-
-
Notifications
You must be signed in to change notification settings - Fork 652
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Python] The Pants Rust inference parser exhibits some dependency misinterpretation issues when python-infer.string_imports is enabled. #20324
Comments
1 seems like a bug. We'll likely have to find the right tree-sitter incantation for that. 2 and 3 are more of a "quirk" than a bug. However, since they aren't valid dotted module names (as far as I know) I think it's safe to also "fix" this quirk. |
Similar to point 2: the rust parser also causes a warning for every file that has either of these strings, neither of which is a valid python module:
I get that using pants 2.18.2 with this in pants.toml (from the StackStorm project): [python-infer]
string_imports = true
string_imports_min_dots = 1 # tools/config_gen.py has import strings with only one dot.
unowned_dependency_behavior = "ignore"
ambiguity_resolution = "by_source_root"
use_rust_parser = true The old parser (in 2.18.x) explicitly required alphanumeric characters before the dot: pants/src/python/pants/backend/python/dependency_inference/scripts/general_dependency_visitor.py Lines 26 to 31 in 3da5ae5
The new rust parser, however, seems to only exclude strings with a whitespace or a pants/src/rust/engine/dep_inference/src/python/mod.rs Lines 325 to 328 in 3632a66
Can we make the rust parser be more selective on which strings might be dependencies? In particular, a string that only consists of Maybe with something like this: if !text.ends_with(".") && !text.contains(|c: char| c.is_ascii_whitespace() || c == '\\') {
self.string_candidates
.insert(text.to_string(), (range.start_point.row + 1) as u64);
} Oh. That wouldn't work because the detected strings are used for both imports and assets. pants/src/python/pants/backend/python/dependency_inference/parse_python_dependencies.py Lines 117 to 130 in 8eb5557
So, maybe: if python_infer_subsystem.string_imports or python_infer_subsystem.assets:
for string, line in native_result.string_candidates.items():
slash_count = string.count("/")
if (
python_infer_subsystem.string_imports
and not slash_count
and not string.endswith(".")
and string.count(".") >= python_infer_subsystem.string_imports_min_dots
):
imports.setdefault(string, (line, True))
if (
python_infer_subsystem.assets
and slash_count >= python_infer_subsystem.assets_min_slashes
):
assets.add(string) |
This comment was marked as resolved.
This comment was marked as resolved.
@liudonggalaxy Do you get any warnings or errors for your last two points:
|
Hello,
To resolve this issue and ensure the CircleCI job passes, we modified the following source code.
->
|
Ignores strings that are not valid python modules such as strings that end in "." (such as "." and ".."). Related: #20324
Ignores strings that are not valid python modules such as strings that end in "." (such as "." and ".."). Related: #20324
…20483) Ignores strings that are not valid python modules such as strings that end in "." (such as "." and ".."). Related: #20324 (fixes only points 2 and 3 where the strings end in "."). Co-authored-by: Jacob Floyd <[email protected]>
#20484) Ignores any strings that have the ignore pragma comment. This used to work with the python-based parser, but apparently there weren't tests so it was not carried into the rust-based parser. Related: #20324, #20472 Co-authored-by: Jacob Floyd <[email protected]>
With #20472, only dotted strings that are made up of valid python identifiers will be be considered as possible imports. So, points 2 and 3 should be resolved in pants 2.20 or the 2.19.1 (once released). To fix the first issue, someone needs to figure out how to deal with concatenated strings with the tree sitter in the dependency parser. |
Ignores strings that are not valid python modules such as strings that end in "." (such as "." and ".."). Related: #20324
#20601) Ignores any strings that have the ignore pragma comment. This used to work with the python-based parser, but apparently there weren't tests so it was not carried into the rust-based parser. Related: #20324, #20472 Co-authored-by: Jacob Floyd <[email protected]>
Tree-sitter uses The tree-sitter playground is useful for figuring these out. For example,
|
Rough draft WIP fix at #22050 for handling concatenated string literals. Needs tests (beyond the minimal example added already) and some evaluation whether a refactor of some of the parser is worthwhile as well. |
Maintainer summary
[python-infer].string_imports
. (issue 1 in the orginal report)a.b.
(issue 2 in original report)retrying...
(issue 3 in the orginal report).
,..
, etc) (issue reported in comments)Original report
Describe the bug
The Pants
Rust
inference parser exhibits some dependency misinterpretation issues whenpython-infer.string_imports
is enabled. This is evident in the following examples:a.b
as a dependency, whereas the classic parser correctly discernsa.b.c.d
:BASE_PATH = 'a.b.'
, the Rust parser mistakenly treats it as a dependency, unlike the classic parser.print('retrying...')
, the Rust parser incorrectly interprets it as a dependency, while the classic parser does not.To compare the Rust and classic parsers, the following commands were executed on both MacOS and Linux:
Here is the pants configuration in pants.toml.
Pants version
2.17.0
OS
Both MacOS and Linux.
Additional info
The text was updated successfully, but these errors were encountered: