Skip to content

gh-123539: Improve SyntaxError msg for import as with not a name #123629

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
May 2, 2025

Conversation

sobolevn
Copy link
Member

@sobolevn sobolevn commented Sep 3, 2024

Copy link
Member

@pablogsal pablogsal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to not change or introduce new first-pass rules when introducing error messages. If you have problems with precedence, perhaps you can just move the invalid rule the first one

@bedevere-app
Copy link

bedevere-app bot commented Sep 3, 2024

When you're done making the requested changes, leave the comment: I have made the requested changes; please review again.

@lysnikolaou
Copy link
Member

I would prefer to not change or introduce new first-pass rules when introducing error messages. If you have problems with precedence, perhaps you can just move the invalid rule the first one

Well, I think in this case it's not possible to do it otherwise. You wanna to be checking the NAME variant first to be sure that there's no valid syntax there, then check the invalid rule for the error message, but before the no alias alternative is visited, since that will succeed in the case of import a as a.b (it'll just parse import a and be done with it).

FWIW my guesstimate is that this might be frequent enough that changing existing rules is warranted.

@pablogsal
Copy link
Member

pablogsal commented Sep 3, 2024

Well, I think in this case it's not possible to do it otherwise. You wanna to be checking the NAME variant first to be sure that there's no valid syntax there, then check the invalid rule for the error message, but before the no alias alternative is visited, since that will succeed in the case of import a as a.b (it'll just parse import a and be done with it).

Yeah, but we are adding now a lookahead (&(',' | ')') which also includes NEWLINE. I am a bit wary of introducing rules that deal explicitly with NEWLINE tokens because I have seen those backfire in the past. For example, it's not imediately clear why a file that just has import a as b (no newline at the end) is not now a syntax error (is because the implicit newline, but that's not obvious when you read the grammar file).

@sobolevn
Copy link
Member Author

sobolevn commented Sep 3, 2024

I will try some other approaches, I might come up with something simplier.

In the meantime I also noticed that

>>> from a import b.c
  File "<stdin>", line 1
    from a import b.c
                   ^
SyntaxError: invalid syntax

this can also be improved, not only as. But, I will defer this to another PR.

@pablogsal
Copy link
Member

But, I will defer this to another PR.

Whatever you prefer, it's ok to do it all here as we are already discussing this rule :)

@sobolevn
Copy link
Member Author

sobolevn commented Sep 4, 2024

I've spent several hours trying something else:

  1. Converting b=['as' z=NAME { z }] to its own rules import_alias with invalid_import_alias rule inside, but there's no clear way to tell apart 'as' NAME and 'as' NAME.NAME without some extra complexity and wierd lookaheads
  2. I tried to just add invalid_dotted_as_name: dotted_name 'as' a=expression as the first rule to dotted_name, but this way it can accidentally match first on the second on the valid code :(
  3. I tried to simplify the current solution, but it is not easy as well. It requires look aheads.

Right now I think that it is better to close this PR, unfortunatelly.

@pablogsal
Copy link
Member

Right now I think that it is better to close this PR, unfortunatelly.

If @lysnikolaou it's not too worried about the lookahead I am happy to consider landing it as is currently, but I am a bit worried that this also makes the grammar more difficult to reason about in the official docs because there it won't be apparent why the lookahead is needed (and other rules don't have it).

@pablogsal
Copy link
Member

pablogsal commented Sep 4, 2024

I've spent several hours trying something else:

  1. Converting b=['as' z=NAME { z }] to its own rules import_alias with invalid_import_alias rule inside, but there's no clear way to tell apart 'as' NAME and 'as' NAME.NAME without some extra complexity and wierd lookaheads
  2. I tried to just add invalid_dotted_as_name: dotted_name 'as' a=expression as the first rule to dotted_name, but this way it can accidentally match first on the second on the valid code :(
  3. I tried to simplify the current solution, but it is not easy as well. It requires look aheads.

Right now I think that it is better to close this PR, unfortunatelly.

This works:

diff --git a/Grammar/python.gram b/Grammar/python.gram
index c8415552b28..8eb9bb7473c 100644
--- a/Grammar/python.gram
+++ b/Grammar/python.gram
@@ -218,17 +218,17 @@ import_from_targets[asdl_alias_seq*]:
 import_from_as_names[asdl_alias_seq*]:
     | a[asdl_alias_seq*]=','.import_from_as_name+ { a }
 import_from_as_name[alias_ty]:
-    | a=NAME 'as' b=NAME &(',' | ')' | NEWLINE) {
-        _PyAST_alias(a->v.Name.id, b->v.Name.id, EXTRA) }
     | invalid_import_from_as_name
-    | a=NAME { _PyAST_alias(a->v.Name.id, NULL, EXTRA) }
+    | a=NAME b=['as' z=NAME { z }] { _PyAST_alias(a->v.Name.id,
+                                               (b) ? ((expr_ty) b)->v.Name.id : NULL,
+                                               EXTRA) }
 dotted_as_names[asdl_alias_seq*]:
     | a[asdl_alias_seq*]=','.dotted_as_name+ { a }
 dotted_as_name[alias_ty]:
-    | a=dotted_name 'as' b=NAME &(',' | NEWLINE) {
-        _PyAST_alias(a->v.Name.id, b->v.Name.id, EXTRA) }
     | invalid_dotted_as_name
-    | a=dotted_name { _PyAST_alias(a->v.Name.id, NULL, EXTRA) }
+    | a=dotted_name b=['as' z=NAME { z }] { _PyAST_alias(a->v.Name.id,
+                                                      (b) ? ((expr_ty) b)->v.Name.id : NULL,
+                                                      EXTRA) }
 dotted_name[expr_ty]:
     | a=dotted_name '.' b=NAME { _PyPegen_join_names_with_dot(p, a, b) }
     | NAME
@@ -1324,11 +1324,11 @@ invalid_import_from_targets:
     | token=NEWLINE {
         RAISE_SYNTAX_ERROR_STARTING_FROM(token, "Expected one or more names after 'import'") }
 invalid_dotted_as_name:
-    | dotted_name 'as' a=expression {
+    | dotted_name 'as' !(NAME (',' | ')' | NEWLINE)) a=expression {
         RAISE_SYNTAX_ERROR_KNOWN_LOCATION(a,
             "cannot use %s as import target", _PyPegen_get_expr_name(a)) }
 invalid_import_from_as_name:
-    | NAME 'as' a=expression {
+    | NAME 'as' !(NAME (',' | ')' | NEWLINE))   a=expression {
         RAISE_SYNTAX_ERROR_KNOWN_LOCATION(a,
             "cannot use %s as import target", _PyPegen_get_expr_name(a)) }
./python.exe -m test test_syntax test_grammar
Using random seed: 2322618595
Raised RLIMIT_NOFILE: 256 -> 1024
0:00:00 load avg: 5.06 Run 2 tests sequentially in a single process
0:00:00 load avg: 5.06 [1/2] test_syntax
0:00:00 load avg: 5.06 [2/2] test_grammar

== Tests result: SUCCESS ==

All 2 tests OK.

Total duration: 268 ms
Total tests: run=116
Total test files: ru

Notice this doesn't modify the current working rules but just adds the invalid with higher precedence + filtering of the expected one. The filtering is needed otherwise it will match something valid if there is a syntax error afterwards.

@sobolevn
Copy link
Member Author

@pablogsal thanks a lot for your help, great idea! Sorry, it took me so long to follow up on this PR.

Now it should be ready to review again 🫶

@sobolevn sobolevn requested a review from pablogsal March 13, 2025 15:17
@sobolevn
Copy link
Member Author

sobolevn commented May 2, 2025

@pablogsal @lysnikolaou I would like to get this merged before the feature freeze, it would be amazing if you have some time to look at it :)

Снимок экрана 2025-05-02 в 11 08 11

@pablogsal
Copy link
Member

I'm merging this for now but could you create another PR adding the improvement to the what's new for 3.14 in the syntax error section?

@sobolevn
Copy link
Member Author

sobolevn commented May 2, 2025

Thanks a lot, @pablogsal! Without your help - it would be so much harder! 🫶

Sure, will do!

@pablogsal pablogsal enabled auto-merge (squash) May 2, 2025 08:23
@pablogsal
Copy link
Member

Fantastic job @sobolevn 👌🤘

@pablogsal pablogsal merged commit a6ddd07 into python:main May 2, 2025
45 of 46 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants