-
-
Notifications
You must be signed in to change notification settings - Fork 8.2k
py/parse.c: Implement function keyword argument syntax restriction. #8479
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
py/parse.c: Implement function keyword argument syntax restriction. #8479
Conversation
Implemented function keyword argument syntax restriction added with Python 3.8 (listed under issue micropython#7899). This is done by adding a switch in the push_result_rule function in parse.c to turn off the parentheses removal optimization for atom_paren rules encountered while parsing an arglist rule (meaning this atom_paren rule is a function argument) and when the enclosed node is a lone ID. Keeping the atom_paren rule in the parse tree allows the compiler to catch the syntax error when the rule happens to be a keyword argument. In order to determine whether an atom_paren rule is a function argument, a bit is added to the parser structure, which is set in mp_parse when an arglist rule is being parsed (unset once done with it). Other options were considered: hard-coding the distance on the rule stack between an arglist rule and an atom_paren rule in the added switch, using a global boolean variable, using a local boolean variable (by expanding the function signature of the push_result_rule function). Adding a field to the parser structure seemed to provide the best balance between code size increase, code changes, memory usage overhead, code maintainability, etc. Because the compile_atom_paren function did not expect this rule to be a parent for a lone ID, a switch is added in that function to handle this. A new test file fun_kwargs_syntax.py is added to make sure the syntax restriction is in effect, and a test is added to fun2.py to make sure extraneous parenthese around ID function arguments does not cause any issues. Signed-off-by: Rayane Chatrieux <rayane.chatrieux@gmail.com>
Thanks for the contribution! As far as I understand, this is now restricting parenthesis around id's in a keyword argument, like |
@@ -1073,12 +1080,21 @@ mp_parse_tree_t mp_parse(mp_lexer_t *lex, mp_parse_input_kind_t input_kind) { | |||
default: { | |||
assert((rule_act & RULE_ACT_KIND_MASK) == RULE_ACT_LIST); | |||
|
|||
if (rule_id == RULE_arglist && !MP_PARSE_IS_PARSING_ARGLIST(&parser)) { | |||
parser.cur_view |= MP_PARSE_PARSING_ARGLIST; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of maintaining this bit, did you consider peeking into the rule stack? This information should already be there, you just need to do something like parser->rule_stack[parser->rule_stack_top - 1].rule_id == RULE_arglist
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did, and I think the correct code to do this would be parser->rule_stack[parser->rule_stack_top - 15].rule_id == RULE_arglist
, where 15 is that distance between the RULE_arglist
and RULE_atom_paren
that would guarantee that the second is a child of the first. I was able to implement the restriction this way. My concern with doing this is that it creates a tight dependence on the grammar file, i.e. if someone makes a change to it that should seemingly have no effect on this, this could silently break. I propose the bit solution to hopefully add some robustness.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the approach her work with nested function calls, eg f(f(), (a)=2)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep. On this input code, after parsing through the first argument f()
, the parser will revisit the arglist
rule, allowing it to re-assert the bit, before recursing down through the (a) = 2
argument. Added a test a did a quick run just now to confirm.
Yes, the intent is to restrict syntax such as |
Codecov Report
@@ Coverage Diff @@
## master #8479 +/- ##
==========================================
- Coverage 98.40% 98.25% -0.15%
==========================================
Files 153 154 +1
Lines 20188 20296 +108
==========================================
+ Hits 19865 19941 +76
- Misses 323 355 +32
Continue to review full report at Codecov.
|
…n-main Translations update from Hosted Weblate
Synopsis: Implements the function keyword argument syntax restriction added with Python 3.8 (listed under issue #7899).
Details: This is done by adding a switch in the
push_result_rule
function in parse.c to turn off the parentheses removal optimization foratom_paren
rules encountered while parsing anarglist
rule (meaning thisatom_paren
rule is a function argument) and when the enclosed node is a lone ID. Keeping theatom_paren
rule in the parse tree allows the compiler to catch the syntax error when the rule happens to be a keyword argument.In order to determine whether an
atom_paren
rule is a function argument, a bit is added to the parser structure, which is set inmp_parse
when anarglist
rule is being parsed (unset once done with it). Other options were considered: hard-coding the distance on the rule stack between anarglist
rule and anatom_paren
rule in the added switch, using a global boolean variable, using a local boolean variable (by expanding the function signature of thepush_result_rule
function). Adding a field to the parser structure seemed to provide the best balance between code size increase, code changes, memory usage overhead, code maintainability, etc.Because the
compile_atom_paren
function did not expect this rule to be a parent for a lone ID, a switch is added in that function to handle this.A new test file fun_kwargs_syntax.py is added to make sure the syntax restriction is in effect, and a test is added to fun2.py to make sure extraneous parentheses around ID function arguments does not
cause any issues.