py/parse.c: Implement function keyword argument syntax restriction. #8479

RayaneCTX · 2022-04-01T17:20:16Z

Synopsis: Implements the function keyword argument syntax restriction added with Python 3.8 (listed under issue #7899).

Details: This is done by adding a switch in the push_result_rule function in parse.c to turn off the parentheses removal optimization for atom_paren rules encountered while parsing an arglist rule (meaning this atom_paren rule is a function argument) and when the enclosed node is a lone ID. Keeping the atom_paren rule in the parse tree allows the compiler to catch the syntax error when the rule happens to be a keyword argument.

In order to determine whether an atom_paren rule is a function argument, a bit is added to the parser structure, which is set in mp_parse when an arglist rule is being parsed (unset once done with it). Other options were considered: hard-coding the distance on the rule stack between an arglist rule and an atom_paren rule in the added switch, using a global boolean variable, using a local boolean variable (by expanding the function signature of the push_result_rule function). Adding a field to the parser structure seemed to provide the best balance between code size increase, code changes, memory usage overhead, code maintainability, etc.

Because the compile_atom_paren function did not expect this rule to be a parent for a lone ID, a switch is added in that function to handle this.

A new test file fun_kwargs_syntax.py is added to make sure the syntax restriction is in effect, and a test is added to fun2.py to make sure extraneous parentheses around ID function arguments does not
cause any issues.

Implemented function keyword argument syntax restriction added with Python 3.8 (listed under issue micropython#7899). This is done by adding a switch in the push_result_rule function in parse.c to turn off the parentheses removal optimization for atom_paren rules encountered while parsing an arglist rule (meaning this atom_paren rule is a function argument) and when the enclosed node is a lone ID. Keeping the atom_paren rule in the parse tree allows the compiler to catch the syntax error when the rule happens to be a keyword argument. In order to determine whether an atom_paren rule is a function argument, a bit is added to the parser structure, which is set in mp_parse when an arglist rule is being parsed (unset once done with it). Other options were considered: hard-coding the distance on the rule stack between an arglist rule and an atom_paren rule in the added switch, using a global boolean variable, using a local boolean variable (by expanding the function signature of the push_result_rule function). Adding a field to the parser structure seemed to provide the best balance between code size increase, code changes, memory usage overhead, code maintainability, etc. Because the compile_atom_paren function did not expect this rule to be a parent for a lone ID, a switch is added in that function to handle this. A new test file fun_kwargs_syntax.py is added to make sure the syntax restriction is in effect, and a test is added to fun2.py to make sure extraneous parenthese around ID function arguments does not cause any issues. Signed-off-by: Rayane Chatrieux <rayane.chatrieux@gmail.com>

dpgeorge · 2022-04-02T10:56:27Z

Thanks for the contribution!

As far as I understand, this is now restricting parenthesis around id's in a keyword argument, like f((a)=2), is that correct? Because things like f(a or b=2) are already a SyntaxError in MicroPython.

dpgeorge · 2022-04-02T10:58:22Z

py/parse.c

@@ -1073,12 +1080,21 @@ mp_parse_tree_t mp_parse(mp_lexer_t *lex, mp_parse_input_kind_t input_kind) {
            default: {
                assert((rule_act & RULE_ACT_KIND_MASK) == RULE_ACT_LIST);

+                if (rule_id == RULE_arglist && !MP_PARSE_IS_PARSING_ARGLIST(&parser)) {
+                    parser.cur_view |= MP_PARSE_PARSING_ARGLIST;


Instead of maintaining this bit, did you consider peeking into the rule stack? This information should already be there, you just need to do something like parser->rule_stack[parser->rule_stack_top - 1].rule_id == RULE_arglist.

I did, and I think the correct code to do this would be parser->rule_stack[parser->rule_stack_top - 15].rule_id == RULE_arglist, where 15 is that distance between the RULE_arglist and RULE_atom_paren that would guarantee that the second is a child of the first. I was able to implement the restriction this way. My concern with doing this is that it creates a tight dependence on the grammar file, i.e. if someone makes a change to it that should seemingly have no effect on this, this could silently break. I propose the bit solution to hopefully add some robustness.

Does the approach her work with nested function calls, eg f(f(), (a)=2)?

Yep. On this input code, after parsing through the first argument f(), the parser will revisit the arglist rule, allowing it to re-assert the bit, before recursing down through the (a) = 2 argument. Added a test a did a quick run just now to confirm.

RayaneCTX · 2022-04-02T12:07:24Z

Thanks for the contribution!

As far as I understand, this is now restricting parenthesis around id's in a keyword argument, like f((a)=2), is that correct? Because things like f(a or b=2) are already a SyntaxError in MicroPython.

Yes, the intent is to restrict syntax such as f((a)=2) only.

codecov-commenter · 2022-04-10T13:09:49Z

Codecov Report

Merging #8479 (3622ed5) into master (665f0e2) will decrease coverage by 0.14%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master    #8479      +/-   ##
==========================================
- Coverage   98.40%   98.25%   -0.15%     
==========================================
  Files         153      154       +1     
  Lines       20188    20296     +108     
==========================================
+ Hits        19865    19941      +76     
- Misses        323      355      +32

Impacted Files	Coverage Δ
py/parse.h	`100.00% <ø> (ø)`
py/compile.c	`99.94% <100.00%> (-0.01%)`	⬇️
py/parse.c	`99.16% <100.00%> (ø)`
py/bc.c	`81.88% <0.00%> (-1.21%)`	⬇️
py/persistentcode.c	`97.17% <0.00%> (-0.10%)`	⬇️
py/showbc.c	`98.32% <0.00%> (-0.04%)`	⬇️
py/vm.c	`98.98% <0.00%> (-0.01%)`	⬇️
py/bc.h	`100.00% <0.00%> (ø)`
... and 15 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 665f0e2...3622ed5. Read the comment docs.

…n-main Translations update from Hosted Weblate

dpgeorge reviewed Apr 2, 2022

View reviewed changes

tests/basics/fun_kwargs_syntax: Added a nested call test.

3622ed5

RayaneCTX mentioned this pull request Apr 8, 2022

Is the parser technology gap between MP and CPython going to pose a serious problem? #8507

Closed

dpgeorge added the py-core Relates to py/ directory in source label Apr 10, 2022

tannewt added a commit to tannewt/circuitpython that referenced this pull request Oct 17, 2023

Merge pull request micropython#8479 from weblate/weblate-circuitpytho…

e232b13

…n-main Translations update from Hosted Weblate

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

py/parse.c: Implement function keyword argument syntax restriction. #8479

py/parse.c: Implement function keyword argument syntax restriction. #8479

RayaneCTX commented Apr 1, 2022 •

edited

Loading

Uh oh!

dpgeorge commented Apr 2, 2022

Uh oh!

dpgeorge Apr 2, 2022

Uh oh!

RayaneCTX Apr 2, 2022 •

edited

Loading

Uh oh!

dpgeorge Apr 2, 2022

Uh oh!

RayaneCTX Apr 2, 2022 •

edited

Loading

Uh oh!

RayaneCTX commented Apr 2, 2022 •

edited

Loading

Uh oh!

codecov-commenter commented Apr 10, 2022

Uh oh!

Uh oh!

Uh oh!

py/parse.c: Implement function keyword argument syntax restriction. #8479

Are you sure you want to change the base?

py/parse.c: Implement function keyword argument syntax restriction. #8479

Conversation

RayaneCTX commented Apr 1, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dpgeorge commented Apr 2, 2022

Uh oh!

dpgeorge Apr 2, 2022

Choose a reason for hiding this comment

Uh oh!

RayaneCTX Apr 2, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dpgeorge Apr 2, 2022

Choose a reason for hiding this comment

Uh oh!

RayaneCTX Apr 2, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

RayaneCTX commented Apr 2, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov-commenter commented Apr 10, 2022

Codecov Report

Uh oh!

Uh oh!

RayaneCTX commented Apr 1, 2022 •

edited

Loading

RayaneCTX Apr 2, 2022 •

edited

Loading

RayaneCTX Apr 2, 2022 •

edited

Loading

RayaneCTX commented Apr 2, 2022 •

edited

Loading