-
-
Notifications
You must be signed in to change notification settings - Fork 8.3k
py: Implement partial PEP-498 (f-string) support (v2) #6247
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
py: Implement partial PEP-498 (f-string) support (v2) #6247
Conversation
c398ff9
to
bc03a1b
Compare
This is currently +432 bytes on PYBV11. |
4786dc5
to
12e5f38
Compare
This implements (most of) the PEP-498 spec for f-strings, with two exceptions: - raw f-strings (`fr` or `rf` prefixes) raise `NotImplementedError` - one special corner case does not function as specified in the PEP (more on that in a moment) This is implemented in the core as a syntax translation, brute-forcing all f-strings to run through `String.format`. For example, the statement `x='world'; print(f'hello {x}')` gets translated *at a syntax level* (injected into the lexer) to `x='world'; print('hello {}'.format(x))`. While this may lead to weird column results in tracebacks, it seemed like the fastest, most efficient, and *likely* most RAM-friendly option, despite being implemented under the hood with a completely separate `vstr_t`. Since [string concatenation of adjacent literals is implemented in the lexer](micropython@534b7c3), two side effects emerge: - All strings with at least one f-string portion are concatenated into a single literal which *must* be run through `String.format()` wholesale, and: - Concatenation of a raw string with interpolation characters with an f-string will cause `IndexError`/`KeyError`, which is both different from CPython *and* different from the corner case mentioned in the PEP (which gave an example of the following:) ```python x = 10 y = 'hi' assert ('a' 'b' f'{x}' '{c}' f'str<{y:^4}>' 'd' 'e') == 'ab10{c}str< hi >de' ``` The above-linked commit detailed a pretty solid case for leaving string concatenation in the lexer rather than putting it in the parser, and undoing that decision would likely be disproportionately costly on resources for the sake of a probably-low-impact corner case. An alternative to become complaint with this corner case of the PEP would be to revert to string concatenation in the parser *only when an f-string is part of concatenation*, though I've done no investigation on the difficulty or costs of doing this. A decent set of tests is included. I've manually tested this on the `unix` port on Linux and on a Feather M4 Express (`atmel-samd`) and things seem sane.
12e5f38
to
6a9bccc
Compare
ping :) |
This rebases cleanly on v1.13 and has the following code-size changes:
So the feature can be completely disabled, which is good. @jimmo what was the status of this, were there somethings to improve/test? |
I started writing more comprehensive tests and started to doubt whether this is the right approach. f-strings are complicated! That said, I don't have a better idea! I don't think we need to aim for a perfect implementation (which I think is just about impossible to do in a sufficiently "micro" way), I wanted to characterise exactly what we do and don't support, and ensure that we do something sensible for the unsupported cases. Right now the lexer gets a bit confused by nested braces. (You can put f-strings inside f-strings). I'll dig up my notes and add more details soon. |
Ok, thanks. I'm happy to support f-strings in just a simple way to start with, with known limitations. And "simple" may anyway end up being enough. |
Add supervisor.set_usb_identification(manufacturer, product, vid, pid) function
This is a very minor fixup of #4998 by @klardotsh, just to address the outstanding review comments (code formatting, testing across all ports, etc).
Also enables by default on ESP32 and STM32 (as well as Unix and Windows from the original PR).
Thanks to @klardotsh for doing all the hard work implementing this!! Original PR description below:
This implements (most of) the PEP-498 spec for f-strings, with two
exceptions:
fr
orrf
prefixes) raiseNotImplementedError
(more on that in a moment)
This is implemented in the core as a syntax translation, brute-forcing
all f-strings to run through
String.format
. For example, the statementx='world'; print(f'hello {x}')
gets translated at a syntax level(injected into the lexer) to
x='world'; print('hello {}'.format(x))
.While this may lead to weird column results in tracebacks, it seemed
like the fastest, most efficient, and likely most RAM-friendly option,
despite being implemented under the hood with a completely separate
vstr_t
.Since string concatenation of adjacent literals is implemented in the
lexer,
two side effects emerge:
single literal which must be run through
String.format()
wholesale,and:
f-string will cause
IndexError
/KeyError
, which is both differentfrom CPython and different from the corner case mentioned in the PEP
(which gave an example of the following:)
The above-linked commit detailed a pretty solid case for leaving string
concatenation in the lexer rather than putting it in the parser, and
undoing that decision would likely be disproportionately costly on
resources for the sake of a probably-low-impact corner case. An
alternative to become complaint with this corner case of the PEP would
be to revert to string concatenation in the parser only when an
f-string is part of concatenation, though I've done no investigation on
the difficulty or costs of doing this.
A decent set of tests is included. I've manually tested this on the
unix
port on Linux and on a Feather M4 Express (atmel-samd
) andthings seem sane.