-
-
Notifications
You must be signed in to change notification settings - Fork 31.8k
gh-131507: Add support for syntax highlighting in PyREPL #133247
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🤖 New build scheduled with the buildbot fleet by @ambv for commit ffebbbe 🤖 Results will be shown at: https://buildbot.python.org/all/#/grid?branch=refs%2Fpull%2F133247%2Fmerge If you want to schedule another build, you need to add the 🔨 test-with-buildbots label again. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for being too nitpicky, but I still find the blue used for keywords to be too dark when coupled with dark-themed terminals (e.g. #131562 (comment)). Do you think we could use a more contrasting color?
I disagree about the blue color, the fact that Ubuntu colors it too dim is not actionable for the Python project. Adjust your terminal. |
The refleak failures are unrelated, see #133258. |
Having a closer look, I see you are only reading in chunks in case of |
@chris-eibl I simplified the Unix case to remove the additional buffering since it complicates the codebase. Your PR was adding that same buffering to Windows. Given that we now achieve the same performance without double-buffering, I'd say we don't need the other PR. |
I think that the default theme polishing can be addressed in following pr(s?), as per pr description:
|
So this passed all buildbots save for s390x. The s390x failure is unrelated, Eric is dealing with it: #133265. |
Co-authored-by: Victorien <65306057+Viicos@users.noreply.github.com> Co-authored-by: Hugo van Kemenade <1324225+hugovk@users.noreply.github.com>
Sorry for only spotting the Yeah, simple is always better 👍
I can confirm: this PR is now way faster pasting in the virtual terminal mode on Windows 🚀 |
done = "\x1b[201~" | ||
data = "" | ||
import time | ||
start = time.time() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Leftover from testing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The trace below shows time. I can move the import up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then use perf_counter
please
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left a bunch of questions and nitpicks but overall this looks fantastic.
I've reviewed the tokenizer-related parts of the implementation and I'm comfortable with the current approach. We have already discussed this offline but for everyone else reading: while the soft keyword detection uses heuristics, this is a reasonable compromise as bringing in a more correct solution would require a full parser run, which would be much heavier and slower for this use case and we don't even have the technology now to do partial input in the PEG parser so whatever we do will be in python and much slower.
I've also run some performance benchmarks and things look really good — the impact on responsiveness is minimal, even with syntax highlighting enabled by default. And on Windows seem to be super fast.
I'm happy to explore further optimizations in the future, such as avoiding repeated tokenization during screen refreshes. But as it stands, this looks solid. Great work!
The only open question is what to do for setting the theme officially (right now is a "experimental" API) but I think since that affects other parts of the interpreter (such as tracebacks) this is out of scope of this particular PR, so I propose to discuss this separately |
|
Good catch, @chris-eibl. I'll fix it forward in a subsequent PR. |
Calling cpython/Lib/_pyrepl/windows_console.py Lines 150 to 155 in fac41f5
does fix it for me. |
This is a much improved version of gh-131562. It uses the tokenizer for better speed and pattern matching for more robust handling of soft keywords. While it still won't hit all cases correctly, it's better than
idlelib
's regular expression-based colorizer and unlike our glorious PEG parser it supports incomplete input, which is crucial for an interactive shell.Relatedly, pasting support was tweaked to be way faster. Now the entire contents of Frankenstein can be pasted within 3 seconds both on Unix and Windows as long as bracketed pasting is supported by the terminal. This is a necessary tweak for syntax highlighting not to cripple performance of pastes above 5kB.
There is experimental support for theming through
_colorize.set_theme()
that's mentioned in "What's New" but otherwise undocumented so far.📚 Documentation preview 📚: https://cpython-previews--133247.org.readthedocs.build/