-
-
Notifications
You must be signed in to change notification settings - Fork 31.8k
gh-131878: Fix input of unicode characters with two or more code points in new pyrepl on Windows #131901
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
- with two codepoints or more
LGTM. I cannot do a thorough review, but at least confirm, that with this PR I can enter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested with a Czech keyboard and everything works. Thanks!
Co-authored-by: Tomas R. <tomas.roun8@gmail.com>
…epl-for-long-unicode-chars
Had some time now: this needs fixes for Linux. Otherwise looks good, thanks @sergey-miryanov! |
Since this wasn't caught by the tests, we should also probably add a test for that. |
I have suggested the |
Co-authored-by: Chris Eibl <138194463+chris-eibl@users.noreply.github.com>
@chris-eibl Please take a look! I would like to keep tests for paste mode to be sure that escape sequences not mixed with input if error occurred. But I'm OK to remove them if you think it is not necessary. |
LGTM. Sure, more tests are always better. Though, AFAIU, there is nothing special about paste mode here:
IMHO, both are reasonable, but "paste mode" makes them look too special about pasting? So I'd rename them and use arbitrary "one byte chars" in there. |
Sequence of "one byte chars" is a control command to enable "paste mode". Originally in gh-131878 problem aroise from case where the wrongly passed unicode string puts to the buffer and mixed with following control command that disables paste mode and asserted in those code path. So, I want to check that we don't "corrupt" chars buffer. |
Yupp, I know. But as said, there is nothing special about bracketed pasting here. From the perspective of the eventqueue test, these are just arbitrary bytes. Whether bracketed pasting is working correctly, is tested in cpython/Lib/test/test_pyrepl/test_pyrepl.py Line 1201 in 1b7470f
If you'd like to showcase that sequences of "one byte chars" interleaved with "multi byte chars" work correctly via "paste mode", then maybe just add a comment? And small nit: cpython/Lib/test/test_pyrepl/test_pyrepl.py Lines 1227 to 1228 in 1b7470f
To me, those two tests have merit, but are too special about "paste mode". |
@chris-eibl It seems that I did not correctly assume how paste-mode works (shame on me!). I removed one test because the same code path is covered by |
I don't have a better name - the comment in there clarifies what we intend to test (not much tbh). Otherwise LGTM. Thanks for bearing with me! |
Co-authored-by: Chris Eibl <138194463+chris-eibl@users.noreply.github.com>
@chris-eibl Thanks for the review and patience! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @sergey-miryanov!
If unicode characters with two or more codepoints (ñ or é) typed or pasted, then it should be converted to bytes before passing to
eventqueue.push
.Also fixed handling of exceptions while decoding buffer. If exception occurred then buffer should be flushed to prevent mixing it with following control commands (for example "\x1b[201").