gh-131878: Fix input of unicode characters with two or more code points in new pyrepl on Windows #131901

sergey-miryanov · 2025-03-30T12:01:49Z

If unicode characters with two or more codepoints (ñ or é) typed or pasted, then it should be converted to bytes before passing to eventqueue.push.

Also fixed handling of exceptions while decoding buffer. If exception occurred then buffer should be flushed to prevent mixing it with following control commands (for example "\x1b[201").

Issue: New REPL on Windows exits when accented character is pasted/typed #131878

- with two codepoints or more

python-cla-bot · 2025-04-06T13:56:24Z

All commit authors signed the Contributor License Agreement.

chris-eibl · 2025-04-14T16:19:37Z

LGTM. I cannot do a thorough review, but at least confirm, that with this PR I can enter äÄöÖüÜ, etc, again 🚀

tomasr8

Tested with a Czech keyboard and everything works. Thanks!

Lib/_pyrepl/base_eventqueue.py

Lib/test/test_pyrepl/test_eventqueue.py

Co-authored-by: Tomas R. <tomas.roun8@gmail.com>

…epl-for-long-unicode-chars

Lib/_pyrepl/base_eventqueue.py

Lib/_pyrepl/windows_console.py

chris-eibl · 2025-04-26T22:42:40Z

LGTM. I cannot do a thorough review, but at least confirm, that with this PR I can enter äÄöÖüÜ, etc, again 🚀

Had some time now: this needs fixes for Linux. Otherwise looks good, thanks @sergey-miryanov!

tomasr8 · 2025-04-27T14:26:30Z

This breaks Linux, because only one char is inserted in UnixConsole.get_event() [...]

Since this wasn't caught by the tests, we should also probably add a test for that.

chris-eibl · 2025-04-27T14:28:08Z

I have suggested the def test_push_unicode_character_two_bytes(self): which tests exactly that.

Co-authored-by: Chris Eibl <138194463+chris-eibl@users.noreply.github.com>

….push

sergey-miryanov · 2025-04-27T17:29:49Z

@chris-eibl Please take a look! I would like to keep tests for paste mode to be sure that escape sequences not mixed with input if error occurred. But I'm OK to remove them if you think it is not necessary.

chris-eibl · 2025-04-28T07:10:17Z

LGTM. Sure, more tests are always better. Though, AFAIU, there is nothing special about paste mode here:

test_push_unicode_character_two_bytes_in_paste_mode just asserts that a valid "two byte char" within a sequence of "one byte chars" is working as expected.
likewise, test_push_unicode_character_as_str_in_paste_mode asserts that when an exception during a sequence of feeding bytes via push is caught, we can continue pushing afterwards.

IMHO, both are reasonable, but "paste mode" makes them look too special about pasting?

So I'd rename them and use arbitrary "one byte chars" in there.

sergey-miryanov · 2025-04-28T07:29:40Z

Sequence of "one byte chars" is a control command to enable "paste mode". Originally in gh-131878 problem aroise from case where the wrongly passed unicode string puts to the buffer and mixed with following control command that disables paste mode and asserted in those code path. So, I want to check that we don't "corrupt" chars buffer.

chris-eibl · 2025-04-28T07:44:49Z

Yupp, I know. But as said, there is nothing special about bracketed pasting here. From the perspective of the eventqueue test, these are just arbitrary bytes. Whether bracketed pasting is working correctly, is tested in

cpython/Lib/test/test_pyrepl/test_pyrepl.py

Line 1201 in 1b7470f

def test_bracketed_paste(self):

If you'd like to showcase that sequences of "one byte chars" interleaved with "multi byte chars" work correctly via "paste mode", then maybe just add a comment?

And small nit: ~ is missing in the escape sequences, see

cpython/Lib/test/test_pyrepl/test_pyrepl.py

Lines 1227 to 1228 in 1b7470f

    
           paste_start = "\x1b[200~" 
        
           paste_end = "\x1b[201~"

To me, those two tests have merit, but are too special about "paste mode".

sergey-miryanov · 2025-04-28T18:09:44Z

@chris-eibl It seems that I did not correctly assume how paste-mode works (shame on me!). I removed one test because the same code path is covered by test_push_unicode_character_two_bytes. And renamed and simplified a bit test_push_single_chars_and_unicode_character_as_str (not sure the name is ok though).

Lib/test/test_pyrepl/test_eventqueue.py

chris-eibl · 2025-04-28T19:51:41Z

I don't have a better name - the comment in there clarifies what we intend to test (not much tbh).
Just a small nit on the comment.

Otherwise LGTM. Thanks for bearing with me!

Co-authored-by: Chris Eibl <138194463+chris-eibl@users.noreply.github.com>

sergey-miryanov · 2025-04-29T05:20:07Z

@chris-eibl Thanks for the review and patience!

chris-eibl

Thank you @sergey-miryanov!

Lib/_pyrepl/base_eventqueue.py

tomasr8

Tested again after the simplifications. WFM on both Windows and Linux :)

Co-authored-by: Tomas R. <tomas.roun8@gmail.com>

ambv

Gentlemen, excellent work. Not only does it fix the bug, but it makes the codebase simpler and more elegant. If not for the new test, the net number of lines added here would be negative. Impressive!

miss-islington-app · 2025-05-05T16:45:07Z

Thanks @sergey-miryanov for the PR, and @ambv for merging it 🌮🎉.. I'm working now to backport this PR to: 3.13.
🐍🍒⛏🤖

miss-islington-app · 2025-05-05T16:45:22Z

Sorry, @sergey-miryanov and @ambv, I could not cleanly backport this to 3.13 due to a conflict.
Please backport using cherry_picker on command line.

cherry_picker 0c5151bc81ec8e8588bef4389df12a9ab50e9fa0 3.13

sergey-miryanov · 2025-05-05T17:10:03Z

@ambv Please take a look - this is the same problem with backport as in earlier PR - #130805 (comment)

…ore code points in new pyrepl on Windows (pythongh-131901) (cherry picked from commit 0c5151b) Co-authored-by: Sergey Miryanov <sergey.miryanov@gmail.com> Co-authored-by: Tomas R. <tomas.roun8@gmail.com> Co-authored-by: Chris Eibl <138194463+chris-eibl@users.noreply.github.com>

bedevere-app · 2025-05-05T19:49:49Z

GH-133468 is a backport of this pull request to the 3.13 branch.

…de points in new pyrepl on Windows (gh-131901) (gh-133468) (cherry picked from commit 0c5151b) Co-authored-by: Sergey Miryanov <sergey.miryanov@gmail.com> Co-authored-by: Tomas R. <tomas.roun8@gmail.com> Co-authored-by: Chris Eibl <138194463+chris-eibl@users.noreply.github.com>

bedevere-bot · 2025-05-05T21:30:47Z

⚠️⚠️⚠️ Buildbot failure ⚠️⚠️⚠️

Hi! The buildbot AMD64 Debian root 3.13 (tier-1) has failed when building commit 891232f.

What do you need to do:

Don't panic.
Check the buildbot page in the devguide if you don't know what the buildbots are or how they work.
Go to the page of the buildbot that failed (https://buildbot.python.org/#/builders/1441/builds/1169) and take a look at the build logs.
Check if the failure is related to this commit (891232f) or if it is a false positive.
If the failure is related to this commit, please, reflect that on the issue and make a new Pull Request with a fix.

You can take a look at the buildbot page here:

https://buildbot.python.org/#/builders/1441/builds/1169

Summary of the results of the build (if available):

Click to see traceback logs

remote: Enumerating objects: 26, done.        
remote: Counting objects:   3% (1/26)        
remote: Counting objects:   7% (2/26)        
remote: Counting objects:  11% (3/26)        
remote: Counting objects:  15% (4/26)        
remote: Counting objects:  19% (5/26)        
remote: Counting objects:  23% (6/26)        
remote: Counting objects:  26% (7/26)        
remote: Counting objects:  30% (8/26)        
remote: Counting objects:  34% (9/26)        
remote: Counting objects:  38% (10/26)        
remote: Counting objects:  42% (11/26)        
remote: Counting objects:  46% (12/26)        
remote: Counting objects:  50% (13/26)        
remote: Counting objects:  53% (14/26)        
remote: Counting objects:  57% (15/26)        
remote: Counting objects:  61% (16/26)        
remote: Counting objects:  65% (17/26)        
remote: Counting objects:  69% (18/26)        
remote: Counting objects:  73% (19/26)        
remote: Counting objects:  76% (20/26)        
remote: Counting objects:  80% (21/26)        
remote: Counting objects:  84% (22/26)        
remote: Counting objects:  88% (23/26)        
remote: Counting objects:  92% (24/26)        
remote: Counting objects:  96% (25/26)        
remote: Counting objects: 100% (26/26)        
remote: Counting objects: 100% (26/26), done.        
remote: Compressing objects:  33% (1/3)        
remote: Compressing objects:  66% (2/3)        
remote: Compressing objects: 100% (3/3)        
remote: Compressing objects: 100% (3/3), done.        
remote: Total 14 (delta 12), reused 12 (delta 11), pack-reused 0 (from 0)        
From https://github.com/python/cpython
 * branch                    3.13       -> FETCH_HEAD
Note: switching to '891232f3386dd8b20a216a473954c1b01cede7ec'.

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by switching back to a branch.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -c with the switch command. Example:

  git switch -c <new-branch-name>

Or undo this operation with:

  git switch -

Turn off this advice by setting config variable advice.detachedHead to false

HEAD is now at 891232f3386 [3.13] gh-131878: Fix input of unicode characters with two or more code points in new pyrepl on Windows (gh-131901) (gh-133468)
Switched to and reset branch '3.13'

configure: WARNING: no system libmpdecimal found; falling back to bundled libmpdecimal (deprecated and scheduled for removal in Python 3.15)
configure: WARNING: pkg-config is missing. Some dependencies may not be detected correctly.

find: ‘build’: No such file or directory
find: ‘build’: No such file or directory
find: ‘build’: No such file or directory
find: ‘build’: No such file or directory
make: [Makefile:3116: clean-retain-profile] Error 1 (ignored)

…e points in new pyrepl on Windows (pythongh-131901) Co-authored-by: Tomas R. <tomas.roun8@gmail.com> Co-authored-by: Chris Eibl <138194463+chris-eibl@users.noreply.github.com>

Fix input of long unicode characters

04812d4

- with two codepoints or more

sergey-miryanov requested review from pablogsal, lysnikolaou and ambv as code owners March 30, 2025 12:01

bedevere-app bot added the awaiting review label Mar 30, 2025

bedevere-app bot mentioned this pull request Mar 30, 2025

New REPL on Windows exits when accented character is pasted/typed #131878

Closed

sergey-miryanov added 2 commits March 30, 2025 19:49

Add new entry

9203df4

Do not change simple_interact

9d7e3e5

picnixz mentioned this pull request Mar 30, 2025

gh-131878: Handle top level exceptions in new pyrepl and prevent of closing it #131910

Merged

chris-eibl mentioned this pull request Apr 20, 2025

GH-132439: Fix REPL swallowing characters entered with AltGr on cmd.exe #132440

Merged

tomasr8 approved these changes Apr 26, 2025

View reviewed changes

Lib/_pyrepl/base_eventqueue.py Outdated Show resolved Hide resolved

Lib/test/test_pyrepl/test_eventqueue.py Outdated Show resolved Hide resolved

bedevere-app bot added awaiting core review and removed awaiting review labels Apr 26, 2025

sergey-miryanov and others added 2 commits April 26, 2025 21:31

Do not use bytearray to convert char to bytes

7fd3418

Co-authored-by: Tomas R. <tomas.roun8@gmail.com>

Merge branch 'main' into pythongh-131878-fix-unicode-input-in-new-pyr…

a2fb7ee

…epl-for-long-unicode-chars

chris-eibl reviewed Apr 26, 2025

View reviewed changes

Lib/_pyrepl/base_eventqueue.py Outdated Show resolved Hide resolved

chris-eibl reviewed Apr 26, 2025

View reviewed changes

Lib/_pyrepl/windows_console.py Outdated Show resolved Hide resolved

sergey-miryanov and others added 4 commits April 27, 2025 21:52

Push char one by one in windows_console

b1fa88d

Co-authored-by: Chris Eibl <138194463+chris-eibl@users.noreply.github.com>

Apply Chris Eibl suggestionы and use only int and bytes in eventqueue…

925f3a2

….push

Do not use finally to reset keymap

aa56581

Rename tests along Chris Eibl suggestions

1eaeca3

sergey-miryanov added 2 commits April 28, 2025 23:03

Fix tests

68d978e

Add comment

7f8fb7f

chris-eibl reviewed Apr 28, 2025

View reviewed changes

Lib/test/test_pyrepl/test_eventqueue.py Outdated Show resolved Hide resolved

Update comments in the test

9519208

Co-authored-by: Chris Eibl <138194463+chris-eibl@users.noreply.github.com>

chris-eibl approved these changes Apr 29, 2025

View reviewed changes

tomasr8 reviewed May 4, 2025

View reviewed changes

Lib/_pyrepl/base_eventqueue.py Outdated Show resolved Hide resolved

tomasr8 approved these changes May 4, 2025

View reviewed changes

Use to_bytes

a855ba8

Co-authored-by: Tomas R. <tomas.roun8@gmail.com>

ambv approved these changes May 5, 2025

View reviewed changes

bedevere-app bot added awaiting merge and removed awaiting core review labels May 5, 2025

ambv merged commit 0c5151b into python:main May 5, 2025
52 checks passed

bedevere-app bot removed the awaiting merge label May 5, 2025

ambv added the needs backport to 3.13 bugs and security fixes label May 5, 2025

miss-islington-app bot assigned ambv May 5, 2025

bedevere-app bot removed the needs backport to 3.13 bugs and security fixes label May 5, 2025

chris-eibl mentioned this pull request May 6, 2025

gh-131878: Fix input of unicode characters with two or more code points in the REPL on Windows in vt mode #133030

Closed

Uh oh!

gh-131878: Fix input of unicode characters with two or more code points in new pyrepl on Windows #131901

gh-131878: Fix input of unicode characters with two or more code points in new pyrepl on Windows #131901

Uh oh!

Conversation

sergey-miryanov commented Mar 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

python-cla-bot bot commented Apr 6, 2025

Uh oh!

chris-eibl commented Apr 14, 2025

Uh oh!

tomasr8 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

chris-eibl commented Apr 26, 2025

Uh oh!

tomasr8 commented Apr 27, 2025

Uh oh!

chris-eibl commented Apr 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sergey-miryanov commented Apr 27, 2025

Uh oh!

chris-eibl commented Apr 28, 2025

Uh oh!

sergey-miryanov commented Apr 28, 2025

Uh oh!

chris-eibl commented Apr 28, 2025

Uh oh!

sergey-miryanov commented Apr 28, 2025

Uh oh!

Uh oh!

chris-eibl commented Apr 28, 2025

Uh oh!

sergey-miryanov commented Apr 29, 2025

Uh oh!

chris-eibl left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

tomasr8 left a comment

Choose a reason for hiding this comment

Uh oh!

ambv left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

miss-islington-app bot commented May 5, 2025

Uh oh!

miss-islington-app bot commented May 5, 2025

Uh oh!

sergey-miryanov commented May 5, 2025

Uh oh!

bedevere-app bot commented May 5, 2025

Uh oh!

bedevere-bot commented May 5, 2025

⚠️⚠️⚠️ Buildbot failure ⚠️⚠️⚠️

Uh oh!

Uh oh!

sergey-miryanov commented Mar 30, 2025 •

edited

Loading

chris-eibl commented Apr 27, 2025 •

edited

Loading