[3.13] gh-119517: Fixes for pasting in pyrepl (GH-120253) #120353

miss-islington · 2024-06-11T16:42:19Z

Remove pyrepl's optimization for self-insert

This will be replaced by a less specialized optimization.

Use line-buffering when pyrepl echoes pastes

Previously echoing was totally suppressed until the entire command had
been pasted and the terminal ended paste mode, but this gives the user
no feedback to indicate that an operation is in progress. Drawing
something to the screen once per line strikes a balance between
perceived responsiveness and performance.

Remove dead code from pyrepl

msg_at_bottom is always true.

Speed up pyrepl's screen rendering computation

The Reader in pyrepl doesn't hold a complete representation of the
screen area being drawn as persistent state. Instead, it recomputes it,
on each keypress. This is fast enough for a few hundred bytes, but
incredibly slow as the input buffer grows into the kilobytes (likely
because of pasting).

Rather than making some expensive and expansive changes to the repl's
internal representation of the screen, add some caching: remember some
data from one refresh to the next about what was drawn to the screen
and, if we don't find anything that has invalidated the results that
were computed last time around, reuse them. To keep this caching as
simple as possible, all we'll do is look for lines in the buffer that
were above the cursor the last time we were asked to update the screen,
and that are still above the cursor now. We assume that nothing can
affect a line that comes before both the old and new cursor location
without us being informed. Based on this assumption, we can reuse old
lines, which drastically speeds up the overwhelmingly common case where
the user is typing near the end of the buffer.

Speed up pyrepl prompt drawing

Cache the can_colorize() call rather than repeatedly recomputing it.
This call looks up an environment variable, and is called once per
character typed at the REPL. The environment variable lookup shows up as
a hot spot when profiling, and we don't expect this to change while the
REPL is running.

Speed up pasting multiple lines into the REPL

Previously, we were checking whether the command should be accepted each
time a line break was encountered, but that's not the expected behavior.
In bracketed paste mode, we expect everything pasted to be part of
a single block of code, and encountering a newline shouldn't behave like
a user pressing to execute a command. The user should always
have a chance to review the pasted command before running it.

Use a read buffer for input in pyrepl

Previously we were reading one byte at a time, which causes much slower
IO than necessary. Instead, read in chunks, processing previously read
data before asking for more.

Optimize finding width of a single character

wlen finds the width of a multi-character string by adding up the
width of each character, and then subtracting the width of any escape
sequences. It's often called for single character strings, however,
which can't possibly contain escape sequences. Optimize for that case.

Optimize disp_str for ASCII characters

Since every ASCII character is known to display as single width, we can
avoid not only the Unicode data lookup in disp_str but also the one
hidden in str_width for them.

Speed up cursor movements in long pyrepl commands

When the current pyrepl command buffer contains many lines, scrolling up
becomes slow. We have optimizations in place to reuse lines above the
cursor position from one refresh to the next, but don't currently try to
reuse lines below the cursor position in the same way, so we wind up
with quadratic behavior where all lines of the buffer below the cursor
are recomputed each time the cursor moves up another line.

Optimize this by only computing one screen's worth of lines beyond the
cursor position. Any lines beyond that can't possibly be shown by the
console, and bounding this makes scrolling up have linear time
complexity instead.

(cherry picked from commit 32a0fab)

Co-authored-by: Matt Wozniski mwozniski@bloomberg.net
Signed-off-by: Matt Wozniski mwozniski@bloomberg.net
Co-authored-by: Pablo Galindo pablogsal@gmail.com

Issue: Odd behavior from pasting large text in the new REPL #119517

* Remove pyrepl's optimization for self-insert This will be replaced by a less specialized optimization. * Use line-buffering when pyrepl echoes pastes Previously echoing was totally suppressed until the entire command had been pasted and the terminal ended paste mode, but this gives the user no feedback to indicate that an operation is in progress. Drawing something to the screen once per line strikes a balance between perceived responsiveness and performance. * Remove dead code from pyrepl `msg_at_bottom` is always true. * Speed up pyrepl's screen rendering computation The Reader in pyrepl doesn't hold a complete representation of the screen area being drawn as persistent state. Instead, it recomputes it, on each keypress. This is fast enough for a few hundred bytes, but incredibly slow as the input buffer grows into the kilobytes (likely because of pasting). Rather than making some expensive and expansive changes to the repl's internal representation of the screen, add some caching: remember some data from one refresh to the next about what was drawn to the screen and, if we don't find anything that has invalidated the results that were computed last time around, reuse them. To keep this caching as simple as possible, all we'll do is look for lines in the buffer that were above the cursor the last time we were asked to update the screen, and that are still above the cursor now. We assume that nothing can affect a line that comes before both the old and new cursor location without us being informed. Based on this assumption, we can reuse old lines, which drastically speeds up the overwhelmingly common case where the user is typing near the end of the buffer. * Speed up pyrepl prompt drawing Cache the `can_colorize()` call rather than repeatedly recomputing it. This call looks up an environment variable, and is called once per character typed at the REPL. The environment variable lookup shows up as a hot spot when profiling, and we don't expect this to change while the REPL is running. * Speed up pasting multiple lines into the REPL Previously, we were checking whether the command should be accepted each time a line break was encountered, but that's not the expected behavior. In bracketed paste mode, we expect everything pasted to be part of a single block of code, and encountering a newline shouldn't behave like a user pressing <Enter> to execute a command. The user should always have a chance to review the pasted command before running it. * Use a read buffer for input in pyrepl Previously we were reading one byte at a time, which causes much slower IO than necessary. Instead, read in chunks, processing previously read data before asking for more. * Optimize finding width of a single character `wlen` finds the width of a multi-character string by adding up the width of each character, and then subtracting the width of any escape sequences. It's often called for single character strings, however, which can't possibly contain escape sequences. Optimize for that case. * Optimize disp_str for ASCII characters Since every ASCII character is known to display as single width, we can avoid not only the Unicode data lookup in `disp_str` but also the one hidden in `str_width` for them. * Speed up cursor movements in long pyrepl commands When the current pyrepl command buffer contains many lines, scrolling up becomes slow. We have optimizations in place to reuse lines above the cursor position from one refresh to the next, but don't currently try to reuse lines below the cursor position in the same way, so we wind up with quadratic behavior where all lines of the buffer below the cursor are recomputed each time the cursor moves up another line. Optimize this by only computing one screen's worth of lines beyond the cursor position. Any lines beyond that can't possibly be shown by the console, and bounding this makes scrolling up have linear time complexity instead. --------- (cherry picked from commit 32a0fab) Co-authored-by: Matt Wozniski <mwozniski@bloomberg.net> Signed-off-by: Matt Wozniski <mwozniski@bloomberg.net> Co-authored-by: Pablo Galindo <pablogsal@gmail.com>

This was referenced Jun 11, 2024

Odd behavior from pasting large text in the new REPL #119517

Closed

gh-119517: Fixes for pasting in pyrepl #120253

Merged

bedevere-app bot added skip news topic-repl Related to the interactive shell awaiting review labels Jun 11, 2024

pablogsal enabled auto-merge (squash) June 11, 2024 16:42

pablogsal approved these changes Jun 11, 2024

View reviewed changes

bedevere-app bot added awaiting merge and removed awaiting review labels Jun 11, 2024

pablogsal merged commit 4936575 into python:3.13 Jun 11, 2024

bedevere-app bot removed the awaiting merge label Jun 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[3.13] gh-119517: Fixes for pasting in pyrepl (GH-120253) #120353

[3.13] gh-119517: Fixes for pasting in pyrepl (GH-120253) #120353

Uh oh!

miss-islington commented Jun 11, 2024 •

edited by bedevere-app bot

Loading

Uh oh!

Uh oh!

Uh oh!

[3.13] gh-119517: Fixes for pasting in pyrepl (GH-120253) #120353

[3.13] gh-119517: Fixes for pasting in pyrepl (GH-120253) #120353

Uh oh!

Conversation

miss-islington commented Jun 11, 2024 • edited by bedevere-app bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

miss-islington commented Jun 11, 2024 •

edited by bedevere-app bot

Loading