bpo-36670, regrtest: Fix WindowsLoadTracker() for partial line #16550

vstinner · 2019-10-02T22:40:40Z

WindowsLoadTracker.read_output() now uses a short buffer for
incomplete line.

https://bugs.python.org/issue36670

WindowsLoadTracker.read_output() now uses a short buffer for incomplete line.

vstinner · 2019-10-02T22:45:15Z

I managed to reproduce the bug https://bugs.python.org/issue36670#msg353768 using this change:

diff --git a/Lib/test/libregrtest/win_utils.py b/Lib/test/libregrtest/win_utils.py
index f0c17b906f..78429faa89 100644
--- a/Lib/test/libregrtest/win_utils.py
+++ b/Lib/test/libregrtest/win_utils.py
@@ -14,7 +14,7 @@ BUFSIZE = 8192
 LOAD_FACTOR_1 = 0.9200444146293232478931553241
 
 # Seconds per measurement
-SAMPLING_INTERVAL = 5
+SAMPLING_INTERVAL = 0
 # Windows registry subkey of HKEY_LOCAL_MACHINE where the counter names
 # of typeperf are registered
 COUNTER_REGISTRY_KEY = (r"SOFTWARE\Microsoft\Windows NT\CurrentVersion"

I tested manually that this change fix the bug.

db3l · 2019-10-02T23:00:44Z

Perhaps it's worth a comment in the code for future readers that due to when typeperf generates the new line (before each interval's output), waiting for it will always delay the current value by the sampling interval?

vstinner · 2019-10-02T23:08:25Z

@db3l: "Perhaps it's worth a comment in the code for future readers that due to when typeperf generates the new line (before each interval's output), waiting for it will always delay the current value by the sampling interval?"

Sorry, I didn't understand your comment.

"waiting for it": do you mean calling ReadFile() again would block?

My change fixed the issue for me. So I merged the PR. I understand that previously, the code dropped some samples depending if the line was fully written into the pipe or not.

db3l · 2019-10-02T23:17:30Z

No, I was just referring to the timing of processing each sample. Your change should definitely fix the original issue of reading a partial sample line, but it now also delays each sample by the sampling interval.

That is, in my testing, typeperf generates output at a given sampling interval of:
CRLF sample data (sampling delay)
CRLF sample data (sampling delay)
etc...

E.g., it outputs the newline at the beginning of each sample. So by waiting for the LF, you won't return the line with sample 'n' until sample 'n+1' starts being generated, thus introducing a sample time delay that wasn't there before.

Hardly a major issue (particularly with the current 5s sampling interval). But if I were just reading the code, I'd probably guess the CRLF immediately followed each sample, and wouldn't expect waiting for a complete line to introduce much latency. So it seemed potentially comment-worthy.

vstinner · 2019-10-02T23:32:11Z

Oh, I see. I understand that typeperf calls write() 3 times:

write("\r\n")
write(timestamp)
write(value)

But there are also special cases for blank lines at header line at the beginning... Oh... I didn't expect that it would be so complex to parse a command output :-(

I see two options:

do nothing and keep the current sub-optimal code which "works" but with a delay of 5 seconds
experimental smarter workaround for incomplete lines: need to detect once we saw all header and blank lines, and then ensure that each line looks complete (contains ","? use a regex?)

I suggest to leave the code as it is. It's only used for debug purpose and it's only used in regrtest.

@db3l: If you come up with a comment, I can add it to the code :-)

db3l · 2019-10-03T00:00:27Z

Without access to typeperf's source we're just guessing, but yes, that's how I expect it's working (I suppose the initial "\r\n" might be combined with the timestamp into a single write() call at the start of each sample). As for blank lines this change actually generates more of them (one for every sample) due to the CRLF being at the front of the buffer, but they just get skipped.

I do think there's a third choice, which is just the older behavior (ignore parse issues without a warning). It feels wrong knowing there's a race condition, but it introduces no additional latency and ignoring the partial reads doesn't break anything. That's what's been happening prior to this point anyway.

However, for properly handling partial lines, your approach seems the simplest aside from the latency question (which seems minor as long as the sampling interval remains small). Trying to identify complete lines other than via newline feels like it would be fragile, or need to involve its own latency anyway, such as a timeout.

As for a comment, maybe just after your self._buffer comment, something like:

"typeperf generates CRLF before each sample, rather than after, so this will introduce a SAMPLING_INTERVAL delay"

vstinner · 2019-10-03T07:51:39Z

experimental smarter workaround for incomplete lines: need to detect once we saw all header and blank lines, and then ensure that each line looks complete (contains ","? use a regex?)

Well, I found a simple solution: parse the last line. If it's valid, pass it the parser. Otherwise, put it in a buffer... No regex, no parser state. See my PR #16553.

…H-16550) (GH-16560) * bpo-36670, regrtest: Fix WindowsLoadTracker() for partial line (GH-16550) WindowsLoadTracker.read_output() now uses a short buffer for incomplete line. (cherry picked from commit 3e04cd2) * bpo-36670: Enhance regrtest WindowsLoadTracker (GH-16553) The last line is now passed to the parser even if it does not end with a newline, but only if it's a valid value. (cherry picked from commit c65119d) * bpo-36670: Enhance regrtest (GH-16556) * Add log() method: add timestamp and load average prefixes to main messages. * WindowsLoadTracker: * LOAD_FACTOR_1 is now computed using SAMPLING_INTERVAL * Initialize the load to the arithmetic mean of the first 5 values of the Processor Queue Length value (so over 5 seconds), rather than 0.0. * Handle BrokenPipeError and when typeperf exit. * format_duration(1.5) now returns '1.5 sec', rather than '1 sec 500 ms' (cherry picked from commit 098e256)

…ythonGH-16550) (pythonGH-16560) * bpo-36670, regrtest: Fix WindowsLoadTracker() for partial line (pythonGH-16550) WindowsLoadTracker.read_output() now uses a short buffer for incomplete line. (cherry picked from commit 3e04cd2) * bpo-36670: Enhance regrtest WindowsLoadTracker (pythonGH-16553) The last line is now passed to the parser even if it does not end with a newline, but only if it's a valid value. (cherry picked from commit c65119d) * bpo-36670: Enhance regrtest (pythonGH-16556) * Add log() method: add timestamp and load average prefixes to main messages. * WindowsLoadTracker: * LOAD_FACTOR_1 is now computed using SAMPLING_INTERVAL * Initialize the load to the arithmetic mean of the first 5 values of the Processor Queue Length value (so over 5 seconds), rather than 0.0. * Handle BrokenPipeError and when typeperf exit. * format_duration(1.5) now returns '1.5 sec', rather than '1 sec 500 ms' (cherry picked from commit 098e256) (cherry picked from commit de3195c) Co-authored-by: Victor Stinner <vstinner@python.org>

…H-16550) (GH-16560) * bpo-36670, regrtest: Fix WindowsLoadTracker() for partial line (GH-16550) WindowsLoadTracker.read_output() now uses a short buffer for incomplete line. (cherry picked from commit 3e04cd2) * bpo-36670: Enhance regrtest WindowsLoadTracker (GH-16553) The last line is now passed to the parser even if it does not end with a newline, but only if it's a valid value. (cherry picked from commit c65119d) * bpo-36670: Enhance regrtest (GH-16556) * Add log() method: add timestamp and load average prefixes to main messages. * WindowsLoadTracker: * LOAD_FACTOR_1 is now computed using SAMPLING_INTERVAL * Initialize the load to the arithmetic mean of the first 5 values of the Processor Queue Length value (so over 5 seconds), rather than 0.0. * Handle BrokenPipeError and when typeperf exit. * format_duration(1.5) now returns '1.5 sec', rather than '1 sec 500 ms' (cherry picked from commit 098e256) (cherry picked from commit de3195c) Co-authored-by: Victor Stinner <vstinner@python.org>

…nGH-16550) WindowsLoadTracker.read_output() now uses a short buffer for incomplete line.

bpo-36670, regrtest: Fix WindowsLoadTracker() for partial line

140ec9a

WindowsLoadTracker.read_output() now uses a short buffer for incomplete line.

vstinner added the skip news label Oct 2, 2019

the-knights-who-say-ni added the CLA signed label Oct 2, 2019

bedevere-bot added the awaiting core review label Oct 2, 2019

vstinner merged commit 3e04cd2 into python:master Oct 2, 2019

vstinner deleted the typeperf_fix branch October 2, 2019 23:04

bedevere-bot removed the awaiting core review label Oct 2, 2019

miss-islington mentioned this pull request Oct 3, 2019

[3.7] [3.8] bpo-36670, regrtest: Fix WindowsLoadTracker() for partial line (GH-16550) (GH-16560) #16562

Merged

jacobneiltaylor pushed a commit to jacobneiltaylor/cpython that referenced this pull request Dec 5, 2019

bpo-36670, regrtest: Fix WindowsLoadTracker() for partial line (pytho…

325aef9

…nGH-16550) WindowsLoadTracker.read_output() now uses a short buffer for incomplete line.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

bpo-36670, regrtest: Fix WindowsLoadTracker() for partial line #16550

bpo-36670, regrtest: Fix WindowsLoadTracker() for partial line #16550

Uh oh!

vstinner commented Oct 2, 2019 •

edited by bedevere-bot

Loading

Uh oh!

vstinner commented Oct 2, 2019

Uh oh!

db3l commented Oct 2, 2019

Uh oh!

vstinner commented Oct 2, 2019

Uh oh!

db3l commented Oct 2, 2019 •

edited

Loading

Uh oh!

vstinner commented Oct 2, 2019

Uh oh!

db3l commented Oct 3, 2019

Uh oh!

vstinner commented Oct 3, 2019

Uh oh!

Uh oh!

Uh oh!

bpo-36670, regrtest: Fix WindowsLoadTracker() for partial line #16550

bpo-36670, regrtest: Fix WindowsLoadTracker() for partial line #16550

Uh oh!

Conversation

vstinner commented Oct 2, 2019 • edited by bedevere-bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vstinner commented Oct 2, 2019

Uh oh!

db3l commented Oct 2, 2019

Uh oh!

vstinner commented Oct 2, 2019

Uh oh!

db3l commented Oct 2, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vstinner commented Oct 2, 2019

Uh oh!

db3l commented Oct 3, 2019

Uh oh!

vstinner commented Oct 3, 2019

Uh oh!

Uh oh!

vstinner commented Oct 2, 2019 •

edited by bedevere-bot

Loading

db3l commented Oct 2, 2019 •

edited

Loading