-
-
Notifications
You must be signed in to change notification settings - Fork 32.1k
bpo-36670, regrtest: Fix WindowsLoadTracker() for partial line #16550
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WindowsLoadTracker.read_output() now uses a short buffer for incomplete line.
I managed to reproduce the bug https://bugs.python.org/issue36670#msg353768 using this change:
I tested manually that this change fix the bug. |
Perhaps it's worth a comment in the code for future readers that due to when typeperf generates the new line (before each interval's output), waiting for it will always delay the current value by the sampling interval? |
@db3l: "Perhaps it's worth a comment in the code for future readers that due to when typeperf generates the new line (before each interval's output), waiting for it will always delay the current value by the sampling interval?" Sorry, I didn't understand your comment. "waiting for it": do you mean calling ReadFile() again would block? My change fixed the issue for me. So I merged the PR. I understand that previously, the code dropped some samples depending if the line was fully written into the pipe or not. |
No, I was just referring to the timing of processing each sample. Your change should definitely fix the original issue of reading a partial sample line, but it now also delays each sample by the sampling interval. That is, in my testing, typeperf generates output at a given sampling interval of: E.g., it outputs the newline at the beginning of each sample. So by waiting for the LF, you won't return the line with sample 'n' until sample 'n+1' starts being generated, thus introducing a sample time delay that wasn't there before. Hardly a major issue (particularly with the current 5s sampling interval). But if I were just reading the code, I'd probably guess the CRLF immediately followed each sample, and wouldn't expect waiting for a complete line to introduce much latency. So it seemed potentially comment-worthy. |
Oh, I see. I understand that typeperf calls write() 3 times: write("\r\n") But there are also special cases for blank lines at header line at the beginning... Oh... I didn't expect that it would be so complex to parse a command output :-( I see two options:
I suggest to leave the code as it is. It's only used for debug purpose and it's only used in regrtest. @db3l: If you come up with a comment, I can add it to the code :-) |
Without access to typeperf's source we're just guessing, but yes, that's how I expect it's working (I suppose the initial "\r\n" might be combined with the timestamp into a single write() call at the start of each sample). As for blank lines this change actually generates more of them (one for every sample) due to the CRLF being at the front of the buffer, but they just get skipped. I do think there's a third choice, which is just the older behavior (ignore parse issues without a warning). It feels wrong knowing there's a race condition, but it introduces no additional latency and ignoring the partial reads doesn't break anything. That's what's been happening prior to this point anyway. However, for properly handling partial lines, your approach seems the simplest aside from the latency question (which seems minor as long as the sampling interval remains small). Trying to identify complete lines other than via newline feels like it would be fragile, or need to involve its own latency anyway, such as a timeout. As for a comment, maybe just after your self._buffer comment, something like: "typeperf generates CRLF before each sample, rather than after, so this will introduce a SAMPLING_INTERVAL delay" |
Well, I found a simple solution: parse the last line. If it's valid, pass it the parser. Otherwise, put it in a buffer... No regex, no parser state. See my PR #16553. |
…H-16550) (GH-16560) * bpo-36670, regrtest: Fix WindowsLoadTracker() for partial line (GH-16550) WindowsLoadTracker.read_output() now uses a short buffer for incomplete line. (cherry picked from commit 3e04cd2) * bpo-36670: Enhance regrtest WindowsLoadTracker (GH-16553) The last line is now passed to the parser even if it does not end with a newline, but only if it's a valid value. (cherry picked from commit c65119d) * bpo-36670: Enhance regrtest (GH-16556) * Add log() method: add timestamp and load average prefixes to main messages. * WindowsLoadTracker: * LOAD_FACTOR_1 is now computed using SAMPLING_INTERVAL * Initialize the load to the arithmetic mean of the first 5 values of the Processor Queue Length value (so over 5 seconds), rather than 0.0. * Handle BrokenPipeError and when typeperf exit. * format_duration(1.5) now returns '1.5 sec', rather than '1 sec 500 ms' (cherry picked from commit 098e256)
…ythonGH-16550) (pythonGH-16560) * bpo-36670, regrtest: Fix WindowsLoadTracker() for partial line (pythonGH-16550) WindowsLoadTracker.read_output() now uses a short buffer for incomplete line. (cherry picked from commit 3e04cd2) * bpo-36670: Enhance regrtest WindowsLoadTracker (pythonGH-16553) The last line is now passed to the parser even if it does not end with a newline, but only if it's a valid value. (cherry picked from commit c65119d) * bpo-36670: Enhance regrtest (pythonGH-16556) * Add log() method: add timestamp and load average prefixes to main messages. * WindowsLoadTracker: * LOAD_FACTOR_1 is now computed using SAMPLING_INTERVAL * Initialize the load to the arithmetic mean of the first 5 values of the Processor Queue Length value (so over 5 seconds), rather than 0.0. * Handle BrokenPipeError and when typeperf exit. * format_duration(1.5) now returns '1.5 sec', rather than '1 sec 500 ms' (cherry picked from commit 098e256) (cherry picked from commit de3195c) Co-authored-by: Victor Stinner <vstinner@python.org>
…H-16550) (GH-16560) * bpo-36670, regrtest: Fix WindowsLoadTracker() for partial line (GH-16550) WindowsLoadTracker.read_output() now uses a short buffer for incomplete line. (cherry picked from commit 3e04cd2) * bpo-36670: Enhance regrtest WindowsLoadTracker (GH-16553) The last line is now passed to the parser even if it does not end with a newline, but only if it's a valid value. (cherry picked from commit c65119d) * bpo-36670: Enhance regrtest (GH-16556) * Add log() method: add timestamp and load average prefixes to main messages. * WindowsLoadTracker: * LOAD_FACTOR_1 is now computed using SAMPLING_INTERVAL * Initialize the load to the arithmetic mean of the first 5 values of the Processor Queue Length value (so over 5 seconds), rather than 0.0. * Handle BrokenPipeError and when typeperf exit. * format_duration(1.5) now returns '1.5 sec', rather than '1 sec 500 ms' (cherry picked from commit 098e256) (cherry picked from commit de3195c) Co-authored-by: Victor Stinner <vstinner@python.org>
…nGH-16550) WindowsLoadTracker.read_output() now uses a short buffer for incomplete line.
WindowsLoadTracker.read_output() now uses a short buffer for
incomplete line.
https://bugs.python.org/issue36670