-
-
Notifications
You must be signed in to change notification settings - Fork 31.9k
gh-117151: IO performance improvement, increase io.DEFAULT_BUFFER_SIZE to 128k #118144
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool. If this change has little impact on Python users, wait for a maintainer to apply the |
Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool. If this change has little impact on Python users, wait for a maintainer to apply the |
@serhiy-storchaka hello, you were kind to review my first PR for buffering performance (118037), would you be able to review this PR too? |
@morotti You still need to write a news entry for the PR. For your other PR this is was not required, but due to the performance impact of this PR I think we should. The most convenient way for this is to click on the The CLA is not yet working, but it worked for the other PR (which was created after this one), so maybe it will resolve itself automatically in a couple of days |
Pinging @benjaminp as expert on io. Could you review this PR? Are there other benchmarks (besides the one in the corresponding issue) we can perform to test this PR? |
Thank you for reviewing, The CLA check is green now and I added a news entry. |
@eendebakpt @benjaminp could you review? |
@morotti The PR looks good from my side, but I am no core dev so I cannot approve. Currently many core devs are at pycon US, so I think we should wait a bit more. If in a few weeks there has been no further response, we can post a message to discourse (see https://devguide.python.org/getting-started/pull-request-lifecycle/#reviewing). |
Requesting Serhiy's review, since he reviewd the other linked (and merged) PR #118037. |
Misc/NEWS.d/next/Core and Builtins/2024-04-30-14-03-09.gh-issue-117151.yt2H8c.rst
Outdated
Show resolved
Hide resolved
Misc/NEWS.d/next/Core and Builtins/2024-04-30-14-03-09.gh-issue-117151.yt2H8c.rst
Outdated
Show resolved
Hide resolved
Misc/NEWS.d/next/Core and Builtins/2024-04-30-14-03-09.gh-issue-117151.yt2H8c.rst
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was convinced that such a change could be beneficial.
But for the case if it has unexpected effect, please ask on https://discuss.python.org/. Update also the C implementation of open()
which is used by default.
Misc/NEWS.d/next/Core and Builtins/2024-04-30-14-03-09.gh-issue-117151.yt2H8c.rst
Outdated
Show resolved
Hide resolved
Please do not use rebase and force-push. It makes reviewing more difficult. |
I think you were looking into the winconsoleio optimization? I think we should keep it as a separate issue for a separate PR, or we'll never complete anything. ^^ I do have an old branch with comments, from when I was investigating too https://github.com/man-group/cpython/blob/io-buffer-size-archive/Modules/_io/winconsoleio.c#L51 (The comments and the branch are obsolete as some other people have redone some of the buffering already, the 10 year old code was counting characters wrong or something). TL;DR Old versions of windows were buggy with large writes to the console, theoretically above 65k, practically with as little as 26k, so there is code in winconsole to split writes into smaller writes. This is a windows XP era bug. No longer an issue in windows 8 (oldest windows supported by the interpreter). I think we're in agreement that the old code can be removed. I think you have branches and PR to do that. :) |
alright, I've removed the constants _MAXIMUM_BUFFER_SIZE. build green. I think we're good to merge |
Please use |
I've updated the branch to main, anything else you want? |
@cmaloney @gpshead @erlend-aasland the PR is approved and passing builds. can this be merged? |
A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated. Once you have made the requested changes, please leave a comment on this pull request containing the phrase |
Misc/NEWS.d/next/Library/2024-04-30-14-03-09.gh-issue-117151.yt2H8c.rst
Outdated
Show resolved
Hide resolved
@gpshead I've made the changes, would you be able to review again? |
hello @gpshead @cmaloney @serhiy-storchaka @eendebakpt would you be able to review and unblock the PR? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the change looks good, but this PR was created with the checkbox allowing committers to make direct edits to the PR branch disabled so we can't take care of trivia that may be blocking it ourselves.
can you merge master so that it reruns modern CI?
@gpshead merged with master. sorry, I haven't found any checkbox to allow to make direct edits or I would have ticket. I suspect it's because the fork is in my work organization, maybe that comes with extra restrictions. |
…ER_SIZE to 128k (pythonGH-118144) Co-authored-by: rmorotti <romain.morotti@man.com>
See discussion gh-117151
This patch adjusts the buffer size. That gives 3 to 5 times I/O performance improvement on modern hardware.