-
-
Notifications
You must be signed in to change notification settings - Fork 31.9k
gh-117151: increase default buffer size of shutil.copyfileobj() to 256k. #119783
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
… to 256k. it was set to 16k in the 1990s. it was raised to 64k in 2019. the discussion at the time mentioned another 5% improvement by raising to 128k and settled for a very conservative setting. it's 2024 now, I think it should be revisited to match modern hardware. I am measuring 0-15% performance improvement when raising to 256k on various types of disk. there is no downside as far as I can tell. this function is only intended for sequential copy of full files (or file like objects). it's the typical use case that benefits from larger operations. for reference, I came across this function while trying to profile pip that is using it to copy files when installing python packages.
I think we can skip the NEWS entry. it's not significant. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please check the box in github to allow others to push changes to your PR branch.
This needs a NEWS entry.
Misc/NEWS.d/next/Library/2024-10-03-05-00-25.gh-issue-117151.Prdw_W.rst:
The default buffer size used by :func:`shutil.copyfileobj` has been
increased from 64k to 256k on non-Windows platforms. It was already larger
on Windows.
A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated. Once you have made the requested changes, please leave a comment on this pull request containing the phrase |
I have made the requested changes; please review again Thanks for reading the PR. I added a news entry. I don't have any tickbox to allow maintainers to edit. Maybe because it's an organization repo instead of a personal fork. |
Thanks for making the requested changes! @gpshead: please review the changes made to this pull request. |
Hello,
As part of other work looking into I/O and buffering optimizations (see github ticket).
I am offering to increase the default buffer size of shutil.copyfileobj() to 256k.
it was set to 16k in the 1990s.
it was raised to 64k in 2019. the discussion at the time mentioned another 5% improvement by raising to 128k and settled for a very conservative setting.
it's 2024 now, I think it should be revisited to match modern hardware. I am measuring 0-15% performance improvement when raising to 256k on various types of disk. there is no downside as far as I can tell.
this function is only intended for sequential copy of full files (or file like objects). it's the typical use case that benefits from larger operations.
for reference, I came across this function while trying to profile pip that is using it to copy files when installing python packages.