Skip to content

gh-117151: increase default buffer size of shutil.copyfileobj() to 256k. #119783

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Oct 4, 2024

Conversation

morotti
Copy link
Contributor

@morotti morotti commented May 30, 2024

Hello,

As part of other work looking into I/O and buffering optimizations (see github ticket).

I am offering to increase the default buffer size of shutil.copyfileobj() to 256k.
it was set to 16k in the 1990s.
it was raised to 64k in 2019. the discussion at the time mentioned another 5% improvement by raising to 128k and settled for a very conservative setting.

it's 2024 now, I think it should be revisited to match modern hardware. I am measuring 0-15% performance improvement when raising to 256k on various types of disk. there is no downside as far as I can tell.

this function is only intended for sequential copy of full files (or file like objects). it's the typical use case that benefits from larger operations.

for reference, I came across this function while trying to profile pip that is using it to copy files when installing python packages.

… to 256k.

it was set to 16k in the 1990s.
it was raised to 64k in 2019. the discussion at the time mentioned another 5% improvement by raising to 128k and settled for a very conservative setting.

it's 2024 now, I think it should be revisited to match modern hardware. I am measuring 0-15% performance improvement when raising to 256k on various types of disk. there is no downside as far as I can tell.

this function is only intended for sequential copy of full files (or file like objects). it's the typical use case that benefits from larger operations.

for reference, I came across this function while trying to profile pip that is using it to copy files when installing python packages.
@morotti
Copy link
Contributor Author

morotti commented May 30, 2024

I think we can skip the NEWS entry. it's not significant.

Copy link
Member

@gpshead gpshead left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check the box in github to allow others to push changes to your PR branch.

This needs a NEWS entry.

Misc/NEWS.d/next/Library/2024-10-03-05-00-25.gh-issue-117151.Prdw_W.rst:

The default buffer size used by :func:`shutil.copyfileobj` has been
increased from 64k to 256k on non-Windows platforms.  It was already larger
on Windows.

@bedevere-app
Copy link

bedevere-app bot commented Oct 3, 2024

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

@morotti
Copy link
Contributor Author

morotti commented Oct 3, 2024

I have made the requested changes; please review again

Thanks for reading the PR. I added a news entry.

I don't have any tickbox to allow maintainers to edit. Maybe because it's an organization repo instead of a personal fork.

@bedevere-app
Copy link

bedevere-app bot commented Oct 3, 2024

Thanks for making the requested changes!

@gpshead: please review the changes made to this pull request.

@bedevere-app bedevere-app bot requested a review from gpshead October 3, 2024 11:59
@gpshead gpshead merged commit 6efd95c into python:main Oct 4, 2024
35 of 38 checks passed
@morotti morotti deleted the shutil-copy-buf-size branch February 5, 2025 11:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants