-
-
Notifications
You must be signed in to change notification settings - Fork 31.9k
bpo-40049: Check if symlink exists when extracting from tarfile #19187
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Hello, and thanks for your contribution! I'm a bot set up to make sure that the project can legally accept this contribution by verifying everyone involved has signed the PSF contributor agreement (CLA). Recognized GitHub usernameWe couldn't find a bugs.python.org (b.p.o) account corresponding to the following GitHub usernames: This might be simply due to a missing "GitHub Name" entry in one's b.p.o account settings. This is necessary for legal reasons before we can look at this contribution. Please follow the steps outlined in the CPython devguide to rectify this issue. You can check yourself to see if the CLA has been received. Thanks again for the contribution, we look forward to reviewing it! |
Is there anyone available to review this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pretty sure this should have a unit test.
Misc/NEWS.d/next/Library/2020-03-27-20-49-32.bpo-40049.8079ca.rst
Outdated
Show resolved
Hide resolved
Co-authored-by: Zackery Spytz <zspytz@gmail.com>
@ZackerySpytz thanks for taking a look. I've added a unit test and confirmed that it fails on master and passes on this branch with both Windows 10 and Ubuntu. |
Closing and re-opening to restart the Travis-CI check. |
GNU tar (v1.30, Ubuntu 20.04) does indeed overwrite files with symlinks upon extracting, while both |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good, but it seems to me that the root issue here is not the backwards seek, but the simply incorrect behavior of not overwriting existing files with symlinks.
If you agree, @jonnyhsu, please change the reasoning in the test comment and the NEWS entry accordingly, and I'd be happy to merge this.
A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated. Once you have made the requested changes, please leave a comment on this pull request containing the phrase |
@@ -2232,6 +2232,8 @@ def makelink(self, tarinfo, targetpath): | |||
try: | |||
# For systems that support symbolic and hard links. | |||
if tarinfo.issym(): | |||
if os.path.lexists(targetpath): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps include a short comment here explaining why this is needed? Something along the lines of "tar should overwrite existing files with symlinks, but os.symlink raises an exception rather than overwriting."
When extracting a tarfile, os.symlink() will raise an exception if the symlink already exists. This will cause the entire tarfile to be scanned for the destination files, thinking that the platform does not support symlinks. On a normal file this goes unnoticed, but when processing stream data it will raise a StreamError because it needs to seek backwards to resume extraction where it left off.
https://bugs.python.org/issue40049