Skip to content

tarfile.py: TarFile.addfile not adding all files #116931

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
elehcimd opened this issue Mar 17, 2024 · 3 comments
Closed

tarfile.py: TarFile.addfile not adding all files #116931

elehcimd opened this issue Mar 17, 2024 · 3 comments
Labels
3.11 only security fixes 3.12 only security fixes 3.13 bugs and security fixes type-bug An unexpected behavior, bug, or error

Comments

@elehcimd
Copy link

elehcimd commented Mar 17, 2024

Bug report

Bug description:

I would expect both files A and B to be stored in the tar file. However, only A is archived.

# creating the test directory
!rm -rf test1.tar test1
!mkdir test1
!echo thisisa >test1/A
!echo thisisb >test1/B
import tarfile
archive = tarfile.open("test1.tar", mode="w", format=tarfile.GNU_FORMAT)
archive.addfile(archive.gettarinfo(name="test1/A"))
archive.addfile(archive.gettarinfo(name="test1/B"))
archive.close()
print(tarfile.open("test1.tar", mode="r").getnames())

Expected output:

['test1/A', 'test1/B']

Returned output:

['test1/A']

Reproduced on these Python versions:

Python 3.11.6 (main, Nov 28 2023, 09:22:32) [Clang 14.0.0 (clang-1400.0.29.202)] on darwin
Python 3.10.12 (main, Nov 20 2023, 15:14:05) [GCC 11.4.0] on linux

CPython versions tested on:

3.11

Operating systems tested on:

macOS

Linked PRs

@elehcimd elehcimd added the type-bug An unexpected behavior, bug, or error label Mar 17, 2024
@serhiy-storchaka serhiy-storchaka added 3.11 only security fixes 3.12 only security fixes 3.13 bugs and security fixes labels Mar 18, 2024
@serhiy-storchaka
Copy link
Member

gettarinfo() creates a TarInfo object from the result of os.stat(). In particularly it sets the size of a TarInfo object.

addfile() usually takes two arguments: a TarInfo object and a file-like object. It writes a header, taking name, size, and other attributes from the TarInfo object, and then copies the content of the file-like object (with a padding if necessary). When the file does not have content (empty file, directory, other nodes) the second argument is omitted.

So you wrote a header for the first file, but not its content. Then you wrote the header for the second file which was interpreted as the content of the first file.

There is perhaps a lack of documentation, but I think that TarFile should detect such errors and raise an exception. Not writing a content for a non-empty member is an error. Writing a wrong amount of data is also an error.

cc @gustaebel

encukou pushed a commit that referenced this issue Apr 19, 2024
Tarfile.addfile now throws an ValueError when the user passes
in a non-zero size tarinfo but does not provide a fileobj,
instead of writing an incomplete entry.
@lyc8503
Copy link
Contributor

lyc8503 commented Apr 19, 2024

I've submitted a PR to fix this issue and now it has been merged.

Now people should get a ValueError when they forget to pass in a fileobj.

I think this issue can be closed if there're no more changes needed.

@encukou
Copy link
Member

encukou commented Apr 19, 2024

Thank you!

This is technically a new feature, so it won't be backported to 3.12.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.11 only security fixes 3.12 only security fixes 3.13 bugs and security fixes type-bug An unexpected behavior, bug, or error
Projects
Status: Done
Development

No branches or pull requests

4 participants