-
Notifications
You must be signed in to change notification settings - Fork 315
Zip::InputStream reads only first file, no errors raised #493
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hello, thanks for letting us know about this. I think OSX Archive Utility causes us quite a few issues - or we cause it quite a few issues. Would it be at all possible that you could supply us with two zip files - one created with OSX Archive Utility and one with the command line tool - with the same files in? I don't have a Mac but I'm keen to debug this. |
Hi @hainesr. Sure: Archive.zip created in OSX by selecting both files, right clicking, and choosing "Compress 2 items" from the context menu. commanline.zip created by issuing the following command in a console: zip commandline.zip 1.txt 2.txt FYI zip --help
Copyright (c) 1990-2008 Info-ZIP - Type 'zip "-L"' for software license.
Zip 3.0 (July 5th 2008). Usage:
... And trying to stream the contents: >> puts Gem.loaded_specs['rubyzip'].version
2.3.0
=> nil
>> puts RUBY_VERSION
2.7.1
=> nil
>>
>> require 'zip'
=> true
>>
>> zip_path = "#{dir}/Archive.zip"
>>
?> Zip::InputStream.open(zip_path) do |io|
?> while (entry = io.get_next_entry)
?> puts entry.name
?> end
>> end
1.txt
=> nil
>>
>> zip_path = "#{dir}/commandline.zip"
>>
?> Zip::InputStream.open(zip_path) do |io|
?> while (entry = io.get_next_entry)
?> puts entry.name
?> end
>> end
1.txt
2.txt
=> nil |
Many thanks for this @al. I think you are seeing this behaviour due to the fact that we don't handle data descriptors properly yet (#460, #269, #295) and OSX Archive Utility builds Zip files like no other tool - there's no need for it to use data descriptors, but it does. It also kind of uses them wrong. Anyway, this needs fixing, and I'll get on it ASAP. In the meantime, if you can possibly use |
Great, thanks @hainesr |
I've looked at this a bit deeper now. This appears to be a really specific bug thanks to the frankly weird and non-standard (I was going to say 'wrong') way OSX Archive Utility builds Zip files. Think Different indeed. Archive looks like it's effectively streaming to disk as it's deflating files, and so doesn't know the compressed size of a file when it's writing the local header - which comes before the actual data. So it uses a data descriptor to store that info after the data. All fine and standard. But Archive does know the uncompressed data size when it's writing the local header and it does write that into the local header. Which is not standard, and why we're seeing this specific bug. Rubyzip checks the local header for the streaming flag (gp flags bit 3) and that the compressed size, uncompressed size and CRC check are all zero before deciding it can't extract an entry using All this is to say that the obvious fix for this issue is to raise the error about not being able to extract with @al, may I use your Archive.zip as a fixture for testing? |
Sure, the poem that those verses are taken from is out of copyright I believe. And thanks again for looking into this so promptly. |
For anyone else running into this: $ zip -d filename.zip \*/.DS_Store
$ zip -d filename.zip __MACOSX/\* |
rubyzip v2.3.0
I'm encountering what appears to be a duplicate of #227, i.e. only the first file in an archive being extracted.
That issue, long since closed, suggests that an error should be raised (presumably from
rubyzip/lib/zip/input_stream.rb
Line 138 in 750c474
Code is simply:
Ideally I'd expect the names of all files in the archive to be displayed, or at least an error to be raised, instead only one name is printed and no error is indicated.
The problem occurs when the Zip archive is created by the OSX Archive Utility. Archives created with the
zip
command line tool are handled as expected, i.e. all names are printed.Note Zip::InputStream is being used to deal with the potential for non-unique file names, as suggested in #342.
Any thoughts?
The text was updated successfully, but these errors were encountered: