Skip to content

Bzip2 #425

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 4 commits into from
Closed

Bzip2 #425

Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,13 @@ Or in your Gemfile:
gem 'rubyzip'
```

If you want to read zip files that use the bzip2 compression method,
you must also install the optional dependency `ffi`:

```ruby
gem 'ffi'
```

## Usage

### Basic zip archive creation
Expand Down
4 changes: 4 additions & 0 deletions lib/zip.rb
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,10 @@
require 'zip/crypto/traditional_encryption'
require 'zip/inflater'
require 'zip/deflater'
begin
require 'zip/bzip2_decompressor'
rescue LoadError
end
require 'zip/streamable_stream'
require 'zip/streamable_directory'
require 'zip/constants'
Expand Down
104 changes: 104 additions & 0 deletions lib/zip/bzip2/decompress.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,104 @@
require 'zip/bzip2/libbz2'
require 'zip/bzip2/errors'

module Zip
module Bzip2
class Decompress
OUT_BUFFER_SIZE = 4096

class << self
private

def finalize(stream)
-> (id) do
res = Libbz2::BZ2_bzDecompressEnd(stream)
check_error(res)
end
end
end

def initialize(options = {})
small = options[:small]

@stream = Libbz2::BzStream.new
@out_eof = false

res = Libbz2::BZ2_bzDecompressInit(stream, 0, small ? 1 : 0)
check_error(res)

ObjectSpace.define_finalizer(self, self.class.send(:finalize, stream))
end

def decompress(decompress_string)
return nil if @out_eof

out_buffer = nil
in_buffer = nil
begin
out_buffer = ::FFI::MemoryPointer.new(1, OUT_BUFFER_SIZE)
in_buffer = ::FFI::MemoryPointer.new(1, decompress_string.bytesize)

in_buffer.write_bytes(decompress_string)
stream[:next_in] = in_buffer
stream[:avail_in] = in_buffer.size

result = String.new
while stream[:avail_in].positive?
stream[:next_out] = out_buffer
stream[:avail_out] = out_buffer.size

res = Libbz2::BZ2_bzDecompress(stream)
check_error(res)

result += out_buffer.read_bytes(out_buffer.size - stream[:avail_out])

if res == Libbz2::BZ_STREAM_END
@out_eof = true

res = Libbz2::BZ2_bzDecompressEnd(stream)
ObjectSpace.undefine_finalizer(self)
check_error(res)

break
end
end
result
ensure
in_buffer.free if in_buffer
in_buffer = nil
out_buffer.free if out_buffer
out_buffer = nil
end
end

def finished?
@out_eof
end

protected

attr_reader :stream

private

def check_error(res)
return res if res >= 0

error_class = case res
when Libbz2::BZ_MEM_ERROR
MemError
when Libbz2::BZ_DATA_ERROR
DataError
when Libbz2::BZ_DATA_ERROR_MAGIC
MagicDataError
when Libbz2::BZ_CONFIG_ERROR
ConfigError
else
raise UnexpectedError.new(res)
end

raise error_class.new
end
end
end
end
61 changes: 61 additions & 0 deletions lib/zip/bzip2/errors.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
module Zip
module Bzip2
# Base class for Zip::Bzip2 exceptions.
class Error < IOError
end

# Raised if a failure occurred allocating memory to complete a request.
class MemError < Error
# Initializes a new instance of MemError.
#
# @private
def initialize #:nodoc:
super('Could not allocate enough memory to perform this request')
end
end

# Raised if a data integrity error is detected (a mismatch between
# stored and computed CRCs or another anomaly in the compressed data).
class DataError < Error
# Initializes a new instance of DataError.
#
# @param message [String] Exception message (overrides the default).
# @private
def initialize(message = nil) #:nodoc:
super(message || 'Data integrity error detected (mismatch between stored and computed CRCs, or other anomaly in the compressed data)')
end
end

# Raised if the compressed data does not start with the correct magic
# bytes ('BZh').
class MagicDataError < DataError
# Initializes a new instance of MagicDataError.
#
# @private
def initialize #:nodoc:
super('Compressed data does not start with the correct magic bytes (\'BZh\')')
end
end

# Raised if libbz2 detects that it has been improperly compiled.
class ConfigError < DataError
# Initializes a new instance of ConfigError.
#
# @private
def initialize #:nodoc:
super('libbz2 has been improperly compiled on your platform')
end
end

# Raised if libbz2 reported an unexpected error code.
class UnexpectedError < Error
# Initializes a new instance of UnexpectedError.
#
# @param error_code [Integer] The error_code reported by libbz2.
# @private
def initialize(error_code) #:nodoc:
super("An unexpected error was detected (error code: #{error_code})")
end
end
end
end
100 changes: 100 additions & 0 deletions lib/zip/bzip2/libbz2.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,100 @@
# This file is copied from:
#
# https://github.com/philr/bzip2-ffi/raw/master/lib/bzip2/ffi/libbz2.rb
#

# Copyright (c) 2015-2016 Philip Ross
#
# Permission is hereby granted, free of charge, to any person obtaining a copy of
# this software and associated documentation files (the "Software"), to deal in
# the Software without restriction, including without limitation the rights to
# use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies
# of the Software, and to permit persons to whom the Software is furnished to do
# so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included in all
# copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
# THE SOFTWARE.

require 'ffi'

module Zip
module Bzip2
# FFI bindings for the libbz2 low-level interface.
#
# See bzlib.h and http://bzip.org/docs.html.
#
# @private
module Libbz2 #:nodoc:
extend ::FFI::Library

ffi_lib ['bz2', 'libbz2.so.1', 'libbz2.dll']

BZ_RUN = 0
BZ_FLUSH = 1
BZ_FINISH = 2

BZ_OK = 0
BZ_RUN_OK = 1
BZ_FLUSH_OK = 2
BZ_FINISH_OK = 3
BZ_STREAM_END = 4
BZ_SEQUENCE_ERROR = -1
BZ_PARAM_ERROR = -2
BZ_MEM_ERROR = -3
BZ_DATA_ERROR = -4
BZ_DATA_ERROR_MAGIC = -5
BZ_CONFIG_ERROR = -9

# void *(*bzalloc)(void *,int,int);
callback :bzalloc, [:pointer, :int, :int], :pointer

# void (*bzfree)(void *,void *);
callback :bzfree, [:pointer, :pointer], :void

# typedef struct { ... } bz_stream;
class BzStream < ::FFI::Struct #:nodoc:
layout :next_in, :pointer,
:avail_in, :uint,
:total_in_lo32, :uint,
:total_in_hi32, :uint,

:next_out, :pointer,
:avail_out, :uint,
:total_out_lo32, :uint,
:total_out_hi32, :uint,

:state, :pointer,

:bzalloc, :bzalloc,
:bzfree, :bzfree,
:opaque, :pointer
end

# int BZ2_bzCompressInt(bz_stream* strm, int blockSize100k, int verbosity, int workFactor);
attach_function :BZ2_bzCompressInit, [BzStream.by_ref, :int, :int, :int], :int

# int BZ2_bzCompress (bz_stream* strm, int action);
attach_function :BZ2_bzCompress, [BzStream.by_ref, :int], :int

# int BZ2_bzCompressEnd (bz_stream* strm);
attach_function :BZ2_bzCompressEnd, [BzStream.by_ref], :int

# int BZ2_bzDecompressInit (bz_stream *strm, int verbosity, int small);
attach_function :BZ2_bzDecompressInit, [BzStream.by_ref, :int, :int], :int

# int BZ2_bzDecompress (bz_stream* strm);
attach_function :BZ2_bzDecompress, [BzStream.by_ref], :int

# int BZ2_bzDecompressEnd (bz_stream *strm);
attach_function :BZ2_bzDecompressEnd, [BzStream.by_ref], :int
end
end
end
59 changes: 59 additions & 0 deletions lib/zip/bzip2_decompressor.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
require 'zip/bzip2/decompress'

module Zip
class Bzip2Decompressor < Decompressor #:nodoc:all
def initialize(input_stream, decrypter = NullDecrypter.new)
super(input_stream)
@bzip2_ffi_decompressor = Bzip2::Decompress.new
@output_buffer = ''.dup
@has_returned_empty_string = false
@decrypter = decrypter
end

def sysread(number_of_bytes = nil, buf = '')
readEverything = number_of_bytes.nil?
while readEverything || @output_buffer.bytesize < number_of_bytes
break if internal_input_finished?
@output_buffer << internal_produce_input(buf)
end
return value_when_finished if @output_buffer.bytesize == 0 && input_finished?
end_index = number_of_bytes.nil? ? @output_buffer.bytesize : number_of_bytes
@output_buffer.slice!(0...end_index)
end

def produce_input
if @output_buffer.empty?
internal_produce_input
else
@output_buffer.slice!(0...(@output_buffer.length))
end
end

# to be used with produce_input, not read (as read may still have more data cached)
# is data cached anywhere other than @outputBuffer? the comment above may be wrong
def input_finished?
@output_buffer.empty? && internal_input_finished?
end

alias :eof input_finished?
alias :eof? input_finished?

private

def internal_produce_input(buf = '')
@bzip2_ffi_decompressor.decompress(@decrypter.decrypt(@input_stream.read(1024, buf)))
rescue Bzip2::Error => e
raise DecompressionError, e.message
end

def internal_input_finished?
@bzip2_ffi_decompressor.finished?
end

def value_when_finished # mimic behaviour of ruby File object.
return if @has_returned_empty_string
@has_returned_empty_string = true
''
end
end
end
1 change: 1 addition & 0 deletions lib/zip/entry.rb
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ module Zip
class Entry
STORED = 0
DEFLATED = 8
BZIP2ED = 12
# Language encoding flag (EFS) bit
EFS = 0b100000000000

Expand Down
1 change: 1 addition & 0 deletions lib/zip/errors.rb
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ class EntryNameError < Error; end
class EntrySizeError < Error; end
class InternalError < Error; end
class GPFBit3Error < Error; end
class DecompressionError < Error; end

# Backwards compatibility with v1 (delete in v2)
ZipError = Error
Expand Down
4 changes: 4 additions & 0 deletions lib/zip/input_stream.rb
Original file line number Diff line number Diff line change
Expand Up @@ -152,6 +152,10 @@ def get_decompressor
header = @archive_io.read(@decrypter.header_bytesize)
@decrypter.reset!(header)
::Zip::Inflater.new(@archive_io, @decrypter)
elsif @current_entry.compression_method == ::Zip::Entry::BZIP2ED && defined?(::Zip::Bzip2Decompressor)
header = @archive_io.read(@decrypter.header_bytesize)
@decrypter.reset!(header)
::Zip::Bzip2Decompressor.new(@archive_io, @decrypter)
else
raise ::Zip::CompressionMethodError,
"Unsupported compression method #{@current_entry.compression_method}"
Expand Down
1 change: 1 addition & 0 deletions rubyzip.gemspec
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ Gem::Specification.new do |s|
'wiki_uri' => 'https://github.com/rubyzip/rubyzip/wiki'
}
s.required_ruby_version = '>= 2.4'
s.add_development_dependency 'ffi', '~> 1.0'
s.add_development_dependency 'rake', '~> 10.3'
s.add_development_dependency 'pry', '~> 0.10'
s.add_development_dependency 'minitest', '~> 5.4'
Expand Down
Loading