Skip to content

Isolate Dir.chdir to a new process, or mutex #372

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 15 commits into from
Closed
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion lib/git.rb
Original file line number Diff line number Diff line change
Expand Up @@ -108,7 +108,7 @@ def self.export(repository, name, options = {})
options.delete(:remote)
repo = clone(repository, name, {:depth => 1}.merge(options))
repo.checkout("origin/#{options[:branch]}") if options[:branch]
Dir.chdir(repo.dir.to_s) { FileUtils.rm_r '.git' }
FileUtils.rm_r(File.join(repo.dir.to_s, '.git'))
end

# Same as g.config, but forces it to be at the global level
Expand Down
39 changes: 31 additions & 8 deletions lib/git/base.rb
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,12 @@
module Git

class Base
# Adding a mutex to the class because each repo should be sharing the same mutex
# in case we need to Dir.chdir and we don't have fork() support to isolate that
class << self
attr_accessor :chdir_semaphore
end
Git::Base.chdir_semaphore = Mutex.new
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This works perfectly fine, but maybe the approach below would be even better. It shouldn't really be an accessor but more like a reader since it's an immutable data structure that shouldn't be tampered with elsewhere, right?

class << self
  def chdir_semaphore
    @chdir_semaphore ||= Mutex.new
  end
end

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're right, but are you sure that's threadsafe? If we have 2 threads that call chdir_semaphore before it is set, is it possible they both end up calling Mutex.new?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True. So maybe something like this then:

@chdir_semaphore = Mutex.new

class << self
  attr_reader :chdir_semaphore
end

(if I remember the context right - the `@chdir_semaphore call in normal class context will assign it to the class-level variable. Please try this out before trusting me blindly on this.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@perlun I tested with your suggestion, it works 👍


include Git::Base::Factory

Expand Down Expand Up @@ -92,18 +98,37 @@ def initialize(options = {})
@index = options[:index] ? Git::Index.new(options[:index], false) : nil
end

# changes current working directory for a block
# to the git working directory
# changes current working directory for a block to the git working directory.
#
# Note: If we can fork() or spawn(), Dir.chdir will happen in a new process
# otherwise, we will use a mutex to prevent threading errors
# See https://github.com/ruby-git/ruby-git/issues/355 for more info
#
# example
# @git.chdir do
# # write files
# @git.add
# @git.commit('message')
# end
def chdir # :yields: the Git::Path
Dir.chdir(dir.path) do
yield dir.path
def chdir(&block) # :yields: the Git::Path
chdir_block = Proc.new do
Dir.chdir(dir.path) do
block.call(dir.path)
end
end

if Process.respond_to?(:fork)
# Forking this process so that we can be threadsafe
pid = Process.fork do
chdir_block.call
end
Process.wait(pid)
else
# Windows and NetBSD 4 don't support fork()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting anecdote, I wasn't aware of NetBSD 4 here. But OTOH, it's past EOL already. Maybe we can drop that reference and instead mention JRuby (which is actively used by many), since it also has problems with fork() on the JVM.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True dat

# let's use a mutex to prevent race conditions with threads
Git::Base.chdir_semaphore.synchronize do
chdir_block.call
end
end
end

Expand Down Expand Up @@ -144,9 +169,7 @@ def repo

# returns the repository size in bytes
def repo_size
Dir.chdir(repo.path) do
return `du -s`.chomp.split.first.to_i
end
return `du -s #{repo.path}`.chomp.split.first.to_i
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very good fix, absolutely no reason to chdir in this case. 🤦‍♂️

end

def set_index(index_file, check = true)
Expand Down
14 changes: 3 additions & 11 deletions lib/git/lib.rb
Original file line number Diff line number Diff line change
Expand Up @@ -310,7 +310,7 @@ def branches_all
def list_files(ref_dir)
dir = File.join(@git_dir, 'refs', ref_dir)
files = []
Dir.chdir(dir) { files = Dir.glob('**/*').select { |f| File.file?(f) } } rescue nil
files = Dir.glob(File.join(dir, '**/*')).select { |f| File.file?(f) }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if the rescue nil removal can be dangerous? Unfortunately not aware of its root cause of being there though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I was thinking about that too. I wasn't convinced it was needed, but it's probably safer since passing nil into File.file? or Dir.glob can raise an error

files
end

Expand Down Expand Up @@ -442,11 +442,7 @@ def config_get(name)
command('config', ['--get', name])
end

if @git_dir
Dir.chdir(@git_dir, &do_get)
else
do_get.call
end
do_get.call(@git_dir)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not 💯 this will work, but tests pass. Seems like we might need to pass the path to the config command.

end

def global_config_get(name)
Expand All @@ -458,11 +454,7 @@ def config_list
parse_config_list command_lines('config', ['--list'])
end

if @git_dir
Dir.chdir(@git_dir, &build_list)
else
build_list.call
end
build_list.call(@git_dir)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here, I think we need to utilize the path variable in the config command if it exists

end

def global_config_list
Expand Down
16 changes: 8 additions & 8 deletions lib/git/status.rb
Original file line number Diff line number Diff line change
Expand Up @@ -171,14 +171,14 @@ def construct_status

def fetch_untracked
ignore = @base.lib.ignored_files

Dir.chdir(@base.dir.path) do
Dir.glob('**/*', File::FNM_DOTMATCH) do |file|
next if @files[file] || File.directory?(file) ||
ignore.include?(file) || file =~ %r{^.git\/.+}

@files[file] = { path: file, untracked: true }
end
root_base_dir_path_length = File.join(@base.dir.path, '/').length
Dir.glob(File.join(@base.dir.path, '**/*'), File::FNM_DOTMATCH) do |file|
# Strip off the `base.dir.path` to make these relative paths
relative_file_name = file.slice(root_base_dir_path_length..file.length)
next if @files[relative_file_name] || File.directory?(file) ||
ignore.include?(relative_file_name) || relative_file_name =~ %r{^.git\/.+}

@files[relative_file_name] = { path: relative_file_name, untracked: true }
end
end

Expand Down