-
Notifications
You must be signed in to change notification settings - Fork 533
Isolate Dir.chdir to a new process, or mutex #372
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Instead of using chdir blocks, we should use absolute paths so that we can enable threaded usage of the library Fixes #355
lib/git/lib.rb
Outdated
if @git_dir | ||
Dir.chdir(@git_dir, &do_get) | ||
else | ||
build_list.call |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a typo in the original code, submitting a pr for this
Signed-off-by: Joshua Liebowitz <taquitos@google.com>
I am now testing with the latest change. Will update after we have it running for a bit. |
Now that we’ve made progress on ruby-git/ruby-git#372 We can limit mutexes to be individual repo-scoped instead of 1 giant mutex for all.
Signed-off-by: Joshua Liebowitz <taquitos@google.com>
lib/git/lib.rb
Outdated
else | ||
do_get.call | ||
end | ||
do_get.call(@git_dir) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not 💯 this will work, but tests pass. Seems like we might need to pass the path
to the config
command.
lib/git/lib.rb
Outdated
else | ||
build_list.call | ||
end | ||
build_list.call(@git_dir) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here, I think we need to utilize the path
variable in the config
command if it exists
Signed-off-by: Joshua Liebowitz <taquitos@google.com>
@@ -8,6 +8,7 @@ class GitExecuteError < StandardError | |||
class Lib | |||
|
|||
@@semaphore = Mutex.new | |||
@@config_semaphore = Mutex.new |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
calls to config also should be protected, but it might be heavy handed to fork
@perlun I'd love to get your thoughts. The changes don't make everything thread-safe, but it takes care of a lot of issues. There are a few places where paths can't be passed in |
@taquitos Thanks, will try to get this reviewed ASAP. Two minor things:
Thanks in advance. |
@perlun when you get a moment 👍 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some very good fixes here, bravo! 👏
I gave some notes on things I think could be addressed before merge. Please let me know what you think about these.
lib/git/base.rb
Outdated
class << self | ||
attr_accessor :chdir_semaphore | ||
end | ||
Git::Base.chdir_semaphore = Mutex.new |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This works perfectly fine, but maybe the approach below would be even better. It shouldn't really be an accessor but more like a reader since it's an immutable data structure that shouldn't be tampered with elsewhere, right?
class << self
def chdir_semaphore
@chdir_semaphore ||= Mutex.new
end
end
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you're right, but are you sure that's threadsafe? If we have 2 threads that call chdir_semaphore
before it is set, is it possible they both end up calling Mutex.new
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True. So maybe something like this then:
@chdir_semaphore = Mutex.new
class << self
attr_reader :chdir_semaphore
end
(if I remember the context right - the `@chdir_semaphore call in normal class context will assign it to the class-level variable. Please try this out before trusting me blindly on this.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@perlun I tested with your suggestion, it works 👍
lib/git/base.rb
Outdated
end | ||
Process.wait(pid) | ||
else | ||
# Windows and NetBSD 4 don't support fork() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting anecdote, I wasn't aware of NetBSD 4 here. But OTOH, it's past EOL already. Maybe we can drop that reference and instead mention JRuby (which is actively used by many), since it also has problems with fork()
on the JVM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True dat
Dir.chdir(repo.path) do | ||
return `du -s`.chomp.split.first.to_i | ||
end | ||
return `du -s #{repo.path}`.chomp.split.first.to_i |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very good fix, absolutely no reason to chdir
in this case. 🤦♂️
lib/git/lib.rb
Outdated
@@ -310,7 +311,7 @@ def branches_all | |||
def list_files(ref_dir) | |||
dir = File.join(@git_dir, 'refs', ref_dir) | |||
files = [] | |||
Dir.chdir(dir) { files = Dir.glob('**/*').select { |f| File.file?(f) } } rescue nil | |||
files = Dir.glob(File.join(dir, '**/*')).select { |f| File.file?(f) } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if the rescue nil
removal can be dangerous? Unfortunately not aware of its root cause of being there though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I was thinking about that too. I wasn't convinced it was needed, but it's probably safer since passing nil
into File.file?
or Dir.glob
can raise an error
@taquitos Is this reviewable now? If so, please rebase and I'll give it a look ASAP. |
Revert "355"
Instead of using chdir blocks, we should use absolute paths so that we can enable threaded usage of the library If Process.fork() is available, we'll use that for git.chdir actions, otherwise, we'll use a global mutex to prevent threads from calling git.chdir before the previous execution has completed. Fixes #355 Note: if you do async work in git.chdir all bets are off. - Reduce and isolate usage of Dir.chdir - fork() when using Dir.chdir, otherwise, use a mutex - wrap calls to in a semaphore - replace Dir.chdir with paths Signed-off-by: Joshua Liebowitz <taquitos@google.com>
Ugh. Making new PR |
Instead of using chdir blocks, we should use absolute paths so that we can enable threaded usage of the library
If
Process.fork()
is available, we'll use that forgit.chdir
actions, otherwise, we'll use a global mutex to prevent threads from callinggit.chdir
before the previous execution has completed.Fixes #355
Note: if you do async work in
git.chdir
all bets are off.Signed-off-by: Joshua Liebowitz