-
-
Notifications
You must be signed in to change notification settings - Fork 849
Make #fetch_events_async and #fetch_comments_async thread-safe #729
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@mizoR - Thank you! Thank you! Thank you, for finding and solving this! At Vox Pupuli, this issue has been plaguing us for months. @ferrarimarco Can this be merged? Thanks again! |
Great PR! Can you please provide a test for this as well? Thanks again for the contribution :) |
f480bd3
to
9d2986c
Compare
I tried to add test code now. But it has not been easy for me... This is a bug that reproduces only when the response requested in the second thread arrives before the response requested in the first thread. So this is difficult to express in tests. On the other hand, this PR is a change of Or could you give me any hints if you have good idea about how to test 🙏 Anyway thanks for your reply 🙂 |
I think you can starting by mocking Thread.new() and work from there :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess fetch_comments_async also suffers from the same issue.
@ferrarimarco I really doubt you can replicate race conditions with mocking threads. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reasoning behind the change is good, it is clear what's being done: a client per thread, indexed by an integer.
Thanks for submitting this.
I rebased my commits to resolve conflicts :) Could you check it again? Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm guessing that we will iterate on this, to make sure that it's easy not to make a code change that inadvertently breaks the functionality.
Tools we can consider: a resource pool and concurrent-ruby's Promise.
Should |
Thanks! As far as I've grep'd, there are three functions that use threads.
#fetch_events_async and #fetch_comments_async are using Octokit#last_response, so there was a race condition problem, so I fixed #fetch_comments_async as well as #fetch_events_async. #fetch_tag_shas_async doesn't use Octokit#last_response, so I think it will be fine as is. |
While I think this PR looks okay... I think creates a situation where future changes of the code may introduce thread safety issues or break assumptions about the sharing of clients. My advice is to use a thread pool and queue jobs into it: https://github.com/ruby-concurrency/concurrent-ruby/blob/master/docs-source/thread_pools.md In addition, unless the client is using persistent connections under the hood, caching them probably doesn't serve much purpose. Finally, you are trying to maximise I/O throughput by putting what appears to be HTTP/1.1 requests on different threads. If that's the case, you might want to investigate async-http which would potentially improve throughput, reduce resource utilisation and avoid the need for thread synchronisation primitives. |
I had a quick look and octokit is using faraday: https://github.com/octokit/octokit.rb/blob/4-stable/octokit.gemspec#L9 We have the |
@ioquatix while I agree that's a much cleaner solution, this currently fixes a nasty bug. Do you think you can come up with an alternative patch or should this be merged until a proper solution can be implemented? |
I have seen the alternative PR, and I think we ought to use that solution.
OK, I agree with that too. I close this. Thanks! |
@mizoR thanks for your work and bringing attention to this issue as well as a potential solution, we all learnt a lot! |
iterate_page(client, ...)
is called in a thread when getting issue_events. Butclient
is not thread safe. So sometimes changelog output can be different every time we rungithub_changelog_generator
.This PR fixs it with assigning specified octkit each thread, thanks!