Skip to content

Make #fetch_events_async and #fetch_comments_async thread-safe #729

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

mizoR
Copy link

@mizoR mizoR commented Jun 22, 2019

iterate_page(client, ...) is called in a thread when getting issue_events. But client is not thread safe. So sometimes changelog output can be different every time we run github_changelog_generator.

This PR fixs it with assigning specified octkit each thread, thanks!

@ferrarimarco ferrarimarco added this to the 2.0.0 milestone Jun 22, 2019
@alexjfisher
Copy link
Contributor

@mizoR - Thank you! Thank you! Thank you, for finding and solving this! At Vox Pupuli, this issue has been plaguing us for months.

@ferrarimarco Can this be merged?

Thanks again!

@ferrarimarco
Copy link
Contributor

Great PR!

Can you please provide a test for this as well? Thanks again for the contribution :)

@mizoR
Copy link
Author

mizoR commented Sep 26, 2019

@ferrarimarco

I tried to add test code now.

But it has not been easy for me... This is a bug that reproduces only when the response requested in the second thread arrives before the response requested in the first thread. So this is difficult to express in tests.

On the other hand, this PR is a change of GitHubChangelogGenerator::OctoFetcher#fetch_events_async and #fetch_events_async test passes. So could you consider merging without adding tests?

Or could you give me any hints if you have good idea about how to test 🙏

Anyway thanks for your reply 🙂

@ferrarimarco
Copy link
Contributor

I think you can starting by mocking Thread.new() and work from there :)

Copy link
Contributor

@ekohl ekohl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess fetch_comments_async also suffers from the same issue.

@ekohl
Copy link
Contributor

ekohl commented Nov 12, 2019

@ferrarimarco I really doubt you can replicate race conditions with mocking threads.

Copy link
Collaborator

@olleolleolle olleolleolle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reasoning behind the change is good, it is clear what's being done: a client per thread, indexed by an integer.

Thanks for submitting this.

@mizoR mizoR force-pushed the fix-thread-unsafe branch from 9d2986c to 40abe37 Compare April 8, 2020 09:02
@mizoR
Copy link
Author

mizoR commented Apr 8, 2020

I rebased my commits to resolve conflicts :) Could you check it again?

Thanks!

olleolleolle
olleolleolle previously approved these changes Apr 8, 2020
Copy link
Collaborator

@olleolleolle olleolleolle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm guessing that we will iterate on this, to make sure that it's easy not to make a code change that inadvertently breaks the functionality.

Tools we can consider: a resource pool and concurrent-ruby's Promise.

@ekohl
Copy link
Contributor

ekohl commented Apr 8, 2020

Should fetch_comments_async use the same logic? It also uses threads. Basically grep for MAX_THREAD_NUMBER to find where to apply the pattern.

@mizoR mizoR force-pushed the fix-thread-unsafe branch from 40abe37 to 0746291 Compare April 8, 2020 09:40
@mizoR mizoR changed the title Make #fetch_events_async thread safe Make #fetch_events_async and #fetch_comments_async thread-safe Apr 8, 2020
@mizoR
Copy link
Author

mizoR commented Apr 8, 2020

Thanks! As far as I've grep'd, there are three functions that use threads.

  • #fetch_events_async
  • #fetch_comments_async
  • #fetch_tag_shas_async

#fetch_events_async and #fetch_comments_async are using Octokit#last_response, so there was a race condition problem, so I fixed #fetch_comments_async as well as #fetch_events_async.

#fetch_tag_shas_async doesn't use Octokit#last_response, so I think it will be fine as is.

@ioquatix
Copy link
Contributor

ioquatix commented Apr 8, 2020

While I think this PR looks okay... I think creates a situation where future changes of the code may introduce thread safety issues or break assumptions about the sharing of clients.

My advice is to use a thread pool and queue jobs into it:

https://github.com/ruby-concurrency/concurrent-ruby/blob/master/docs-source/thread_pools.md

In addition, unless the client is using persistent connections under the hood, caching them probably doesn't serve much purpose.

Finally, you are trying to maximise I/O throughput by putting what appears to be HTTP/1.1 requests on different threads. If that's the case, you might want to investigate async-http which would potentially improve throughput, reduce resource utilisation and avoid the need for thread synchronisation primitives.

@ioquatix
Copy link
Contributor

ioquatix commented Apr 8, 2020

I had a quick look and octokit is using faraday:

https://github.com/octokit/octokit.rb/blob/4-stable/octokit.gemspec#L9

We have the async-http-faraday adapter so it should work. Just FYI.

@ekohl
Copy link
Contributor

ekohl commented Apr 8, 2020

@ioquatix while I agree that's a much cleaner solution, this currently fixes a nasty bug. Do you think you can come up with an alternative patch or should this be merged until a proper solution can be implemented?

@ioquatix ioquatix mentioned this pull request Apr 9, 2020
@ioquatix
Copy link
Contributor

@ekohl #784 is now ready for review and should be merged in favour of this IMHO.

@olleolleolle olleolleolle dismissed their stale review April 10, 2020 15:36

I have seen the alternative PR, and I think we ought to use that solution.

@mizoR
Copy link
Author

mizoR commented Apr 10, 2020

OK, I agree with that too. I close this. Thanks!

@mizoR mizoR closed this Apr 10, 2020
@mizoR mizoR deleted the fix-thread-unsafe branch April 10, 2020 20:51
@ioquatix
Copy link
Contributor

@mizoR thanks for your work and bringing attention to this issue as well as a potential solution, we all learnt a lot!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants