Skip to content

process stucks/hangs and the memory usage increases up to 4.6Gib #651

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
ArtieReus opened this issue May 8, 2018 · 6 comments
Closed

process stucks/hangs and the memory usage increases up to 4.6Gib #651

ArtieReus opened this issue May 8, 2018 · 6 comments

Comments

@ArtieReus
Copy link
Contributor

Hi again,
i was very happy to know that my pull request #616 got merged a few days ago so I could get rid of my fork. Sadly I figured out that using the last state of the repo the process stucks/hangs and the memory usage increased to 4.6Gib.

bildschirmfoto 2018-05-08 um 13 35 17

So I decided to use the git-bisect to figure out when was the bug introduced. As starting point i put following commit f977cd1 which was added some days before I opened my fork and went the whole way through testing every step.

$ git bisect start
$ git bisect bad
$ git bisect good f977cd107243afa382be0f69db27119d0450985f

At each step i build the gem and tested agains my project locally and not from the docker.

As a result i got following:

372875f74838e5d7ab38bc337ec4333b540c13cd is the first bad commit
commit 372875f74838e5d7ab38bc337ec4333b540c13cd
Author: Hunter Haugen <hunter@puppet.com>
Date:   Thu Feb 8 16:28:39 2018 -0800

Use git history to find PRs in a tag instead of time

Fixes #617

Before this change PRs were associated with tags by the merge date of
the PR, regardless of git branching history. In this change the oldest
tag in which the merge commit of each PR occurs in is determined by
using the Github compare API. That association is then used to sort PRs
into their respective release.

This change uses caching of commit history fetching and comparisons for
speedup.

:040000 040000 85ea95fe6eee626debc717ad31c2e56b790bd5bf 
ace182352a34c7ece0d0f4c76e0e5b068d152b99 M	lib
:040000 040000 7bf68589dad03500e971f992cfa4194824e60177 
bd10aac7205bb84900a4a07a7ea3c3a634712e5c M	spec

I didn't had time to look into the commit. I will again open a fork rebasing to the commit before the bug and cherry pick the commits from my pull request... until we find a solution.

Best Arturo

@ArtieReus ArtieReus changed the title process stucks/hangs and the memory usage increased to 4.6Gib process stucks/hangs and the memory usage increases to 4.6Gib May 8, 2018
@ArtieReus ArtieReus changed the title process stucks/hangs and the memory usage increases to 4.6Gib process stucks/hangs and the memory usage increases up to 4.6Gib May 8, 2018
ArtieReus added a commit to sapcc/elektra that referenced this issue May 8, 2018
The include_body_issue PR was rebased to master before merged in the original repo. Since beginning of the Fork the master branch have been commited and also a bug introduce which i reported in this issue: github-changelog-generator/github-changelog-generator#651

Until this is not being fixed, we should stay in the fork.

 [ci skip]
@ArtieReus
Copy link
Contributor Author

Any reactions on that?

hunner added a commit to hunner/github-changelog-generator that referenced this issue May 11, 2018
This code is incorrectly spawning one thread per issue rather than one
thread per slice of issues.

Closes github-changelog-generator#651
@hunner
Copy link
Contributor

hunner commented May 11, 2018

That's not great. I am able to replicate this with bundle exec ./bin/github_changelog_generator --user sapcc --project elektra on the current master (7fe9adb). du -sh says the repo is 29M cloned on-disk.

The repo has 59 closed issues, 199 closed PRs, and 4617 commits.

I ^Cd when the memory usage was ~3G and got this stacktrace:

.../lib/github_changelog_generator/generator/generator_fetcher.rb:46:in `join': Interrupt
	from .../lib/github_changelog_generator/generator/generator_fetcher.rb:46:in `each'
	from .../lib/github_changelog_generator/generator/generator_fetcher.rb:46:in `block in detect_actual_closed_dates'
	from .../lib/github_changelog_generator/generator/generator_fetcher.rb:41:in `each'
	from .../lib/github_changelog_generator/generator/generator_fetcher.rb:41:in `each_slice'
	from .../lib/github_changelog_generator/generator/generator_fetcher.rb:41:in `detect_actual_closed_dates'
	from .../lib/github_changelog_generator/generator/generator.rb:150:in `fetch_issues_and_pr'
	from .../lib/github_changelog_generator/generator/generator.rb:47:in `compound_changelog'
	from .../lib/github_changelog_generator.rb:35:in `run'
	from ./bin/github_changelog_generator:5:in `<main>'

I removed the threading around find_closed_dates_by_commit() and it ran with no problems (though every PR before number 143 has no associated merge commit; perhaps a history rebasing happened around then). So it must be something with the way the closing dates by commit are found when part of threading.

Oh yeah, apparently

threads = []
issues_slice.each do |issue|
threads << Thread.new { find_closed_date_by_commit(issue) }
end
threads.each(&:join)
is spawning one thread per issue rather than one thread per batch of issues.

hunner added a commit to hunner/github-changelog-generator that referenced this issue May 11, 2018
The octo_fetcher caches the commits, so removing the threading around
methods requesting commits from the cache reduces CPU time and memory
usage.

Closes github-changelog-generator#651
@ArtieReus
Copy link
Contributor Author

Thx for the root cause analysis!

@hunner
Copy link
Contributor

hunner commented May 15, 2018

@ArtieReus If the fix in #653 fixes the issue for you I'd love to know!

@ArtieReus
Copy link
Contributor Author

@hunner let me try the fix against my repo tomorrow and I will post the results. Thx

@ArtieReus
Copy link
Contributor Author

ArtieReus commented May 16, 2018

@hunner I test the fix #653 and it works good against the elektra repo. It takes a bit longer than before (before is meaning state end of january), but it works through:

Done!
Generated log placed in ///host/CHANGELOG.md

real	2m21.545s
user	0m0.245s
sys	0m0.096s

System usage is around 800Mib which is a bit but ok:
bildschirmfoto 2018-05-16 um 10 48 29

Thx

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants