-
-
Notifications
You must be signed in to change notification settings - Fork 10.8k
WIP, TST: bump up CPU usage on shippable nodes #11915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Doesn't seem to help with |
The overall ARM CI runs are getting faster here, but the The
|
IIRC, |
Yes indeed, warnings are generally not thread safe, that is also true for |
Wouldn't be possible to parallelize in processes rather then threads and only parallelize for whole files or so? (Might be thinking completely wrong here, didn't try to understand it in depth) |
@tylerjereddy, hmm, that is the only/main error prone test? Maybe the module filter is particularly bad or so, not sure. It is plausible we do not use it at all. Filtering by module is particularly hacky (the only really bad part IMO), so maybe there is something particularly wrong there, though not sure why or what. How does the threading run the tests anyway? |
Boosting up to the full 96 cores also causes 1 more failure ( Perhaps the conclusion is that for now we don't have much to gain from using more cores on these ARM nodes, and plenty more susceptibility to testing issues. I could see investing time in those testing issues if we were getting a huge speed boost, but a little less clear that's justified here. |
Well, it might be the test is just random for some other reason, it is a pretty funny test creating a fake module and so on I think... It doesn't sound quite like it is related to just threads manipulating the warning filters anyway, and since you think it is process orientated, I am sure it is, it just sounds so much more sensible ;). So frankly, if this is the only test stopping you, we can look into fixing the test up. Only thing, I am not quite sure where to start debugging (and don't have much time), so maybe it is not wroth the trouble :). |
One other question might be -- do we expect building NumPy to scale well with increased parallelism? I always thought it helped informally when I work on x86_64 arch, but we don't seem to squeeze much of anything out even going from 1-40 cores on ARM here. Being able to independently compile modules instead of following a fairly strict serial order may be at odds with the way the project organization / dependencies evolved over time. |
Is that true? I recently tested parallel test runs for scipy with I haven't checked how |
One thing I just noticed yesterday, is that the |
@rgommers the warnings filters all work based on mutating the EDIT: More importantly, catch warning and suppress warnings replaces |
Should be fine. From the PR description that introduced parallel builds (gh-5161): The parallelization is limited to within the files of an extension, so only numpy multiarraymodule really profits... |
3dfa3d2
to
2e4e10b
Compare
So, just We only get about a minute faster if we resolve that anyway, so maybe not a big priority given the much larger queue lengths in comparison. |
@tylerjereddy Do you want to keep this open? |
I guess I'll close it -- Stefan seems to be redirecting my energy away from CI stuff anyway |
The shippable team noted that although the pool of nodes is shared, once allocated for a job the node is exclusively assigned so we can use all the cores--trying to bump up usage here--will remove WIP tag if it helps.
I ran "lscpu" a week or two ago in one of the jobs and the number was quite large -- can't remember how much of that is physical vs. logical, but worth trying to bump up. I think previous small number testing suggested that it can help for testing more than building, but let's see.