-
-
Notifications
You must be signed in to change notification settings - Fork 10.8k
TST: improve warnings parallel test safety #12015
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Not sure if we want a sort of mock test to ensure both code paths get explored. I suspect the travis coverage job isn't prone to the stochastic failure anyway (probably not using several cores for the test suite), but if we ever have parallel testing + coverage, then in theory we could also see stochastic fluctuations in the covered code path. |
numpy/testing/tests/test_utils.py
Outdated
# 0 warnings mean the absence of the | ||
# of the dictionary attribute, which seems | ||
# to have been causing stochastic failures | ||
# in parallel test scenarios |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, OK, now I get it, with parallel tests __warningsregistry__
was never created, which is OK if there are no warnings. The same could be done in _get_fresh_mode
, but I think you are right and it is clean here.
The logic is slightly reversed IMO, no warningregistry
means that no warnings occured.
Will merge in a bit in any case unless someone beats me to it, nice tracking down.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe in case it is not quite clear. What happens is that the same module gets reused, the first time the module gets a warning __warningregistry__
is created. In a non-parallel test run, there is apparently some test that is always run first which makes sure __warningregistry__
was created (just by creating a warning). In the parallel test run, there is a test that creates no warning and tests for that, so __warningregistry__
does not exist yet.
* previously, stochastic failures were possible when running the NumPy test suite in parallel because of an assumption made by assert_warn_len_equal() * we now guard against this failure by gracefully handling the testing contexts of both serial and parallel execution of the module
e7b5f44
to
2f44aff
Compare
Revised to clarify reason for previous parallel test failure based on feedback from @seberg. Also, added a crude unit test to (hopefully) ensure both code paths are explored irrespective of serial vs. parallel execution model. Broadly speaking, I suppose one might argue that a unit test that depends on the execution order of other tests in the module, or even within the entire suite, isn't strictly speaking a unit test, but anyway would be useful not to have stochastic failures in the CI coming from our own end at least. |
I doubt we really need that test to test the testing much, but can't hurt either. Thanks Tyler! |
Related to #12005 (comment) and #11915 (comment)
@seberg seems to be familiar with some of the details -- in particular, it seems to really just be this one test,
test_suppress_warnings_module
, that is process and / or thread unstable lately, apparently because of a call toassert_warn_len_equal
I've created a gist with a script that repeats the problematic test module on 4 cores 20 times and parses the output statistics.
On a clean build of latest master (6f0a0fa) for me locally,
python warning_probe.py
:Same script with this branch:
That looks promising I suppose?