-
-
Notifications
You must be signed in to change notification settings - Fork 8.4k
tests/extmod/select_poll_eintr: Skip unreliable test in Github CI. #17745
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tests/extmod/select_poll_eintr: Skip unreliable test in Github CI. #17745
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #17745 +/- ##
=======================================
Coverage 98.41% 98.41%
=======================================
Files 171 171
Lines 22210 22210
=======================================
Hits 21857 21857
Misses 353 353 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
extmod/select_poll_eintr.py is a constant source of spurious failures in Github CI. This PR adds it to the list of tests skipped in that environment in order to improve the test suite's false positive rate and positive predictive value in detecting defects. Signed-off-by: Anson Mansfield <amansfield@mantaro.com>
28ea4a6
to
26d9bf2
Compare
Thanks for the very detailed analysis! I should have been clearer that this is intended to be fixed (with a workaround for the true bug) by #17655. |
Oh, I think that did actually come up in the search I did, guess I should've read further. |
And that PR has just been merged, so CI should be a lot happier now. |
Ty! Since, this test is the main reason I didn't notice the other rv32 test failures when I was reviewing #17716 originally --- with how conditioned I was starting to get, expecting there to always be one or two failures in every CI run. |
This still seems to be flaky on macOS in my experience. |
Can you point to a few recent runs where it fails? Or is it only locally on your machine? |
Sorry, this is locally on my machine building through the Nix package manager. Just wanted to comment as an FYI as I was looking for other reports of similar failures. I simply disabled the test on our end. |
OK, thanks, good to know. The actual bug that leads to this unreliability is described in #11604. Hopefully will be fixed on day! |
Summary
extmod/select_poll_eintr.py
is a constant source of spurious failures in Github CI.This PR adds it to the list of tests skipped when running on Github CI, to help reduce the overall false positive rate and improve the predictive value of the test fail indication.
Testing
I exampled a sample of the last 25 failed Github Actions runs, tabulated their causes, and calculated relevant confusion matrix statistics over the results to determine that there is in fact adequate statistical evidence to support my original anecdotal experience with
extmod/select_poll_eintr.py
being problematic.stackless_clang
extmod/select_poll_eintr.py
settrace_stackless
extmod/select_poll_eintr.py
settrace_stackless
extmod/select_poll_eintr.py
stackless_clang
macos
10 other jobs
extmod/select_poll_eintr.py
extmod/select_poll_eintr.py basics/slice_optimse.py
basics/slice_optimse.py
standard_v2
longlong
extmod/select_poll_eintr.py
(many failures)
standard_v2
extmod/select_poll_eintr.py
standard_v2
settrace_stackless
extmod/select_poll_eintr.py
extmod/select_poll_eintr.py
(Note that
reproducible
was excluded from tabulation as it doesn't runextmod/select_poll_eintr.py
)20 of these 25 examined runs include
extmod/select_poll_eintr.py
as a failure, compared to only 6 runs that include any other kind of failure.As far as I can tell, none of these failures have anything to do with changes made to the
select
module in the triggering branch, making all but the one run that also included another failure false positives.Over the same sample period, there were a total of 9 passing unix runs. Under the assumption that all 6 non-
extmod/select_poll_eintr.py
failed runs are true positives and that all 9 of these passing runs are true negatives, that gives the test suite withextmod/select_poll_eintr.py
included a false positive rate of 67.8%, a positive predictive value of only 24%, and an F1 score of 0.387. These values support the conclusion that the rate of spurious failures is excessive, and that the usefulness of the CI failure indicator is diluted as a result.Considering
extmod/select_poll_eintr.py
individually, this test has a per-job false positive rate of 5.5% and a per-run fpr of 60.6%. This supports the conclusion that the weak predictive value of the test suite is largely attributable to this test.Overall, the sample I examined supports the conclusion that
extmod/select_poll_eintr.py
is problematic should be excluded from Github CI runs going forward.Statistics Code, for anyone who cares to check my math:
Output: