-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
TST Relax test_minibatch_sensible_reassign
to avoid CI failures with single global random seed
#29278
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TST Relax test_minibatch_sensible_reassign
to avoid CI failures with single global random seed
#29278
Conversation
Some observations
|
For all the failing builds the number of clusters is 10, so using > 9 may be a quick fix:
|
test_minibatch_sensible_reassign
test_minibatch_sensible_reassign
I tested all the random seeds in the CI and the only problematic seed on all four failing builds is free-threaded fails with Haskell (still can't reproduce locally though with same CPU architecture ...), the two other failing builds using OpenBLAS are using SkyLakeX, the MKL one we don't know ( |
test_minibatch_sensible_reassign
test_minibatch_sensible_reassign
Relaxing the number of clusters check to be |
test_minibatch_sensible_reassign
to avoid CI failures with single global random seed
@@ -437,21 +437,24 @@ def test_minibatch_sensible_reassign(global_random_seed): | |||
n_clusters=20, batch_size=10, random_state=global_random_seed, init="random" | |||
).fit(zeroed_X) | |||
# there should not be too many exact zero cluster centers | |||
assert km.cluster_centers_.any(axis=1).sum() > 10 | |||
num_non_zero_clusters = km.cluster_centers_.any(axis=1).sum() | |||
assert num_non_zero_clusters > 9, f"{num_non_zero_clusters=} is too small" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added assertion string (i.e. second argument in the assert, not sure if there is a more exact name) because it seems like pytest assertion rewriting is a bit broken (needs a bit of investigation as to why). If this ever fails again, at least the message will tell how far we are from the threshold
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Ideally the test should be reworked with assertion where we know in advance what we really expect, but let's go with this quick tweak for now.
…h single global random seed (scikit-learn#29278)
…h single global random seed (scikit-learn#29278)
…h single global random seed (#29278)
Close #29253.
Summarising my comments below, relaxing the check to be
> 9
instead of> 10
makes the CI pass on all random seeds, see Azure logs on PR commit 162684a. I have not been able to reproduce the issue locally.This issue has been seen in multiple CI builds from time to time e.g. #27967 (comment) or #26802 (comment)