Make test_stem less flaky. #16735

QuLogic · 2020-03-11T20:03:41Z

PR Summary

Since parametrizing the test allows it to run in parallel, this makes it flaky, as one process can overwrite the test result image of another.

Our standard way for dealing with tests that use the same baseline image is to pass duplicate filenames to image_comparison, because that is serialized.

This is basically the same as #16656 for test_stem.

PR Checklist

Has Pytest style unit tests
Code is Flake 8 compliant
[N/A] New features are documented, with examples if plot related
[N/A] Documentation is sphinx and numpydoc compliant
[N/A] Added an entry to doc/users/next_whats_new/ if major new feature (follow instructions in README.rst there)
[N/A] Documented in doc/api/api_changes.rst if API changed in a backward-incompatible way

Since parametrizing the test allows it to run in parallel, this makes it flaky, as one process can overwrite the test result image of another. Our standard way for dealing with tests that use the same baseline image is to pass duplicate filenames to `image_comparison`, because that is serialized.

timhoffm · 2020-03-11T22:38:28Z

Would this be fixed by #16693?

QuLogic · 2020-03-11T22:43:07Z

No, that's only for check_figures_equal, not image_comparison. And in that case, the filename doesn't actually matter since it doesn't correspond with checked-in files, so it can be changed at will.

timhoffm · 2020-03-11T23:17:00Z

Sorry about the confusion with check_figures_equal.

This approach here unrolls the parametrization into a helper loop. Exchanging the declarative parametrization for to code logic makes it less readable. But what bothers me more is that pytest will now not give the information which of the parameters failed.

I Would prefer to solve this in the administrative logic if possible. I quickly searched for disabling parallelization for some tests, but didn’t find anything. Could we set a lock based on the filename in image_comparison? That would prevent simultaneous access to the test file. AFAICS a good test could still overwrite a bad test image result so that you see the failure but may not have the offending test image anymore. But that‘s no difference in the manual solution proposed here.

Alternatively/additionally, could image_comparison carry a counter and number the test image stem.png, stem1.png, ... ?

QuLogic · 2020-03-12T04:12:42Z

We could use a lock, Cartopy uses filelock for this purpose. But I think that might be a bit large for 3.2.x, so how about we merge this (I can retarget directly if needed) and try that out for 3.3.0?

timhoffm · 2020-03-12T08:05:55Z

I don‘t see a problem backporting a lock to 3.2.1. It‘s an internal bug fix. But if you want to bring this explicit fix back I’m also fine with that.

QuLogic · 2020-03-12T10:11:16Z

It's not just a lock, but also a new dependency. And also 3.2.1 is scheduled for Friday, and that will take some time to test, I think.

…735-on-v3.2.x Backport PR #16735 on branch v3.2.x (Make test_stem less flaky.)

QuLogic added the topic: testing label Mar 11, 2020

QuLogic added this to the v3.2.1 milestone Mar 11, 2020

timhoffm approved these changes Mar 12, 2020

View reviewed changes

QuLogic mentioned this pull request Mar 13, 2020

Vectorize patch extraction in Axes3D.plot_surface #16675

Merged

6 tasks

tacaswell merged commit 4160365 into matplotlib:master Mar 13, 2020

meeseeksmachine pushed a commit to meeseeksmachine/matplotlib that referenced this pull request Mar 13, 2020

Backport PR matplotlib#16735: Make test_stem less flaky.

f456138

meeseeksmachine mentioned this pull request Mar 13, 2020

Backport PR #16735 on branch v3.2.x (Make test_stem less flaky.) #16760

Merged

QuLogic added a commit that referenced this pull request Mar 13, 2020

Merge pull request #16760 from meeseeksmachine/auto-backport-of-pr-16…

9da2546

…735-on-v3.2.x Backport PR #16735 on branch v3.2.x (Make test_stem less flaky.)

QuLogic deleted the flaky-test_stem branch March 14, 2020 06:24

tacaswell mentioned this pull request Oct 3, 2022

TST: force test with shared test image to run in serial #24081

Merged

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Make test_stem less flaky. #16735

Make test_stem less flaky. #16735

Uh oh!

QuLogic commented Mar 11, 2020

Uh oh!

timhoffm commented Mar 11, 2020

Uh oh!

QuLogic commented Mar 11, 2020

Uh oh!

timhoffm commented Mar 11, 2020 •

edited

Loading

Uh oh!

QuLogic commented Mar 12, 2020

Uh oh!

timhoffm commented Mar 12, 2020

Uh oh!

QuLogic commented Mar 12, 2020

Uh oh!

Uh oh!

Uh oh!

Make test_stem less flaky. #16735

Make test_stem less flaky. #16735

Uh oh!

Conversation

QuLogic commented Mar 11, 2020

PR Summary

PR Checklist

Uh oh!

timhoffm commented Mar 11, 2020

Uh oh!

QuLogic commented Mar 11, 2020

Uh oh!

timhoffm commented Mar 11, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

QuLogic commented Mar 12, 2020

Uh oh!

timhoffm commented Mar 12, 2020

Uh oh!

QuLogic commented Mar 12, 2020

Uh oh!

Uh oh!

timhoffm commented Mar 11, 2020 •

edited

Loading