Make image_comparison more pytest-y #8380

QuLogic · 2017-03-26T20:08:21Z

I'm a bit tired of image_comparison breaking pytest subtly. It was written in a way that mirrors nose with the test as "setup" and the "test" is just saving and comparing the results. This has a number of disadvantages.

However, switching to a decorated function had a problem; we couldn't change the signature of the decorated function or pytest would get confused about where to send the fixtures. So we couldn't, for example, parametrize based on extension. The trick I've found is that the test doesn't need to take a fixture if one of its fixtures does. So I hacked this together by having the fixture send the parameters back to the wrapper via the decorated function. Now, image_comparison is simply another decorator as far a pytest is concerned (because it doesn't modify the function signature.)

This method has a number of advantages:

Test code is actually run as the test phase, not the setup phase. This makes more sense semantically:
- Problems in test code is a failure instead of an error.
- Stuff like Use some more pytest plugins: warnings & rerunfailures #8346 will work correctly now because it assumes test code is actually in the test phase and doesn't try to capture warnings from the setup phase.
No signature changes means we can use any fixture we want on an image test. (Marks already worked, but now we can ask for, e.g., monkeypatch or something else as an argument.)
Image comparisons can be parametrized. At the moment, I've only setup baseline_images as a possible fixture, but this could be extended. This can be seen being used in the mathtext tests in the last commit.

There's only one obvious (to me) disadvantage:

Previously, the test phase was just comparisons, so we could parametrize this by baseline image name. Now with the test phase as figure creation and comparison, we can't parametrize on image name. Since we already parametrize on extension, I think this is a small price to pay.

phobson · 2017-03-28T16:31:24Z

This is impressive work! It's also a bit over my head.

Would these changes render pytest-mpl obsolete, or does the image_comparison remain pretty low-level under this code?

QuLogic · 2017-03-29T02:54:24Z

I did not really set out to do so, so I can't say that definitively. And realistically, our decorator had mostly (modulo the minor issues fixed here) always worked and probably will continue to work just fine. It's really a matter of whether we want to continue developing it or not.

The change does look a bit bigger than it really is; what it boils down to is splitting one class that does everything into one class that does just comparison (so some stuff got moved up), one class that works just with nose (some stuff got moved down), and one wrapper function that works just with pytest (a bunch of stuff got deleted). And then tweaking the original decorator so it calls the right one.

dopplershift · 2017-03-29T16:26:58Z

Well, pytest-mpl uses our image_comparison code, so it's won't obsolete it. At the very least, pytest-mpl provides an actual pytest plugin, with things like extra command-line arguments.

What we need to be careful about is making sure that this doesn't break pytest-mpl.

QuLogic · 2017-03-29T22:20:56Z

It looks like pytest-mpl uses ImageComparisonTest, which was already renamed on master at some point (not by me, so I don't know when.) I think this PR is actually useful for pytest-mpl because it splits the comparison part from the decorator part (which it doesn't use at all.) I tried to keep the external API the same from before this PR, if you don't feel like changing it (but as I said, it's already been broken.)

dopplershift · 2017-03-29T22:49:44Z

Wow, that was really hard to find. #7097

QuLogic · 2017-03-30T01:09:04Z

I added a rename back to the old name; I don't think it should be a problem, but we don't use it anymore, so it might be useful to try it out.

Instead of a heavy do-it-all class, split ImageComparisonDecorator into a smaller class that does just the comparison stuff and one that does only nose. For pytest, use a wrapper function that's decorated only by pytest decorators, and don't try to modify the function signature. By using a separate fixture, we can indirectly return the parameterized arguments instead. This stops pytest from getting confused about what takes what argument. The biggest benefit is that test code is now run as the *test*, whereas previously, it was run as the *setup* causing it to have all sorts of semantic irregularities.

The main point is it can be indirectly determined from other parametrizations.

A similar check is done in calculate_rms.

While this does cause a periodic printout if using the image_comparison decorator from nose, trying to handle this exception means that test images don't get checked at all, which is even worse.

This should ensure that it continues to work for downstream users even though we've stopped using it.

QuLogic · 2017-04-15T06:50:23Z

I've updated this with:

a rebase,
restoring ImageComparisonTest and ImageComparisonTest.remove_text which should allow pytest-mpl to continue working (@dopplershift: It would be nice if pytest-mpl used the remove_ticks_and_titles function directly, though).
a test of the nose-version of image_comparison, which should ensure it continues to work for downstream users (assuming I've got the test written up correctly.)

QuLogic · 2017-04-15T21:15:21Z

Looks like even indirectly, we don't install nose; should I install it for this one test?

anntzer · 2017-04-15T21:37:53Z

lib/matplotlib/testing/conftest.py

+    # pytest won't get confused.
+    # We annotate the decorated function with any parameters captured by this
+    # fixture so that they can be used by the wrapper in image_comparison.
+    baseline_images = request.keywords['baseline_images'].args[0]


Looks like from a quick skim through the codebase that baseline_images is always a list with a single element.

Should we switch to having a single baseline_image (singular)? Perhaps the ability to compare multiple images is useful, though.

If we want to assert that baseline_images is always of length 1, I prefer

baseline_image, = request.keywords["baseline_images"].args

which will error when multiple values are passed.
If we don't, then we shouldn't silently drop extra values here.

The [0] refers to the fact that I passed the original baseline_images as the first argument of the marker. There are multiple tests that do use more than one baseline image.

So request.keywords['baseline_images'] contains all args, nor just baseline_images? If so that's probably worth a comment.

request.keywords['baseline_images'] contains the pytest marker, .argsare the arguments passed to the marker, .kwargs are keyword arguments passed to the marker, etc.

Perhaps I should have given the marker a different name...

anntzer · 2017-04-15T21:38:55Z

lib/matplotlib/testing/conftest.py

+
+    func = request.function
+    func.__wrapped__.parameters = (baseline_images, extension)
+    yield


I think the typical pattern is try: yield finally: ... (even though it probably doesn't matter here)

The one above uses it, so I'll change this one too to be consistent.

anntzer · 2017-04-15T21:50:32Z

The naming is slightly confusing. If I understand correctly, _ImageComparisonBase is the class now used by pytest and ImageComparisonTest (which inherits it) is used by nose. If so, perhaps rename the first one to ImageComparison and the second to NoseImageComparison?

Looks reasonable enough to install nose to test nose compatibility.

QuLogic · 2017-04-15T21:53:09Z

Can't change the name of the nose one for backwards compatibility. I don't really want to make _ImageComparisonBase public for the same reason.

anntzer · 2017-04-15T21:56:22Z

That's reasonable, but can you leave comments to that effect then? Again it took me a while to understand the reason of the split.

QuLogic · 2017-04-15T22:21:04Z

Added some comments and handled some of the other stuff you mentioned.

anntzer · 2017-04-15T22:43:05Z

lib/matplotlib/testing/decorators.py

+    Nose-based image comparison class
+
+    This class generates tests for a nose-based testing framework. Ideally,
+    this class would not be public, and the only publically visibile API would


typo: visible.

anntzer · 2017-04-15T22:43:22Z

lib/matplotlib/testing/decorators.py

+    This function creates a decorator that wraps a figure-generating function
+    with image comparison code. Pytest can become confused if we change the
+    signature of the function, so we indirectly pass anything we need via the
+    ``mpl_image_comparison_parameters`` fixture and extra markers.


single backquotes (it's a reference)

anntzer · 2017-04-15T22:48:02Z

LGTM modulo typo and conditional on tests passing as usual :-)

* Install nose in one build. * Add docstrings and comments on new functions/classes to clarify what they do. * Use consistent `yield` pattern for fixture.

QuLogic · 2017-04-16T19:47:10Z

Even codecov is happy!

QuLogic added the topic: testing label Mar 26, 2017

QuLogic added this to the 2.1 (next point release) milestone Mar 26, 2017

QuLogic force-pushed the pytest-image_comparison branch from cd02a49 to 0913080 Compare March 27, 2017 03:53

QuLogic added 8 commits April 15, 2017 02:45

TST: Use same default style in the pytest fixture.

14b5d17

TST: Allow baseline_images to be a fixture.

f736296

The main point is it can be indirectly determined from other parametrizations.

TST: Properly parametrize the last mathtext tests.

3810c2a

Restore ImageComparisonTest and its static methods.

878dfaa

Raise consistent shape error out of save_diff_image.

fd8e844

A similar check is done in calculate_rms.

Stop handling GeneratorExit in nose image comparison.

75bebef

While this does cause a periodic printout if using the image_comparison decorator from nose, trying to handle this exception means that test images don't get checked at all, which is even worse.

Add tests for nose version of image_comparison.

ba7106a

This should ensure that it continues to work for downstream users even though we've stopped using it.

QuLogic force-pushed the pytest-image_comparison branch from 0ab7f4b to ba7106a Compare April 15, 2017 06:47

QuLogic mentioned this pull request Apr 15, 2017

Use some more pytest plugins: warnings & rerunfailures #8346

Merged

anntzer requested changes Apr 15, 2017

View reviewed changes

QuLogic mentioned this pull request Apr 15, 2017

Fix incorrect text line spacing. #8495

Merged

anntzer reviewed Apr 15, 2017

View reviewed changes

anntzer approved these changes Apr 15, 2017

View reviewed changes

anntzer changed the title ~~Make image_comparison more pytest-y~~ [MRG+1] Make image_comparison more pytest-y Apr 15, 2017

QuLogic added 2 commits April 15, 2017 18:48

Handle comments from PR.

c186408

* Install nose in one build. * Add docstrings and comments on new functions/classes to clarify what they do. * Use consistent `yield` pattern for fixture.

Fix nose image comparison test on Python 2.

d1c1e1b

QuLogic force-pushed the pytest-image_comparison branch from 8fc25dc to d1c1e1b Compare April 15, 2017 22:49

tacaswell merged commit 9bcadae into matplotlib:master Apr 16, 2017

QuLogic deleted the pytest-image_comparison branch April 17, 2017 05:01

QuLogic changed the title ~~[MRG+1] Make image_comparison more pytest-y~~ Make image_comparison more pytest-y Apr 17, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make image_comparison more pytest-y #8380

Make image_comparison more pytest-y #8380

QuLogic commented Mar 26, 2017 •

edited

Loading

phobson commented Mar 28, 2017

QuLogic commented Mar 29, 2017 •

edited

Loading

dopplershift commented Mar 29, 2017

QuLogic commented Mar 29, 2017 •

edited

Loading

dopplershift commented Mar 29, 2017

QuLogic commented Mar 30, 2017

QuLogic commented Apr 15, 2017

QuLogic commented Apr 15, 2017

anntzer Apr 15, 2017

QuLogic Apr 15, 2017

anntzer Apr 15, 2017

QuLogic Apr 15, 2017 •

edited

Loading

anntzer Apr 15, 2017

anntzer Apr 15, 2017

QuLogic Apr 15, 2017

anntzer commented Apr 15, 2017

QuLogic commented Apr 15, 2017

anntzer commented Apr 15, 2017

QuLogic commented Apr 15, 2017

anntzer Apr 15, 2017

anntzer Apr 15, 2017

anntzer commented Apr 15, 2017

QuLogic commented Apr 16, 2017

Make image_comparison more pytest-y #8380

Make image_comparison more pytest-y #8380

Conversation

QuLogic commented Mar 26, 2017 • edited Loading

phobson commented Mar 28, 2017

QuLogic commented Mar 29, 2017 • edited Loading

dopplershift commented Mar 29, 2017

QuLogic commented Mar 29, 2017 • edited Loading

dopplershift commented Mar 29, 2017

QuLogic commented Mar 30, 2017

QuLogic commented Apr 15, 2017

QuLogic commented Apr 15, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

QuLogic Apr 15, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

anntzer commented Apr 15, 2017

QuLogic commented Apr 15, 2017

anntzer commented Apr 15, 2017

QuLogic commented Apr 15, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

anntzer commented Apr 15, 2017

QuLogic commented Apr 16, 2017

QuLogic commented Mar 26, 2017 •

edited

Loading

QuLogic commented Mar 29, 2017 •

edited

Loading

QuLogic commented Mar 29, 2017 •

edited

Loading

QuLogic Apr 15, 2017 •

edited

Loading