The example of the testing decorator does not work. #18168

SmirnGreg · 2020-08-04T15:12:03Z

Bug report

Bug summary

The example of the testing decorator does not work.
https://matplotlib.org/3.2.1/devel/testing.html#writing-an-image-comparison-test

Code for reproduction

from matplotlib.testing.decorators import image_comparison
import matplotlib.pyplot as plt

@image_comparison(baseline_images=['line_dashes'], remove_text=True,
                          extensions=['png'])
def test_line_dashes():
        fig, ax = plt.subplots()
        ax.plot(range(10), linestyle=(0, (3, 3)), lw=5)

pytest

Actual outcome

=================================== test session starts ===================================
platform linux -- Python 3.7.6, pytest-5.4.3, py-1.8.2, pluggy-0.13.1
rootdir: /home/smirnov/langlearn/mpltest
plugins: openfiles-0.5.0, astropy-header-0.1.2, arraydiff-0.3, hypothesis-5.16.1, remotedata-0.3.2, doctestplus-0.7.0
collected 0 items / 1 error

========================================= ERRORS ==========================================
______________________________ ERROR collecting test_mpl.py _______________________________
In test_line_dashes: function uses no argument 'extension'
==================================== warnings summary =====================================
/home/smirnov/anaconda3/lib/python3.7/site-packages/matplotlib/testing/decorators.py:242
  /home/smirnov/anaconda3/lib/python3.7/site-packages/matplotlib/testing/decorators.py:242: PytestUnknownMarkWarning: Unknown pytest.mark.baseline_images - is this a typo?  You can register custom marks to avoid this warning - for details, see https://docs.pytest.org/en/latest/mark.html
    @pytest.mark.baseline_images(baseline_images)

/home/smirnov/anaconda3/lib/python3.7/site-packages/matplotlib/testing/decorators.py:244
  /home/smirnov/anaconda3/lib/python3.7/site-packages/matplotlib/testing/decorators.py:244: PytestUnknownMarkWarning: Unknown pytest.mark.style - is this a typo?  You can register custom marks to avoid this warning - for details, see https://docs.pytest.org/en/latest/mark.html
    @pytest.mark.style(style)

-- Docs: https://docs.pytest.org/en/latest/warnings.html
================================= short test summary info =================================
ERROR test_mpl.py
!!!!!!!!!!!!!!!!!!!!!!!!! Interrupted: 1 error during collection !!!!!!!!!!!!!!!!!!!!!!!!!!
============================== 2 warnings, 1 error in 0.39s ===============================

Expected outcome

Matplotlib version

Operating system: Windows 10 (native and WSL2)
Matplotlib version: 3.1.3
Matplotlib backend (print(matplotlib.get_backend())): not applicable
Python version: Python 3.7.7 (default, May 6 2020, 11:45:54) [MSC v.1916 64 bit (AMD64)]
Jupyter version (if applicable): not applicable
Other libraries: pytest 5.4.3

pip install <package that depends on matplotlib and has it in requirements> within the new Conda environment.

I want to use something to test whether the figures generated by my code look the same/similar comparing to the figures generated earlier.

The text was updated successfully, but these errors were encountered:

QuLogic · 2020-08-04T22:43:17Z

Matplotlib version: 3.1.3
Other libraries: pytest 5.4.3

You need to update Matplotlib, or downgrade pytest.

SmirnGreg · 2020-08-06T07:15:06Z

@QuLogic
Thanks, with mpl 3.3.0 works fine, with 3.2 does not work.

SmirnGreg · 2020-08-06T07:47:08Z

I still have an issue with testing. Tests work fine when I run them one by one, but when I run pytest on a whole folder, each following test remembers about previously opened figures, and tries to compare all of them with the saved baseline image.

Should I implicitly close all the opened figures in the beginning of the test with plt.close('all')? This works...
Input:

from matplotlib.testing.decorators import image_comparison
from matplotlib import pyplot as plt

@image_comparison(baseline_images=['line_dashes1'], remove_text=True, extensions=['png'])
def test_line_dashes1():
    fig, ax = plt.subplots()
    ax.plot(range(10), linestyle=(0, (3, 3)), lw=5)

@image_comparison(baseline_images=['line_dashes2'], remove_text=True, extensions=['png'])
def test_line_dashes2():
    fig, ax = plt.subplots()
    ax.plot(range(10), linestyle=(0, (3, 3)), lw=5)

Output of pytest on this file:

============================= test session starts =============================
platform win32 -- Python 3.7.7, pytest-5.4.3, py-1.9.0, pluggy-0.13.1 -- C:\Users\smirn\.conda\envs\gofish\python.exe
cachedir: .pytest_cache
rootdir: D:\astrowork\prodige\imager
collecting ... collected 2 items

test_mpl_sanity.py::test_line_dashes1[png] 
test_mpl_sanity.py::test_line_dashes2[png] PASSED                        [ 50%]FAILED                        [100%]
test\test_mpl_sanity.py:8 (test_line_dashes2[png])
extension = 'png'
request = <FixtureRequest for <Function test_line_dashes2[png]>>, args = ()
kwargs = {}, __tracebackhide__ = True
img = <matplotlib.testing.decorators._ImageComparisonBase object at 0x000001BD9B0F0BC8>
needs_lock = False, our_baseline_images = ['line_dashes2']

    @functools.wraps(func)
    @pytest.mark.parametrize('extension', extensions)
    @pytest.mark.style(style)
    @_checked_on_freetype_version(freetype_version)
    @functools.wraps(func)
    def wrapper(*args, extension, request, **kwargs):
        __tracebackhide__ = True
        if 'extension' in old_sig.parameters:
            kwargs['extension'] = extension
        if 'request' in old_sig.parameters:
            kwargs['request'] = request
    
        img = _ImageComparisonBase(func, tol=tol, remove_text=remove_text,
                                   savefig_kwargs=savefig_kwargs)
        matplotlib.testing.set_font_settings_for_testing()
        func(*args, **kwargs)
    
        # If the test is parametrized in any way other than applied via
        # this decorator, then we need to use a lock to prevent two
        # processes from touching the same output file.
        needs_lock = any(
            marker.args[0] != 'extension'
            for marker in request.node.iter_markers('parametrize'))
    
        if baseline_images is not None:
            our_baseline_images = baseline_images
        else:
            # Allow baseline image list to be produced on the fly based on
            # current parametrization.
            our_baseline_images = request.getfixturevalue(
                'baseline_images')
    
        assert len(plt.get_fignums()) == len(our_baseline_images), (
            "Test generated {} images but there are {} baseline images"
>           .format(len(plt.get_fignums()), len(our_baseline_images)))
E       AssertionError: Test generated 2 images but there are 1 baseline images

C:\Users\smirn\.conda\envs\gofish\lib\site-packages\matplotlib\testing\decorators.py:280: AssertionError

Assertion failed

Assertion failed

Assertion failed

Assertion failed


================================== FAILURES ===================================
___________________________ test_line_dashes2[png] ____________________________

extension = 'png'
request = <FixtureRequest for <Function test_line_dashes2[png]>>, args = ()
kwargs = {}, __tracebackhide__ = True
img = <matplotlib.testing.decorators._ImageComparisonBase object at 0x000001BD9B0F0BC8>
needs_lock = False, our_baseline_images = ['line_dashes2']

    @functools.wraps(func)
    @pytest.mark.parametrize('extension', extensions)
    @pytest.mark.style(style)
    @_checked_on_freetype_version(freetype_version)
    @functools.wraps(func)
    def wrapper(*args, extension, request, **kwargs):
        __tracebackhide__ = True
        if 'extension' in old_sig.parameters:
            kwargs['extension'] = extension
        if 'request' in old_sig.parameters:
            kwargs['request'] = request
    
        img = _ImageComparisonBase(func, tol=tol, remove_text=remove_text,
                                   savefig_kwargs=savefig_kwargs)
        matplotlib.testing.set_font_settings_for_testing()
        func(*args, **kwargs)
    
        # If the test is parametrized in any way other than applied via
        # this decorator, then we need to use a lock to prevent two
        # processes from touching the same output file.
        needs_lock = any(
            marker.args[0] != 'extension'
            for marker in request.node.iter_markers('parametrize'))
    
        if baseline_images is not None:
            our_baseline_images = baseline_images
        else:
            # Allow baseline image list to be produced on the fly based on
            # current parametrization.
            our_baseline_images = request.getfixturevalue(
                'baseline_images')
    
        assert len(plt.get_fignums()) == len(our_baseline_images), (
            "Test generated {} images but there are {} baseline images"
>           .format(len(plt.get_fignums()), len(our_baseline_images)))
E       AssertionError: Test generated 2 images but there are 1 baseline images

C:\Users\smirn\.conda\envs\gofish\lib\site-packages\matplotlib\testing\decorators.py:280: AssertionError
============================== warnings summary ===============================
C:\Users\smirn\.conda\envs\gofish\lib\site-packages\matplotlib\testing\decorators.py:248
  C:\Users\smirn\.conda\envs\gofish\lib\site-packages\matplotlib\testing\decorators.py:248: PytestUnknownMarkWarning: Unknown pytest.mark.style - is this a typo?  You can register custom marks to avoid this warning - for details, see https://docs.pytest.org/en/latest/mark.html
    @pytest.mark.style(style)

-- Docs: https://docs.pytest.org/en/latest/warnings.html
=========================== short test summary info ===========================
FAILED test_mpl_sanity.py::test_line_dashes2[png] - AssertionError: Test gene...
=================== 1 failed, 1 passed, 1 warning in 1.14s ====================

Process finished with exit code 1

Assertion failed

jklymak · 2020-08-06T14:53:45Z

@SmirnGreg perhaps discuss this at discourse.matplotlib.org ?

SmirnGreg · 2020-08-06T18:37:15Z

Dear @jklymak

I can discourse that on stack overflow or on some other forum as well, but I believe there is a severe issue in the image testing within matplotlib that might result into the library code change.

Let me explain my point. Matplotlib is the most heavily used plotting library in the Python community. In science and data science, it is definitely the most used. Groups of scientists rely on matplotlib, and since matplotlib turned to version 3 and dropped the python2 legacy, the community may demand stability. Current industry standard, as well as for serious scientific development, includes writing tests for the code, especially when using the rapidly changing frameworks. Matplotlib is trying to provide some testing environment, such as decorator image_comparison, and this is great. But the issue that forbids sequential runs of multiple test cases is here.

Currently, the documentation for image_comparison states that

Compare images generated by the test with those specified in
*baseline_images*, which must correspond, else an `ImageComparisonFailure`
exception will be raised.

But it is not comparing the images generated by test, rather all open figures.

As far as I can see (did not go to the source code yet), the decorator tries to compare all currently opened figures with the saved one. At the same time, the previous test leaves the figure open. I can propose three solutions for that:

The decorator first remembers all opened figures, then runs the test, then only compares the newly created figures with the saved images.
The decorator first remembers all opened figures, then runs the test, then compares all the currently opened figures, then closes the new ones.
Add plt.close('all') into the corresponding documentation example and change the docstring of image_comparison.

I am ready to implement one of the above if this PR is likely to be accepted, maybe with the support from the core team if possible.

Best wishes,
Greg.

jklymak · 2020-08-06T18:56:18Z

I suggest discourse because I think you need help setting up your testing environment, and this is not the appropriate venue for that.

tacaswell · 2020-08-06T21:28:39Z

The way the test the work they have to collect all of the currently open figures (one could argue we should be returning the tuple of the figures we want tested but that is neither here nor there). The way we cope with this in our test suite (which uses these tools) is via

matplotlib/lib/matplotlib/testing/conftest.py

Lines 32 to 101 in d38e443

    
           @pytest.fixture(autouse=True) 
        
           def mpl_test_settings(request): 
        
               from matplotlib.testing.decorators import _cleanup_cm 
        
               with _cleanup_cm(): 
        
                   backend = None 
        
                   backend_marker = request.node.get_closest_marker('backend') 
        
                   if backend_marker is not None: 
        
                       assert len(backend_marker.args) == 1, \ 
        
                           "Marker 'backend' must specify 1 backend." 
        
                       backend, = backend_marker.args 
        
                       skip_on_importerror = backend_marker.kwargs.get( 
        
                           'skip_on_importerror', False) 
        
                       prev_backend = matplotlib.get_backend() 
        
                       # special case Qt backend importing to avoid conflicts 
        
                       if backend.lower().startswith('qt4'): 
        
                           if any(k in sys.modules for k in ('PyQt5', 'PySide2')): 
        
                               pytest.skip('Qt5 binding already imported') 
        
                           try: 
        
                               import PyQt4 
        
                           # RuntimeError if PyQt5 already imported. 
        
                           except (ImportError, RuntimeError): 
        
                               try: 
        
                                   import PySide 
        
                               except ImportError: 
        
                                   pytest.skip("Failed to import a Qt4 binding.") 
        
                       elif backend.lower().startswith('qt5'): 
        
                           if any(k in sys.modules for k in ('PyQt4', 'PySide')): 
        
                               pytest.skip('Qt4 binding already imported') 
        
                           try: 
        
                               import PyQt5 
        
                           # RuntimeError if PyQt4 already imported. 
        
                           except (ImportError, RuntimeError): 
        
                               try: 
        
                                   import PySide2 
        
                               except ImportError: 
        
                                   pytest.skip("Failed to import a Qt5 binding.") 
        
                   # Default of cleanup and image_comparison too. 
        
                   style = ["classic", "_classic_test_patch"] 
        
                   style_marker = request.node.get_closest_marker('style') 
        
                   if style_marker is not None: 
        
                       assert len(style_marker.args) == 1, \ 
        
                           "Marker 'style' must specify 1 style." 
        
                       style, = style_marker.args 
        
                   matplotlib.testing.setup() 
        
                   with cbook._suppress_matplotlib_deprecation_warning(): 
        
                       if backend is not None: 
        
                           # This import must come after setup() so it doesn't load the 
        
                           # default backend prematurely. 
        
                           import matplotlib.pyplot as plt 
        
                           try: 
        
                               plt.switch_backend(backend) 
        
                           except ImportError as exc: 
        
                               # Should only occur for the cairo backend tests, if neither 
        
                               # pycairo nor cairocffi are installed. 
        
                               if 'cairo' in backend.lower() or skip_on_importerror: 
        
                                   pytest.skip("Failed to switch to backend {} ({})." 
        
                                               .format(backend, exc)) 
        
                               else: 
        
                                   raise 
        
                       matplotlib.style.use(style) 
        
                   try: 
        
                       yield 
        
                   finally: 
        
                       if backend is not None: 
        
                           plt.switch_backend(prev_backend)

which is auto-used for all of the tests and takes care of making sure any global state that is touched is cleaned up and (implicitly) closing all of the figures via plt.switch_backend.

I think the fix here is to document importing that fixture into your conftest.py

SmirnGreg · 2020-08-06T21:55:20Z

Dear @tacaswell,

thanks a lot, that is what I was looking for! It seems to me that when testing not the matplotlib itself, but own packages, the following import makes everything fine (as the fixture is applied to every test case):

from matplotlib.testing.conftest import mpl_test_settings

No further setup is required with latest pip releases of pytest and matplotlib.

tacaswell · 2020-08-06T22:09:27Z

@SmirnGreg Where would you have expected to find that information (and can you open a PR putting it there?) ?

timhoffm · 2020-08-07T08:04:41Z

I'm wondering if our testing functions are intended to be public for the end user. How does our internal testing compare to pytest-mpl? A pytest plugin seems to be the cleaner approach compared to importing from conftest.

dopplershift · 2020-08-07T18:55:59Z

I would 100% say everyone else should use pytest-mpl and not matplotlib's own testing code. And honestly, if we had cycles to burn, I'd say we should migrate matplotlib itself to pytest-mpl.

SmirnGreg · 2020-08-11T09:04:00Z

@timhoffm @dopplershift
thank you for the discussion!

I wanted to find a way to write a test for the output figures I plot with matplotlib. I went to the matplotlib.org and typed "test" in the search box, and the first two link point to the matplotlib.testing package. You are right, it is written there that it is for developers, but users also might be interested in such functionality. Also, matplotlib.testing is provided alongside the main matplotlib package not only in the developer version. I assumed that it is a valid way for testing and tries to run an example, an failed.
'pytest-mpl' has some issues, namely:
a. It still depends on matplotlib.testing.compare.compare_images. It has some wrapper around it, but..
b. I do find the original testing decorators more convenient. First, and the most important, I like the way they store the figures in the folder named after the file with the test case.
c. I also like that the decorator saves the output figure even if the tolerance test has passed -- that is helpful if the tolerance value was set to high, but I still want to check by eye time to time.
d. Its current release version has a leading 0. Does that mean that the API can change dramatically over time? I wanted a more stable solution.
e. The decorator API is much richer than that of pytest-mpl.

Based on the comment by @tacaswell, I would propose one or more of the following:

Add a note on testing that pytest-mpl is suggested for testing.
Add a note here that this is how the test should be written in the matplotlib testing environment, and that one can import a fixture from matplotlib.testing.conftest import mpl_test_settings to make it close the figure after the test and roll back to the initial state.
Add the similar note as previous in the decorator docstring
Maybe it is better to add this (and other important) fixture being imported within the decorator, so it works as in the example without the need in changing the docs? I don't have an idea about which side effects this could have on the existing tests, maybe it is better to not touch that.

What would you say?

timhoffm · 2020-11-06T23:21:54Z

I'd go with the three documentation bullets.

hs2361 · 2020-11-09T13:13:10Z

@SmirnGreg just to be clear, adding

from matplotlib.testing.conftest import mpl_test_settings

fixes the problem with the tests failing? If so, I'll go ahead and add in this line to the example in the docs as well, in the places that you suggested.

Add a note on testing that pytest-mpl is suggested for testing.
Add a note here that this is how the test should be written in the matplotlib testing environment, and that one can import a ? fixture from matplotlib.testing.conftest import mpl_test_settings to make it close the figure after the test and roll back to the initial state.
Add the similar note as previous in the decorator docstring

@timhoffm Would that be alright?

timhoffm · 2020-11-09T22:28:37Z

Um, not the pytest expert here, but I think it's sufficient to have mpl_test_settings imported in one conftest.py per project. Since that's already done in the matplotlib codebase and matplotlib.testing is not to be used by external users, no documentation is needed concerning the import.

tacaswell · 2020-11-10T02:47:57Z

Maybe it is better to add this (and other important) fixture being imported within the decorator, so it works as in the example without the need in changing the docs? I don't have an idea about which side effects this could have on the existing tests, maybe it is better to not touch that.

This won't work as pytest needs to find the fixture before it finds the test. Might make sense to add a plt.close('all') to the test decorator though?

but I think it's sufficient to have mpl_test_settings imported in one conftest.py per project.

Yes, pytest can find fixtures either in a conftest.py or at the module level. If you want it everywhere in a package, put it in conftest.py, if you only want it in a single test module import it at the module level. Either will work, depends on how aggressively you want to reset Matplotlib's global state.

For better-or-worse I think users have discovered our testing module and are using it so we should document it, however I would not add it to the example as the example is about how to write tests for our test suite.

I do think that pytest-mpl is a better option (and I agree with @dopplershift that if we had effort we would migrate to depending on them).

SmirnGreg closed this as completed Aug 6, 2020

QuLogic added the Community support Users in need of help. label Aug 6, 2020

SmirnGreg reopened this Aug 6, 2020

tacaswell added this to the v3.4.0 milestone Aug 6, 2020

tacaswell added the Documentation label Aug 6, 2020

QuLogic modified the milestones: v3.4.0, v3.5.0 Jan 27, 2021

QuLogic modified the milestones: v3.5.0, v3.6.0 Sep 25, 2021

anntzer mentioned this issue Oct 1, 2021

Make image_comparison work even without the autoclose fixture. #21256

Merged

7 tasks

QuLogic closed this as completed in #21256 Oct 14, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The example of the testing decorator does not work. #18168

The example of the testing decorator does not work. #18168

SmirnGreg commented Aug 4, 2020 •

edited

Loading

QuLogic commented Aug 4, 2020 •

edited

Loading

SmirnGreg commented Aug 6, 2020 •

edited

Loading

SmirnGreg commented Aug 6, 2020

jklymak commented Aug 6, 2020

SmirnGreg commented Aug 6, 2020

jklymak commented Aug 6, 2020 •

edited

Loading

tacaswell commented Aug 6, 2020

SmirnGreg commented Aug 6, 2020

tacaswell commented Aug 6, 2020

timhoffm commented Aug 7, 2020

dopplershift commented Aug 7, 2020

SmirnGreg commented Aug 11, 2020

timhoffm commented Nov 6, 2020

hs2361 commented Nov 9, 2020

timhoffm commented Nov 9, 2020

tacaswell commented Nov 10, 2020

The example of the testing decorator does not work. #18168

The example of the testing decorator does not work. #18168

Comments

SmirnGreg commented Aug 4, 2020 • edited Loading

Bug report

QuLogic commented Aug 4, 2020 • edited Loading

SmirnGreg commented Aug 6, 2020 • edited Loading

SmirnGreg commented Aug 6, 2020

jklymak commented Aug 6, 2020

SmirnGreg commented Aug 6, 2020

jklymak commented Aug 6, 2020 • edited Loading

tacaswell commented Aug 6, 2020

SmirnGreg commented Aug 6, 2020

tacaswell commented Aug 6, 2020

timhoffm commented Aug 7, 2020

dopplershift commented Aug 7, 2020

SmirnGreg commented Aug 11, 2020

timhoffm commented Nov 6, 2020

hs2361 commented Nov 9, 2020

timhoffm commented Nov 9, 2020

tacaswell commented Nov 10, 2020

SmirnGreg commented Aug 4, 2020 •

edited

Loading

QuLogic commented Aug 4, 2020 •

edited

Loading

SmirnGreg commented Aug 6, 2020 •

edited

Loading

jklymak commented Aug 6, 2020 •

edited

Loading