Skip to content

mathtext/mathfont intermittent failures #7911

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
QuLogic opened this issue Jan 21, 2017 · 12 comments · Fixed by #8708
Closed

mathtext/mathfont intermittent failures #7911

QuLogic opened this issue Jan 21, 2017 · 12 comments · Fixed by #8708
Milestone

Comments

@QuLogic
Copy link
Member

QuLogic commented Jan 21, 2017

I'm opening this as an issue separate from #6586 to better document the investigation into the root cause. This affects both the v2.0.x and master branches.

We can make the failures a bit more reproducible by repeating the tests a bit. What I did was the following:

  1. Choose one of the mathtext/mathfont tests (e.g. mathfont_*_08)
  2. Copy that test image 100 times for each font set
for i in `seq -w 0 99`; do
  for fs in cm dejavusans dejavuserif stix stixsans; do
      cp test_mathtext/mathfont_${fs}_08.png test_fail/mathfont_${fs}_${i}.png
  done
done
  1. Modify test_mathtext.py to drop the other tests and run only 08 100 times over.
  2. Edit Text._get_layout to print out whether the cache hit worked or failed.

In the end, you will get about 2 or 3 failures out of the 500. On the failures, the cache returns a value. It turns out that every once in a while, the renderer will have the same ID (or be the same one, I don't know). Since the cache key does not include the mathtext fontset, the cached value will be for some other one and thus be in the wrong place.

I'm not sure how to easily fix this. Ideally, the fontset would be part of the key, but it's not actually used by Text objects, but the MathTextParser. According to #7107, baking the rcParam directly into initialization is a hard problem. Maybe we should find a way to disable the cache during test?

@jkseppan
Copy link
Member

I'm not sure I follow this 100%, but just a thought looking at the caching in text.py and mathtext.py:

    def get_prop_tup(self, renderer=None):
        """
        Return a hashable tuple of properties. ...
        """
        x, y = self.get_unitless_position()
        return (x, y, self.get_text(), self._color,
                self._verticalalignment, self._horizontalalignment,
                hash(self._fontproperties),
                self._rotation, self._rotation_mode,
                self.figure.dpi, id(renderer or self._renderer),
                )
    def parse(self, s, dpi = 72, prop = None):
        ...
        cacheKey = (s, dpi, hash(prop))
        result = self._cache.get(cacheKey)
        if result is not None:
            return result

Both use hash(prop) as part of the cache key. What the hash function guarantees is that if two objects are equal they have the same hash value, not the other way around. Could we be getting an occasional hash collision?

@QuLogic
Copy link
Member Author

QuLogic commented Jan 22, 2017

In these tests, the FontProperties are exactly the same across the 5 fontsets. It's only the mathtext.fontset rcParam that changes, but it's only checked in mathtext.py while the cached layout is found in text.py.

QuLogic added a commit to QuLogic/matplotlib that referenced this issue Mar 19, 2017
See matplotlib#7911 for why these tests are flaky and matplotlib#7107 for why they are not
so easy to fix.
QuLogic added a commit to QuLogic/matplotlib that referenced this issue Mar 19, 2017
See matplotlib#7911 for why these tests are flaky and matplotlib#7107 for why they are not
so easy to fix.
@tacaswell
Copy link
Member

Suspect it as a new render which has the same id ( https://stackoverflow.com/questions/20753364/why-is-the-id-of-a-python-class-not-unique-when-called-quickly and I have hit this at my day-job).

The fix looks like it is give renderers some sort of unique id (uuid4 or just a random number) and use that instead of using id. Maybe also put it first so that it fails-fast if on a different renderer.

@tacaswell tacaswell added this to the 2.1 (next point release) milestone Mar 23, 2017
@tacaswell
Copy link
Member

Exact work:

  • add _uid attribute to backend_bases.py:RendererBase
  • change Text.get_prop_tuple to use that _uid instead of id() to identify the renderer. This needs to account for the renderers being None and be back compatible with non-derived-from-us renderers, something like getatter(renderer, '_id', None) or getattr(self._renderer, '_id) or id(renderer or self._renderer)
  • document the (private) API change

@tacaswell
Copy link
Member

Testing this is hard given that it relies on triggering a race condition inside of cpython.

QuLogic added a commit to QuLogic/matplotlib that referenced this issue Mar 24, 2017
See matplotlib#7911 for why these tests are flaky and matplotlib#7107 for why they are not
so easy to fix.
dstansby pushed a commit to dstansby/matplotlib that referenced this issue Apr 12, 2017
See matplotlib#7911 for why these tests are flaky and matplotlib#7107 for why they are not
so easy to fix.
QuLogic added a commit to QuLogic/matplotlib that referenced this issue Apr 17, 2017
See matplotlib#7911 for why these tests are flaky and matplotlib#7107 for why they are not
so easy to fix.
QuLogic added a commit to QuLogic/matplotlib that referenced this issue Apr 21, 2017
See matplotlib#7911 for why these tests are flaky and matplotlib#7107 for why they are not
so easy to fix.
QuLogic added a commit to QuLogic/matplotlib that referenced this issue Apr 23, 2017
See matplotlib#7911 for why these tests are flaky and matplotlib#7107 for why they are not
so easy to fix.
QuLogic added a commit to QuLogic/matplotlib that referenced this issue Apr 28, 2017
See matplotlib#7911 for why these tests are flaky and matplotlib#7107 for why they are not
so easy to fix.
@matthew-brett
Copy link
Contributor

Any progress here? It seems to me that around half of the daily wheel build / test runs fail with one of these errors - I mean, there's one run in the build matrix that fails, meaning I have to go and restart that build manually.

@QuLogic
Copy link
Member Author

QuLogic commented May 13, 2017

On master? They should be marked to re-run on failure.

@matthew-brett
Copy link
Contributor

Yes, on master. This is using pytest.

@matthew-brett
Copy link
Contributor

Errors continue, at or around 50% of build runs. Here's the latest build matrix:

https://travis-ci.org/MacPython/matplotlib-wheels/builds/234736344

Failure build log:

https://travis-ci.org/MacPython/matplotlib-wheels/jobs/234736346

This is another plea for a progress update, it's tiring to restart builds by hand every other day.

matthew-brett added a commit to MacPython/matplotlib-wheels that referenced this issue May 22, 2017
Intermittent test failures - see:
matplotlib/matplotlib#7911.  Install
pytest-rerunfailures to trigger rebuild on failure.
@matthew-brett
Copy link
Contributor

I've added pytest-rerunfailures to the wheel building repo, as workaround. But, I guess this issue is also going to come up for people running tests on their own machine.

@tacaswell
Copy link
Member

I suspect that this failure is rare in real life as it is a consequence of running a somewhat pathological test where all of these values are the same:

        return (x, y, self.get_text(), self._color,
                self._verticalalignment, self._horizontalalignment,
                hash(self._fontproperties),
                self._rotation, self._rotation_mode,
                self.figure.dpi, id(renderer or self._renderer),
                self._linespacing
                )

but with different renderers and rapidly changing rcParams.

For what ever reason, these collisions seem to happen far more often on travis than on my machines. I assume this has something to do with travis being all VMs and / or low memory?

@QuLogic
Copy link
Member Author

QuLogic commented May 23, 2017

Ah, I thought you were already using the rerunfailures plugin; that's what cut down on the failures on master. #7107 is marked as hard, so I'm not sure why this issue is new-contributor-friendly, but perhaps there's some way to modify only the tests to work nicely.

It's a bit of a hack, but what if we empty Text._cached at the end of each of these tests? The shear number of these tests is already a bit of a time sink, so this may or may not be a good idea.

@QuLogic QuLogic mentioned this issue Jun 3, 2017
6 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants