CI: Disable numpy avx512 instructions #22099

greglucas · 2022-01-03T17:42:00Z

PR Summary

This disables the AVX512 instruction set at runtime within the CI using the NPY_DISABLE_CPU_FEATURES environment variable. This is due to small floating point differences causing test failures when using that instruction set on numpy 1.22 wheels on the GitHub Actions runners.

PR Checklist

Tests and Styling

Has pytest style unit tests (and pytest passes).
Is Flake 8 compliant (install flake8-docstrings and run flake8 --docstring-convention=all).

Documentation

New features are documented, with examples if plot related.
New features have an entry in doc/users/next_whats_new/ (follow instructions in README.rst there).
API changes documented in doc/api/next_api_changes/ (follow instructions in README.rst there).
Documentation is sphinx and numpydoc compliant (the docs should build without error).

This removes the NPY_DISABLE_CPU_FEATURES flag from the sphinx and tk tests as they emit warnings on CI which leads to failure from the subprocess. These don't need to be disabled on these tests, so remove them from the environment variables that are passed in.

jklymak · 2022-01-03T21:03:05Z

This is pretty obscure. Does it make sense to just up the test tolerance instead? Though I think a couple of tests were quite wrong. Why did these flags have such a large effect?

dstansby · 2022-01-03T21:31:36Z

This is pretty obscure. Does it make sense to just up the test tolerance instead? Though I think a couple of tests were quite wrong. Why did these flags have such a large effect?

I don't think any were unexpectedly wrong. I downloaded the failed images and all were minor chnages apart from errorbar_mixed, which was a large change because (I presume) a small difference was causing the axes limit to jump from 1e-2 to 1e-3.

I think this is the most pragmatic way to go forward, without having to play whack-a-mole with test tolerances across the test suite.

jklymak · 2022-01-03T21:38:17Z

Oh I have the opposite point of view. If we have tests that are susceptible to compiler foibles maybe we should fix the tests or increase their tolerance, because there will always be more compiler foibles.

greglucas · 2022-01-03T21:58:43Z

I agree this is a pretty big hammer, saying we won't test against certain floating point instruction sets. However, I assume we rely on numpy for some verification that their floating point calculations are "good enough" for them and this is impacting us because we are doing pixel-perfect comparisons with floating point inputs. I can see both arguments here, so I'm not sure which one people would be more generally comfortable with (increased tolerances, or reduced instruction sets).

There are only a few tests failing, so the tolerances on a few of them could probably be bumped, but the errorbar one we should probably update the autoscaling in that test by just bumping the np.minimum() call to add in a small epsilon rather than adding a large tolerance.

timhoffm · 2022-01-03T22:01:02Z

We have to live with the fact that our dependencies may introduce minor variations. First and foremost we don't want brittle tests. There are two ways to get there:

Make the test environment as reproducible as possible
Increase tolerances

I argue that if we can easily get away with (1) that's better than (2). We have more control and don't blindly accept other changes as well. For the same reason we pin freetype or remove text overall.

timhoffm · 2022-01-04T07:22:41Z

I merge as is to unbreak the builds. @jklymak if you think tolerances are the better approach we can discuss this in the next call, and then maybe change.

jklymak · 2022-01-04T07:48:02Z

Sure, I've added to the call agenda.

…099-on-v3.5.x Backport PR #22099 on branch v3.5.x (CI: Disable numpy avx512 instructions)

tacaswell · 2022-01-04T18:30:34Z

I may have over-learned from our issue we had with "rendering the wrong glyph but tests are passing", but I am very very concerned about bumping tolerances in anything but a per-test basis.

jklymak · 2022-01-04T19:04:17Z

Sure, I wasn't suggesting a general large tolerance. But here we have one test that had a large change because of floating point slop. So in my opinion that test should be made more robust. The other changes are pretty small, so upping those tolerances would also make sense to me.

greglucas · 2022-01-06T16:14:33Z

Here is a link to a commit that would be what @jklymak suggests to increase tolerances/dtypes. All of them were pretty small.
greglucas@c030e05
I tried to move all of the failing tests to use longdouble rather than increasing tolerances, but there ended up being some unsafe downcasting to float64 in the qhull procedures of trisurf.

tacaswell · 2022-01-06T16:26:17Z

I came across https://etna.math.kent.edu/vol.52.2020/pp358-369.dir/pp358-369.pdf recently which has a case where increasing the precision of floats can make things worse (sometimes the rounding is actually in your favor!).

tacaswell · 2022-01-06T16:26:39Z

@greglucas can you open a PR with those tolerances, they seem reasonable to me.

greglucas · 2022-01-06T16:34:58Z

Sure, see #22132.

That is an interesting finding! I didn't think about trying single precision instead in the update...

CI: Disable numpy CPU features at runtime

24d9f37

greglucas force-pushed the numpy-cpu-avx512 branch from 2fdc989 to a2eed8e Compare January 3, 2022 18:09

greglucas force-pushed the numpy-cpu-avx512 branch from a2eed8e to d6f6875 Compare January 3, 2022 19:11

timhoffm mentioned this pull request Jan 3, 2022

DOC: Add new tutorial to external resources. #22073

Merged

1 task

timhoffm approved these changes Jan 3, 2022

View reviewed changes

timhoffm added this to the v3.5.2 milestone Jan 3, 2022

timhoffm added the topic: testing label Jan 3, 2022

dstansby approved these changes Jan 3, 2022

View reviewed changes

timhoffm merged commit 82e5939 into matplotlib:main Jan 4, 2022

meeseeksmachine mentioned this pull request Jan 4, 2022

Backport PR #22099 on branch v3.5.x (CI: Disable numpy avx512 instructions) #22101

Merged

meeseeksmachine pushed a commit to meeseeksmachine/matplotlib that referenced this pull request Jan 4, 2022

Backport PR matplotlib#22099: CI: Disable numpy avx512 instructions

c7cb34a

dstansby added a commit that referenced this pull request Jan 4, 2022

Merge pull request #22101 from meeseeksmachine/auto-backport-of-pr-22…

b5d83e8

…099-on-v3.5.x Backport PR #22099 on branch v3.5.x (CI: Disable numpy avx512 instructions)

dstansby mentioned this pull request Jan 4, 2022

See if pinning numpy to < 1.22 fixes tests #22076

Closed

6 tasks

greglucas deleted the numpy-cpu-avx512 branch January 4, 2022 15:23

pllim mentioned this pull request Jan 4, 2022

TST: Use NPY_DISABLE_CPU_FEATURES for numpy 1.22 astropy/astropy#12684

Closed

10 tasks

greglucas mentioned this pull request Jan 6, 2022

TST: Increase fp tolerances for some images #22132

Merged

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

CI: Disable numpy avx512 instructions #22099

CI: Disable numpy avx512 instructions #22099

Uh oh!

greglucas commented Jan 3, 2022

Uh oh!

jklymak commented Jan 3, 2022

Uh oh!

dstansby commented Jan 3, 2022

Uh oh!

jklymak commented Jan 3, 2022

Uh oh!

greglucas commented Jan 3, 2022

Uh oh!

timhoffm commented Jan 3, 2022 •

edited

Loading

Uh oh!

timhoffm commented Jan 4, 2022

Uh oh!

jklymak commented Jan 4, 2022

Uh oh!

tacaswell commented Jan 4, 2022

Uh oh!

jklymak commented Jan 4, 2022

Uh oh!

greglucas commented Jan 6, 2022

Uh oh!

tacaswell commented Jan 6, 2022

Uh oh!

tacaswell commented Jan 6, 2022

Uh oh!

greglucas commented Jan 6, 2022

Uh oh!

Uh oh!

Uh oh!

CI: Disable numpy avx512 instructions #22099

CI: Disable numpy avx512 instructions #22099

Uh oh!

Conversation

greglucas commented Jan 3, 2022

PR Summary

PR Checklist

Uh oh!

jklymak commented Jan 3, 2022

Uh oh!

dstansby commented Jan 3, 2022

Uh oh!

jklymak commented Jan 3, 2022

Uh oh!

greglucas commented Jan 3, 2022

Uh oh!

timhoffm commented Jan 3, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

timhoffm commented Jan 4, 2022

Uh oh!

jklymak commented Jan 4, 2022

Uh oh!

tacaswell commented Jan 4, 2022

Uh oh!

jklymak commented Jan 4, 2022

Uh oh!

greglucas commented Jan 6, 2022

Uh oh!

tacaswell commented Jan 6, 2022

Uh oh!

tacaswell commented Jan 6, 2022

Uh oh!

greglucas commented Jan 6, 2022

Uh oh!

Uh oh!

timhoffm commented Jan 3, 2022 •

edited

Loading