-
-
Notifications
You must be signed in to change notification settings - Fork 7.9k
TST: Lower tolerance of mathtext tests, update baseline images to master #5305
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@zblz: What platform did you generate the pngs on? The testing is very sensitive to versions and settings in freetype, and if it doesn't match what Travis is doing, there will be problems. For example, if I generate images on a Mac, they are completely unsuitable. All this is the basis for doing imperfect comparisons... Also a heads up -- when the default styles change in 2.0, we will probably do a mass regeneration of all test images. It's probably fine to do just these math ones now, but of course when the default math font changes they will need to be done again. |
I'm using Debian testing, and wanted to do this PR mostly to check how Travis would behave with respect to the images I generated. I see now that there a lot of failures, some even at a 20% level. A 1% threshold was definitely optimistic, but I set it because a lot of the accent tests were failing at the 4-5% level, so we should have a method to catch those. Could we use the images generated by travis as baseline in the repo? Are they stored somewhere we can access? However, that might mean having to update them every time there are changes in the Travis environment, which is not ideal as it is not something we can control completely. Regarding the default mathtext fontset, the |
That's a good idea. You can get the images from Travis-CI -- they are uploaded to an Amazon S3 account -- but only when building a branch in the main matplotlib repo, not when building a random pull request. This is for security reasons and more-or-less enforced by the Travis security model. Since you have commit rights, you can push your branch to the matplotlib repo with something like:
Travis should build that branch and provide a URL to a downloadable tarball of images at the bottom of the output. My long term plan for fixing this -- and maybe I'll even get to this today -- is to add a build configuration option that will download and build against a specific version of freetype, with particular settings. The idea is that all matplotlib developers and our .travis.yml would turn this on and should get identical results. Then we can turn the tolerance all the way down to zero. That's the idea anyway. This problem has gone on a long time, and we're finding more and more cases where things are being missed due to the thresholding.
Oh, of course that's right. No risk of duplicating effort here then. There is some math scattered here and there throughout the test suite that will be affected, but that's much less stuff. |
I just checked and Travis used freetype 2.4.8 whereas I have 2.6.0 installed... Having the same freetype everywhere would definitely help here. If you can add this option in the near future we should wait to replace the images with a self-built freetype, otherwise I will replace the images here with the Travis-generated ones. |
We might consider switching Travis over to the "new" Ubuntu 14.04 Trusty images which have a more resent version of freetype and TeX which is hopefully closer to what we are using for development. Update 14.04 has 2.5.2 where as 12.04 has 2.4.8 https://launchpad.net/ubuntu/%2Bsource/freetype |
re: 2.0 style changes The problem I have with a mass regeneration of baseline images is that it On Fri, Oct 23, 2015 at 9:15 AM, Jens Hedegaard Nielsen <
|
@jenshnielsen -- in the last commit I changed |
I think the predictable freetype version is a pretty high priority, so I plan to start on it today (though may not finish, and who knows what surprises it will bring). I think we might as well hold off on merging this until that's in. I like the idea of pegging the Travis environment. I didn't realize one could do that. I think it's still worthwhile doing a local freetype build so that developers can run the tests on their own machines without requiring a specific version of the OS etc. |
@WeatherGod: Good point about the classic style. That makes sense. |
And just a note for completeness: The images depend not just on the version of freetype, but how it was configured -- whether bytecode hinter or not etc. Different Linux distros build it in different ways, even if they have the same exact version. |
See #5306, which should now (fingers crossed) be passing... |
This PR is now redundant as #5306 is definitely a better long term option. Closing. |
As mentioned in #4887, #5301, a lot of the mathtext tests are only passing because the tolerance is set quite high. In a couple of recent PRs only the failing images were replaced in the baseline even though a few other were modified by the PR and were not failing the tests because of the tolerance.
In this PR the tolerance is lowered to 1%, and the baseline images updated to the master rendering for the following mathtext tests (comment from #5301):
\,
as in 09. baseline looks wrong to me:2 \, a
should render to a very small space, and the baseline spacing is pretty big.!
(baseline has no space, master has it as per the_punctuation_symbols
spacing)