FIX: mathtext accents #4887

tacaswell · 2015-08-08T04:44:09Z

This is a follow up to #4588

The change in ae91e9f fixed
the symbols that started with accent names, but broke all of
the other accents.

This fixes both by special casing the 8 named symbols that start with
an accent as a prefix.

ex '\doteq' should be parsed as a single symbol, not as as two symbols
(e, q) with a dot over the e ('\dot{e}q')

tacaswell · 2015-08-08T05:08:32Z

In adding tests for this I confirmed that out text tests are not strict enough to catch major issues. The second commit on this PR changes what symbols are in the test and it still passed locally.

zblz · 2015-08-08T09:10:32Z

@tacaswell: Look at all those nice accents and symbols:

Thanks for fixing this. Is there a way to set a per-test tolerance in the mathtext tests? I imagine we don't want to jack up the strictness for all tests, but we really need an accent test.

zblz · 2015-08-08T09:40:46Z

I gave it a shot and made a new category in test_mathtext called accent_test with a few accent tests and lowered their tolerance to 16. Using baseline images from this PR, the tests now fail, even though it seems that most of the discrepancy comes from the modified spacing when accents are shifted rather that the accents themselves.

The branch is here: https://github.com/zblz/matplotlib/tree/fix-accent-tests

@tacaswell : do you want me to do a PR on that or will you cherry-pick them here?

tacaswell · 2015-08-08T14:10:26Z

I think we do want to push the threshold up on all of them.

I started to do some work last night to sort out what fails when the threshold is push down and some of them should fail for example \doteq is used in test 01 which has been happily passing.

tacaswell · 2015-08-09T01:58:20Z

fun thing to do, apply the following threshold

diff --git a/lib/matplotlib/tests/test_mathtext.py b/lib/matplotlib/tests/test_mathtext.py       
index 745588a..71d9d0a 100644                                                                    
--- a/lib/matplotlib/tests/test_mathtext.py                                                      
+++ b/lib/matplotlib/tests/test_mathtext.py                                                      
@@ -151,7 +151,7 @@ for fonts, chars in font_test_specs:                                         
 def make_set(basename, fontset, tests, extensions=None):                                        
     def make_test(filename, test):                                                              
         @image_comparison(baseline_images=[filename], extensions=extensions,                    
-                          tol=32)                                                               
+                          tol=10)                                                               
         def single_test():                                                                      
             matplotlib.rcParams['mathtext.fontset'] = fontset                                   
             fig = plt.figure(figsize=(5.25, 0.75))

python tests.py -s --processes=4 --process-timeout=300 matplotlib.tests.test_mathtext 2> math-fail.txt

with open('/home/tcaswell/other_source/matplotlib/math-fail.txt') as f:
    lns = [ln.strip()[51:-5].split('_') for ln in f.readlines() if ln.startswith('FAIL:')]

df = pd.DataFrame({k: v for k, v in zip(['font', 'number'], zip(*lns))})

Gives

In [104]: df.groupby('number').count()
Out[104]: 
        font
number      
00         9
01         9
05         9
06         5
07         4
09         9
15         1
16         2
17         2
24         2
26         2
27         9
30         4
39         2
42         2
44         2
45         3
50         9
51         1
54         9
55         3
56         5
58         1
60         6
61         3
62         2
64         9
72         9

Some of the baseline images (like 01) are clearly wrong.

tacaswell · 2015-08-09T03:50:10Z

Brain dump of my notes from the tests that have 9 of 9 fail with a tol=0, 63 (?!) of them have at least 1 failure.

*** 00 (fixed)
 - \dots is really a symbol, should change

*** 01 (fixed)
 - baseline image is wrong, built against bad rendering of \doteq

*** 05
 - spacing issues

*** 06
 - spacing issues

*** 07
 - spacing issues

*** 09
 - spacing issues, looks like baseline is correct

*** 26 (fixed)

 - accents in the wrong place

*** 27
 - spacing issues, looks like baseline is wrong

*** 34 (fixed)
 - length of frac line was a bit too long, shortened
 - probably due to issues with right pad on super

*** 39
 - wide accents slightly shifted, not clear to me which one is
   correct

*** 42
 - spacing issues

*** 44

 - frac length issues
 - binomial does not look right
 - superscript looks too close to base letter
 - probably due to issues with right pad on super

*** 50
 - spacing issue, looks like baseline is correct

*** 54
 - spacing issues

*** 56
 not clear either version is correct

    r'${\int }_{1}^{x}\frac{\mathrm{dt}}{t}$',

*** 60

 spacing issues from right pad on superscript

*** 64
 - spacing issues

*** 72 (fixed)
 - clearly wrong

@zblz I can pull the last 3 commits if they are conflicting with what you have been working on which I have not been following as closely as I should have been.

zblz · 2015-08-09T11:24:40Z

@tacaswell: They will definitely conflict with PRs #4872 and #4873, for they change spacing of symbols and position of sub/superscripts respectively. With tol=10 the tests will likely fail with those changes (good!). I removed the baseline images from #4873 because I am still finetuning some corner cases.

If you merge this PR I can rebase on these without baseline images and then regenerate them.

tacaswell · 2015-08-10T04:40:48Z

I pulled off the last three commits, I would rather fix them up all at once then have multiple generations of the test images in the repo.

The changes to 01 do not conflict with your work. There will be a merge conflict on 72, but that should not be too hard to clean up.

tacaswell · 2015-08-10T04:41:27Z

@matplotlib/developers @mdboom is on vacation, can anyone else review the changes to the parser logic?

jkseppan · 2015-08-10T11:44:06Z

lib/matplotlib/tests/test_mathtext.py

@@ -12,7 +12,7 @@
 from matplotlib import mathtext

 math_tests = [
-    r'$a+b+\dots+\dot{s}+\ldots$',
+    r'$a+b+\dot s+\dot{s}+\ldots$',


I'm pretty sure this case is intended to test that \dots produces three dots and \dot{s} produces an s with a dot over it.

I updated the test to match the image, I will update image to test all three.

According to the TeX definitions, \dots should not produce three dots: only \ldots should. \dots would give an error.

@zblz Can you point me to a reference on that?

Well, I just checked and it appears I was wrong. Sorry for that. \dots should indeed produce three dots and is a text and math command, whereas \ldots (and other \cdots, etc) are math-only commands. I checked it in the LaTeX comprehensive symbol list table 3 for math and text and table 189 for math only.

The \dots that I know is a macro in amsmath, it includes some magic to choose a suitable height for the dots: http://tex.stackexchange.com/questions/649/how-do-magic-dots-work-in-amsmath

mdboom · 2015-08-20T17:40:48Z

The changes to the parser seem like a bit of a hack, but I don't know if that should hold up this PR. I think longer term, the mathtext parser is due for a simplification or overhaul -- it's grown rather crufty with special cases over the years. But perfect is probably the enemy of the good here. At least snowflake is autogenerated here so it should work even as new symbols are added.

tacaswell · 2015-08-20T17:46:20Z

I picked the name snow flake because it is a hack 😉

I spent a fair amount of time trying to write parsing rules that would not match depending on the next character, but I don't have enough experience with parse/lexxing to be sure if I did not sort it out because I don't know what I am doing or because it can't be done.

In either case, this either needs to be replaced by a better solution or merged for 1.5 as my last attempt to touch the parser really really broke it (and it turns out that comment was right).

mdboom · 2015-08-28T18:13:53Z

I'm happy to merge this once it's rebased and tested again...

This is a follow up to matplotlib#4588 The change in ae91e9f fixed the symbols that started with accent names, but broke all of the other accents. This fixes both by special casing the 8 named symbols that start with an accent as a prefix. ex '\doteq' should be parsed as a single symbol, not as as two symbols (e, q) with a dot over the e ('\dot{e}q')

*Note this does not update the test image and still passes locally*

Updated - 01 (snowflake) - 05 (spacing) - 06 (spacing) - 20 (spacing) - 21 (spacing) - 37 (spacing) - 53 (spacing) - 54 (spacing) Added: - 77 (snowflake testing)

tacaswell · 2015-08-29T02:57:56Z

@zblz @mdboom Both are rebased on current master and all of the test updating shoved into one commit.

tacaswell · 2015-08-29T03:09:02Z

~~The second round of updated test images is cases where all 9 tests failed for that test at a tolerance of 15~~

Changed my mind about including tests at a higher threshold that is currently required, I don't have the bandwith to check that the new images are any more correct that the old.

tacaswell · 2015-09-01T15:35:12Z

ping @mdboom

FIX: mathtext accents

tacaswell added Release critical For bugs that make the library unusable (segfaults, incorrect plots, etc) and major regressions. topic: text labels Aug 8, 2015

tacaswell added this to the next point release milestone Aug 8, 2015

tacaswell added the status: needs review label Aug 8, 2015

tacaswell force-pushed the fix_mathtext_accents branch from adcdcb1 to 98e167b Compare August 10, 2015 04:37

zblz mentioned this pull request Aug 10, 2015

mathtext: Finetuning sup/super block to match TeX reference #4873

Merged

jkseppan reviewed Aug 10, 2015
View reviewed changes

zblz mentioned this pull request Aug 27, 2015

mathtext: Increase space after spaced symbols #4872

Closed

tacaswell added 2 commits August 28, 2015 22:12

TST: update test symbols

ef62fd0

*Note this does not update the test image and still passes locally*

tacaswell force-pushed the fix_mathtext_accents branch from 98e167b to 46d9ee6 Compare August 29, 2015 02:35

zblz and others added 3 commits August 28, 2015 22:50

set do_kern=True in spaced symbols

acd48c6

add dotplus and doteqdot to relation symbols

5ad5a5d

TST: update mathtext tests

a6682a8

Updated - 01 (snowflake) - 05 (spacing) - 06 (spacing) - 20 (spacing) - 21 (spacing) - 37 (spacing) - 53 (spacing) - 54 (spacing) Added: - 77 (snowflake testing)

tacaswell force-pushed the fix_mathtext_accents branch from 574d380 to a6682a8 Compare August 29, 2015 02:55

tacaswell force-pushed the fix_mathtext_accents branch from b44b671 to a6682a8 Compare August 29, 2015 03:20

mdboom added a commit that referenced this pull request Sep 1, 2015

Merge pull request #4887 from tacaswell/fix_mathtext_accents

b7ef642

FIX: mathtext accents

mdboom merged commit b7ef642 into matplotlib:master Sep 1, 2015

mdboom removed the status: needs review label Sep 1, 2015

tacaswell deleted the fix_mathtext_accents branch September 1, 2015 15:43

This was referenced Oct 22, 2015

BUG: Dot should not be spaced when used as a decimal separator #5301

Merged

TST: Lower tolerance of mathtext tests, update baseline images to master #5305

Closed

Uh oh!

FIX: mathtext accents #4887

FIX: mathtext accents #4887

Conversation

tacaswell commented Aug 8, 2015

Uh oh!

tacaswell commented Aug 8, 2015

Uh oh!

zblz commented Aug 8, 2015

Uh oh!

zblz commented Aug 8, 2015

Uh oh!

tacaswell commented Aug 8, 2015

Uh oh!

tacaswell commented Aug 9, 2015

Uh oh!

tacaswell commented Aug 9, 2015

Uh oh!

zblz commented Aug 9, 2015

Uh oh!

tacaswell commented Aug 10, 2015

Uh oh!

tacaswell commented Aug 10, 2015

Uh oh!

jkseppan Aug 10, 2015

Choose a reason for hiding this comment

Uh oh!

tacaswell Aug 10, 2015

Choose a reason for hiding this comment

Uh oh!

zblz Aug 10, 2015

Choose a reason for hiding this comment

Uh oh!

tacaswell Aug 10, 2015

Choose a reason for hiding this comment

Uh oh!

zblz Aug 10, 2015

Choose a reason for hiding this comment

Uh oh!

jkseppan Aug 10, 2015

Choose a reason for hiding this comment

Uh oh!

mdboom commented Aug 20, 2015

Uh oh!

tacaswell commented Aug 20, 2015

Uh oh!

mdboom commented Aug 28, 2015

Uh oh!

tacaswell commented Aug 29, 2015

Uh oh!

tacaswell commented Aug 29, 2015

Uh oh!

tacaswell commented Sep 1, 2015

Uh oh!

Uh oh!