-
-
Notifications
You must be signed in to change notification settings - Fork 7.9k
FIX: mathtext accents #4887
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FIX: mathtext accents #4887
Conversation
In adding tests for this I confirmed that out text tests are not strict enough to catch major issues. The second commit on this PR changes what symbols are in the test and it still passed locally. |
@tacaswell: Look at all those nice accents and symbols: Thanks for fixing this. Is there a way to set a per-test tolerance in the mathtext tests? I imagine we don't want to jack up the strictness for all tests, but we really need an accent test. |
I gave it a shot and made a new category in test_mathtext called The branch is here: https://github.com/zblz/matplotlib/tree/fix-accent-tests @tacaswell : do you want me to do a PR on that or will you cherry-pick them here? |
I think we do want to push the threshold up on all of them. I started to do some work last night to sort out what fails when the threshold is push down and some of them should fail for example |
fun thing to do, apply the following threshold diff --git a/lib/matplotlib/tests/test_mathtext.py b/lib/matplotlib/tests/test_mathtext.py
index 745588a..71d9d0a 100644
--- a/lib/matplotlib/tests/test_mathtext.py
+++ b/lib/matplotlib/tests/test_mathtext.py
@@ -151,7 +151,7 @@ for fonts, chars in font_test_specs:
def make_set(basename, fontset, tests, extensions=None):
def make_test(filename, test):
@image_comparison(baseline_images=[filename], extensions=extensions,
- tol=32)
+ tol=10)
def single_test():
matplotlib.rcParams['mathtext.fontset'] = fontset
fig = plt.figure(figsize=(5.25, 0.75))
with open('/home/tcaswell/other_source/matplotlib/math-fail.txt') as f:
lns = [ln.strip()[51:-5].split('_') for ln in f.readlines() if ln.startswith('FAIL:')]
df = pd.DataFrame({k: v for k, v in zip(['font', 'number'], zip(*lns))}) Gives
Some of the baseline images (like 01) are clearly wrong. |
Brain dump of my notes from the tests that have 9 of 9 fail with a
@zblz I can pull the last 3 commits if they are conflicting with what you have been working on which I have not been following as closely as I should have been. |
@tacaswell: They will definitely conflict with PRs #4872 and #4873, for they change spacing of symbols and position of sub/superscripts respectively. With If you merge this PR I can rebase on these without baseline images and then regenerate them. |
adcdcb1
to
98e167b
Compare
I pulled off the last three commits, I would rather fix them up all at once then have multiple generations of the test images in the repo. The changes to 01 do not conflict with your work. There will be a merge conflict on 72, but that should not be too hard to clean up. |
@matplotlib/developers @mdboom is on vacation, can anyone else review the changes to the parser logic? |
@@ -12,7 +12,7 @@ | |||
from matplotlib import mathtext | |||
|
|||
math_tests = [ | |||
r'$a+b+\dots+\dot{s}+\ldots$', | |||
r'$a+b+\dot s+\dot{s}+\ldots$', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pretty sure this case is intended to test that \dots produces three dots and \dot{s} produces an s with a dot over it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated the test to match the image, I will update image to test all three.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to the TeX definitions, \dots
should not produce three dots: only \ldots
should. \dots
would give an error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zblz Can you point me to a reference on that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, I just checked and it appears I was wrong. Sorry for that. \dots
should indeed produce three dots and is a text and math command, whereas \ldots
(and other \cdots
, etc) are math-only commands. I checked it in the LaTeX comprehensive symbol list table 3 for math and text and table 189 for math only.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The \dots that I know is a macro in amsmath, it includes some magic to choose a suitable height for the dots: http://tex.stackexchange.com/questions/649/how-do-magic-dots-work-in-amsmath
The changes to the parser seem like a bit of a hack, but I don't know if that should hold up this PR. I think longer term, the mathtext parser is due for a simplification or overhaul -- it's grown rather crufty with special cases over the years. But perfect is probably the enemy of the good here. At least |
I picked the name snow flake because it is a hack 😉 I spent a fair amount of time trying to write parsing rules that would not match depending on the next character, but I don't have enough experience with parse/lexxing to be sure if I did not sort it out because I don't know what I am doing or because it can't be done. In either case, this either needs to be replaced by a better solution or merged for 1.5 as my last attempt to touch the parser really really broke it (and it turns out that comment was right). |
I'm happy to merge this once it's rebased and tested again... |
This is a follow up to matplotlib#4588 The change in ae91e9f fixed the symbols that started with accent names, but broke all of the other accents. This fixes both by special casing the 8 named symbols that start with an accent as a prefix. ex '\doteq' should be parsed as a single symbol, not as as two symbols (e, q) with a dot over the e ('\dot{e}q')
*Note this does not update the test image and still passes locally*
98e167b
to
46d9ee6
Compare
Updated - 01 (snowflake) - 05 (spacing) - 06 (spacing) - 20 (spacing) - 21 (spacing) - 37 (spacing) - 53 (spacing) - 54 (spacing) Added: - 77 (snowflake testing)
574d380
to
a6682a8
Compare
Changed my mind about including tests at a higher threshold that is currently required, I don't have the bandwith to check that the new images are any more correct that the old. |
b44b671
to
a6682a8
Compare
ping @mdboom |
This is a follow up to #4588
The change in ae91e9f fixed
the symbols that started with accent names, but broke all of
the other accents.
This fixes both by special casing the 8 named symbols that start with
an accent as a prefix.
ex '\doteq' should be parsed as a single symbol, not as as two symbols
(e, q) with a dot over the e ('\dot{e}q')
attn @zblz