Skip to content

issues parsing AFM fonts #12987

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
tacaswell opened this issue Dec 14, 2018 · 8 comments · Fixed by #13441
Closed

issues parsing AFM fonts #12987

tacaswell opened this issue Dec 14, 2018 · 8 comments · Fixed by #13441
Milestone

Comments

@tacaswell
Copy link
Member

tacaswell commented Dec 14, 2018

Investigating the linux failures in #12982 discovered that we are not parsing AFM files correctly.

https://www.adobe.com/content/dam/acom/en/devnet/font/pdfs/5004.AFM_Spec.pdf for reference...

  1. weird values for underline information. There are some files which have lines like
    UnderlinePosition -41,5039
    UnderlineThickness 43,9453
    
    where we expect this to be a "number" which can a decimal, we are expecting to cast to float (see page 10 in the pdf). An example of this (on arch) is /usr/share/texmf-dist/fonts/afm/public/dejavu/DejaVuSansCondensed-BoldOblique.afm
  2. It seems that there are strings encoded as latin-1 (b'Notice' b'Copyright (c) Typographer Mediengestaltung, 2000. All rights reserved. Zallman-Caps \xa9 1991 by D. Rakowski.' to get in ©. According to the first section of the linked pdf AFM files are ASCII encoded.
  3. Some fonts are parsing with weights that are not in the weight dictionary (which is what caused the
    failure on azure
In [7]: [(f.fname, f.weight) for f in fm.afmlist if f.weight not in matplotlib.font_manager.weight_dict]
Out[7]: 
[('/usr/share/texmf-dist/fonts/afm/public/aurical/AuriocusKalligraphicusRslant.afm',
  'reverseoblique'),
 ('/usr/share/texmf-dist/fonts/afm/public/aurical/AuriocusKalligraphicusSlant.afm',
  'oblique'),
 ('/usr/share/texmf-dist/fonts/afm/public/aurical/AmiciLogoRslant.afm',
  'reverseoblique'),
 ('/usr/share/texmf-dist/fonts/afm/public/prodint/prodint.afm', 'plain'),
 ('/usr/share/texmf-dist/fonts/afm/public/aurical/LukasSvatbaSlant.afm',
  'oblique'),
 ('/usr/share/texmf-dist/fonts/afm/public/aurical/JanaSkrivanaRslant.afm',
  'reverseoblique'),
 ('/usr/share/texmf-dist/fonts/afm/public/txfontsb/FreeSerifb-SmallCapsAlt.afm',
  'smallcaps'),
 ('/usr/share/texmf-dist/fonts/afm/public/aurical/LukasSvatbaRslant.afm',
  'reverseoblique'),
 ('/usr/share/texmf-dist/fonts/afm/public/cochineal/Cochineal-Italic.afm',
  'italic'),
 ('/usr/share/texmf-dist/fonts/afm/public/aurical/LukasSvatbaBoldRslant.afm',
  'boldreverseoblique'),
 ('/usr/share/texmf-dist/fonts/afm/public/tex-gyre/qcsbi.afm', 'bolditalic'),
 ('/usr/share/texmf-dist/fonts/afm/public/aurical/JanaSkrivanaBoldRslant.afm',
  'boldreverseoblique'),
 ('/usr/share/texmf-dist/fonts/afm/public/mathpazo/fplmbb.afm',
  'blackboardbold'),
 ('/usr/share/texmf-dist/fonts/afm/public/dejavu/DejaVuSans-ExtraLight.afm',
  'extra-light'),
 ('/usr/share/texmf-dist/fonts/afm/public/aurical/JanaSkrivanaBoldSlant.afm',
  'boldoblique'),
 ('/usr/share/texmf-dist/fonts/afm/public/aurical/AuriocusKalligraphicusBoldRslant.afm',
  'boldreverseoblique'),
 ('/usr/share/texmf-dist/fonts/afm/public/aurical/AuriocusKalligraphicusBoldSlant.afm',
  'boldoblique'),
 ('/usr/share/texmf-dist/fonts/afm/public/aurical/AmiciLogoSlant.afm',
  'oblique'),
 ('/usr/share/texmf-dist/fonts/afm/public/txfontsb/FreeSerifb-SmallCaps.afm',
  'smallcaps'),
 ('/usr/share/texmf-dist/fonts/afm/public/aurical/AmiciLogoBoldRslant.afm',
  'boldreverseoblique'),
 ('/usr/share/texmf-dist/fonts/afm/public/aurical/AmiciLogoBoldSlant.afm',
  'boldoblique'),
 ('/usr/share/texmf-dist/fonts/afm/public/aurical/LukasSvatbaBoldSlant.afm',
  'boldoblique'),
 ('/usr/share/texmf-dist/fonts/afm/public/aurical/JanaSkrivanaSlant.afm',
  'oblique')]

there are mostly weight that should have been split into a weight + a slant or weight + style.

@tacaswell tacaswell added this to the v3.0.3 milestone Dec 14, 2018
@anntzer
Copy link
Contributor

anntzer commented Dec 14, 2018

re: #12982 (comment)

I am a bit surprised we have not had bug reports on this already.

I think this is because I just made this stricter in #12565... can revert if you want (and remilestoning as 3.1 as that's where that PR lands).

Probably the parsing should do the same thing as font_manager does, i.e. look for substrings matching known weights strings.

@anntzer anntzer modified the milestones: v3.0.3, v3.1 Dec 14, 2018
@tacaswell
Copy link
Member Author

Ah, I verified that the parsing is the same in 3.0.2, but not that we got the crash is 3.0.2.

I am much less concerned about this now that it is only on 3.1.

@tacaswell tacaswell added the Release critical For bugs that make the library unusable (segfaults, incorrect plots, etc) and major regressions. label Dec 14, 2018
@tacaswell tacaswell removed the Release critical For bugs that make the library unusable (segfaults, incorrect plots, etc) and major regressions. label Dec 17, 2018
@tacaswell
Copy link
Member Author

3 is fixed by #12991 , 2 is just annoying but harmless (and only happens on fonts that do not conform to the AFM spec), 1 may be floats written in a locale where , is used as the decimal separator. Internally we are casting this to an int, maybe we want to be using a float internally?

@anntzer
Copy link
Contributor

anntzer commented Dec 17, 2018

As noted in #12991 (comment) it's really not clear at all to me what the point of this entire afm parsing is (other than for the pdf 14 core fonts, for which we could just hardcode the parsed data anyways)...

@tacaswell
Copy link
Member Author

Where the issue in 3 came up was in searching for the font closest to what the user asked. I know there has been discussion of moving to just using font-config, but I am not sure that exists on windows.

@anntzer
Copy link
Contributor

anntzer commented Dec 17, 2018

That's not a fontconfig issue.
The only places we use afm fonts are:

  • in the pdf backend, when the pdf.use14corefonts rcparam is set. In that case the 14 core fonts are, well, fixed and we could just store the parsed versions once and for all (we can even leave the afm module in tools/, but not as public API so that we know how the parsed versions are generated).
  • in the ps backend, when the ps.useafm rcparam is set. First there's the case of the postscript 35 core fonts, which are essentially like the pdf 14 core fonts. Then we also pick up whatever other afm fonts are installed in the system; but for these, the user needs to have the actual font face (pfa/pfb) to make any use of the generated postscript file, so we may as well directly look for these (and freetype handles them just fine).

Brief reminder: The afm file only contains the metrics and kerning of each glyph (and font metadata), so the pdf/ps backends can use them to position text (with metrics, you know that the string "foo" is 12.34pt tall and 45.67pt wide), but any pdf or ps viewer will need to have access to the glyph outlines (i.e. the shape of each glyph), which are stored in a pfa/pfb file (for example) to actually render the pdf/ps; this is excluding the 14 pdf core fonts / 35 postscript core fonts that each (compliant) pdf/ps renderer is supposed to have embedded.

TLDR: When searching for the closest font to what the user asked, if they want a core font then they are already parsed, otherwise we can just lookup the pfa/pfb files which FreeType handles for us.

@tacaswell
Copy link
Member Author

Fair enough, do we currently look for the pfa/pfb files?

@anntzer
Copy link
Contributor

anntzer commented Dec 18, 2018

No, but we could (and likely, on most systems, we won't find anything because they are installed in paths that we don't look for (e.g. the texlive path) -- just like for afm, where at least for my system matplotlib only finds the fonts it ships itself).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants