Use metric identifiers to parse an AFM character metric line #2783

jdeblese · 2014-02-02T19:26:52Z

An AFM Character Metric line contains key-value pairs, with some of these being mandatory and some optional. Matplotlib currently parses this using absolute positions, but there are no guarantees as to the order of keys on a line. For example, a Chinese font converted through FontForge results in an AFM with the following Character Metric:

C 0 ; WX 1000 ; WY 1377 ; N uni5200 ; B 73 -131 890 665 ;

The solution is to make use of the keys when parsing the metrics. Parsing a line out into a dictionary, as below, is one straightforward way.

Cheers,
Jw

jdeblese · 2014-02-19T14:34:53Z

Travis build failed due to a problem in string_width_height, which I never touched, and because I was mixing binary and unicode strings. I've fixed the latter.

tacaswell · 2014-02-19T15:07:06Z

@mdboom Can you take a look at this one?

tacaswell · 2015-02-07T23:04:44Z

@jdeblese Can you take another look at this? It looks like string_width_height uses the output from the code you changes and somehow empty things into the metrics dictionary.

jdeblese · 2015-02-08T03:38:44Z

Ok, I'll take another look. You're referring to the errors in the Travis build, right?

It may just be the 'list()' bit that I dropped from line 209, but I'll need to set up my test environment again to confirm that.

jdeblese · 2015-05-24T22:08:35Z

Ok, it was the missing "list()", and PEP8 compliance. Now passes the Travis checks.

tacaswell · 2015-05-24T22:22:14Z

Are all of the explicit 'b' markers really needed?

tacaswell · 2015-05-24T22:22:34Z

Other than that, this looks good to me 👍 .

jdeblese · 2015-05-24T22:37:54Z

I had some issues in python 3 back in 2013 without them, if I recall
correctly. Worth removing them?
On May 24, 2015 3:22 PM, "Thomas A Caswell" notifications@github.com
wrote:

Are all of the explicit 'b' markers really needed?

—
Reply to this email directly or view it on GitHub
#2783 (comment)
.

tacaswell · 2015-05-24T22:45:19Z

If it isn't too much trouble could you see which are not needed? My guess
is the startswith calls will need them and the dictionary getitems will
not.

It is one of those things that once it is there, and people forget/don't
know why, it will never go away and live on forever as a funny wart in the
code base. As long as you are touching this code (and you have enough
mental state of what is going on) it is worth cleaning up.

Are there examples of afm's like this we can download as part of the
testing on travis?

On Sun, May 24, 2015 at 6:37 PM Jan-willem De Bleser <
notifications@github.com> wrote:

I had some issues in python 3 back in 2013 without them, if I recall
correctly. Worth removing them?
On May 24, 2015 3:22 PM, "Thomas A Caswell" notifications@github.com
wrote:

Are all of the explicit 'b' markers really needed?

—
Reply to this email directly or view it on GitHub
<
https://github.com/matplotlib/matplotlib/pull/2783#issuecomment-105071672>
.

—
Reply to this email directly or view it on GitHub
#2783 (comment)
.

jdeblese · 2015-05-25T04:58:17Z

The line I give as example comes from the STXihei font, which I converted to an afm using fontforge. Probably can't include that one as part of the test suite, but maybe we can find others under more open licenses.

I'll take another look at where the 'b's might be needed.

tacaswell · 2015-05-25T04:58:46Z

@jdeblese Awesome, thank you.

jdeblese · 2015-05-25T05:31:18Z

Ok, so on my machine it complains if I remove any of the 'b' prefixes. The issue as I see it is that both the dictionary values as well as the keys are created directly from the byte literal string, without any conversion. Just using vals['WX'], for instance, will give me a key not found error, as 'WX' != b'WX'. Either I'll have to leave them all as byte literals, or I first have to convert 'line' to a unicode string.

I can change the lambda inside the filter call to (len(s) > 0) instead of (s != b''), however.

jdeblese · 2015-05-25T06:43:51Z

AFM files should be ASCII anyway, and the code already does a convert for 'name', so I pushed a commit that converts 'line' to a unicode string before any processing. Oddly enough Travis now fails for 2.6 with a file not found error on a part of the backend code, although not sure why.

We can either dig out what happened there, or I can revert that last commit and we just live with the warts.

Not much luck re: fonts. There's one or two Chinese fonts available under a GPL or Apache license, but I haven't tried seeing if they produce more complex metric files as I don't know if these could be distributed as part of the tests.

tacaswell · 2015-06-22T03:55:45Z

@jdeblese Sorry this fell off my radar.

The test failure was transient, we sometimes get funny things like that on travis (but it is free and generally awesome so it is ok). Restarted the test and it passed.

Thanks for your work on this!

FIX: Use metric identifiers to parse an AFM character metric line Don't make assumptions about order.

jdeblese · 2015-06-22T15:49:18Z

No problem, no rush on my part. Glad to hear it passes now, and if I can think of some test cases I'll make sure to pass them along.

Thanks for merging it!

jdeblese added 2 commits February 2, 2014 20:16

Use metric identifiers to parse an AFM character metric line

4e1f19a

Be consistent in string type, or python 3 complains

02d2e83

tacaswell added this to the v1.4.x milestone Jul 12, 2014

tacaswell assigned mdboom Jul 12, 2014

tacaswell modified the milestones: 1.5.0, v1.4.x Feb 7, 2015

tacaswell added the status: needs revision label Feb 7, 2015

jdeblese added 3 commits May 24, 2015 14:14

Var type of 'bbox' should be list, causes issues otherwise

240c5df

Reformatted to respect maximum line length

eb9250a

PEP8 Conformance

f5396cf

tacaswell added status: needs review and removed status: needs revision labels May 24, 2015

Convert byte-literal to string to avoid use of b''

5fc9809

tacaswell added a commit that referenced this pull request Jun 22, 2015

Merge pull request #2783 from jdeblese/master

e508f4e

FIX: Use metric identifiers to parse an AFM character metric line Don't make assumptions about order.

tacaswell merged commit e508f4e into matplotlib:master Jun 22, 2015

tacaswell removed the status: needs review label Jun 22, 2015

Uh oh!

Use metric identifiers to parse an AFM character metric line #2783

Use metric identifiers to parse an AFM character metric line #2783

Uh oh!

Conversation

jdeblese commented Feb 2, 2014

Uh oh!

jdeblese commented Feb 19, 2014

Uh oh!

tacaswell commented Feb 19, 2014

Uh oh!

tacaswell commented Feb 7, 2015

Uh oh!

jdeblese commented Feb 8, 2015

Uh oh!

jdeblese commented May 24, 2015

Uh oh!

tacaswell commented May 24, 2015

Uh oh!

tacaswell commented May 24, 2015

Uh oh!

jdeblese commented May 24, 2015

Uh oh!

tacaswell commented May 24, 2015

Uh oh!

jdeblese commented May 25, 2015

Uh oh!

tacaswell commented May 25, 2015

Uh oh!

jdeblese commented May 25, 2015

Uh oh!

jdeblese commented May 25, 2015

Uh oh!

tacaswell commented Jun 22, 2015

Uh oh!

jdeblese commented Jun 22, 2015

Uh oh!

Uh oh!