Skip to content

MEP 31: dimension unit handling #9226

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jklymak opened this issue Sep 25, 2017 · 32 comments
Closed

MEP 31: dimension unit handling #9226

jklymak opened this issue Sep 25, 2017 · 32 comments

Comments

@jklymak
Copy link
Member

jklymak commented Sep 25, 2017

Added MEP 31 to discuss dimension unit handling:

https://github.com/matplotlib/matplotlib/wiki/MEP-31%3A-dimension-units-handling

@anntzer
Copy link
Contributor

anntzer commented Sep 25, 2017

Can we consider using 1.23*cm instead? (where cm, etc. would be magic objects imported from dimunits). I think that would look better to me. (Looks like that's basically the pint syntax.)

@jklymak
Copy link
Member Author

jklymak commented Sep 25, 2017

I'd definitely consider that, but it seems obscure to me. What is a "magic object"?

@anntzer
Copy link
Contributor

anntzer commented Sep 25, 2017

The implementation details don't matter much at atht point, but it's basically an object that defines a__mul__ so that the above expression "works".

@jklymak
Copy link
Member Author

jklymak commented Sep 25, 2017

So something like?

import matplotlib.dimunits as du

pad=1.23 * du.cm
plt.dolayotuthingy(padding=pad)

where __mul__ is defined for a dimunits.cm object?

My only hesitation is that it requires another import by the user. But maybe you meant something even fancier...

@anntzer
Copy link
Contributor

anntzer commented Sep 25, 2017

Yes but I would encourage just from matplotlib.units import cm. It'll just go into the large list of imports that's already at the top of every file... (numpy pyplot etc.)

@jklymak
Copy link
Member Author

jklymak commented Sep 26, 2017

OK, I think I understand.

I'm not sure that its a better way to go than a simple string and then a minimal parser inside the library. Requiring another import adds user-friction. But maybe there are advantages that counterbalance that.

@u55
Copy link
Contributor

u55 commented Sep 26, 2017

One advantage to using an multiplicative object versus string parsing is that the expression does not have to be exactly three characters, as was proposed in MEP-31. Writing 'cms' instead of 'cm' as an abbreviation for centimeter(s) looks very strange to me, and violates the SI specification. This also allows users to define their own custom units as easily as:

from matplotlib.units import cm
light_nanosecond = 30*cm

@jklymak
Copy link
Member Author

jklymak commented Sep 26, 2017

@u55 Note that this is just for layout in matplotlib, not an arbitrary units package.

I suppose there may be use cases where a user could want to define their own unit, but I'd still argue that little benefit is outweighed by the cost to 99% of users who just want to specify usual typography units. I'm not trying to be obstinate here, so more counter-arguments are great!

As for the three-character unit name, thats totally flexible. I only did it so that relative units can be rel. Otherwise, I agree that two characters make sense.

@u55
Copy link
Contributor

u55 commented Sep 26, 2017

@jklymak Just spitballing here.

... outweighed by the cost ...

I don't see the cost. I have never been annoyed by another optional import statement when it gains me flexibility.

... 99% of users who just want to specify usual typography units.

But what constitutes "usual typography units" to you may be different for another person. For instance, LaTeX defines other typography units that you have ignored: "ex", "bp", "pc", "dd", "cc", "nd", "nc", "sp".

@anntzer
Copy link
Contributor

anntzer commented Sep 26, 2017

I didn't notice it, but if we go for strings, please no "s" in cm, in, pt, em. I'd rather have the parsing code be a tiny more complicated if needed.

@jklymak
Copy link
Member Author

jklymak commented Sep 26, 2017

Fair enough about not adding s. Certainly we can include any arbitrary unit folks ask for. My list wasn’t meant to be exhaustive.

My issue with an added import is that I think we would prefer dimensions be expressed this way across the library. If we want people to do this then I think it should be as frictionless as possible. Coming from a non-python universe I find going to the top of the file and adding yet another import to be a friction. It’s also a documentation problem. “In order to specify dimensions you need to import the dimunits package and then multiply your number times the desired dimension” is not as clear to me as “enter a string with the desired dimension as the suffix”.

@afvincent
Copy link
Contributor

@jklymak I guess it is more a problem of a long-lasting difference between non-Pythonistas or non-tech saavy end-users who “simply” want to plot things with Matplotlib, and people who are more comfortable with Python and develop third-party tools based on Matplotlib.

I do not really know anything about the packaging philosophy of Matplotlib, but maybe we could think about providing some kind of import shortcuts for the most used units (if we go for multiplicative objects). Something like plt.inches, plt.centimeters (I think plt.cm is already taken :/), etc. as we already do for some formatters, colormaps, etc. (I am assuming that import matplotlib.pyplot as plt is very likely to always be used by end-users). This may decrease a bit the “friction”.

My 2 cents.

@u55
Copy link
Contributor

u55 commented Sep 26, 2017

I guess I will walk back my objection about flexibility. Both pad='1.23cm' and pad=1.23*cm are flexible enough to allow custom units. My only objection about strings is the need for users to convert numbers to strings (only to have matplotlib functions convert the strings back to numbers). To me, keeping numbers as numbers, and optionally multiplying them by another number (a unit), seems more natural.

@anntzer
Copy link
Contributor

anntzer commented Sep 26, 2017

@afvincent too bad in can't be used anyways, but I guess that suggests something like from matplotlib.pyplot import in_, cm_, ....

@jklymak
Copy link
Member Author

jklymak commented Sep 26, 2017

@u55 That makes sense. I guess I was coming at it from the point of view that users would be unlikely to programmatically determine these dimensions, so typing it one way versus the other doesn't matter. Certainly, however, if these are going to be manipulated in scripts I can easily see the advantage of making these class instances rather than strings.

@jklymak
Copy link
Member Author

jklymak commented Sep 26, 2017

@afvincent I just want to make sure that we are not being "tech-savvy" just to be "tech-savvy" and that there is an actual benefit to doing something more complicated. If there is a good reason to use the sledgehammer to crack a nut, then I'm all for it...

@u55
Copy link
Contributor

u55 commented Sep 26, 2017

@jklymak How do you propose to handle 'dpi', which has units of inverse length?

For instance, this makes sense to me:

from matplotlib.units import cm
plt.figure(dpi=100/cm)

But for strings, we would have to define a syntax such as:

plt.figure(dpi='100/cm')

@anntzer
Copy link
Contributor

anntzer commented Sep 26, 2017

I think dpi should always be "dots per inch" regardless of anything else...

@jklymak
Copy link
Member Author

jklymak commented Sep 26, 2017

@u55

I wasn't proposing that this handle inverse lengths. Are there other inverse lengths than dpi?

I think there has been many many people who would like to specify figure sizes in centimenters, and padding can sensibly be done in many units, and is done inconsistently (i.e. colorbar uses a fraction of the subplotspec for pad, other pads are in points). So goal here is to enforce consistency while still allowing flexibility. I don't think dpi is ever inconsistently defined (how its applied is a different issue).

@u55
Copy link
Contributor

u55 commented Sep 27, 2017

@anntzer Just playing Devil's advocate here, does this mean you intend to support

fig.set_size_inches(5*cm, 4*cm)

but not

fig.set_dpi(100/cm)

even though they both have "inches" implied in their names? Why be inconsistent? I thought the purpose of a units system was to allow users to specify quantities in their preferred units instead of the ones used by matplotlib functions internally?

@jklymak
Copy link
Member Author

jklymak commented Sep 27, 2017

@u55 The purpose is to introduce consistency among the various places lengths are specified. Given that dpi doesn't have an inconsistency problem, I wouldn't choose one approach over another based on the ability to include dpi.

@anntzer
Copy link
Contributor

anntzer commented Sep 27, 2017

I would rename set_size_inches to set_size (set_size_inches(5*cm, 4*cm) just seems awful to me) and leave dpi as it is (even in France, where the imperial system is basically unheard of, printer resolutions are quoted in "points par pouce", i.e. dpi (although I just realized that having the same word for dot and for point is probably quite confusing too...)).

@u55
Copy link
Contributor

u55 commented Sep 27, 2017

Okay, it seems that I am fighting a losing battle. I concede. But I still think that the pervasiveness of the Imperial system is poor reason to not supplant it.

As a counter-example to French printer resolutions, the PNG specification states that the intended pixel density is stored in units of pixels per meter.

@jklymak
Copy link
Member Author

jklymak commented Sep 27, 2017

I think if we go with a class with a __mult__ then it wouldn't hurt to also do the __rdiv__. I just am not convinced that its a strong argument for implementing such a class over the original string proposal.

So just to summarize the length arguments we have as contenders:

  1. pad="3pt", pad="0.1fr": No extra imports. Can't do more complicated math (unless the user does some string formatting: factor=2.5, pad="%fpt"%(factor*3))
  2. from dimunits import pt, fr, pad=3*pt, pad=0.1*fr: extra import; Can do math (i.e. factor=2.5, pad=factor*3*pt, pad=factor*0.1*fr)

Any other pluses and minuses?

@u55
Copy link
Contributor

u55 commented Sep 27, 2017

In the interest of completeness, although I prefer the units-as-objects syntax, I should point out that units-as-strings would be easier to define-in and parse-from the matplotlibrc, in the same way that cycler instances are difficult to parse.

@anntzer
Copy link
Contributor

anntzer commented Sep 27, 2017

I have a big plan to kill the current matplotlibrc syntax (as argued in other places), but will probably need to write a MEP to convince everyone...

Re: imports: instead of

import matplotlib.pyplot as plt

I usually write

from matplotlib import pyplot as plt

in which case, if the units are exposed at the toplevel (perhaps not a crazy idea?), one could just do

from matplotlib import pyplot as plt, cm, ...

@u55
Copy link
Contributor

u55 commented Sep 27, 2017

@anntzer matplotlib.cm already exists as a sub-module for colormaps. I prefer to keep units in a separate namespace anyway, so that I can do:

from matplotlib import units as u
myborder = 0.5*u.cm

which mirrors the astropy.units documentation.

@anntzer
Copy link
Contributor

anntzer commented Sep 27, 2017

(just throwing ideas now) or uppercase them, which also solves the problem with "in" being a keyword.
Yes, SI says units should be lowercased, but SI also says you shouldn't put a multiplication sign between the number and the unit, so...

@astrojuanlu
Copy link

There are a number of competing units systems (Pint, astropy.units, probably others). Would it be worth it to work with them to include them in matplotlib, if the dependency policy allows?

@jklymak
Copy link
Member Author

jklymak commented Sep 29, 2017

Yeah I know about those. Unless we want to support unit mathematics, which I don't think is really necessary, the main "tricky" part is making sure the right units are being used for the right dimensions. So we'd need a matplotlib wrapper at the very least.

@jrueb
Copy link

jrueb commented Nov 15, 2019

Has there been any update on this? Does matplotlib allow for international units now in some way?

@jklymak
Copy link
Member Author

jklymak commented Nov 15, 2019

@jrueb The core devs are relatively skeptical that the extra code/documentation/maintenance are worth adding this. We will never get away from point (defined as 1/72 of an inch) and dpi, so doing things in inches, while arbitrary, is simple. Admittedly this grates on folks, but you quickly go down rabbit holes if you start making things more flexible (see the long discussion in #12415).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants