-
-
Notifications
You must be signed in to change notification settings - Fork 7.9k
Sizes of different markers are not perceptually uniform #15703
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
If I remember correctly, the size parameter refers to the length of a side
of the "reference square". Most of the markers are defined to be within
that square, while some of them are defined to enclose that square (like
the big diamond).
…On Sat, Nov 16, 2019 at 3:41 PM Hans Dembinski ***@***.***> wrote:
*Bug summary*
Markers of different types ("o", "s", "*" ...) do not visually appear to
be of the same size when their marker size (e.g. ms=8) is equal (matplotlib
3.1.1).
*Details*
Matplotlib also has a great selection of markers, but these relative sizes
of these markers are not perceptually uniform, see
https://matplotlib.org/3.1.1/api/markers_api.html
or this example script:
from matplotlib import pyplot as plt
plt.style.use("default")
import numpy as np
x = np.arange(4)
y = np.ones(4)
for imarker, marker in enumerate("os*pv^<>PDdX"):
plt.plot(x, y + 0.1 * imarker, marker=marker)
plt.show()
The square "s" and the diamond "D" appear larger than the other markers.
The star "*" is the smallest, followed by the pentagon "p" and the plus "P".
[image: image]
<https://user-images.githubusercontent.com/2631586/68998928-f1aef080-08b8-11ea-8633-805e604aa96b.png>
*Expected outcome*
Markers should appear uniform in size. I think for the star this is very
obviously not the case. The area of the star is much smaller than for the
circle, and largest for the square and the diamond.
Nevertheless, I don't think an objective geometric criterion like area can
be used to make them perceptually uniform in size. I think this needs to be
hand-tuned by a human to take into account how humans perceive the relative
size of objects.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#15703?email_source=notifications&email_token=AACHF6HFBPFUHLMTERSYNPLQUBLGBA5CNFSM4JOGZCG2YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4HZZ7M7A>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AACHF6GKGIX3EUK5FYO266TQUBLGBANCNFSM4JOGZCGQ>
.
|
This would need a discussion if we want to aim for perceptual uniformity. Anyway, it would be a breaking style change and could only be introduced in the context of a style makeover (likely not before 4.0). Added to the list of possible style changes #14331. |
Thanks! |
Moving discussion here from #16623. My thoughts is that while a "perceptually" uniform size would be amazing, we must account for a couple of non-trivial caveats:
In short, how you end up defining "perceptual uniformity" is likely to depend drastically on what test you use to compare the markers. Aside from the technical issues involved in developing a principled metric for perceptual size, there is also the practical, sociological issue of what people's bosses will allow them to use. I'm currently a grad student, and I know that if I tried to pitch to my boss that I was using some phenomenological scaling for the sizes of the markers in my scatterplots, he would tell me to either find a paper I can cite or revert to what we've always done, which is to use objective, geometric criterion to scale our markers so that they appear perceptually uniform in a way that's easily justifiable. So even if we do find a good metric for perceptual uniformity and get good data and standardize it and get it implemented in time for 4.*, I think there's a very strong argument to be made that we should also include an option to define markersize to mean something more geometrically/objectively definable, for those of us who have to justify our marker size choices to skeptical/adversarial reviewers. Conversely, if we don't get this done in time, I think it would be better to change the marker sizes to follow a well-defined, objective geometric criterion (see below) than to leave them as is, since they frankly produce rather ugly plots as is without hand-tuning the marker sizes. Here are the options I propose including alongside (or instead of) a perceptually-uniform approach:
|
I am happy to implement a couple of simple GUIs that could be used to do "randomized/single-blind" testing of the relative "sizes" of different markers (i.e. present scatter/line plots with two different markers with randomly chosen sizes included and just ask "which looks bigger? or do they look about the same?"). But ideally these would be distributed for many different people to use and the data collected in some centralized location to be analyzed later. Would posting the code as a gist here be appropriate? Attaching the data to comments on this thread? Or is this conversation more appropriate for the mailing list? |
@brunobeltran Firstly, great to see more comments on this issue. There are two uses of markers. One is to just annotate different data sets as to make them distinct. This was my original concern and most of what you point out does not apply then. The second is the use in a scatter plot, where the size of the marker has meaning. However even then only the relative size of the markers matters. I have never seen someone making an overlay of a scatter plot of circles and a scatter plot of squares and then compare the relative sizes of circles and squares with each other. You only compare circles to circles and squares to squares. If that is true, then there is no problem in computing the size of a marker from the |
Good to hear that alternate perspective @HDembinski! I have actually used multiple glyphs in contexts where comparing their sizes is meaningful. Most often when including e.g. trade volume as a separate channel of information using size. IME, this can look good when there are very few data points, and is useful color is already being used to signal a different, quantitative variable. Maybe your comment that you've never seen this before is a sign that I should think more carefully when including four quantitative channels of data using scatter's four main inputs (x, y, s, c). However, these types of plots are not uncommon (https://www.researchgate.net/figure/Scatterplot-of-average-error-rates-and-completion-times-for-all-99-configurations-The_fig3_221557219 and https://www.premraj.me/five-dimensional-scatterplot-using-ggplot2/ were both on the first "page-ish" of my images.google.com results for "scatterplot glyphs"). Regardless of whether or not the above is bad practice, people will still end up using different glyphs in a scatterplot sometimes. My original point (before I forgot to stop typing) was that we want these glyphs to "appear" the same size as well, and that the I guess I'm happy to move forward with building some kind of blind testing GUI to try to get some values for this I'll wait for the word of a maintainer on how/if to distribute this test and how/where to tabulate the results before I put any work into it, though. |
I am a bit concerned about the risk of spiraling scope on this, much of it outside of the technical skills of many of our developers (I know enough about designing surveys / user research to know I know nothing useful about designing surveys / user research). Doing this right seems like it is at least a master's thesis worth of work (and I may be underestimating), does anyone in this thread know an academic who would be interested in partnering with us on this? I know that will drastically slow everything down, but if we are going to do this and possibly make a major breaking change to Matplotlib, we should make sure we are on sound footing. As to if it should be done, I think I am leaning towards yes. We should do some research (either git log / mailing list splunking or talking to people) to sort out if there was any systematic logic behind the current sizes. My suspicion is that there was not (or it was "look like MATLAB"), if there was, we should sort out if that reasoning still holds. In either case, if we do have solid guidance on better marker sizes, we should use it (but "solid guidance" is doing a lot of work in that sentence, I am thinking "published paper" level solid)). |
And to be clear @HDembinski and @brunobeltran I am thankful you are both thinking about this, just cautious about making major changes. |
Thanks for the input @tacaswell . To be honest I agree that it sounds like a masters degree's worth of work. Having dug through the code for marker sizes in depth for a recent PR (#16607), I can tell you that the systematic logic for the current sizes appears (at least a posteriori) to be: "whatever was easiest to code". I go into detail for each marker in #16623. I can provide more explanation if need be, but I think that's it's hard to read |
Also, to clarify, I take your comment to mean that if I were to perform this research myself, this thread would not be an appropriate place to share data (or to gather willing participants?) |
One could start by allowing users to actually size their markers consistently without needing to make use of private attributes, see |
@ImportanceOfBeingErnest While I agree that MarkerStyle's behavior in these cases should be changed, I think that maybe a separate Issue should be opened for this, since the posts you link suggest some reasonably complex set of API changes. Do you have any Issues open yet for these problems? If not, just @me in one and I'm happy to take the lead of implementing any API changes that are agreed on! After all, you can already implement custom |
@tacaswell I reluctantly agree that doing this correctly is a major task, similar in scope to the work that eventually led to the new colormaps. The questions is whether we want to wait for this or whether we should go forward with a less ideal solution, which at least adresses some of the issues. So as an intermediate step, I propose to change the API so that |
I agree on the points brought up. To be pragmatic, I would also support @HDembinski idea: given that this will only come with 4.x, why not put this as a planned feature. If a better way can be found before 4.x, we can go for that. Otherwise, the "best" is for 5.x or later and we have something good for sure "soon". |
What is the prior art on this? i.e. what do other packages do? |
I guess I'm a little leery of this - if you tell me the marker size of a square is 10 pts, I expect it to be 10 pts across. If you tell me a circle has a marker size of 10 pts I expect its diameter to be 10 pts. Obviously their areas are not perceptually equal. Except for the big diamond and maybe the "x" above, everything looks "correct" to me in that plot. |
I agree that linear width is more intuitive. Scatter sizing is maybe the only place where the opposite is even arguable, and that's because it's what Matlab (and then matplotlib) has always done.
But the markers are not all the same width! See #16623 for detailed breakdown. They are slightlyyyyy different widths and very different heights...Something should be consistent. Right now it's neither width, height, nor area. Agree as well that constant area now (4.x) and better as soon as it's available (even if only by 5.x) seems like best path forward. |
@brunobeltran Thanks for the link to #16623. I agree with your proposed path 2 in #16623 and we should make everything touch your unit box in at least two spots ;-) |
Right,
|
So I think the first step here is to hack on the Once we have that we can sort out what the API to control it (on master we can now pass a |
Being able to pass the #16773 would allow to create any set of custom markers, so one could at least provide a module with respective markers. I think this is important because no matter what the default markers in matplotlib are, they will never satify everyone. I.e.
|
I'd like to add a point: |
Bug summary
Markers of different types ("o", "s", "*" ...) do not visually appear to be of the same size when their marker size (e.g. ms=8) is equal (matplotlib 3.1.1).
Details
Matplotlib has a great selection of markers, but the relative sizes of these markers are not perceptually uniform, see
https://matplotlib.org/3.1.1/api/markers_api.html
or this example script:
The square "s" and the diamond "D" appear larger than the other markers. The star "*" is the smallest, followed by the pentagon "p" and the plus "P".
Expected outcome
Markers should appear uniform in size. I think for the star this is very obviously not the case. The area of the star is much smaller than for the circle, and largest for the square and the diamond.
Nevertheless, I don't think an objective geometric criterion like area can be used to make them perceptually uniform in size. I think this needs to be hand-tuned by a human to take into account how humans perceive the relative size of objects.
The text was updated successfully, but these errors were encountered: