-
-
Notifications
You must be signed in to change notification settings - Fork 7.9k
scatter() behavior with different size inputs depends on the backend but both variants are inconsistent with the docs #12021
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
crossref #11383 |
I think Mysterious cycling souds lie API run amok to me. But if we change it someone will complain.... |
I'm going to mark this as RC for the next release (even though the bug is old), because the more I think of it the more I fear that something like the following is waiting to happen (or has already happened): someone plots some data with scatter() with accidentally mismatched sizes, visualizes it, is happy with the result, generates a pdf because that's what the journal submission requires, \includegraphics it and doesn't pay more attention to it, and ends with an incorrect figure. (Well, sure, it was also incorrect at the beginning but the pdf is not only incorrect but also different from what was visually checked.) (I agree that c and s should just be broadcastable to x and y, which effectively means size 1 or n.) For me it's the combination of silently dropping/inventing data and of doing so differently in two different backends that makes this RC. Feel free to unlabel if you disagree. |
crossref also #11663, which is a good start for better input validation in |
This issue has been marked "inactive" because it has been 365 days since the last comment. If this issue is still present in recent Matplotlib releases, or the feature request is still wanted, please leave a comment and this label will be removed. If there are no updates in another 30 days, this issue will be automatically closed, but you are free to re-open or create a new issue if needed. We value issue reports, and this procedure is meant to help us resurface and prioritize issues that have not been addressed yet, not make them disappear. Thanks for your help! |
I've extracted the remaining open issue #12021 (comment) into #28833. |
scatter()'s docstring states
i.e.
s
must have the same size asx
/y
(n
) andc
too, orc
can have size 1 as well. But this is not the case:x
,y
,c
, ands
are "cycled over" until covering the longest of them, as can be shown by(plots a second point of size 300 over (0, 0) -- but only for vector backends; for agg, the size=300 point is dropped)
or
(reuses "r" for the last point).
The difference between vector and raster backends comes from _backend_agg.h using
size_t N = std::max(Npaths, Noffsets);
(so not taking the size ofs
(i.e.transforms
) into account), whereas the vector backends ultimately rely on backend_bases' _iter_collection_raw_paths, which usesN = max(Npaths, Ntransforms)
and then _iter_collection, which hasN = max(Npaths, Noffsets)
(so all three are taken into account).Obviously this behavior goes down what semantics we want for draw_path_collection (... having had to figure it out myself for mplcairo), but I think at least that cycling the positions when the sizes or colors are longer is misguided and would be hiding a bug most of the time (indeed, I initially missed the fact that the example in #12010 plotted 4e6 points (at 2e3 positions, each of them overlayed 2e3 times), not 2e3, due to the size of
s
).I can imagine wanting to cycle over a shorter
c
ors
, not that I particularly like it as it can also hide bugs.If we really don't want to change anything, then the docstring should be updated.
mpl master/3.0rc and probably since a long time ago.
The text was updated successfully, but these errors were encountered: