-
-
Notifications
You must be signed in to change notification settings - Fork 7.9k
plt.yscale('log') after plt.scatter() behaves unpredictably in this example. #6915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Tested using lastest master 2.0.0b3 and python 3.5. Problem still present. I played around a bit with this. It seems to effect both OO style and pyplot. Of note in the latest version, the ticks and tick labels disappear. Also, the automatic scaling when setting the scaling beforehand does not leave enough space at the bottom of the scale. The points end up along the x-axis. EDIT: More playing. So it seems that this problem is data dependent. My guess is there is something about the sample data you are trying to plot is messing up an autoscaling algorithm somewhere. |
OK the data-dependence is great to know for two reasons: 1) I was pretty sure I hadn't seen this behavior before, 2) it offers a clue that can narrow down what's going wrong. I'll write a little script that creates a bunch of different kinds of data sets and try to narrow it down which conditions are necessary to cause the problem. |
@uhoh-SE did you have any luck at tracking what kind of data triggers the problem and possible where the limit of that data might lie? |
@LindyBalboa Thanks for the reminder - yes I got to this point and then said my brain hurts! I can't look at this more until the weekend, I'll post this Monte-Carlo simulation here in case someone wants to look at it. It was written quickly (not intended to be shared) so it is a bit scrappy and there are no comments. For 4000 plot simulations it took about 6 minutes on my laptop. In general the dots should be stair-casing between the two lines. There are a few things happening - something about the value -2.33 that is probably an excellent clue, but while Is it possible to search for an occurrence of It seems to happen equally for both x and y axes, and for either axis alone if only one is log scale. I guess there's no longer any question that it is a bug at least!
|
Ah that is fantastic! That does definitely help narrow it a bit. I'm looking forward to looking into this later 👍 |
I did some tracking with pdb and things start to get a little funny inside mpl/axes/_base.py: 3142 def set_yscale(self, value, **kwargs):
3143 """
3144 Call signature::
3145
3146 set_yscale(value)
3147
3148 Set the scaling of the y-axis: %(scale)s
3149
3150 ACCEPTS: [%(scale)s]
3151
3152 Different kwargs are accepted, depending on the scale:
3153 %(scale_docs)s
3154 """
3155 # If the scale is being set to log, clip nonposy to prevent headaches
3156 # around zero
3157 if value.lower() == 'log' and 'nonposy' not in kwargs.keys():
3158 kwargs['nonposy'] = 'clip'
3159
3160 g = self.get_shared_y_axes()
3161 for ax in g.get_siblings(self):
3162 ax.yaxis._set_scale(value, **kwargs) <<< I think the error is located somewhere in
3163 ax._update_transScale() <<< these three lines. There is a visual bug
3164 ax.stale = True <<< upon ax.stale = True
3165 self.autoscale_view(scalex=False) |
Okay I have narrowed this down a little more. File mpl/axes/_base.py, method There is a call on line 2270 for The Too tired to continue tonight. Maybe someone else can crack the cpp easily now that we know what is happening. |
@LindyBalboa Very brave! I can't help with this part (more than a few dozen lines of Python makes me dizzy and want to start eating instant coffee right out of the jar) but I can mention that I looked more closely, ran more Monte Carlos and still I see this funny pattern. The strange effect still seems happen when the minimum X value is below about Is it possible to do a simple text search for |
@uhoh-SE can you explain those plots (or give them axis labels!). |
@tacaswell OK sure I'll make something more self-explanatory today, thanks. Briefly each dot is the result of one of 4000 log-log plots from 4000 random data sets (of 12 points each). The X coordinate of the dot is the actual minimum x (abscissa) value of the 12 points, and the Y coordinate of the dot is the lower limit chosen by matplotlib. The staircasing between the two lines is the expected and desired (presumably) behavior of a plotting routine. The streaks to the left represent cases where the minimum x (abscissa) value is much lower than the lower limit chosen - resulting in missing data from the final plot. Because I didn't want to actually use a log plot to accurately portray a bug with log plotting, I am plotting the base-10 logarithms on a linear scale. The visual appearance of some kind of vertical band at around -2.5 makes is striking and has no simple explanation that I can think of. Therefore it may be useful hint. |
The first example is even worse on v2.x. |
Piece of the puzzle:
|
I haven't traced it back to the cause, but I found a place to interrupt the error chain: Inside ticker.py, in the
This is really annoying because I have been trying to track down when the minpos attribute is getting messed up, but everything looks normal! |
The first basic problem is that for collections the dataLim is a Bbox calculated when the collection is added to the Axes. Those dataLim change depending on the transforms, however, so when a transform changes they need to be updated by removing and re-adding each collection. When I do that, I get inclusive but too-large ax.dataLim. I will try to track that down tomorrow. |
As far as I know natural logarithms or exp() are not likely to appear in plotting routines - most things should be base 10. However, there is that artifact at 10** -2.3466, or ~0.0045 which looks like natural logs were used. Is it possible to to a quick text search for |
Thanks for jumping in on this efiring. This has been bugging me for a while. I was hoping for a nice mid-difficulty fix but found quite a rabbit hole. At least I'm sharpening my debugging skills in the process! If you look at the error trace I posted, the offending statement is |
@LindyBalboa for what it's worth, |
In all my years of taking math and physics, a base 10 logarithm has always been |
It is surely regional:) I was taught |
I'm going to admit defeat. A simpler test case is import numpy as np
import matplotlib.pyplot as plt
X = np.linspace(0, 1, 3)
Y = np.logspace(np.log10(5e-4), np.log10(15), 3)
fig, ax = plt.subplots()
#ax.set_yscale('log')
col = ax.scatter(X, Y)
ax.set_yscale('log')
plt.show() There is considerable updating of the collection and the axes that would need to get done for this to work with the generation of any collection being before a switch to a log scale. Removing the collection, executing |
@LindyBalboa OK so the funkiness shown in my Monte Carlo plot above is happening at -2.305..ish, which is what you'd get if you were trying to take Or... a situation where |
This problem has nothing to do with magic values; it is a matter of how transforms are made, modified, and used in the autoscaling, when dealing with collections. I'm not 100% sure, but it looks like a fairly fundamental limitation, not a simple bug. I don't expect the solution to be a one-liner. |
@efiring OK, well in the Monte Carlo simulation above I did run thousands of plots with simulated data ranging over six orders of magnitude, from very narrow to very wide ranges, and that -2.305 repeated very consistently. Sorry but I think this is meaningful. |
@LindyBalboa just to be sure: did you change those sizes for the Monte Carlo bit too? |
Yes, that is the output of the Monte Carlo code, just adding I was just following up on efirings hunch
|
Yeah, I got that, I just meant that there are two kinds of |
@efiring Amazing! Somehow my brain did not actually process the words 'marker size' properly when I read what you wrote. OK I understand better now, thanks! Also thanks @LindyBalboa for the new plots. |
I'm running into this too at the moment. A workaround is to use ax.plot(x, y, marker='o', linewidth=0) will do all the auto-scaling properly, but still look like a scatter plot. |
For suppressing the line between markers, using |
As mentioned in this SE question a scatter plot is not autoscaling to include all of the data if
plt.yscale('log')
is used afterplt.scatter()
. This happens for the y-axis but not the x-axis in the example, and does not happen forplt.plot()
.In an earlier answer by a developer,
ax.set_yscale('log')
is shown followingax.scatter()
, so I am wondering if this may be a bug.Using matplotlib version 1.5.1 and python 2.7.11
The text was updated successfully, but these errors were encountered: