-
-
Notifications
You must be signed in to change notification settings - Fork 7.9k
Cleanup: broadcasting #7562
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cleanup: broadcasting #7562
Conversation
@@ -25,7 +25,7 @@ | |||
# Generate random data | |||
data = np.random.uniform(0, 1, (64, 75)) | |||
X = np.linspace(-1, 1, data.shape[-1]) | |||
G = 1.5 * np.exp(-4 * X * X) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In most cases I think this construct is marginally faster.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# float is probably the most common case
In [8]: x = np.arange(100000).astype(float)
In [9]: %timeit x * x
10000 loops, best of 3: 65.5 µs per loop
In [10]: %timeit x ** 2
10000 loops, best of 3: 66.4 µs per loop
(and you can easily get them reversed on another run).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
interesting. That is an apparently wrong rule of thumb I have been using for a while.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would you think this would be the case? (I am honestly puzzled.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It dates back to the early days of computers and compilers, when computers were slow and compilers were not so bright. Multiplication is faster than taking a power, in general. But now compilers recognize things like this and find the best algorithm. In the case of numpy, I'm pretty sure there is explicit internal optimization of small integer powers so that we wouldn't have to use that ancient manual optimization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I respectfully request the children remain off of my lawn. 😈
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As @efiring says, numpy hard-codes power(x, 2)
to fall back on square(x)
, along with a handful of other constants (I think (-1, 0, 0.5, 1, 2)
)
lib/matplotlib/axes/_axes.py
Outdated
# make them safe to take len() of | ||
_left = left | ||
left = make_iterable(left) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this break unit handling? It is probably better to leave the duck-ish typing here and not force to numpy
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know what unit handling is actually supposed to do, but in any case if an object array (or list of objects, or single object) is passed in then atleast_1d will return an object array.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At the top of examples/units/radian_demo.py
, I added:
print(x[0])
print(np.atleast_1d(x)[0])
print(np.atleast_1d(x[0]))
and it printed:
[ 0.] in radians
[ 0.] in radians
[0.0]
so I guess it might have broken something, unit-wise?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems due to a useless and incorrect implementation of __array__
. Tried to fix the example...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem is when the unit is carried on the container class, not the values. This will strip the units and turn it into a plain numpy array. These sorts of things tend to be implemented as numpy array sub-classes (ex pint and yt).
This will also force a copy of pandas Series
we get in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These need to stay as they were or the process_unic_info
call needs to be moved above these casts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
atleast_1d
calls asanyarray
so it passes subclasses through with no problem. It is true that pandas Series do not subclass numpy arrays and get copied, but given that we're going to generate one Rectangle object per entry in the Series I'd say the additional cost of the copy is rather small.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, note that the current implementation is actually buggy: bar([1, 2], [3, 4], width=[.8])
works but bar([1, 2], [3, 4], width=np.array([.8]))
currently raises an exception due to the fact that width *= nbars
works elementwise here. (Can you open a separate issue if you don't think this PR can get merged any time soon, so that it doesn't get lost?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Edit: Actually asarray doesn't even copy the underlying buffer for pandas series:
In [1]: s = pd.Series([1, 2, 3]); t = np.asarray(s); t[0] = 42; s
Out[1]:
0 42
1 2
2 3
dtype: int64
so the point is moot.
7b3171b
to
f15e8f4
Compare
@@ -440,8 +440,8 @@ def transform_non_affine(self, xy): | |||
|
|||
quarter_x = 0.25 * x | |||
half_y = 0.5 * y | |||
z = np.sqrt(1.0 - quarter_x*quarter_x - half_y*half_y) | |||
longitude = 2 * np.arctan((z*x) / (2.0 * (2.0*z*z - 1.0))) | |||
z = np.sqrt(1 - quarter_x * quarter_x - half_y * half_y) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not using **2
on these ones?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure.
c = np.sqrt(area) | ||
r = np.sqrt(x*x + y*y) | ||
r = np.sqrt(x * x + y * y) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
np.hypot
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we have to use hypot every time, especially in examples, as the function may be somewhat unknown.
lib/matplotlib/axes/_axes.py
Outdated
# make them safe to take len() of | ||
_left = left | ||
left = make_iterable(left) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At the top of examples/units/radian_demo.py
, I added:
print(x[0])
print(np.atleast_1d(x)[0])
print(np.atleast_1d(x[0]))
and it printed:
[ 0.] in radians
[ 0.] in radians
[0.0]
so I guess it might have broken something, unit-wise?
@@ -318,7 +318,7 @@ def draw_text(self, gc, x, y, s, prop, angle, ismath=False, mtext=None): | |||
if angle == 0.0: | |||
gfx_ctx.DrawText(s, x, y) | |||
else: | |||
rads = angle / 180.0 * math.pi | |||
rads = math.radians(angle) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not deg2rad
like the other PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
math
only has radians
and degrees
, not deg2rad
and rad2deg
. I seems that both names are used in the codebase (even before my earlier PR).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, I meant to go back and check whether that was math
or np
and forgot about it.
lib/matplotlib/projections/geo.py
Outdated
@@ -388,8 +388,8 @@ def transform_non_affine(self, xy): | |||
|
|||
quarter_x = 0.25 * x | |||
half_y = 0.5 * y | |||
z = np.sqrt(1.0 - quarter_x*quarter_x - half_y*half_y) | |||
longitude = 2 * np.arctan((z*x) / (2.0 * (2.0*z*z - 1.0))) | |||
z = np.sqrt(1 - quarter_x * quarter_x - half_y * half_y) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No **2
again?
lib/matplotlib/tests/test_axes.py
Outdated
@@ -1445,8 +1445,8 @@ def bump(a): | |||
y = 2 * np.random.random() - .5 | |||
z = 10 / (.1 + np.random.random()) | |||
for i in range(m): | |||
w = (i / float(m) - y) * z | |||
a[i] += x * np.exp(-w * w) | |||
w = (i / m - y) * z |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could be written as a NumPy expression instead of a loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure.
lib/mpl_toolkits/mplot3d/axes3d.py
Outdated
raise ValueError(("Argument 'zs' must be of same size as 'xs' " | ||
"and 'ys' or of size 1.")) | ||
|
||
xs, ys, zs = np.broadcast_arrays(*map(np.ma.ravel, [xs, ys, zs])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This possibly makes any error about incorrect shapes a bit more obscure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
broadcast_to
explicitly prints the nonmatching shape in the error message, but not broadcast_arrays
. I'll open an issue on numpy but I don't think this should hold up this PR.
(Note how just below we gain checking on the size of dx/dy/dz, which was not present before.)
f15e8f4
to
4a15fec
Compare
Looks like lots of nice cleanups here, once again. In future, however, I suggest that you reconsider your policy of always surrounding operators by spaces. This is not required by PEP 8, and although I am a fan of white space, I think that in many cases omitting spaces around operators improves readability. E.g., |
Sounds like a reasonable policy, thanks for the notice. |
Looks like there the build failure comes from the newly released sphinx 1.5. |
Yep, #7569. |
@@ -20,22 +20,21 @@ | |||
|
|||
# Create the mesh in polar coordinates and compute x, y, z. | |||
radii = np.linspace(min_radius, 0.95, n_radii) | |||
angles = np.linspace(0, 2*np.pi, n_angles, endpoint=False) | |||
angles = np.linspace(0, 2 * np.pi, n_angles, endpoint=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am very strongly opposed to unnecessary whitespace changes of this form, and there are quite a lot of them in this PR.
They are also not related to the title and description of the PR.
3fd7fea
to
d01ce65
Compare
I got rid of all the trivial whitespace changes. (Lines with both whitespace changes and nonwhitespace changes kept their changes.) |
6803631
to
f0c3b4f
Compare
@anntzer Thankyou, but you do need to be consistent. Changing some is worse than changing all or none. |
f0c3b4f
to
9d6d81b
Compare
I think it's all fairly consistent now. |
9d6d81b
to
b69c4f6
Compare
There seems to be some issues left with the unit handling code that I need to look into. |
b69c4f6
to
dfa77b9
Compare
@@ -91,7 +91,7 @@ | |||
fig = plt.figure() | |||
ax = fig.add_subplot(111, projection='polar') | |||
r = np.arange(0, 1, 0.001) | |||
theta = 2*2*np.pi*r | |||
theta = 4 * np.pi * r |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The 2*2
here may have been pedagogical (ex 2 * tau
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reverted; however I chose to leave the use of **2
instead of X * X
above (unless you feel strongly about it).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, @efiring explained that one correctly. If it isn't faster, use the more obvious way to write it.
lib/matplotlib/axes/_axes.py
Outdated
# make them safe to take len() of | ||
_left = left | ||
left = make_iterable(left) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem is when the unit is carried on the container class, not the values. This will strip the units and turn it into a plain numpy array. These sorts of things tend to be implemented as numpy array sub-classes (ex pint and yt).
This will also force a copy of pandas Series
we get in.
lib/matplotlib/axes/_axes.py
Outdated
# make them safe to take len() of | ||
_left = left | ||
left = make_iterable(left) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These need to stay as they were or the process_unic_info
call needs to be moved above these casts.
y = np.atleast_2d(*args) | ||
elif len(args) > 1: | ||
y = np.row_stack(args) | ||
y = np.row_stack(args) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are not equivalent
In [18]: np.atleast_2d([1, 2, 3])
Out[18]: array([[1, 2, 3]])
In [19]: np.row_stack([1, 2, 3])
Out[19]:
array([[1],
[2],
[3]])
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it's np.atleast_2d(*args)
(note the unpack) and only in the case where len(args) == 1
.
Say args
has shape (after casting to an array) (1, x, y, ...)
.
np.atleast_2d(*args) == np.atleast_2d(<shape (x, y, ...)>) == <shape (x, y, ...)>
np.row_stack(args)
also has shape (x, y, ...).
In the case args
is 2D (shape (1, x)), both expressions result in a shape (1, x) as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🐑
for i in range(m): | ||
w = (i / float(m) - y) * z | ||
a[i] += x * np.exp(-w * w) | ||
a += x * np.exp(-((np.arange(m) / m - y) * z) ** 2) | ||
a = np.zeros((m, n)) | ||
for i in range(n): | ||
for j in range(5): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this inner loop here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because whoever came up with that example decided to add some random numbers five times to each column rather than only once...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🐑 Yeah, I am greatly confused by the local operating in-place function...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
stackplot_demo2
(from which this function comes) has the slightly more helpful docstring "Return n random Gaussian mixtures, each of length m." I can copy it there, or just inline the function (also in the demo), let me know if you have a preference.
dfa77b9
to
694ea11
Compare
@tacaswell uses ⚔️ for conflicts... |
4ae08f2
to
79bc88d
Compare
79bc88d
to
9a077e8
Compare
Rebased. |
Still LGTM, though maybe squash the import fixup into the relevant commits. Might also want to rebase just to get CircleCI+AppVeyor fixes too.. |
86a86e2
to
33fef83
Compare
I squashed-rebased everything as splitting the fixup commit across the rebase seems a bit tricky, plus everything is still more or less related. |
I forgot; why did the test image change? |
Because the previous behavior was incorrect: see the leftmost green "triangle", which is did not go all the way to the left but now does. |
Ah, I see it now. |
Needs a rebase; I'm also going to dismiss @tacaswell's review which seems to be outdated. |
33fef83
to
aee5e1e
Compare
done |
And of course in that time a PR went in that causes conflicts... |
aee5e1e
to
1e68491
Compare
and fixed again... |
@@ -9,7 +9,7 @@ | |||
import matplotlib.pyplot as plt | |||
|
|||
t = np.arange(0.0, 1.0 + 0.01, 0.01) | |||
s = np.cos(2 * 2 * np.pi * t) | |||
s = np.cos(2 * 2*np.pi * t) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious: why is this the one where you don't have spaces around it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nm...I guess I see it now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just one question about imports...
lib/matplotlib/projections/polar.py
Outdated
import six | ||
|
||
from collections import OrderedDict | ||
import warnings |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this import necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same thing actually for six.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably some rebase made the thing irrelevant. Fixed.
Also fixes a bug in fill_between with masked data. In the modified test figures, the area in green is supposed to correspond to the part of the hatched area where the curve is below y=2. The new behavior is the correct one. Also fixes cbook._reshape2D for scalar object inputs. Before the fix, `plt.hist(None)` would fail with `x must have 2 or fewer dimensions`, which it does have. Now it fails with a bit later with `unsupported operands type(s) for +: 'NoneType' and 'float'`, which is hopefully clearer.
1e68491
to
5348ce9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just waiting on CI to pass...
np.hypot
where appropriate.np.pi
instead ofmath.pi
.