MAINT Use np.full when possible #12379

rth · 2018-10-03T08:41:30Z

This uses np.full in a few places instead of cst*np.ones which avoids a memory copy and should be ~2x faster.

anntzer · 2018-10-03T09:02:47Z

You'll probably need to wait until we bump to numpy>=1.12 (next january, by #12358) or specify a bunch of dtypes manually, see #10571 for basically the same thing.

rth · 2018-10-03T13:40:47Z

Thanks for your feedback @anntzer ! I was not aware of the change of behavior in numpy 1.12.

I have specified the dtype manually in the few places it was missing -- I think it's more reliable in any case than relying on the dtype of the second argument (i.e. 5 vs 5.).

This PR is less comprehensive that the one you did earlier, as I only made the changes in places where I though this had a change to impact the run time performance of the library (e.g. excluding examples, and some tests where the array size is very small).

anntzer · 2018-10-03T13:56:25Z

lib/matplotlib/axes/_axes.py

@@ -3978,7 +3978,7 @@ def dopatch(xs, ys, **kwargs):
            # maybe draw the fliers
            if showfliers:
                # fliers coords
-                flier_x = np.ones(len(stats['fliers'])) * pos
+                flier_x = np.full(len(stats['fliers']), pos, dtype='float64')


I think dtype=float is generally preferrable, unless you have a good reason to force float64 even on 32bit architectures (which at least we didn't do before).
same comment applies throughout.

actually, no, dtype=float is not preferable as it is referencing python floats. dtype='float64' or dtype=np.float_ (if you don't want to specify the width) is generally considered preferable. Also note dtype=np.int_ and dtype=np.bool_ all exist for similar reasons.

but dtype=float will give you a proper np.float_ array... at this place, it's literally adding 4 characters for no change in behavior.

You stated "is generally preferrable". Meanwhile: https://stackoverflow.com/questions/6205020/numpy-types-with-underscore-int-float-etc

Travis's Numpy book states reasons for why he created those dtypes. Yes, in many cases, you end up with the same thing, but they do exist for a reason.

Eh... the link you quote literally says "Whenever a data type is required, as an argument, the standard Python types are recognized as well" and I don't think (?) there's a single sentence in this quote saying that the dtypes are preferable in this use-case.

My point was, and still is, that the fact that python datatypes are accepted is merely a convenience, and is not declared as "generally preferable". I have never seen anyone before state that the python datatypes are "generally preferable", and instead, I have seen people state the opposite because "explicit is better than implicit". I think what the submitter has is perfectly fine. And if the submitter had it as dtype=float in the original submission, I probably wouldn't have said anything anyway, either. I am only opposing the assertion that your suggestion was "generally preferable".

I am only opposing the assertion that your suggestion was "generally preferable".

That's fine with me, let's agree to not derail this PR and postpone this discussion to, well, not now.

To address the initial comment,

>>> import numpy as np >>> np.dtype(float) dtype('float64') >>> np.dtype(np.float64) dtype('float64') >>> np.dtype(np.float_) dtype('float64')

so the actual dtype when providing dtype=float will be float64. That will be true for both 32 bit and 64 bit systems (checked in a 32 bit Debian docker container), unlike np.dtype(int) which can be int32 or int64 depending on the system.

I also think it's better to say float64 explicitly even if it's 3 char longer, because that's the actual numpy dtype we get, and it removes the need for readers to think about the sizeof float on 32/64 bit architecture and about possible implications of that.

FWIW, np.float64 is a slightly better spelling than 'float64'

anntzer · 2018-10-03T13:56:48Z

lib/matplotlib/hatch.py

@@ -169,7 +169,8 @@ def __init__(self, hatch, density):
        self.num_rows = (hatch.count('*')) * density
        path = Path.unit_regular_star(5)
        self.shape_vertices = path.vertices
-        self.shape_codes = np.ones(len(self.shape_vertices)) * Path.LINETO
+        self.shape_codes = np.full(len(self.shape_vertices), Path.LINETO,
+                                   dtype='float64')


that should be some integer type, i think?

anntzer · 2018-10-03T13:57:41Z

lib/matplotlib/legend_handler.py

@@ -231,7 +231,7 @@ def create_artists(self, legend, orig_handle,
        xdata, xdata_marker = self.get_xdata(legend, xdescent, ydescent,
                                             width, height, fontsize)

-        ydata = ((height - ydescent) / 2.) * np.ones(xdata.shape, float)
+        ydata = np.full(xdata.shape, ((height - ydescent) / 2.), float)


may as well drop the dot after "/2" while you're at it, the __future__(division) is now

anntzer · 2018-10-03T13:58:47Z

lib/matplotlib/path.py

@@ -316,7 +316,7 @@ def make_compound_path_from_polys(cls, XY):
        stride = numsides + 1
        nverts = numpolys * stride
        verts = np.zeros((nverts, 2))
-        codes = np.ones(nverts, int) * cls.LINETO
+        codes = np.full(nverts, cls.LINETO, dtype=int)


should perhaps have been cls.code_type?

anntzer · 2018-10-05T11:36:38Z

There's a bunch of FutureWarnings on the numpy==1.10 build (related to the change in dtype) which we probably don't want. You can make the dtype explicit in these calls, or just see whether this is gone in numpy 1.11 and bump the dependency. (If numpy 1.12 is needed, that's allowed next january.)

rth · 2018-10-05T11:50:16Z

Thanks @anntzer! I know. Trying to track them down but so far unsuccessfully as the warning are not raised with stacklevel=2 so they don't say where in matplotlib they are actually raised. All the np.full calls from this PR have an explicit dtype.
Also one of the latest dtype changes made a test fail, and I need to find out which one.

I'll run the tests locally to investigate, and will fix it -- it may take a few days though.

anntzer · 2018-10-05T11:52:22Z

the warning are not raised with stacklevel=2

would be nice (for the future) for numpy to do that :)

rth · 2019-01-22T14:25:55Z

I rebased on the latest Python3 only master with a more recent numpy. CI seems happy now.

anntzer · 2019-01-22T15:12:02Z

lib/matplotlib/hatch.py

@@ -169,7 +169,8 @@ def __init__(self, hatch, density):
        self.num_rows = (hatch.count('*')) * density
        path = Path.unit_regular_star(5)
        self.shape_vertices = path.vertices
-        self.shape_codes = np.ones(len(self.shape_vertices)) * Path.LINETO
+        self.shape_codes = np.full(len(self.shape_vertices), Path.LINETO,
+                                   dtype=np.int32)


Probably should be Path.code_type.

anntzer · 2019-01-22T15:13:44Z

lib/matplotlib/legend_handler.py

@@ -231,7 +231,7 @@ def create_artists(self, legend, orig_handle,
        xdata, xdata_marker = self.get_xdata(legend, xdescent, ydescent,
                                             width, height, fontsize)

-        ydata = ((height - ydescent) / 2.) * np.ones(xdata.shape, float)
+        ydata = np.full(xdata.shape, ((height - ydescent) / 2), float)


np.full_like(xdata, (height - ydescent) / 2)

anntzer · 2019-01-22T15:15:28Z

lib/matplotlib/path.py

@@ -552,7 +552,8 @@ def interpolated(self, steps):
        vertices = simple_linear_interpolation(self.vertices, steps)
        codes = self.codes
        if codes is not None:
-            new_codes = Path.LINETO * np.ones(((len(codes) - 1) * steps + 1, ))
+            new_codes = np.full(((len(codes) - 1) * steps + 1, ), Path.LINETO,


can pass the shape as scalar instead of 1-tuple

anntzer · 2019-01-22T15:15:46Z

lib/matplotlib/path.py

@@ -864,7 +865,7 @@ def arc(cls, theta1, theta2, n=None, is_wedge=False):
        if is_wedge:
            length = n * 3 + 4
            vertices = np.zeros((length, 2), float)
-            codes = cls.CURVE4 * np.ones((length, ), cls.code_type)
+            codes = np.full((length, ), cls.CURVE4, dtype=cls.code_type)


can pass shape as scalar

anntzer · 2019-01-22T15:15:53Z

lib/matplotlib/path.py

@@ -873,7 +874,7 @@ def arc(cls, theta1, theta2, n=None, is_wedge=False):
        else:
            length = n * 3 + 1
            vertices = np.empty((length, 2), float)
-            codes = cls.CURVE4 * np.ones((length, ), cls.code_type)
+            codes = np.full((length, ), cls.CURVE4, dtype=cls.code_type)


can pass shape as scalar

anntzer · 2019-01-22T15:16:45Z

lib/matplotlib/tri/trirefine.py

@@ -243,7 +243,7 @@ def _refine_triangulation_once(triangulation, ancestors=None):
                                         np.arange(ntri, dtype=np.int32)]))
        edge_apexes = np.ravel(np.vstack([np.zeros(ntri, dtype=np.int32),
                                          np.ones(ntri, dtype=np.int32),
-                                          np.ones(ntri, dtype=np.int32)*2]))
+                                          np.full(ntri, 2, dtype=np.int32)]))


Perhaps replace the previous two lines to use np.full too? Looks nicer :)

Also I think the whole thing is just

np.concatenate([np.full(ntri, 0, np.uint32), np.full(ntri, 1, np.uint32), np.full(ntri, 2, np.uint32)])

right?

anntzer · 2019-01-22T15:47:22Z

lib/matplotlib/tri/trirefine.py

-        edge_apexes = np.ravel(np.vstack([np.zeros(ntri, dtype=np.int32),
-                                          np.ones(ntri, dtype=np.int32),
-                                          np.full(ntri, 2, dtype=np.int32)]))
+        edge_elems = np.tile(np.arange(3, dtype=np.int32), ntri)


looks like you got 3 and ntri swapped?

rth · 2019-01-22T15:51:26Z

Thanks for the review @anntzer ! I think I addressed all your comments, waiting for CI now.

looks like you got 3 and ntri swapped?

Yes, thanks! using np.tile / np.repeat in #12379 (comment) should be simpler / faster..

timhoffm · 2019-01-22T19:17:25Z

lib/matplotlib/legend_handler.py

@@ -231,7 +231,7 @@ def create_artists(self, legend, orig_handle,
        xdata, xdata_marker = self.get_xdata(legend, xdescent, ydescent,
                                             width, height, fontsize)

-        ydata = ((height - ydescent) / 2.) * np.ones(xdata.shape, float)
+        ydata = np.full_like(xdata, ((height - ydescent) / 2))


specify dtype?

Copying xdata's dtype (which full_like does) seems normal.

anntzer reviewed Oct 3, 2018

View reviewed changes

anntzer mentioned this pull request Oct 19, 2018

Dropping support for Py3.5 and numpy 1.10 #12358

Closed

rth added 4 commits January 22, 2019 14:24

Avoid memory copies with np.full

5a1e7b3

Explicitly specify the dtype in np.full

de37bc5

Address review comments

2da8562

Don't specify dtype as str

95fa27f

rth force-pushed the use-np-full branch from 1d28727 to 95fa27f Compare January 22, 2019 13:27

anntzer reviewed Jan 22, 2019

View reviewed changes

More review comments

d376dad

anntzer reviewed Jan 22, 2019

View reviewed changes

Fix tiling in trirefine.py

fec1fa0

timhoffm reviewed Jan 22, 2019

View reviewed changes

anntzer approved these changes Jan 22, 2019

View reviewed changes

dopplershift approved these changes Jan 22, 2019

View reviewed changes

dopplershift merged commit 86d961c into matplotlib:master Jan 22, 2019

rth deleted the use-np-full branch January 22, 2019 19:39

QuLogic added this to the v3.1 milestone Jan 22, 2019

Uh oh!

MAINT Use np.full when possible #12379

MAINT Use np.full when possible #12379

Uh oh!

Conversation

rth commented Oct 3, 2018

Uh oh!

anntzer commented Oct 3, 2018

Uh oh!

rth commented Oct 3, 2018

Uh oh!

anntzer Oct 3, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

WeatherGod Oct 3, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

anntzer Oct 3, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

anntzer commented Oct 5, 2018

Uh oh!

rth commented Oct 5, 2018

Uh oh!

anntzer commented Oct 5, 2018

Uh oh!

rth commented Jan 22, 2019

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

anntzer Jan 22, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rth commented Jan 22, 2019

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

anntzer Oct 3, 2018 •

edited

Loading

WeatherGod Oct 3, 2018 •

edited

Loading

anntzer Oct 3, 2018 •

edited

Loading

anntzer Jan 22, 2019 •

edited

Loading