BUG: Fix padding with large integers #11033

lagru · 2018-05-02T11:04:41Z

The old way of creating the padded array padded with wrong values for
large integers because the new prepended / appended array was implicitly
created with dtype float64:

>>> (np.zeros(1) + (2 ** 64 - 1)).astype(np.uint64)
array([0], np.uint64)
>>> (np.zeros(1) + (2 ** 63 - 1)).astype(np.int64)
array([-9223372036854775808])

cc @mhvk

The old way of creating the padded array padded with wrong values for large integers because the new prepended / appended array was implicitly created with dtype float64: >>> (np.zeros(1) + (2 ** 64 - 1)).astype(np.uint64) array([0], np.uint64) >>> (np.zeros(1) + (2 ** 63 - 1)).astype(np.int64) array([-9223372036854775808])

mhvk

Looks good; happy to see an even simpler solution than the one I suggested! I'll let is sit for the day just to let other people have a chance to chime in.

eric-wieser · 2018-05-02T15:05:31Z

numpy/lib/arraypad.py

@@ -138,8 +138,8 @@ def _append_const(arr, pad_amt, val, axis=-1):
        return np.concatenate((arr, np.zeros(padshape, dtype=arr.dtype)),


While you're here, you could remove this branch too

Sure. Although np.full(padshape, 0) seems to be a little slower than np.zeros(padshape)...

lagru · 2018-05-02T15:41:43Z

Just to make this comment more visible: Replacing np.zeros(...) with np.full(..., fill_value=0) in a5f94a9 seems to come with a small performance penalty.

mhvk · 2018-05-02T15:45:34Z

@lagru - given that pad does unnecessary copies and that concatenating is not that fast anyway, I would not currently worry about performance too much... Overall, I think the improved code clarify is well worth a small loss in performance.

lagru · 2018-05-02T15:50:20Z

@mhvk

Overall, I think the improved code clarify is well worth a small loss in performance.

Sounds reasonable.

given that pad does unnecessary copies

If you don't mind me asking, where exactly are unnecessary copies made? This statement sounds like there would be a faster option to achieve the same thing pad(..., mode="constant") does. If so, how?

mhvk · 2018-05-02T16:30:27Z

From just a quick look, copies are made implicitly by passing any input through np.array at https://github.com/numpy/numpy/blob/master/numpy/lib/arraypad.py#L1300, explicitly at https://github.com/numpy/numpy/blob/master/numpy/lib/arraypad.py#L1369, and then implicitly again by using concatenate.

Some cleanup might be good... (but definitely in a separate PR!)

eric-wieser · 2018-05-02T16:44:05Z

I've a couple patches in the works for np.pad (#11011 + #11012 + some offline) that would likely conflict with changes that remove some redundant copies, so it might be nice to get thoes merged first before taking that on.

eric-wieser

Looks great! Feel free to merge, @mhvk.

Regarding full being slower - that sounds like a possible optimization that could be made inside full.

mhvk · 2018-05-02T16:58:11Z

OK, merging. Thanks, @lagru!

mhvk approved these changes May 2, 2018

View reviewed changes

eric-wieser reviewed May 2, 2018

View reviewed changes

MAINT: Simplify workflow in _append_const and _prepend_const

a5f94a9

eric-wieser approved these changes May 2, 2018

View reviewed changes

mhvk merged commit b946795 into numpy:master May 2, 2018

lagru deleted the pad branch May 2, 2018 21:21

lagru mentioned this pull request May 21, 2018

ENH: Faster array padding #11126

Closed

lagru mentioned this pull request Jun 16, 2018

MAINT: Rewrite numpy.pad without concatenate #11358

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: Fix padding with large integers #11033

BUG: Fix padding with large integers #11033

lagru commented May 2, 2018

mhvk left a comment

eric-wieser May 2, 2018

lagru May 2, 2018 •

edited

Loading

lagru commented May 2, 2018

mhvk commented May 2, 2018

lagru commented May 2, 2018

mhvk commented May 2, 2018

eric-wieser commented May 2, 2018 •

edited

Loading

eric-wieser left a comment

mhvk commented May 2, 2018

		@@ -138,8 +138,8 @@ def _append_const(arr, pad_amt, val, axis=-1):
		return np.concatenate((arr, np.zeros(padshape, dtype=arr.dtype)),

BUG: Fix padding with large integers #11033

BUG: Fix padding with large integers #11033

Conversation

lagru commented May 2, 2018

mhvk left a comment

Choose a reason for hiding this comment

eric-wieser May 2, 2018

Choose a reason for hiding this comment

lagru May 2, 2018 • edited Loading

Choose a reason for hiding this comment

lagru commented May 2, 2018

mhvk commented May 2, 2018

lagru commented May 2, 2018

mhvk commented May 2, 2018

eric-wieser commented May 2, 2018 • edited Loading

eric-wieser left a comment

Choose a reason for hiding this comment

mhvk commented May 2, 2018

lagru May 2, 2018 •

edited

Loading

eric-wieser commented May 2, 2018 •

edited

Loading