-
-
Notifications
You must be signed in to change notification settings - Fork 10.8k
ENH: remove unneeded spaces in float/bool reprs, fixes 0d str #9139
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
numpy/core/arrayprint.py
Outdated
@@ -624,6 +618,7 @@ def fillFormat(self, data): | |||
from . import numeric as _nc | |||
|
|||
with _nc.errstate(all='ignore'): | |||
hassign = self.sign or any(data < 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be _nc.any
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And probably has_sign
not hassign
3d81fd9
to
53cfc62
Compare
This could break doc tests. Doc tests are evil, but some use them anyway... |
@mhvk just for your info as we decide how to make the repr changes, I ran this PR though astropy tests, and it messes up a lot of tests:
Most of them are things like
or
|
@ahaldane - yes, but I've already got a PR that fixes those tests: astropy/astropy#6090; and am happy to fix things further if we go that route, as I think this PR made printing better in general. Just don't want to have to do it for each numpy version! |
@ahaldane - I clearly did not pay much attention to the PR in which you commented... Even up to redoing your work here in #9144 (now closed). In any case, my comment above stands, if slightly differently: I gladly fix the astropy doctests if we end up with more consistent numpy printing -- after all, I think that after this it is not going to change again... But ideally we do make all changes in one go for 1.14! |
If we try not to break too many things, one option for this would be to expose the |
Updated based on discussion in #9143. Also, while we are annoying everyone by changing the spaces in array printing, we should try to do it as thoroughly as possible now to avoid any future changes. Specifically, I'd like to add another commit or two to 1. remove spaces from (I also still need to update the release notes and other little things) |
numpy/core/tests/test_arrayprint.py
Outdated
assert_equal(str(np.array('café', np.unicode_)), 'café') | ||
assert_equal(repr(np.array('café', np.unicode_)), | ||
"array('café',\n dtype='<U4')") | ||
assert_equal(str(np.array('café', np.unicode_)), 'café') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This case is identical to the one two above it, isn't it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmf not sure how that happened.. will fix
numpy/core/tests/test_arrayprint.py
Outdated
|
||
# 0d arrays are *not* affected by printoptions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment is now wrong!
dbec3f1
to
85a8c45
Compare
Updated. I updated the treatment of signed values so the extra whitespace for the sign only appears if the longest elements have a sign. Release note and tests also updated. I haven't updated the longfloat spacing, but I think I want to leave that for another possible PR. I'd like to get this one through so upstream projects can get their tests to stop complaining. |
85a8c45
to
5767ac1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, modulo a few text issues in the changelog, and an unneeded list construction (which was there already, but might as well get rid of it).
doc/release/1.14.0-notes.rst
Outdated
``np.set_string_function``, unlike most other numpy scalars. This is no longer | ||
the case. | ||
|
||
removed an unneeded whitespace in float and bool array printing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Idem, and also delete "an"
numpy/core/arrayprint.py
Outdated
Has no effect, do not use. | ||
|
||
.. deprecated:: 1.14.0 | ||
outputA |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Piece missing here. Change to "output correctly."?
numpy/core/arrayprint.py
Outdated
if len(non_zero): | ||
precision = max([_digits(x, self.precision, format) | ||
for x in non_zero]) | ||
precision = max([_trailing_nonzero(x, self.precision) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to construct a list inside max
-- btw, unrelated to this PR, but this seems a really slow way for large arrays...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah I worried about that too, but then I worked out that data
is going to be a small array.
data
only contains the elements that are actually printed on the screen (ie, not the interior which is replaced by ...
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, that makes sense.
doc/release/1.14.0-notes.rst
Outdated
---------------------------------------------------- | ||
0d arrays now use the array2string formatters to print their elements, like | ||
other arrays. The `style` argument of array2string is now non-functional. | ||
printing of 0d arrays changed to be more consistent with scalars/ndarrays |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Trivial, but capitalize first word.
doc/release/1.14.0-notes.rst
Outdated
``None`` now prints the array using the ``formatter`` specified using | ||
``np.set_printoptions``, otherwise ``style`` should be a function which accepts | ||
a numpy scalar and returns a string, and ``style(a[()])`` is returned. The | ||
default value is now ``style=None``. | ||
|
||
integer scalars are now unaffected by ``np.set_string_function`` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Idem.
numpy/core/arrayprint.py
Outdated
for x in non_zero]) | ||
precision = max([_trailing_nonzero(x, self.precision) | ||
for x in abs_non_zero]) | ||
precision = min(self.precision, precision) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, I think this line is unnecessary, will remove.
5767ac1
to
d13d48e
Compare
All right, reworked the |
Looks good to me. |
I'd be quite happy to merge this, as it seems a follow-up of gh-8983, and finishes making the print options much more logical. Obviously, this will break more doctests, but it does seem best to at least do it in one go. @eric-wieser, @charris: any objections to merging this? |
} | ||
return ret; | ||
|
||
item_str = PyObject_Str(item); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I think I said before, this isn't safe, since item
can (and likely will for custom dtypes) return a copy of self
, causing infinite recursion.
What actually falls back to this case anyway?
Seems to me that it would be better to just inherit object.__str__
, making if obvious that custom dtype users have work to do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, you are talking about this comment.
Currently, besides potential user-defined types, only numpy only void types fall back to this. But they no longer will after #8981.
I just tried your suggestion by simply removing gentype_repr
and gentype_str
, in the last commit pushed here.
Locally (not in the pushed commit) I also temporarily modified test_rational.c
to remove the str and repr of rational methods as a check, and can confirm I get reasonable behavior:
>>> from numpy.core.test_rational import rational
>>> a = np.array([1,2,3], dtype=rational)
>>> a
array([<rational object at 0x7f37f1b4e918>,
<rational object at 0x7f37f1b4e918>,
<rational object at 0x7f37f1b4e918>], dtype=rational)
I'm not sure how to implement a test of this without recreating a new test_rational
type missing the str and repr, which seems a bit heavy-handed.
numpy/core/arrayprint.py
Outdated
@@ -641,7 +635,8 @@ def fillFormat(self, data): | |||
|
|||
if self.exp_format: | |||
self.large_exponent = 0 < min_val < 1e-99 or max_val >= 1e100 | |||
self.max_str_len = 8 + self.precision | |||
pad_sign = self.sign or _nc.any(non_zero < 0) | |||
self.max_str_len = 6 + self.precision + pad_sign |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did this decrease by one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The simple answer is that there appeared to be an unnecessary space.
The more complex answer is that actually I'm not sure why that extra space was there... I'll take another look.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After investigation, I still don't understand why the extra space was there.
I count "6" extra characters more than the floating part and optional sign: x.e+xx
where x
are digits.
fd4ea55
to
2cbe4c4
Compare
min_val = minimum.reduce(non_zero) | ||
max_val = np.max(abs_non_zero) | ||
min_val = np.min(abs_non_zero) | ||
min_val_sgn = np.min(non_zero) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems a bit odd that we're importing np
directly for these, yet importing other stuff indirectly. Not something you need to fix in this PR though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, looks like all the indirect imports can be replaced by the single np
import - I just tried it, seems fine.
I won't do that in this PR though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've never understood quite how, or why, that worked. But it generally does.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it works because we never use or call any numpy functions or variables in this file at import time, only at runtime.
At the moment arrayprint
is imported from the main__init__
file, the numpy
module has been created but not populated, so we are allowed to import numpy, but not allowed refer to any of its functions or variables. However, we can use it inside of function definitions since by the time they are called the numpy module will have been populated.
(Actually, I see a few exceptions: At module import time we call multiarray.set_string_function
and we refer to int_
, float_
, complex_
, intc
and longlong
. Therefore these will still need to be directly imported).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, but it might be safer to leave these all the way they are: What if another module calls arrayprint functions at import time? Then we would have a problem. Ideally no other modules would, though.
(And this isn't theoretical: We recently had a problem because getlimits.py
was calling array2string
at module import time, though we've now fixed it so it no longer does (#9113))
numpy/core/arrayprint.py
Outdated
def __init__(self, data, **kwargs): | ||
# add an extra space so " True" and "False" have the same length and | ||
# array elements align nicely when printed, but only for arrays with | ||
# more than one element (0d and nd) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment no longer matches code
2cbe4c4
to
7be5048
Compare
Fixed the comment, plus the two doctrings describing the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks for the iteration.
Let's leave this a day or two to collect any final comments
@mhvk: good to go from your perspective? |
Looked over it once more and, yes, good to go! Though we should ensure that |
7be5048
to
5810349
Compare
I just tweaked the release note a little. I changed the title about |
Otherwise, I checked over it again, I don't have more changes. |
@charris, I think it's screaming time |
5810349
to
710e032
Compare
Rebased for conflicts with #9688 |
In it goes. Good job all. |
There look to be four remaining PR relevant to this. It would be helpful if folks could keep Allan's list current and close any PRs that are no longer relevant. |
numpy 1.14 "fixes" the extra space in float array prints.(numpy/numpy#9139) Use char type to keep back/forward compatible.
Before: [[ 0. 0. 0. ] [ 0. 0. 0. ] [ 0. 0. 0. ]] With recent NumPy: [[0. 0. 0.] [0. 0. 0.] [0. 0. 0.]] See numpy/numpy#9139
See discussion in #8983
This removes the leading space in the string representation of arrays of floating-point type if it is not needed (ie, all positive values), eg:
old behavior:
New behavior:
This noticebly improves the appearance of 0d arrays, so that
np.array(1.0)
prints as itself, instead of asnp.array( 1.0)
as happens after #8983.It does the same for boolean arrays with only one element, so 0d boolean arrays print as
array(True, dtype=bool)
instead ofarray( True, dtype=bool)
.(longfloats still have a space I think, but I would rather not tackle that here because I think it requires a big rewrite of longfloat formatter)