Skip to content

BUG: take casting logic with an out= argument #4246

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

jaimefrio
Copy link
Member

When it receives an out argument, np.take converts the out array to
the dtype of the array being taken from. This results in wrong TypeErrors
being raised when out has a larger dtype than the array (and not being
raised when it has a smaller dtype, even though they should):

>>> a = np.arange(3, dtype=np.int16) + 128
>>> b = np.empty((3,), dtype=np.int8)
>>> a.take([0, 1, 2], out=b) # wrong, should raise an error
array([-128, -127, -126], dtype=int8)
>>> b = np.empty((3,), dtype=np.int32)
>>> a.take([0, 1, 2], out=b) # perfectly OK, but raises an error
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: Cannot cast array data from dtype('int32') to dtype('int16') according to the rule 'safe'

This PR makes an explicit casting check, then either sets NPY_ARRAY_FORCECAST
or explicitly raises the error.

The tests added check the logic for conversion between all pairs of ints and
floats only.

*/
if (PyArray_CanCastTypeTo(dtype, out_dt, NPY_SAFE_CASTING)) {
flags |= NPY_ARRAY_FORCECAST;
} else {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PEP7

}
else {

@charris
Copy link
Member

charris commented Apr 4, 2014

The logic of this function is strange, the out parameter isn't even used unless no copy is made in producing obj, and I don't see a warning when it isn't used. I guess it is like ravel in that respect.

When it receives an `out` argument, `np.take` converts the `out` array
to the dtype of the array being taken from. This results in wrong
TypeErrors being raised when `out` has a larger dtype than the array
(and not being raised when it has a smaller dtype, even though they
should).

This PR sets NPY_ARRAY_FORCECAST and makes an explicit casting check,
raising an error if it fails.

The tests added check the logic for conversion between all pairs of
ints and floats only.
@jaimefrio
Copy link
Member Author

I have simplified things as much as possible following your suggestions. The comment in the code is still kind of long and unclear. Perhaps it is better to remove it completely and let the code speak for itself?

* is needed, as the constructor of obj would check the opposite
* casting, were it not disabled by the NPY_ARRAY_FORCECAST flag.
*/
if (PyArray_CanCastTypeTo(dtype, PyArray_DESCR(out),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking this check is incorrect. It was previously allowed to cast, say, float64 to int8, and ufuncs also allow such casting. The forcecast going the other way should be the only change needed as it provides a type compatible work array, which is then copied back on success or discarded on failure. The test could check that out remains unchanged when the mode is raise and an indexing error occurs.

@charris
Copy link
Member

charris commented Apr 7, 2014

This looks much cleaner, but I don't think it is quite correct yet.

@jaimefrio
Copy link
Member Author

I just did a little bit of archeology, and we had this same conversation a little over a year ago:

http://mail.scipy.org/pipermail/numpy-discussion/2013-February/065441.html

It is indeed true that ufuncs will allow any type of casting to out, wasn't fully aware of that, so reproducing that by simply adding the NPY_ARRAY_FORCECAST, plus the testing to avoid regressions is probably the right thing to do.

@jaimefrio
Copy link
Member Author

When simply forcing the cast it raises a different error:

======================================================================
FAIL: Ticket #789, changeset 5217.
----------------------------------------------------------------------
Traceback (most recent call last):
  File "C:\WinPython-32bit-np-dev\python-2.7.5\lib\site-packages\numpy\core\test
s\test_regression.py", line 1027, in test_compress_small_type
    raise AssertionError("compress with an out which cannot be " \
AssertionError: compress with an out which cannot be safely casted should not return successfully

Apparently it is expected that compress does not behave like ufuncs, and it was relying on take to enforce that behavior. The test for compress works by pure chance: not because int cannot be cast to single, but because single cannot be cast to int.

Since compress has been broken all this time the same way that take was, and no one else has complained, it is tempting to remove the test...

@seberg
Copy link
Member

seberg commented Apr 7, 2014

Only half true I believe, isn't that behaviour in ufuncs is deprecated I believe.

@charris
Copy link
Member

charris commented Apr 9, 2014

@seberg Hmm, need to check that. I'm concerned about changing functions in the API even if it looks like this function isn't much used ;) Currently, the function only works if the output precision is <= input precision, with this PR it is >= instead.

@seberg
Copy link
Member

seberg commented Apr 9, 2014

Wasn't arguing about just using forced casting here. I think it should be good. Could or could not add deprecations on top of that then, just like the ufuncs have right now. Or similar to them anyway.

@jaimefrio
Copy link
Member Author

The forced cast without type checking raises that error in compress. Apparently someone was expecting compress to not simply force a cast to the out type, but do what this PR originally did . If we go that way, then the test in compress has to be removed. Which is a good idea anyway, because it is testing the wrong thing.

I haven't been able to find "Ticket #789, changeset 5217" to see if there is a deeper reason for that test, any clue?

@seberg
Copy link
Member

seberg commented Apr 9, 2014

Well, that test tests a segfault bug (see gh-1387), so there is no need to see it as someone wanted to fix behaviour there. We are not awfully consistent about what we allow with the out argument anyway. Some allow forced casting, some want exact matching, and for ufuncs there is a plan to make it safe...

@charris
Copy link
Member

charris commented Apr 11, 2014

Looking at scipy, the out parameter is never used in calls to take. I'm thinking it is probably safe (and a good idea) to check the out type. Ideally we would add a casting argument, but that can't be done at the C level. Hmm, How about we get rid of the check here, and add it to the python take function along with the casting option?

@charris
Copy link
Member

charris commented Apr 11, 2014

Actually, array_take in methods.c looks like the best place to add an option.

@charris
Copy link
Member

charris commented Apr 11, 2014

There is PyArray_CastingConverter available to check the conversion and array_astype provides a template for the usage.

@charris
Copy link
Member

charris commented Feb 27, 2015

Maybe time to take another look at this.

@homu
Copy link
Contributor

homu commented Jul 25, 2015

☔ The latest upstream changes (presumably #6047) made this pull request unmergeable. Please resolve the merge conflicts.

@seberg
Copy link
Member

seberg commented Apr 26, 2019

@jaimefrio this seems to still requires some work. Unless you comment/wish to continue, I would probably:

  • close the PR
  • create a new issues to note that there is a start for fixing this in this PR. (Or modify the existing one)

in a few weeks. Of course the PR can be reactivated at any time. (tag:PR-possibly-close)

In general this seems like a fairly straight forward task, although likely requiring some form of Deprecation.

@anirudh2290
Copy link
Member

This has been stale from the point when @seberg last responded. Closing for now. Feel free to reopen @jaimefrio if you still wish to pursue this task.

@eric-wieser
Copy link
Member

I think we should create an issue to continue tracking this behavior

@anirudh2290
Copy link
Member

sure @eric-wieser, will do.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants