-
-
Notifications
You must be signed in to change notification settings - Fork 10.8k
make recarray.attr return ndarray (not chararray) #5454
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Agree with this change as Also, the commit message should begin |
Should we throw in a delectation warning to chararray.init while we're
|
*Deprecation, of course.
|
@njsmith I was considering that but wanted to do a github code search first and need to exclude all the numpy forks, which I've forgotten how to do. I think the people most likely to be affected would be StSci. |
Hmm, |
OK, I've updated the commit message and added a note in Should I post to NumPy-discussion or SciPy-dev? I'm note sure which one is for numpy development. |
numpy-discussion |
Doesn't look like anyone noticed the post :( Could you add tests to check the return type of the two examples you used? |
I've added some tests. Along the way though, I realized my fix still has a problem and so I've made another change. I was expecting that by removing the chararray lines I would end up with a setup where attribute access behaved identically to index access. However this happens instead:
With the new pull request I've made index access behave identically to attribute access to recarrays (as a second git commit if that's OK). That is, accessing a scalar dtype gives an ndarray but accessing a structured datatype gives a recarray, for both attributes and indexes. There is a case to be made for having indexing behave differently from attribute access. Accessing fields using indexing is something that already exists for ndarrays. Recarrays are supposed to add attribute access, so it could also make sense for them to leave the index access alone. In this case indexed access would always return an ndarray, while attribute access would return ndarrays or recarrays depending on whether the data type is structured. On the other hand, it could be confusing that field access by indexes and attributes behaves differently, and it would have to be clearly documented. It might also seems a little strange if indexing an array gives an array of a different type, as would happen if indexing always returns ndarrays. In either case, the current code returns recarrays for scalar types accessed through indexing, which I don't think makes sense. Please also read the updated note in doc/release/1.10.0-notes.rst for another explanation. With this change, I get the following:
By the way, it occured to me that the docs would need to change too, but they don't: recarray's use of chararrays wasn't documented anywhere, nor were the return types in general. I'm thinking of expanding the docs. If any of this seems confusing, please wait a little because I am going to send an email to the numpy list regarding recarray documentation and #3581. Perhaps the issues I raise there should be discussed before this one. |
Correction: I had it a little wrong in my last update. Here's the correct description: Setup:
Previous behavior:
New behavior:
|
Just a heads up: Don't merge this; I have more changes I want to make. I'll also merge the latest commits from origin when the time comes. |
This was automatically closed because of a push I made, but that's just as well. I wanted to open a new (duplicate) pull request anyway, since I mistakenly make these commits to my master branch. |
This fixes #3993
This bug in recarrays has bitten me since chararrays remove whitespace for 'S' dtypes, while ndarrays don't.
As far as I can tell all that was required was removing the explicit checks for string types in the recarray code.