Skip to content

BUG: regression for array([pandas.DataFrame()]) #13663

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
May 30, 2019

Conversation

mattip
Copy link
Member

@mattip mattip commented May 29, 2019

Fixes #13399

Add failing test.
Fix by not converting 0-len object to ndarray, which is compatible with behaviour before #13399

@mattip
Copy link
Member Author

mattip commented May 29, 2019

Fixes #13659

@mattip
Copy link
Member Author

mattip commented May 29, 2019

xref pandas-dev/pandas#26546

@mattip mattip changed the title WIP, BUG: regression for array([pandas.DataFrame()]) BUG: regression for array([pandas.DataFrame()]) May 29, 2019
@charris charris added this to the 1.17.0 release milestone May 29, 2019
Copy link
Member

@seberg seberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems OK to me as a reversion. I am not sure about the DECREF, I was wondering if it should have a comment that this is a bit odd, but then if its a 0 length sequence, the assignment should be empty in any case (of course Eric will create some malicious object which would bring it down in flames).

Fixes the regression hitting Pandas. But does not fix the real (never noticed) but that this gives the incorrect output shape to begin with.

return 0;
}
else {
Py_DECREF(Py_NotImplemented);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be an unconditional Py_DECREF(tmp)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixing

#gh-13659, would raise in broadcasting [x=t for x in result]
np.array([t])

def test_2d__array__shape(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like this whole code block is just duplicated?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

weird. Fixing

@mattip mattip force-pushed the __array__-empty-2d branch from 725a9a5 to af094ef Compare May 29, 2019 19:07
goto fail;
}
else if (tmp != Py_NotImplemented) {
if (PyArray_CopyInto(dst, (PyArrayObject *)tmp) < 0) {
goto fail;
Copy link
Member

@seberg seberg May 29, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
goto fail;
Py_DECREF(tmp);
goto fail;

Something is still off here with the reference counting. Also the return 0 below needs it. Which probably means the whether or not the else clause is there did not matter much. Or we make tmp = NULL; one scope up to make the fail paths nicer.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

refactored

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, looks right now. Will merge as soon as tests finish (or squash merge).

@seberg seberg self-assigned this May 30, 2019
@seberg seberg merged commit ba53a63 into numpy:master May 30, 2019

t = T()
#gh-13659, would raise in broadcasting [x=t for x in result]
np.array([t])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a test here for the shape?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

xref #13671

TomAugspurger added a commit to TomAugspurger/pandas that referenced this pull request May 31, 2019
TomAugspurger added a commit to pandas-dev/pandas that referenced this pull request May 31, 2019
seberg added a commit to seberg/numpy that referenced this pull request Jul 24, 2019
…rray-like"

This reverts commit 71fc59d
which was part of numpygh-13399 and the followup commit
ba53a63 from numpygh-13663.

The issue is that we mix Sequence protocol and `__array__` usage
when coercing arrays. It seems easiest to revert this for the 1.17.x release
and make a larger breaking change in the 1.18.x release cycle.
Since this only occurs for stranger array-likes. This revert is limited
to the 1.17.x branch for now.
charris pushed a commit to charris/numpy that referenced this pull request Nov 26, 2019
…rray-like"

This reverts commit 71fc59d
which was part of numpygh-13399 and the followup commit
ba53a63 from numpygh-13663.

The issue is that we mix Sequence protocol and `__array__` usage
when coercing arrays. It seems easiest to revert this for the 1.17.x release
and make a larger breaking change in the 1.18.x release cycle.
Since this only occurs for stranger array-likes. This revert is limited
to the 1.17.x branch for now.
charris pushed a commit to charris/numpy that referenced this pull request Dec 2, 2019
…rray-like"

This reverts commit 71fc59d
which was part of numpygh-13399 and the followup commit
ba53a63 from numpygh-13663.

The issue is that we mix Sequence protocol and `__array__` usage when
coercing arrays. It seems easiest to revert this for the 1.18.x release
and make a larger breaking change in the 1.19.x release cycle.  Since
this only occurs for stranger array-likes.
seberg added a commit to seberg/numpy that referenced this pull request Feb 6, 2020
``__array__`` was previously not used during dimension discovery,
while bein gused during dtype discovery (if dtype is not given),
as well as during filling of the resulting array.

This would lead to inconsistencies with respect to array likes
that have a shape including a 0 (typically as first dimension).
Thus a shape of ``(0, 1, 1)`` would be found as ``(0,)`` because
a nested list/sequence cannot represent empty shapes, except 1-D.

This uses the `_array_from_array_like` function, which means that
some coercions may be tiny bit slower, at the gain of removing
a lot of complex code.

(this also reverts commit d0d250a
or the related change).

This is a continuation of work by Sergei Lebedev in numpygh-13663
which had to be reverted due to problems with Pandas, and the
general inconsistency. This version may not resolve all issues
with pandas, but does resolve the inconsistency.

Closes numpygh-13958
@mattip mattip deleted the __array__-empty-2d branch June 8, 2020 06:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants