Skip to content

FIX: pandas indexing error #5556

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Dec 30, 2015
Merged

Conversation

tacaswell
Copy link
Member

pd.Series prefer indexing via searching the index to positional
indexing. This method will get the first element of any iterable.

It will advance a generater, but they did not work previously anyway.

Closes #5550

@tacaswell tacaswell added this to the Critical bugfix release (1.5.1) milestone Nov 24, 2015
@@ -1561,7 +1561,7 @@ def default_units(x, axis):
x = x.ravel()

try:
x = x[0]
x = next(iter(x))
except (TypeError, IndexError):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

line 1565 should be modified to

except (TypeError, StopIteration):

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching that!

@tacaswell
Copy link
Member Author

rebased.

@tacaswell
Copy link
Member Author

Also addresses #5557

----------
inp : iterable
ename : str, optional
Name to use is ValueError if can not be normalized
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"is" --> "in"?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I swear I know english...

@WeatherGod
Copy link
Member

besides the nitpicking, this looks good to me

@tacaswell
Copy link
Member Author

@WeatherGod do you want me to drop that (some what superfluous) white space commit?

@WeatherGod
Copy link
Member

Eh, might be better to get rid of that commented out code entirely. I don't see the point of keeping it around.

@tacaswell
Copy link
Member Author

Well, the minimum numpy is now 1.6 so we should do what ever it was that note was telling us to do.

@WeatherGod
Copy link
Member

?

@tacaswell
Copy link
Member Author

The comment below the commented line says that we are handling normed internally until our minimum version of numpy can handle it (which the comment claims in > 1.5). I think it will take a bit of work (at least for met) to understand how what we are currently doing works to make sure that we can just fall over to numpy.

@WeatherGod
Copy link
Member

Ah, yeah, that comment (I must be getting blind...). I suspect that it is quite old and came from before the days that stacked histograms and such were possible. I am not sure it is possible to just hand off the responsibility of norming to numpy any more given the current feature set.

pd.Series prefer indexing via searching the index to positional
indexing.  This method will get the first element of any iterable.

It will advance a generater, but they did not work previously anyway.

Closes matplotlib#5550
using `next(iter(obj))` can raise `TypeError` if obj is not iterable
and `StopIteration` if len(obj) == 0.  As we are no longer explictily
indexing into `obj` it should not raise an `IndexError`
Both iterable and index_of are imported at the top of this import block
@tacaswell
Copy link
Member Author

This is passing and not conflicting, can someone push the button on this before it needs another rebase 😈 .

@WeatherGod
Copy link
Member

I do have another concern with respect to generators. Previously, generators just simply didn't work, now they may sort of work (generator gets partly consumed). This muddles the API a bit. Wouldn't it be better to detect that we have a generator and raise an error? We can later consider how to properly support generators without consuming them.

@tacaswell
Copy link
Member Author

That is a can of worms of handling I am worried about opening and would rather punt on the whole thing. This at least does not break any existing user code.

@WeatherGod
Copy link
Member

isn't it just simply a matter of testing isinstance(x, collections.Iterator)? That would be false for lists, numpy arrays and such, but should be true for generators and iterators, I think.

@tacaswell
Copy link
Member Author

I am more worried about applying that test uniformly across the library.

@WeatherGod
Copy link
Member

That would be nice, but it is not the goal that I am seeking. Simply that we add this test where-ever we have code that would potentially consume an iterator. That way, the API hasn't changed. So, that is just two places here in this PR.

@tacaswell
Copy link
Member Author

fair enough

All versions of python we support have built in `next`
 - moved next(iter(obj)) logic into a cbook function
@tacaswell
Copy link
Member Author

Digging into this a bit (and trying to sort out why plot does not have this problem) it turns out that plot goes through an entirely different code path by using Axis.update_units which delegates to units.Registry.get_convertor which has some non-trivial logic and will do very bad things to generators (I think).

This PR is a whack-a-mole fix which will get 1.5.1 out the door.

@tacaswell
Copy link
Member Author

The test failure is un-related (due to backport of custom RNG).

@WeatherGod
Copy link
Member

seems like it, was the baseline images not updated for the new RNG?

@mdboom
Copy link
Member

mdboom commented Dec 29, 2015

seems like it, was the baseline images not updated for the new RNG?

They were. This looks like a "only on Travis" sort of failure.

@WeatherGod
Copy link
Member

That doesn't give me any sort of warm fuzzies...

@tacaswell
Copy link
Member Author

I can reproduce it locally (but it passed before I pushed upstream (which I assume means my local install had become screwed up)).

@WeatherGod
Copy link
Member

I hope this hasn't become another rabbit-infested dragon cave...

@WeatherGod
Copy link
Member

Looks like you cleared out the rabbits!

WeatherGod added a commit that referenced this pull request Dec 30, 2015
@WeatherGod WeatherGod merged commit aeafb64 into matplotlib:v1.5.x Dec 30, 2015
@WeatherGod
Copy link
Member

By moving a bit of the if-statement into a cbook function, the if-statement starts to be a little bit clearer. Probably would be a good idea to have some cbook tests at some point, but that can come later.

tacaswell pushed a commit that referenced this pull request Mar 12, 2016
Specifically, when creating collection objects,
`linewidths` and `antialiaseds` can be non zero indexed
pandas series.

Related to: PR #5556
@tacaswell tacaswell deleted the fix_pandas_indexing branch May 8, 2016 22:08
tacaswell added a commit to tacaswell/matplotlib that referenced this pull request Jul 9, 2016
This was unintentionally broken by
bdaaf59 which went in as part of
PR matplotlib#5556 which was fixing support for pandas series.
tacaswell added a commit to tacaswell/matplotlib that referenced this pull request Jul 12, 2016
This was unintentionally broken by
bdaaf59 which went in as part of
PR matplotlib#5556 which was fixing support for pandas series.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants