-
-
Notifications
You must be signed in to change notification settings - Fork 7.9k
ENH: plotting methods can unpack labeled data #4829
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: plotting methods can unpack labeled data #4829
Conversation
Note: this PR will fail as the decorated functions probably need some adjustments in the decorator call (see https://github.com/matplotlib/matplotlib/pull/4829/files#diff-0347a3e5b820bc05115b3c9cc6f122f9R49 -> |
👍 Thanks! Either doing this my self and faking the commit author or asking you to do this was on my to-do list. |
c48414a
to
9411f86
Compare
slightly reordered (the .index commit is now first) and updated compile time check to hopefully get less travis failures... Also now includes a better commit message and a .gitignore update for pycharm |
@@ -979,6 +981,7 @@ def hlines(self, y, xmin, xmax, colors='k', linestyles='solid', | |||
|
|||
return coll | |||
|
|||
@unpack_labeled_data() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this probably needs a label_namer="x"
I started to go thought the decorated functions to add the right decorator arguments, but this is slow going, as in some case I don't know the plotting function at all :-( |
Uih, and this is currently not working:
-> the names of the individual args depend on the total number of args BTW: this will currently fail if you call the function with `quiver(U=[...], V, **kw) |
Yes, these sorts of signatures date from early Matlab compatibility On 8/2/15, Jan Schulz notifications@github.com wrote:
|
Ok, I went through all the decorated function. The differences between the APIs for each function are kind of horrible :-( for quite a lot of the functions I had to remove the decorator again (just comemtned out for now), as they take data which cannot easily gotten from a pandas.DataFrame (aka 2d+ data -> I also added a "replace_all_args" kwarg to the decorator, which handles the variable length |
47268fb
to
a71d16e
Compare
Can you leave it in cases where a data frame would not work? There are other data structures (like h5py objects or a dict of ndarrays) which conform to the required signatures which can return 2D data. That feature is actually one of the points that did the most to convince me that this was worth doing. |
In my understanding the main case for this is: (df.pipe(h)
.pipe((plt.plot, "data"), x="a", y="b" )
) I wouldn't use the Maybe one of the original proponents can chime in here: @jakevdp @fperez @mrocklin @ellisonbg @pzwang @mwaskom @jreback @andrewcollette |
If the point was primarily to support The protocol that was agreed on (to my understanding) at pydata Seattle was that @JanSchulz I don't really understand your concern. If anything, you are making the case that we can go back to using a simpler decorator that tries to replace everything and make sure that the auto-labeler works (or just black list replacing |
Sorry I am cranky, I should not leave comments before breakfast. |
Witness @tacaswell, the grumpy plotting custodian. |
@tacaswell No problem :-) I've no problem putting it back in, it's just my (biased?) impression that this functionality will be most used by pandas users (in whatever form: BTW: should this do even more magic and detect if we are called from |
Happy to see this going in. We've been adding plotting to xray recently. Most of our focus was on 2d data. Adding labels to the axes was important for us. |
Ok, seems that we are down to pep8 problems. Yay :-) |
@@ -2236,6 +2251,7 @@ def barh(self, bottom, width, height=0.8, left=None, **kwargs): | |||
bottom=bottom, orientation='horizontal', **kwargs) | |||
return patches | |||
|
|||
# @unpack_labeled_data() # not df["name"] getable... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be uncommented.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done, added with label_namer=None
IPython ships a version of `Signature` and friends that runs on python >=2.7. Fall back to this version of things for python <=3.2.
MNT: use IPython's signature if needed + available
new_sig = None | ||
ver_info = sys.version_info | ||
# py2.6 compatible version, use line below as soon as we can | ||
python_has_signature = ver_info[0] > 2 and ver_info[1] > 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very minor, but I keep tripping over this:
major
andminor1
are already defined on line 194, so you don't needver_info
- These version checks are incorrect (python 4, anyone?) and confusing. They would be clearer as
python_has_signature = (major == 3 and minor1 >= 3)
python_has_wrapped = (major == 3 and minor1 >= 2)
Then you can remove all the comments in the vicinity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I didn't know we had those variables.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a PR into @JanSchulz 's branch to fix this and to guard against importing IPython in not already imported.
Vetter checks for python and IPython
``pandas.DataFrames`` easier: | ||
|
||
* For plotting methods which understand a ``label`` keyword argument but the | ||
user does not supply such an argument, this is now implicitly set by either |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which of these two methods for looking up a label takes priority? It's not clear to me from the description here....
If the second arguement to `plot` is both in data and a valid style code warn the user.
@JanSchulz I am happy to fix those as we go. |
ENH: plotting methods can unpack labeled data
Arbitrary long args, i.e., plot("x","y","r","x2","y2","b") were problematic if used with a data kwarg and the color spec ("r", "b") was included in data: this made it impossible in some cases to determine what the user wanted to plot: plot("x","y","r","x2","y2","b", data={..., "r":..., "b":...) could be interpreted as plot(x, y, "r") # all points red plot(x2, y2, "b") # all points black or plot(x,y) plot(r,x2) plot(y2,b) This could lead to hard to debug problems if both styles result in a similar plot (e.g. if all are values in the same range). Therefore it was decided to remove this possibility so that the usere gets a proper error message instead: matplotlib#4829 (comment) There is still a case of ambiguity (plot("y", "ro", data={"y":..., "ro":...), which is now detected and a warning is issued. This detection could theoretically be used to detect the above case as well, but there were so many corner cases, that the checks became too horrible and I took that out again. Note that passing in data directly (without a data kwarg) is unaffected, it still accepts arbitrary long args.
Arbitrary long args, i.e., plot("x","y","r","x2","y2","b") were problematic if used with a data kwarg and the color spec ("r", "b") was included in data: this made it impossible in some cases to determine what the user wanted to plot: plot("x","y","r","x2","y2","b", data={..., "r":..., "b":...) could be interpreted as plot(x, y, "r") # all points red plot(x2, y2, "b") # all points black or plot(x,y) plot(r,x2) plot(y2,b) This could lead to hard to debug problems if both styles result in a similar plot (e.g. if all are values in the same range). Therefore it was decided to remove this possibility so that the usere gets a proper error message instead: matplotlib#4829 (comment) There is still a case of ambiguity (plot("y", "ro", data={"y":..., "ro":...), which is now detected and a warning is issued. This detection could theoretically be used to detect the above case as well, but there were so many corner cases, that the checks became too horrible and I took that out again. Note that passing in data directly (without a data kwarg) is unaffected, it still accepts arbitrary long args.
This is an alternative implementation of a labeled data decorator (see #4787)
Todo
doc/users/whats_new
plot()
document the problems with 'color spec or next x' ambiguity when the third argument in *args is in data.plt.func(*args, **args)
as thats the signature of the decorator.label_namer="y"
is ok. In most cases it probably is not...*args, **kwargs
if it would be better to supply argument namesrename_names
list is better (i.e. don't replace everything but only a few arguments)plot(...)
data=data
is passed in?label_namer="y"
tolabel_namer=None
? None is probably much more used than 'y'...*args
positional_parameter_names={(#args: ["names"]}
?