ENH: plotting methods can unpack labeled data #4829

jankatins · 2015-07-30T18:28:57Z

This is an alternative implementation of a labeled data decorator (see #4787)

Todo

jankatins · 2015-07-30T18:32:58Z

Note: this PR will fail as the decorated functions probably need some adjustments in the decorator call (see https://github.com/matplotlib/matplotlib/pull/4829/files#diff-0347a3e5b820bc05115b3c9cc6f122f9R49 -> test_compiletime_checks for the tested problems. It's mostly to do with not found label names)

tacaswell · 2015-07-30T18:35:52Z

👍 Thanks! Either doing this my self and faking the commit author or asking you to do this was on my to-do list.

jankatins · 2015-07-30T20:37:40Z

slightly reordered (the .index commit is now first) and updated compile time check to hopefully get less travis failures... Also now includes a better commit message and a .gitignore update for pycharm

jankatins · 2015-07-30T20:39:40Z

lib/matplotlib/axes/_axes.py

@@ -979,6 +981,7 @@ def hlines(self, y, xmin, xmax, colors='k', linestyles='solid',

        return coll

+    @unpack_labeled_data()


this probably needs a label_namer="x"

jankatins · 2015-08-02T20:49:09Z

I started to go thought the decorated functions to add the right decorator arguments, but this is slow going, as in some case I don't know the plotting function at all :-(

jankatins · 2015-08-02T20:57:48Z

Uih, and this is currently not working:

  quiver(U, V, **kw)
  quiver(U, V, C, **kw)
  quiver(X, Y, U, V, **kw)
  quiver(X, Y, U, V, C, **kw)

-> the names of the individual args depend on the total number of args

BTW: this will currently fail if you call the function with `quiver(U=[...], V, **kw)

efiring · 2015-08-02T21:45:06Z

Yes, these sorts of signatures date from early Matlab compatibility
considerations. There are several functions like this.

On 8/2/15, Jan Schulz notifications@github.com wrote:

Uih, and this is currently not working:
  quiver(U, V, **kw)
  quiver(U, V, C, **kw)
  quiver(X, Y, U, V, **kw)
  quiver(X, Y, U, V, C, **kw)
-> the names of the individual args depend on the total number of args

Reply to this email directly or view it on GitHub:
#4829 (comment)

jankatins · 2015-08-02T23:24:20Z

Ok, I went through all the decorated function. The differences between the APIs for each function are kind of horrible :-( for quite a lot of the functions I had to remove the decorator again (just comemtned out for now), as they take data which cannot easily gotten from a pandas.DataFrame (aka 2d+ data -> df["name"] will return a 1d Series)

I also added a "replace_all_args" kwarg to the decorator, which handles the variable length *args. After seeing what is used here, I think that adding the possibility to pass in a dict as 'positional_parameter_names' which mals number_of_args -> ["list", "of", "names"] is probably the better solution...

tacaswell · 2015-08-03T01:24:40Z

Can you leave it in cases where a data frame would not work? There are other data structures (like h5py objects or a dict of ndarrays) which conform to the required signatures which can return 2D data. That feature is actually one of the points that did the most to convince me that this was worth doing.

jankatins · 2015-08-03T09:25:32Z

In my understanding the main case for this is:

(df.pipe(h)
   .pipe((plt.plot, "data"), x="a", y="b" )
)

I wouldn't use the data=dict(...) case as one which should be optimized (or even catered) for because then we would need to keep the "replace all" (at least for isinstance(data, dict)), as a dict can also include scalars or ENUMs or whatever fancy argument a function can take. If you construct a dict it's IMO easier to use **dict(...) (or *list(...)). I'm also not sure how to tell the user that all in all the "normal" plot calls, a DataFrame works as data, but here you have to do some additional construction work.

Maybe one of the original proponents can chime in here: @jakevdp @fperez @mrocklin @ellisonbg @pzwang @mwaskom @jreback @andrewcollette

tacaswell · 2015-08-03T12:28:50Z

If the point was primarily to support df.pipe, then the pandas folks should just implement pipe to be more flexible.

The protocol that was agreed on (to my understanding) at pydata Seattle was that data is any obotject that support __getitem__ with a string and returns something that np.asarray works on. From this point of view a dataframe is just a (restricted) dict of ndarrays and the base target implementation is isinstance(data, dict).

@JanSchulz I don't really understand your concern. If anything, you are making the case that we can go back to using a simpler decorator that tries to replace everything and make sure that the auto-labeler works (or just black list replacing label) as it adds some nice functionally is the simplest case and in cases where users pass custom objects.

tacaswell · 2015-08-03T13:48:27Z

Sorry I am cranky, I should not leave comments before breakfast.

blink1073 · 2015-08-03T13:58:46Z

Witness @tacaswell, the grumpy plotting custodian.

jankatins · 2015-08-03T16:05:25Z

@tacaswell No problem :-) I've no problem putting it back in, it's just my (biased?) impression that this functionality will be most used by pandas users (in whatever form: pipe or not) and I feel that making that case ("mostly") foolproof is a nice goal in terms of maintenance costs (stakoverflow Qs,...).

BTW: should this do even more magic and detect if we are called from pandas.DataFrame.pipe() and in that case accept the first argument as 'data'? would make it even cleaner for pandas users (no need for the ugly tuple: df.pipe((plt.plot, "data"), args) becomes df.pipe(plt.plot, args). CC: @TomAugspurger, @shoyer @jreback

clarkfitzg · 2015-08-03T17:13:49Z

Happy to see this going in. We've been adding plotting to xray recently. Most of our focus was on 2d data. Adding labels to the axes was important for us.

jankatins · 2015-08-03T18:47:17Z

Ok, seems that we are down to pep8 problems. Yay :-)

tacaswell · 2015-08-03T19:45:53Z

lib/matplotlib/axes/_axes.py

@@ -2236,6 +2251,7 @@ def barh(self, bottom, width, height=0.8, left=None, **kwargs):
                           bottom=bottom, orientation='horizontal', **kwargs)
        return patches

+    # @unpack_labeled_data() # not df["name"] getable...


I think this should be uncommented.

done, added with label_namer=None

IPython ships a version of `Signature` and friends that runs on python >=2.7. Fall back to this version of things for python <=3.2.

MNT: use IPython's signature if needed + available

efiring · 2015-09-08T18:38:49Z

lib/matplotlib/__init__.py

+        new_sig = None
+        ver_info = sys.version_info
+        # py2.6 compatible version, use line below as soon as we can
+        python_has_signature = ver_info[0] > 2 and ver_info[1] > 2


Very minor, but I keep tripping over this:

major and minor1 are already defined on line 194, so you don't need ver_info

These version checks are incorrect (python 4, anyone?) and confusing. They would be clearer as

python_has_signature = (major == 3 and minor1 >= 3) python_has_wrapped = (major == 3 and minor1 >= 2)

Then you can remove all the comments in the vicinity.

Sorry, I didn't know we had those variables.

I have a PR into @JanSchulz 's branch to fix this and to guard against importing IPython in not already imported.

Vetter checks for python and IPython

More fixes

shoyer · 2015-09-11T07:39:16Z

doc/users/whats_new/2015-07-30_unpack_labeled_data.rst

+``pandas.DataFrames`` easier:
+
+* For plotting methods which understand a ``label`` keyword argument but the
+  user does not supply such an argument, this is now implicitly set by either


Which of these two methods for looking up a label takes priority? It's not clear to me from the description here....

tacaswell · 2015-09-12T18:49:23Z

@matplotlib/developers I am planning to fix up the last few issues (comments from me, @pelson, and @shoyer ) and then merge this today.

@pelson We can fix up the testing in later PRs and I am hoping my claim that get_label is a place holder for future work is convincing.

If the second arguement to `plot` is both in data and a valid style code warn the user.

tacaswell · 2015-09-13T00:28:27Z

@JanSchulz I am happy to fix those as we go.

ENH: plotting methods can unpack labeled data

Arbitrary long args, i.e., plot("x","y","r","x2","y2","b") were problematic if used with a data kwarg and the color spec ("r", "b") was included in data: this made it impossible in some cases to determine what the user wanted to plot: plot("x","y","r","x2","y2","b", data={..., "r":..., "b":...) could be interpreted as plot(x, y, "r") # all points red plot(x2, y2, "b") # all points black or plot(x,y) plot(r,x2) plot(y2,b) This could lead to hard to debug problems if both styles result in a similar plot (e.g. if all are values in the same range). Therefore it was decided to remove this possibility so that the usere gets a proper error message instead: matplotlib#4829 (comment) There is still a case of ambiguity (plot("y", "ro", data={"y":..., "ro":...), which is now detected and a warning is issued. This detection could theoretically be used to detect the above case as well, but there were so many corner cases, that the checks became too horrible and I took that out again. Note that passing in data directly (without a data kwarg) is unaffected, it still accepts arbitrary long args.

tacaswell added the status: needs review label Jul 30, 2015

jankatins mentioned this pull request Jul 30, 2015

ENH: plotting methods can unpack labeled data [MOVED TO #4829] #4787

Closed

jankatins force-pushed the unpack_labeled_data_alternative branch from c48414a to 9411f86 Compare July 30, 2015 20:35

jankatins reviewed Jul 30, 2015
View reviewed changes

tacaswell added this to the next point release milestone Jul 31, 2015

jankatins force-pushed the unpack_labeled_data_alternative branch from 47268fb to a71d16e Compare August 2, 2015 23:29

tacaswell reviewed Aug 3, 2015
View reviewed changes

tacaswell and others added 2 commits September 8, 2015 12:31

MNT: use IPython's signature if needed + available

2b2092d

IPython ships a version of `Signature` and friends that runs on python >=2.7. Fall back to this version of things for python <=3.2.

Merge pull request #4 from tacaswell/unpack_labeled_data_alternative

572c1e2

MNT: use IPython's signature if needed + available

efiring reviewed Sep 8, 2015
View reviewed changes

tacaswell and others added 9 commits September 9, 2015 18:51

MNT: use already imported version variables

e2cf264

PRF: only try to use IPython if already imported

67fda44

Merge pull request #5 from tacaswell/unpack_labeled_data_alternative

59f3917

Vetter checks for python and IPython

FIX: guard signature import better

1c9feb8

MNT: use built-in logic for unwrapping

1d00767

MNT: rename get_index_y -> index_of

c454983

MNT: enable unpacking on 2D inputs

0d6cd40

MNT: invert logic of _has_varargs

0734b86

Merge pull request #6 from tacaswell/unpack_labeled_data_alternative

5fbe4b0

More fixes

shoyer reviewed Sep 11, 2015
View reviewed changes

shoyer mentioned this pull request Sep 11, 2015

pandas v0.17.0rc1 #5050

Closed

tacaswell added 4 commits September 12, 2015 16:35

MNT: simplify identifying valid style codes

f34aed0

If the second arguement to `plot` is both in data and a valid style code warn the user.

MNT: rearrange testing helper functions

a051b57

MNT: expose hist(..., weight='key') to data kwarg

8271952

DOC: edits to whats_new entry

0b4fc7c

tacaswell mentioned this pull request Sep 12, 2015

Unpack labeled data alternative #5053

Merged

tacaswell added a commit that referenced this pull request Sep 13, 2015

Merge pull request #4829 from JanSchulz/unpack_labeled_data_alternative

f8ceb85

ENH: plotting methods can unpack labeled data

tacaswell merged commit f8ceb85 into matplotlib:master Sep 13, 2015

tacaswell removed the status: needs review label Sep 13, 2015

jklymak mentioned this pull request Jan 3, 2018

Question on docstring and signature of Axes.stem() #10151

Closed

		@@ -979,6 +981,7 @@ def hlines(self, y, xmin, xmax, colors='k', linestyles='solid',

		return coll

		@unpack_labeled_data()

Uh oh!

ENH: plotting methods can unpack labeled data #4829

ENH: plotting methods can unpack labeled data #4829

Uh oh!

Conversation

jankatins commented Jul 30, 2015

Todo

Uh oh!

jankatins commented Jul 30, 2015

Uh oh!

tacaswell commented Jul 30, 2015

Uh oh!

jankatins commented Jul 30, 2015

Uh oh!

jankatins Jul 30, 2015

Choose a reason for hiding this comment

Uh oh!

jankatins commented Aug 2, 2015

Uh oh!

jankatins commented Aug 2, 2015

Uh oh!

efiring commented Aug 2, 2015

Uh oh!

jankatins commented Aug 2, 2015

Uh oh!

tacaswell commented Aug 3, 2015

Uh oh!

jankatins commented Aug 3, 2015

Uh oh!

tacaswell commented Aug 3, 2015

Uh oh!

tacaswell commented Aug 3, 2015

Uh oh!

blink1073 commented Aug 3, 2015

Uh oh!

jankatins commented Aug 3, 2015

Uh oh!

clarkfitzg commented Aug 3, 2015

Uh oh!

jankatins commented Aug 3, 2015

Uh oh!

tacaswell Aug 3, 2015

Choose a reason for hiding this comment

Uh oh!

jankatins Aug 3, 2015

Choose a reason for hiding this comment

Uh oh!

efiring Sep 8, 2015

Choose a reason for hiding this comment

Uh oh!

tacaswell Sep 8, 2015

Choose a reason for hiding this comment

Uh oh!

tacaswell Sep 9, 2015

Choose a reason for hiding this comment

Uh oh!

shoyer Sep 11, 2015

Choose a reason for hiding this comment

Uh oh!

tacaswell commented Sep 12, 2015

Uh oh!

tacaswell commented Sep 13, 2015

Uh oh!

Uh oh!