Enh mappable remapper #4490

tacaswell · 2015-06-02T03:40:58Z

This came out of the discussion at pandas-dev/pandas#10129.

I think this will make the lives of pandas users much better (along with buckets and buckets of documentation).

Draft of decorators to automatically map DataFrame columns to base objects for plotting.

tacaswell · 2015-06-07T20:14:34Z

@jenshnielsen You are a hero for sorting out all of the doc related formatting issues.

tacaswell · 2015-06-07T20:32:54Z

@matplotlib/developers I would like to talk about this in my scipy talk, any feed back would be appreciated.

efiring · 2015-06-07T20:45:14Z

lib/matplotlib/cbook.py

+
+
+def apply_args_mapping(ax, func, data, map_targets, *args, **kwargs):
+    args = tuple(data[k] for k in map_targets) + args


Why are you using the values attribute for the kwargs cases but not for the args cases?

Oversight, but I am not fully sure which way is better. .values gets us a numpy array full-stop, but just [] gets use a Series which more-or-less behave like a numpy array.

Not using .values also means that these decorators will work with dicts of numpy arrays. It might be better to use np.asarray.

Omitting .values also means it will work with a numpy structured array; but I'm not sure if there would be any point in this.
Maybe use np.asanyarray in case something might yield a masked array?
I presume with a DataFrame, missing data will end up as NaN after conversion, correct?

I think it is always NaN internally in pandas and they don't do masking/sparse at all

tacaswell · 2015-06-07T21:07:31Z

These decorators + #4488 should make it easy to wrap non-pandas aware functions to play nicely with their new pipe method.

Use `np.asarray` to ensure that the contents of the mapping are suitable to pass into mpl functions. This allows these functions to work with both DataFrames, with dict-likes of lists/arrays and (presumably) with xray objects.

pelson · 2015-06-11T15:49:04Z

Any chance of some tests?

tacaswell · 2015-06-11T15:53:39Z

So requests for tests means people like this idea ? ;)

On Thu, Jun 11, 2015 at 11:49 AM Phil Elson notifications@github.com
wrote:

Any chance of some tests?

—
Reply to this email directly or view it on GitHub
#4490 (comment)
.

OceanWolf · 2015-06-11T16:03:09Z

I have no idea, as I don't use pandas.

The only concern I have comes from making mpl more difficult to contribute to, it adds an extra layer for developers to test...

tacaswell · 2015-06-11T16:05:47Z

These should be stand-alone helper functions, I do not see these getting
deeply integrated into the rest of the library.

On Thu, Jun 11, 2015 at 12:03 PM OceanWolf notifications@github.com wrote:

I have no idea, as I don't use pandas.

The only concern I have comes from making mpl more difficult to contribute
to, it adds an extra layer for developers to test...

—
Reply to this email directly or view it on GitHub
#4490 (comment)
.

pelson · 2015-06-11T16:19:56Z

So requests for tests means people like this idea ? ;)

Ha! Caught me. I've just finished reading that thread 30mins down...
I'm not convinced... honestly, I think the right place for this is in its own package.
I guess in its own context, it will either fly or die - that is no bad thing.

I would like to talk about this in my scipy talk, any feed back would be appreciated.

My interest is piqued though - I've not spent 30 seconds thinking about it, but I'm not convinced that the best solution was found in the discussion. A scipy discussion in and of itself?

tacaswell · 2015-07-11T20:51:32Z

Had a chat with @ellisonbg about how to do this, he is suggesting an API more like:

plt.plot('a', 'b', data=df)

which is what other plotting packages which take labeled data use. The idea is that we can implement a decorator that goes through the args and tries to replace any strings with the colums of the data. This has the advantage of not requiring a hard pandas dependency and will work with things like hdf5/netcdf4 objects or dicts of arrays.

I am not a huge fan of this is it just feels wrong that the (second) most important thing goes last. It also runs the risk of not doing this right due the constraints of not using the fact that we know the input is a datafram and providing a broken API is worse than not providing one.

attn @efiring

efiring · 2015-07-11T21:59:52Z

The @ellisonbg suggestion looks good to me, in the sense that I think it provides an interface that will make sense to users, and be easy and natural to use. I like the generality of allowing data to be anything that acts like a dict of arrays. Your "feels wrong" sensation is coming from having the data kwarg? It doesn't bother me. How might the API be broken if the input is a dataframe? I don't see that this API prevents one from having code that special-cases particular data sources, but it would be nicer not to have to do that. In particular, I don't want to see pandas become a dependency of mpl.

tacaswell · 2015-08-06T19:58:34Z

Closing this as cute, but not useful.

tacaswell added 2 commits June 1, 2015 23:36

ENH/RFC: decorators to deal with data frames

47b7fc8

Draft of decorators to automatically map DataFrame columns to base objects for plotting.

ENH: add apply version of the decorators

270144b

tacaswell added the status: needs review label Jun 2, 2015

tacaswell added this to the proposed next point release milestone Jun 2, 2015

tacaswell added the New feature label Jun 2, 2015

DOC: fix up indentation in docstrings

6657e65

efiring reviewed Jun 7, 2015
View reviewed changes

tacaswell added 4 commits June 7, 2015 17:18

MNT: settle on using np.asarray on input

08edd22

Use `np.asarray` to ensure that the contents of the mapping are suitable to pass into mpl functions. This allows these functions to work with both DataFrames, with dict-likes of lists/arrays and (presumably) with xray objects.

DOC: updates docstrings

48d1cfa

MNT: use asannayarray instead of asarray

bbd7356

DOC: make sphinx happy

1e5ae2e

tacaswell closed this Aug 6, 2015

tacaswell removed the status: needs review label Aug 6, 2015

tacaswell deleted the enh_mappable_remapper branch August 6, 2015 19:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enh mappable remapper #4490

Enh mappable remapper #4490

tacaswell commented Jun 2, 2015

tacaswell commented Jun 7, 2015

tacaswell commented Jun 7, 2015

efiring Jun 7, 2015

tacaswell Jun 7, 2015

tacaswell Jun 7, 2015

efiring Jun 7, 2015

tacaswell Jun 7, 2015

tacaswell commented Jun 7, 2015

pelson commented Jun 11, 2015

tacaswell commented Jun 11, 2015

OceanWolf commented Jun 11, 2015

tacaswell commented Jun 11, 2015

pelson commented Jun 11, 2015

tacaswell commented Jul 11, 2015

efiring commented Jul 11, 2015

tacaswell commented Aug 6, 2015



		def apply_args_mapping(ax, func, data, map_targets, args, *kwargs):
		args = tuple(data[k] for k in map_targets) + args

Enh mappable remapper #4490

Enh mappable remapper #4490

Conversation

tacaswell commented Jun 2, 2015

tacaswell commented Jun 7, 2015

tacaswell commented Jun 7, 2015

efiring Jun 7, 2015

Choose a reason for hiding this comment

tacaswell Jun 7, 2015

Choose a reason for hiding this comment

tacaswell Jun 7, 2015

Choose a reason for hiding this comment

efiring Jun 7, 2015

Choose a reason for hiding this comment

tacaswell Jun 7, 2015

Choose a reason for hiding this comment

tacaswell commented Jun 7, 2015

pelson commented Jun 11, 2015

tacaswell commented Jun 11, 2015

OceanWolf commented Jun 11, 2015

tacaswell commented Jun 11, 2015

pelson commented Jun 11, 2015

tacaswell commented Jul 11, 2015

efiring commented Jul 11, 2015

tacaswell commented Aug 6, 2015