Skip to content

Simplify intro tutorial re: asarray. #15994

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

anntzer
Copy link
Contributor

@anntzer anntzer commented Dec 21, 2019

DataFrames can also be converted to numpy arrays using np.asarray; no
need to mention a separate way of converting them.

PR Summary

PR Checklist

  • Has Pytest style unit tests
  • Code is Flake 8 compliant
  • New features are documented, with examples if plot related
  • Documentation is sphinx and numpydoc compliant
  • Added an entry to doc/users/next_whats_new/ if major new feature (follow instructions in README.rst there)
  • Documented in doc/api/api_changes.rst if API changed in a backward-incompatible way

DataFrames can also be converted to numpy arrays using np.asarray; no
need to mention a separate way of converting them.
@timhoffm
Copy link
Member

If it's always just numpy.asarray, IMHO we should put that at the beginning of the relevant functions. Doesn't cost anything if it's already an array. Telling people to always manually convert is a bit pedantic. We should change our policy to

All of plotting functions expect `numpy.array` or `numpy.ma.masked_array` input, or objects that can be converted to these using `numpy.asarray` (such as `pandas.DataFrame` and `numpy.matrix`).

@anntzer
Copy link
Contributor Author

anntzer commented Dec 21, 2019

Well, we can't launder everything though asarray because units. Lack of support for e.g. np.matrix was explicitly added in #3394.
Even if we were to change this policy, this part of the docs would need to stay until we have asarray() calls (+ whatever unit handling needs) in place everywhere.

@timhoffm
Copy link
Member

I see. Then I'd leave this as is. IMHO df.values is more canonical than np.asarray(df). It's shorter to type (not much but still less annoying when working interactively), and people will not necessarily have numpy imported when working with pandas.

@timhoffm
Copy link
Member

Actually to_numpy is preferred over values since pandas 0.24. We should advertise that.

@anntzer
Copy link
Contributor Author

anntzer commented Dec 21, 2019

But we support older versions of pandas and I don't think we should start having to field questions from confused users using old pandas. (I don't think the usage tutorial should become a pandas tutorial either.)
I personally like asarray() because it always works (and that's what we use internally anyways); I guess we could just remove the examples here. Or just close the whole PR if you prefer, I'm not overly attached to it.

@timhoffm
Copy link
Member

I'd leave as is; primarily telling "convert to numpy", plus "here is an example" is the right priority.

Anyway, thanks for thinking about how to improve the docs!

@timhoffm timhoffm closed this Dec 21, 2019
@anntzer anntzer deleted the asarray branch December 21, 2019 15:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants