Skip to content

Add PEP484 type hints to the code (For IDE autocompletion / hints) #13798

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
ikamensh opened this issue Mar 29, 2019 · 8 comments
Closed

Add PEP484 type hints to the code (For IDE autocompletion / hints) #13798

ikamensh opened this issue Mar 29, 2019 · 8 comments

Comments

@ikamensh
Copy link

ikamensh commented Mar 29, 2019

My usage of matplotlib would be much easier if there were types in the code. I am not sure if boilerplate.py supports this, however that would be a big win in usability if implemented.
Return types are especially useful.

@tacaswell tacaswell added this to the unassigned milestone Mar 29, 2019
@tacaswell
Copy link
Member

tacaswell commented Mar 29, 2019

There are a number of important technical questions about how to do this:

  • in-line or as a pyi file?
    • I think that visual code can generate at a starting point for the pyi file from watching usage
  • what should the types be?
    • It is my understanding that how to mark up numpy arrays as input/output
    • how do you deal with the very flexible input types?
  • is it worth doing just some typing but not all or do we need to do everything for it to be worth it (a-la const correctness in c++)?

The typing should be done in the underlying library, not just in pyplot.

Thinking about what it would take to do this completely, I think it would take 6-9 mo FTE of a senior engineer (the second question is the one that worries me) would require coordination with {numpy, scipy, pandas} (if we use a different way of marking up how to specify the types of ndarrays or dataframes it would lead to minor chaos).

[EDIT, sorry, premature post for those following via email]

@clbarnes
Copy link
Contributor

clbarnes commented Jun 24, 2019

I am strongly in favour of type hints, but IMO there's no necessity to make a concerted effort to sweep through the entire codebase adding them. The best way forward is to set a policy saying "we prefer new API surface be type-annotated", and encourage people to annotate at least the function signatures of they would be fixing/refactoring anyway. This way, the development overhead is very small, but with potentially big gains for developers and users alike.

To be clear: matplotlib already uses type annotations: the decision has already been made that it is valuable for developers and users to have access to type information in function signatures. We're just not using the well-defined, tool-accessble annotations built into the language; they're written into docstrings, which is fragile, hard to ensure correctness, and useless to many tools.

in-line or as a pyi file?

Inline reduces overhead. Pyi files are, IMO, most useful in legacy projects where type annotation syntax is not supported, but new matplotlib code isn't going to see anything below 3.6. It also reduces activation energy - when I am making changes to a codebase I am still learning about, it's helpful for me to annotate function signatures as I need to; I probably wouldn't bother if I had to create a new file and switch between them as I went.

what should the types be?

It is my understanding that how to mark up numpy arrays as input/output

Indeed, numpy array/shape cannot be annotated currently. But that can be documented (as it is currently), and all of the other types can be annotated.

how do you deal with the very flexible input types?

The typing module allows a great deal of flexibility. For example, you can use Iterable or Sequence instead of saying "List but it'll work if it's a tuple, set, dict keys etc". You can give concrete types for items where appropriate, or Any if not. If you accept multiple types, you can use a Union; if you're using the same union over and over, you can use a TypeVar. If we did go forward with type hints, common types like colorspecs could go into a module for re-use:

from typing import Tuple, Sequence, Iterator

RgbTuple = Tuple[float, float, float]
RgbaTuple = Tuple[float, float, float, float]
ColorSpec = Union[RgbTuple, RgbaTuple, str]
MultiColorSpec = Union[ColorSpec, Sequence[ColorSpec]]

def normalize_colors(color: Optional[MultiColorSpec] = None) -> Iterator[RgbaTuple]:
    # some logic
    yield r, g, b, a

Going forward, overly-flexible APIs are arguably a smell. Another valuable outcome of type-hinting new code is that they force the developer to think "It's pretty awkward to specify all of the ways this single function can be used - I wonder if that will make it hard to understand, too?". Again, I'm not saying we need to fundamentally change matplotlib's legacy code; just that new code can be different.

it worth doing just some typing but not all or do we need to do everything for it to be worth it (a-la const correctness in c++)?

IMO, it is absolutely worth it even if we only do it gradually/partially. We've been adding type hints to a project for a few months and have found a whole slew of bugs, documentation errors, and dead code.

There was some discussion on this point in #14278.

@anntzer said they considering bringing it up at last week's dev call - was there any discussion there?

@clbarnes
Copy link
Contributor

@ikamensh could you make this issue more discoverable by changing the title? Referencing PEP484 would be helpful, and referring to "type hints" or "type annotations" is more clear - and there's no need to mention pycharm specifically, any IDE worth its salt should be working with type hints by now.

@anntzer anntzer changed the title Add types to the code (For pycharm autocompletion / hints) Add PEP484 type hints to the code (For IDE autocompletion / hints) Jun 24, 2019
@anntzer
Copy link
Contributor

anntzer commented Jun 24, 2019

This was discussed at last week's dev call. The opinion of those present (just myself, @efiring and @story645) was to not use type hints for now. Myself and Eric don't really like type hints to start with, whereas Hannah seemed more open to them (@efiring @story645 feel free to correct me if I'm misrepresenting anyone's position), but we agreed that at least 1) numpy should first standardize how to represent concepts such as n-d array-like, broadcastable things, etc. before we consider adding type hints, and, 2) if anything, type hints should first be added to the most common entry points (subplots(), hist(), etc. (plot() is a bit hopeless anyways...) as these would benefit the most, and would likely exercise quite well all kinds of weird annotations that would be needed).
@tacaswell You may want to comment on this as well?

@ikamensh
Copy link
Author

@tacaswell

how do you deal with the very flexible input types?

Perhaps typing on 'flexible type' can be either something very generic like Iterable / Collection? if it is something more specific there could be an ABC / Interface defined which outlines what methods / fields must be present to work with it? It could also be a Union[numpy.ndarray, pandas.DataFrame, ..., Collection[Number]].

@tacaswell
Copy link
Member

TL;DR

For now I think that the typing system is still to immature for Matplotlib to adopt in-line, however if people are interested in writing and supporting pyi files, I am open to matplotlib/matplotilb-typeshed.


@ikamensh you have captured the complexity of some of our allowed input types. Developing that ontology and type system which works uniformly across all of our functions / methods (and across numpy/scipy/pandas!) sounds like a major research project (hence my estimate of 6-9 mo FTE of a senior level person).

@clbarnes I think your example also makes this point, color is probably one of the

There is also the question if **kwargs should be typed as well. If we typed the all, in-line, my instinct is that it would be so verbose that it would be harmful to readability.

@tacaswell
Copy link
Member

I am going to close this issue, thank you for your work and input @ikamensh and @clbarnes .

@NeilGirdhar
Copy link

This was a very reasonable decision last year, but now that numpy will have type hints in 1.20, is there any chance that matplotlib could reconsider type annotations?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants