Skip to content

Additional docstring recommendations #10225

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
timhoffm opened this issue Jan 11, 2018 · 11 comments
Closed

Additional docstring recommendations #10225

timhoffm opened this issue Jan 11, 2018 · 11 comments

Comments

@timhoffm
Copy link
Member

timhoffm commented Jan 11, 2018

as a followup to #9786 and more generally #7217

Update: Striked through done

Proposal: Additional docstring recommendations

While looking at and writing more documentation, I've come across different conventions in several aspects of the docstrings.

While one should not be too pedantic, and the world does not end if there are some variations, a set of of recommendations helps.

  • It helps the user: Recognizable conventions throughout the docs. Better readability.
  • It helps me as an author, because I don't always have to think anew about how to write certain patterns in the docs.

Structure of this proposal

For every aspect, if written a text to be included in documentation guide in quotes:

Text for documentation.

Everything else are descriptions of the current state or explanations. One may consider adding the rationales.

Parameter type descriptions

#### Number type
Proposal:

Use scalar instead of int or float if the type can be any number.

Current usage: Already often done, but not always. Just a hint for new writers.

#### array-like inputs
For array-like inputs, there are several notations:

  • array_like / array-like
  • sequence, sequence of [type]
  • array
  • iterable of [type]

Additionally, there are many variants to note length or dimensionality:

  • shape(N,) / shape(2,N) / shape 2xN
  • length N / length n
  • 2-tuple / length 2-tuple / tuple of length 2
  • 2 dimensional / 2-dimensional / two-dimensional / 2d / 2D / 2-D
    Note: according to this journal article 3D is more commonly used than 3-D.

suggestion t.b.d.

String values

Use simple single quotes when giving string values.

e.g.

If fmt is 'none', only the errorbars will be plotted.

Rationale: For the rendered docs, a litterally-quoted string ``'none'`` would look a bit nicer (similar to 'none'), but that's too difficult to type and too difficult to read in plaintext docs.

Current usage: Already often done, but not always. Just a hint for new writers.

### Citing other types

Do we use ~matplotlib.markers.MarkerStyle or .MarkerStyle?

Proposal:

Use full references `~matplotlib.markers.MarkerStyle` in parameter types. Use abbreviated links `.MarkerStyle` in the text.

e.g.

    norm : `~matplotlib.colors.Normalize`, optional, default: None
        A `.Normalize` instance is used to scale luminance data to 0, 1.

Rationale: The distinction is only relevant for the plaintext string. The full reference is required if I want to find the corresponding doc (e.g. ipython matplotlib.colors.Normalize?). Therefore we write it in the type definition. In a text passage, readability has a higher priority. Therefore, we use abbreviated references.

The displayed links of full references are abbreviated to the last component by preceding a tilde `~matplotlib.markers.MarkerStyle`. The abbreviation is not neccesary if the reference does only consist of one component anyway. For readability, do not use a tilde in these cases `.MarkerStyle` not `~.MarkerStyle`.

Use partial references if the shorted form would be too generic, i.e. `.Bbox.ignore` instead of `~.Bbox.ignore` or `~matplotlib.transforms.Bbox.ignore`.

e.g.

This depends on the last value passed to `.Bbox.ignore`.

Rationale: The abbreviated from in the HTML should be easy to read. The link target gives the exact reference.

### Citing rcParams

Current variants:

  • defaults to ``(lines.linewidth,)``
  • defaults to rc ``image.cmap`` / defaults to rc `image.origin`
  • default is taken from the rcParam ``hist.bins``
  • Default is ``rcParams['lines.markersize'] ** 2``
  • default to rc settings

We should create a label .. _rc-params here.

When citing rcParams use a reference to the label rc-params and the literal role for the parameter name.

e.g.

defaults to :ref:`rcParam<rc-params>` ``lines.linewidth``.

Rationale: A link to the rcParams is helpfull, but it should be as short as possible in the plain text. One could also :ref:rc-params, but the link text would then depend on the section titile, which may not be good.

Note: If we would switch from the Sphinx default_role obj to any, we could even leave out the :ref:.

Note 2: The object rcParams is currently sometimes referenced, but that does not contain a list and description of the values. Also, there are several ways to influence the rcParams. So the above reference is more appropriate than the object. We should add a link from the object.

Spaces after colon

Just noted that sometimes there's one space, sometimes there are two spaces. Not sure if it's worth mentioning.

@anntzer
Copy link
Contributor

anntzer commented Jan 11, 2018

I have an implementation of a custom :rc: role so that one can write

:rc:`lines.linewidth`

and get this mapped to, well, whatever we want. The nice thing is that roles can be in the postfix position too, so we could even have a construct such as

the `lines.linewidth`:rcparam:

which seems very readable and lightweight in plain text.

The PR depends on #9708 though so I'll wait for that PR to be merged first.

Edit: actually not that hard to split out, see #10226.


For array(-like) parameter type descriptions we should probably use something close-ish to numpy's current effort (python/typing#516 and stuff linked therein).
For other parameters we should use typing-style annotations. There was some concerns that sphinx may or may not be able to support e.g. List[~.Axes] but if that's the case we should try to fix this on sphinx's side...

@timhoffm
Copy link
Member Author

Note: handling of parameter lists updated: #10280

@anntzer
Copy link
Contributor

anntzer commented Jan 27, 2018

Also, I think I prefer float to scalar, even though it is a slight abuse of language to make it mean "float-or-int" (I guess if you want to be really pedantic we usually mean https://docs.python.org/3/library/numbers.html#numbers.Real anyways...).

@tacaswell
Copy link
Member

👍 all of these seem eminently reasonable.

@timhoffm
Copy link
Member Author

I will recheck what other scientific libraries are using. It makes sense to try and use a common language. Then I will provide a pull request.

@timhoffm
Copy link
Member Author

timhoffm commented Apr 2, 2018

Proposal for array-like types

There is no strict consensus among or even within other scientific libraries (checked numpy, scipy, pandas). The prevalent form is a variant of "array-like". So, I propose

Use array-like for homogeneous sequences, which could typically be a numpy.array.

If required, this can be amended with dimensions array-like(N,) / array-like(M, N).

float is the implicit default dtype. If required, an alternative type may be specified: array-like of int.

@fredrik-1
Copy link
Contributor

I have continued to edit some docstrings (figure and pyplot) and I believe that after some confusion it is quite clear how to write. A few question though.

Are there any convention for how to write True, False and None? I believe that they are written in all possibles ways in the docstrings. I think doing nothing special works well but single back ticks gets a link to the python documentation and looks good.

How to write example code in the docstrings? With .. code:: or with >>>? I believe that the first one is often the best but including the result is better done in the last one (but that is probably not that common in matplotlib examples).

Things that have been confusing to me are if the markups, :class:, 'func: etc, should be used or not. I realize now that a section in the documentation where about those but I thought markups there meant some other markup.

I also got confused that back ticks should not be used in the See Also section.

@timhoffm
Copy link
Member Author

Personally, I currently use *True* etc. I find it appealing to visually offset these from regular text. I don't see the point in back-linking to the python documentation using single bac kticks. Also the link color makes these too visually present. In the rendered html, the <code> effect of double back ticks would be optimal. But I find double back ticks too distracting in the rst code ( ``True``). Just personal preference. I don't think we have a definitive rule for that, currently.

... code:: and >>> are both used. It depends a bit if you rather have a single line with input and output or a multi-line code block. For ... code:: you should usually use the abbreviated form of just putting :: at the end of a paragraph (unless you need to specify ... code:: attributes.

:class: etc should be left out where possible to improve readability (i.e. where there is not an ambiguity in the name).

I'm not quite sure right now about See Also I assume that this is something managed wihin numpydoc and thus works without quotes.

@fredrik-1
Copy link
Contributor

fredrik-1 commented Jun 14, 2018

Ok, some other comments in regard to kwargs and what to link to.

I found it frustrating that the Axes kwargs are not included in the natural place in the documentation. Wouldn't it be better to just make some kind of hack to get them into the documentation compared to waiting for someone writing a good solution?

I also looked at the Axes string that can be used in dedent_interpd. sharex, and sharey where not included, probably because they don't have a setter. Easy to be confused though if not all possible keywords are included in that string.

How to link to the documentation when subclasses are involved? axes are often not axes.Base classes. I believe that the _AxesBase class is not included in the documentation and cant be linked to. The classes and functions in axes._subplots cant be linked to for same reason, are they not public even though they are the result type of several functions? I guess that the general subplot class cant be linked to because it is dynamic but the Subplot class should exist.

What about including _AxesBase in the documentation and maybe make it public?
edit: I see now that all axes I know of have axes.Axes as parent so probably no need to _AxesBase.

What is the best way to look at very long docstrings?

edit: Are there really any reason to include the kwargs list in methods? For example in plot
plot

The table is one click away in the docs and the table often looks bad in a shell and makes the docstring much longer.

@timhoffm
Copy link
Member Author

timhoffm commented Jun 14, 2018

I cannot really say something on the Axes topic. Didn’t have time to look into it yet.

What is the best way to look at very long docstrings?

Not sure what you mean with that. HTML page?

The kwarg tables are debatable. They’ve been around for a long time. On the plus side, it’s more obvious which values are supported. With a link, you’d still have to follow the link and find the relevant part of the target docstring (the link goes to the top of the docstring not to the list). That’s not impossible, but a distraction. On the shell (or in jupyter notebooks) you don’t have the link and it’s difficulty get the information there. Therefore, I’m +0.5 on keeping the kwarg lists.

@timhoffm
Copy link
Member Author

Most of the recommendation have been incorporated into the documentation guide. Currently, I don't see the need for further work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants