Skip to content

Move "howto interpreting box plots" to boxplot docstring #19860

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 4, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 0 additions & 19 deletions doc/faq/howto_faq.rst
Original file line number Diff line number Diff line change
Expand Up @@ -494,25 +494,6 @@ you're all done issuing commands and you want to draw the figure now.
per script, and harmonized the behavior of interactive mode, across most
backends.

.. _howto-boxplot_violinplot:

Interpreting box plots and violin plots
---------------------------------------

Tukey's :doc:`box plots </gallery/statistics/boxplot_demo>` (Robert McGill,
John W. Tukey and Wayne A. Larsen: "The American Statistician" Vol. 32, No. 1,
Feb., 1978, pp. 12-16) are statistical plots that provide useful information
about the data distribution such as skewness. However, bar plots with error
bars are still the common standard in most scientific literature, and thus, the
interpretation of box plots can be challenging for the unfamiliar reader. The
figure below illustrates the different visual features of a box plot.

.. figure:: ../_static/boxplot_explanation.png

:doc:`Violin plots </gallery/statistics/violinplot>` are closely related to box
plots but add useful information such as the distribution of the sample data
(density trace). Violin plots were added in Matplotlib 1.4.

.. _how-to-threads:

Working with threads
Expand Down
22 changes: 22 additions & 0 deletions lib/matplotlib/axes/_axes.py
Original file line number Diff line number Diff line change
Expand Up @@ -3714,6 +3714,28 @@ def boxplot(self, x, notch=None, sym=None, vert=None, whis=None,
meanprops : dict, default: None
The style of the mean.

Notes
-----
Box plots provide insight into distribution properties of the data.
However, they can be challenging to interpret for the unfamiliar
reader. The figure below illustrates the different visual features of
a box plot.
Copy link
Contributor

@anntzer anntzer Apr 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider also linking to the website version so that people displaying the docstring at their terminal have somewhere to click

or alternatively just replace the png by some ascii art:

       Q1-1.5IQR   Q1   median  Q3   Q3+1.5IQR
                    |-----:-----|
    .      |--------|     :     |--------|   .  .
                    |-----:-----|
 outlier            <----------->
                         IQR

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a link to wikipedia, which IMHO is sufficient.

Furthermore, this note only gives additional context. I assume that only a minority of people will try to learn from the docstring (It's rather used for looking up signature and specific parameters). I don't want to jump extra hoops here. You are welcome to do more yourself.


.. image:: /_static/boxplot_explanation.png
:alt: Illustration of box plot features
:scale: 50 %

The whiskers mark the range of the non-outlier data. The most common
definition of non-outlier is ``[Q1 - 1.5xIQR, Q3 + 1.5xIQR]``, which
is also the default in this function. Other whisker meanings can be
applied via the *whis* parameter.

See `Box plot <https://en.wikipedia.org/wiki/Box_plot>`_ on Wikipedia
for further information.

Violin plots (`~.Axes.violinplot`) add even more detail about the
statistical distribution by plotting the kernel density estimation
(KDE) as an estimation of the probability density function.
"""

# Missing arguments default to rcParams.
Expand Down