Skip to content

MEP12 improvments for statistics plots #7331

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Oct 25, 2016
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 8 additions & 2 deletions examples/statistics/boxplot_color_demo.py
Original file line number Diff line number Diff line change
@@ -1,10 +1,16 @@
"""Box plots with custom fill colors.
"""
=================================
Box plots with custom fill colors
=================================

This plot illustrates how to create two types of box plots
(rectangular and notched), and how to fill them with custom
colors by accessing the properties of the artists of the
box plots. Additionally, the ``labels`` parameter is used to
provide x-tick labels for each sample.

A good general reference on boxplots and their history can be found
here: http://vita.had.co.nz/papers/boxplots.pdf
"""

import matplotlib.pyplot as plt
Expand All @@ -15,7 +21,7 @@
all_data = [np.random.normal(0, std, size=100) for std in range(1, 4)]
labels = ['x1', 'x2', 'x3']

fig, axes = plt.subplots(nrows=1, ncols=2, figsize=(12, 5))
fig, axes = plt.subplots(nrows=1, ncols=2, figsize=(9, 4))

# rectangular box plot
bplot1 = axes[0].boxplot(all_data,
Expand Down
22 changes: 19 additions & 3 deletions examples/statistics/boxplot_demo.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,19 @@
"""
Demo of the new boxplot functionality
=========================================
Demo of artist customization in box plots
=========================================

This example demonstrates how to use the various kwargs
to fully customize box plots. The first figure demonstrates
how to remove and add individual components (note that the
mean is the only value not shown by default). The second
figure demonstrates how the styles of the artists can
be customized. It also demonstrates how to set the limit
of the whiskers to specific percentiles (lower right axes)

A good general reference on boxplots and their history can be found
here: http://vita.had.co.nz/papers/boxplots.pdf

"""

import numpy as np
Expand All @@ -23,7 +37,8 @@
axes[0, 2].set_title('showmeans=True,\nmeanline=True', fontsize=fs)

axes[1, 0].boxplot(data, labels=labels, showbox=False, showcaps=False)
axes[1, 0].set_title('Tufte Style \n(showbox=False,\nshowcaps=False)', fontsize=fs)
tufte_title = 'Tufte Style \n(showbox=False,\nshowcaps=False)'
axes[1, 0].set_title(tufte_title, fontsize=fs)

axes[1, 1].boxplot(data, labels=labels, notch=True, bootstrap=10000)
axes[1, 1].set_title('notch=True,\nbootstrap=10000', fontsize=fs)
Expand Down Expand Up @@ -62,7 +77,8 @@
showmeans=True)
axes[1, 0].set_title('Custom mean\nas point', fontsize=fs)

axes[1, 1].boxplot(data, meanprops=meanlineprops, meanline=True, showmeans=True)
axes[1, 1].boxplot(data, meanprops=meanlineprops, meanline=True,
showmeans=True)
axes[1, 1].set_title('Custom mean\nas line', fontsize=fs)

axes[1, 2].boxplot(data, whis=[15, 85])
Expand Down
26 changes: 18 additions & 8 deletions examples/statistics/boxplot_vs_violin_demo.py
Original file line number Diff line number Diff line change
@@ -1,19 +1,29 @@
"""Box plot - violin plot comparison.
"""
===================================
Box plot vs. violin plot comparison
===================================

Note that although violin plots are closely related to Tukey's (1977)
box plots, they add useful information such as the distribution of the
sample data (density trace).

Note that although violin plots are closely related to Tukey's (1977) box
plots, they add useful information such as the distribution of the sample
data (density trace).
By default, box plots show data points outside 1.5 * the inter-quartile
range as outliers above or below the whiskers whereas violin plots show
the whole range of the data.

By default, box plots show data points outside 1.5 x the inter-quartile range
as outliers above or below the whiskers whereas violin plots show the whole
range of the data.
A good general reference on boxplots and their history can be found
here: http://vita.had.co.nz/papers/boxplots.pdf

Violin plots require matplotlib >= 1.4.

For more information on violin plots, the scikit-learn docs have a great
section: http://scikit-learn.org/stable/modules/density.html
"""

import matplotlib.pyplot as plt
import numpy as np

fig, axes = plt.subplots(nrows=1, ncols=2, figsize=(12, 5))
fig, axes = plt.subplots(nrows=1, ncols=2, figsize=(9, 4))

# generate some random test data
all_data = [np.random.normal(0, std, 100) for std in range(6, 10)]
Expand Down
22 changes: 18 additions & 4 deletions examples/statistics/bxp_demo.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,17 @@
"""
Demo of the new boxplot drawer function
===================================
Demo of the boxplot drawer function
===================================

This example demonstrates how to pass pre-computed box plot
statistics to the box plot drawer. The first figure demonstrates
how to remove and add individual components (note that the
mean is the only value not shown by default). The second
figure demonstrates how the styles of the artists can
be customized.

A good general reference on boxplots and their history can be found
here: http://vita.had.co.nz/papers/boxplots.pdf
"""

import numpy as np
Expand All @@ -21,6 +33,7 @@
stats[n]['mean'] *= 2

print(stats[0].keys())

fs = 10 # fontsize

# demonstrate how to toggle the display of different elements:
Expand All @@ -35,7 +48,8 @@
axes[0, 2].set_title('showmeans=True,\nmeanline=True', fontsize=fs)

axes[1, 0].bxp(stats, showbox=False, showcaps=False)
axes[1, 0].set_title('Tufte Style\n(showbox=False,\nshowcaps=False)', fontsize=fs)
tufte_title = 'Tufte Style\n(showbox=False,\nshowcaps=False)'
axes[1, 0].set_title(tufte_title, fontsize=fs)

axes[1, 1].bxp(stats, shownotches=True)
axes[1, 1].set_title('notch=True', fontsize=fs)
Expand All @@ -50,7 +64,6 @@
fig.subplots_adjust(hspace=0.4)
plt.show()


# demonstrate how to customize the display different elements:
boxprops = dict(linestyle='--', linewidth=3, color='darkgoldenrod')
flierprops = dict(marker='o', markerfacecolor='green', markersize=12,
Expand All @@ -71,7 +84,8 @@
showmeans=True)
axes[1, 0].set_title('Custom mean\nas point', fontsize=fs)

axes[1, 1].bxp(stats, meanprops=meanlineprops, meanline=True, showmeans=True)
axes[1, 1].bxp(stats, meanprops=meanlineprops, meanline=True,
showmeans=True)
axes[1, 1].set_title('Custom mean\nas line', fontsize=fs)

for ax in axes.flatten():
Expand Down
73 changes: 50 additions & 23 deletions examples/statistics/customized_violin_demo.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,18 @@
# Customizing violin plots
#
#
"""
=================================
Demo of violin plot customization
=================================

This example demonstrates how to fully customize violin plots.
The first plot shows the default style by providing only
the data. The second plot first limits what matplotlib draws
with additional kwargs. Then a simplified representation of
a box plot is drawn on top. Lastly, the styles of the artists
of the violins are modified.

For more information on violin plots, the scikit-learn docs have a great
section: http://scikit-learn.org/stable/modules/density.html
"""

import matplotlib.pyplot as plt
import numpy as np
Expand All @@ -22,21 +34,28 @@ def percentile(vals, p):
def adjacent_values(vals):
q1 = percentile(vals, 0.25)
q3 = percentile(vals, 0.75)
uav = q3 + (q3-q1)*1.5
iqr = q3 - q1 # inter-quartile range

# upper adjacent values
uav = q3 + iqr * 1.5
if uav > vals[-1]:
uav = vals[-1]
if uav < q3:
uav = q3
lav = q1 - (q3-q1)*1.5

# lower adjacent values
lav = q1 - iqr * 1.5
if lav < vals[0]:
lav = vals[0]
if lav > q1:
lav = q1
return [lav, uav]


# create test data
dat = [np.random.normal(0, std, 100) for std in range(6, 10)]
lab = ['a', 'b', 'c', 'd'] # labels
np.random.seed(123)
dat = [np.random.normal(0, std, 100) for std in range(1, 5)]
lab = ['A', 'B', 'C', 'D'] # labels
med = [] # medians
iqr = [] # inter-quantile ranges
avs = [] # upper and lower adjacent values
Expand All @@ -47,31 +66,39 @@ def adjacent_values(vals):
avs.append(adjacent_values(sarr))

# plot the violins
fig, ax = plt.subplots(nrows=1, ncols=1, figsize=(7, 5))
parts = ax.violinplot(dat, showmeans=False, showmedians=False,
showextrema=False)
fig, (ax1, ax2) = plt.subplots(nrows=1, ncols=2, figsize=(9, 4),
sharey=True)
_ = ax1.violinplot(dat)
parts = ax2.violinplot(dat, showmeans=False, showmedians=False,
showextrema=False)

ax1.set_title('Default violin plot')
ax2.set_title('Customized violin plot')

# plot medians and averages
# plot whiskers as thin lines, quartiles as fat lines,
# and medians as points
for i in range(len(med)):
ax.plot([i+1, i+1], avs[i], '-', c='black', lw=1)
ax.plot([i+1, i+1], iqr[i], '-', c='black', lw=5)
ax.plot(i+1, med[i], 'o', mec='none', c='white', ms=6)
# whiskers
ax2.plot([i + 1, i + 1], avs[i], '-', color='black', linewidth=1)
ax2.plot([i + 1, i + 1], iqr[i], '-', color='black', linewidth=5)
ax2.plot(i + 1, med[i], 'o', color='white',
markersize=6, markeredgecolor='none')

# customize colors
for pc in parts['bodies']:
pc.set_facecolor('#D43F3A')
pc.set_edgecolor('black')
pc.set_alpha(1)

ax.get_xaxis().set_tick_params(direction='out')
ax.xaxis.set_ticks_position('bottom')
ax.set_xticks([x+1 for x in range(len(lab))])
ax.set_xticklabels(lab)
ax.set_xlim(0.25, len(lab)+0.75)
ax.set_ylabel('ylabel')
ax.set_xlabel('xlabel')
ax.set_title('customized violin plot')
ax1.set_ylabel('Observed values')
for ax in [ax1, ax2]:
ax.get_xaxis().set_tick_params(direction='out')
ax.xaxis.set_ticks_position('bottom')
ax.set_xticks(np.arange(1, len(lab) + 1))
ax.set_xticklabels(lab)
ax.set_xlim(0.25, len(lab) + 0.75)
ax.set_xlabel('Sample name')

plt.subplots_adjust(bottom=0.15)
plt.subplots_adjust(bottom=0.15, wspace=0.05)

plt.show()
12 changes: 10 additions & 2 deletions examples/statistics/errorbar_demo.py
Original file line number Diff line number Diff line change
@@ -1,12 +1,20 @@
"""
Demo of the errorbar function.
=============================
Demo of the errorbar function
=============================

This exhibits the most basic use of the error bar method.
In this case, constant values are provided for the error
in both the x- and y-directions.
"""

import numpy as np
import matplotlib.pyplot as plt

# example data
x = np.arange(0.1, 4, 0.5)
y = np.exp(-x)

plt.errorbar(x, y, xerr=0.2, yerr=0.4)
fig, ax = plt.subplots()
ax.errorbar(x, y, xerr=0.2, yerr=0.4)
plt.show()
39 changes: 24 additions & 15 deletions examples/statistics/errorbar_demo_features.py
Original file line number Diff line number Diff line change
@@ -1,37 +1,46 @@
"""
Demo of errorbar function with different ways of specifying error bars.
===================================================
Demo of the different ways of specifying error bars
===================================================

Errors can be specified as a constant value (as shown in `errorbar_demo.py`),
or as demonstrated in this example, they can be specified by an N x 1 or 2 x N,
where N is the number of data points.
Errors can be specified as a constant value (as shown in
`errorbar_demo.py`). However, this example demonstrates
how they vary by specifying arrays of error values.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this line was wrapped just fine before?

N x 1:
Error varies for each point, but the error values are symmetric (i.e. the
lower and upper values are equal).
If the raw ``x`` and ``y`` data have length N, there are two options:

2 x N:
Error varies for each point, and the lower and upper limits (in that order)
are different (asymmetric case)
Array of shape (N,):
Error varies for each point, but the error values are
symmetric (i.e. the lower and upper values are equal).

In addition, this example demonstrates how to use log scale with errorbar.
Array of shape (2, N):
Error varies for each point, and the lower and upper limits
(in that order) are different (asymmetric case)

In addition, this example demonstrates how to use log
scale with error bars.
"""

import numpy as np
import matplotlib.pyplot as plt

# example data
x = np.arange(0.1, 4, 0.5)
y = np.exp(-x)

# example error bar values that vary with x-position
error = 0.1 + 0.2 * x
# error bar values w/ different -/+ errors
lower_error = 0.4 * error
upper_error = error
asymmetric_error = [lower_error, upper_error]

fig, (ax0, ax1) = plt.subplots(nrows=2, sharex=True)
ax0.errorbar(x, y, yerr=error, fmt='-o')
ax0.set_title('variable, symmetric error')

# error bar values w/ different -/+ errors that
# also vary with the x-position
lower_error = 0.4 * error
upper_error = error
asymmetric_error = [lower_error, upper_error]

ax1.errorbar(x, y, xerr=asymmetric_error, fmt='o')
ax1.set_title('variable, asymmetric error')
ax1.set_yscale('log')
Expand Down
Loading