Skip to content

Updated violin plot example as per suggestions in issue #7251 #7360

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Oct 31, 2016
Prev Previous commit
Next Next commit
Made variable names more meaningful
  • Loading branch information
DrNightmare committed Oct 30, 2016
commit 14cd6cfd72a1e2d30e8e87448ee73be29bc19c30
43 changes: 19 additions & 24 deletions examples/statistics/customized_violin_demo.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,15 +20,14 @@

def adjacent_values(vals):
q1, q3 = np.percentile(vals, [25, 75])
# inter-quartile range iqr
iqr = q3 - q1
# upper adjacent values
uav = q3 + iqr * 1.5
uav = np.clip(uav, q3, vals[-1])
# lower adjacent values
lav = q1 - iqr * 1.5
lav = np.clip(lav, vals[0], q1)
return [lav, uav]
inter_quartile_range = q3 - q1

upper_adjacent_value = q3 + inter_quartile_range * 1.5
upper_adjacent_value = np.clip(upper_adjacent_value, q3, vals[-1])

lower_adjacent_value = q1 - inter_quartile_range * 1.5
lower_adjacent_value = np.clip(lower_adjacent_value, vals[0], q1)
return [lower_adjacent_value, upper_adjacent_value]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probabky doesn't need to be bracketed [], wonder if it can be used to remove zip* stuff in the unpacking...

Copy link
Contributor Author

@DrNightmare DrNightmare Oct 31, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To avoid the unpacking, I could also optionally make 'whiskers' an np.array and then do:
whiskersMin, whiskersMax = whiskers[:, 0], whiskers[:, 1]

Irrespective of whether I do this or not, the brackets can be removed

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then I think it makes sense to do that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, updated in latest commit



def set_axis_style(ax, labels):
Expand All @@ -42,19 +41,19 @@ def set_axis_style(ax, labels):

# create test data
np.random.seed(123)
dat = [sorted(np.random.normal(0, std, 100)) for std in range(1, 5)]
data = [sorted(np.random.normal(0, std, 100)) for std in range(1, 5)]

fig, (ax1, ax2) = plt.subplots(nrows=1, ncols=2, figsize=(9, 4), sharey=True)

# plot the default violin
ax1.set_title('Default violin plot')
ax1.set_ylabel('Observed values')
ax1.violinplot(dat)
ax1.violinplot(data)

# customized violin
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stray redundant comment

Copy link
Contributor Author

@DrNightmare DrNightmare Oct 31, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, right. Guess the title we set for the axes are self explanatory.
Updated with comments removed @story645

ax2.set_title('Customized violin plot')
parts = ax2.violinplot(
dat, showmeans=False, showmedians=False,
data, showmeans=False, showmedians=False,
showextrema=False)

# customize colors
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably also unnecessary comment

Expand All @@ -63,21 +62,17 @@ def set_axis_style(ax, labels):
pc.set_edgecolor('black')
pc.set_alpha(1)

# medians
med = [np.percentile(sarr, 50) for sarr in dat]
# inter-quartile ranges
iqr = [[np.percentile(sarr, 25), np.percentile(sarr, 75)] for sarr in dat]
# upper and lower adjacent values
avs = [adjacent_values(sarr) for sarr in dat]
medians = np.percentile(data, 50, axis=1)
inter_quartile_ranges = list(zip(*(np.percentile(data, [25, 75], axis=1))))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't need the list(zip; use .T as noted before.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed 👍 , have made the change as suggested in latest commit

whiskers = [adjacent_values(sorted_array) for sorted_array in data]

# plot whiskers as thin lines, quartiles as fat lines,
# and medians as points
for i, median in enumerate(med):
# whiskers
ax2.plot([i + 1, i + 1], avs[i], '-', color='black', linewidth=1)
# quartiles
ax2.plot([i + 1, i + 1], iqr[i], '-', color='black', linewidth=5)
# medians
for i, median in enumerate(medians):
ax2.plot([i + 1, i + 1], whiskers[i], '-', color='black', linewidth=1)
ax2.plot(
[i + 1, i + 1], inter_quartile_ranges[i], '-', color='black',
linewidth=5)
ax2.plot(
i + 1, median, 'o', color='white',
markersize=6, markeredgecolor='none')
Expand Down