Skip to content

Histogram range appears to be calculated incorrectly when using "bins='auto'" with 2D data #8778

Closed
@seth-a

Description

@seth-a

Bug report

Bug summary

Using 'auto' to compute the number of bins with 2D data only uses the first column to compute the range. This causes all other columns to be incorrectly displayed.

Calculating histograms using bins='auto' with numpy gives the correct behavior.
Code for reproduction

from pylab import *

# four values to examine
x = randn(1000, 4)*0.1 + arange(4)/7

figure()
ax1 = subplot(211)
plt.hist(x, histtype='step')
legend(['1', '2', '3', '4'])
ax2 = subplot(212)
plt.hist(x, bins='auto', histtype='step')
legend(['1', '2', '3', '4'])

# Match axes so it's clear where the problem is
ax2.set_xlim(ax1.get_xlim())
plt.show()

# Check that numpy doesn't have this bug
_, bins_default = np.histogram(x)
_, bins_auto = np.histogram(x, bins='auto')
assert(np.min(bins_default) == np.min(bins_auto))
assert(np.max(bins_default) == np.max(bins_auto))

Actual outcome
The range of the histogram corresponds to the range of the last column of data for all columns, see second subplot.

Expected outcome
The range of the histogram should correspond to the min/max of all data, as it is in the first subplot of the example code.

Matplotlib version

  • Operating System: Ubuntu
  • Matplotlib Version: 2.0.0
  • Python Version: 3.5.2

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions