Skip to content

Histogram range appears to be calculated incorrectly when using "bins='auto'" with 2D data #8778

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
seth-a opened this issue Jun 19, 2017 · 4 comments

Comments

@seth-a
Copy link

seth-a commented Jun 19, 2017

Bug report

Bug summary

Using 'auto' to compute the number of bins with 2D data only uses the first column to compute the range. This causes all other columns to be incorrectly displayed.

Calculating histograms using bins='auto' with numpy gives the correct behavior.
Code for reproduction

from pylab import *

# four values to examine
x = randn(1000, 4)*0.1 + arange(4)/7

figure()
ax1 = subplot(211)
plt.hist(x, histtype='step')
legend(['1', '2', '3', '4'])
ax2 = subplot(212)
plt.hist(x, bins='auto', histtype='step')
legend(['1', '2', '3', '4'])

# Match axes so it's clear where the problem is
ax2.set_xlim(ax1.get_xlim())
plt.show()

# Check that numpy doesn't have this bug
_, bins_default = np.histogram(x)
_, bins_auto = np.histogram(x, bins='auto')
assert(np.min(bins_default) == np.min(bins_auto))
assert(np.max(bins_default) == np.max(bins_auto))

Actual outcome
The range of the histogram corresponds to the range of the last column of data for all columns, see second subplot.

Expected outcome
The range of the histogram should correspond to the min/max of all data, as it is in the first subplot of the example code.

Matplotlib version

  • Operating System: Ubuntu
  • Matplotlib Version: 2.0.0
  • Python Version: 3.5.2
@afvincent
Copy link
Contributor

afvincent commented Jun 19, 2017

It looks to me that the problem may come from this section in axes.hist:

for i in xrange(nx):
        # this will automatically overwrite bins,
        # so that each histogram uses the same bins
        m, bins = np.histogram(x[i], bins, weights=w[i], **hist_kwargs)

If the bins parameter that is initially passed to hist is “auto¹”, then bins becomes a sequence after the first iteration, and this iteration will be kept during the following runs. My first idea would be to just add some supplementary logic (in the loop?) to take care of the case of bins being a string.

¹ : maybe I am wrong, but one could pass any string that is valid for numpy.histogram, don't we? Is it deliberate to not mention it in the docstring of ax.hist? Otherwise, it looks like a pretty kind nice but unfortunately hidden feature ^^.

Edit: English...

@WeatherGod
Copy link
Member

WeatherGod commented Jun 19, 2017 via email

@tacaswell
Copy link
Member

Closing as a duplicate of #8636

@afvincent
Copy link
Contributor

Indeed (and the on-going PR #8638 may fix this).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants