Skip to content

matplotlib.units.ConversionError on scatter of dates with a NaN in the first position #14356

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Dapid opened this issue May 28, 2019 · 16 comments · Fixed by #14429
Closed

matplotlib.units.ConversionError on scatter of dates with a NaN in the first position #14356

Dapid opened this issue May 28, 2019 · 16 comments · Fixed by #14429
Labels
Release critical For bugs that make the library unusable (segfaults, incorrect plots, etc) and major regressions. topic: units and array ducktypes
Milestone

Comments

@Dapid
Copy link

Dapid commented May 28, 2019

Bug report

Bug summary

When on a scatter plot the first number on the y values is a nan, and the x values are dates, I get the error:

matplotlib.units.ConversionError: Failed to convert value(s) to axis units: masked_array

Code for reproduction*

import numpy as np
import pylab as plt

times = np.arange('2005-02', '2005-03', dtype='datetime64[D]')

y = np.random.random(size=len(times))
y[0] = np.nan
plt.scatter(times, y)

plt.show()

Actual outcome

On Matplotlib 3.0.0, I get no warning.

On 3.1.0, I get an error.

Traceback (most recent call last):
  File "/home/david/.virtualenv/py36/lib/python3.6/site-packages/matplotlib/axis.py", line 1551, in convert_units
    ret = self.converter.convert(x, self.units, self)
  File "/home/david/.virtualenv/py36/lib/python3.6/site-packages/matplotlib/dates.py", line 2011, in convert
    return date2num(value)
  File "/home/david/.virtualenv/py36/lib/python3.6/site-packages/matplotlib/dates.py", line 426, in date2num
    return _to_ordinalf_np_vectorized(d)
  File "/home/david/.virtualenv/py36/lib/python3.6/site-packages/numpy/lib/function_base.py", line 2091, in __call__
    return self._vectorize_call(func=func, args=vargs)
  File "/home/david/.virtualenv/py36/lib/python3.6/site-packages/numpy/lib/function_base.py", line 2161, in _vectorize_call
    ufunc, otypes = self._get_ufunc_and_otypes(func=func, args=args)
  File "/home/david/.virtualenv/py36/lib/python3.6/site-packages/numpy/lib/function_base.py", line 2121, in _get_ufunc_and_otypes
    outputs = func(*inputs)
  File "/home/david/.virtualenv/py36/lib/python3.6/site-packages/matplotlib/dates.py", line 226, in _to_ordinalf
    base = float(dt.toordinal())
AttributeError: 'numpy.float64' object has no attribute 'toordinal'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/david/.virtualenv/py36/lib/python3.6/site-packages/matplotlib/backends/backend_qt5.py", line 501, in _draw_idle
    self.draw()
  File "/home/david/.virtualenv/py36/lib/python3.6/site-packages/matplotlib/backends/backend_agg.py", line 388, in draw
    self.figure.draw(self.renderer)
  File "/home/david/.virtualenv/py36/lib/python3.6/site-packages/matplotlib/artist.py", line 38, in draw_wrapper
    return draw(artist, renderer, *args, **kwargs)
  File "/home/david/.virtualenv/py36/lib/python3.6/site-packages/matplotlib/figure.py", line 1709, in draw
    renderer, self, artists, self.suppressComposite)
  File "/home/david/.virtualenv/py36/lib/python3.6/site-packages/matplotlib/image.py", line 135, in _draw_list_compositing_images
    a.draw(renderer)
  File "/home/david/.virtualenv/py36/lib/python3.6/site-packages/matplotlib/artist.py", line 38, in draw_wrapper
    return draw(artist, renderer, *args, **kwargs)
  File "/home/david/.virtualenv/py36/lib/python3.6/site-packages/matplotlib/axes/_base.py", line 2645, in draw
    mimage._draw_list_compositing_images(renderer, self, artists)
  File "/home/david/.virtualenv/py36/lib/python3.6/site-packages/matplotlib/image.py", line 135, in _draw_list_compositing_images
    a.draw(renderer)
  File "/home/david/.virtualenv/py36/lib/python3.6/site-packages/matplotlib/artist.py", line 38, in draw_wrapper
    return draw(artist, renderer, *args, **kwargs)
  File "/home/david/.virtualenv/py36/lib/python3.6/site-packages/matplotlib/collections.py", line 866, in draw
    Collection.draw(self, renderer)
  File "/home/david/.virtualenv/py36/lib/python3.6/site-packages/matplotlib/artist.py", line 38, in draw_wrapper
    return draw(artist, renderer, *args, **kwargs)
  File "/home/david/.virtualenv/py36/lib/python3.6/site-packages/matplotlib/collections.py", line 257, in draw
    transform, transOffset, offsets, paths = self._prepare_points()
  File "/home/david/.virtualenv/py36/lib/python3.6/site-packages/matplotlib/collections.py", line 229, in _prepare_points
    xs = self.convert_xunits(offsets[:, 0])
  File "/home/david/.virtualenv/py36/lib/python3.6/site-packages/matplotlib/artist.py", line 180, in convert_xunits
    return ax.xaxis.convert_units(x)
  File "/home/david/.virtualenv/py36/lib/python3.6/site-packages/matplotlib/axis.py", line 1554, in convert_units
    f'units: {x!r}') from e
matplotlib.units.ConversionError: Failed to convert value(s) to axis units: masked_array(data=[--, 731979.0, 731980.0, 731981.0, 731982.0, 731983.0,
                   731984.0, 731985.0, 731986.0, 731987.0, 731988.0,
                   731989.0, 731990.0, 731991.0, 731992.0, 731993.0,
                   731994.0, 731995.0, 731996.0, 731997.0, 731998.0,
                   731999.0, 732000.0, 732001.0, 732002.0, 732003.0,
                   732004.0, 732005.0],
             mask=[ True, False, False, False, False, False, False, False,
                   False, False, False, False, False, False, False, False,
                   False, False, False, False, False, False, False, False,
                   False, False, False, False],
       fill_value=1e+20)

Matplotlib version

  • Operating system:
  • Matplotlib version: 3.1.0
  • Matplotlib backend (print(matplotlib.get_backend())): Qt5Agg
  • Python version: 3.6

Matplotlib installed through pip on a virtual environment

@jklymak
Copy link
Member

jklymak commented May 28, 2019

Minimizing this would be helpful because the following works fine:

import matplotlib.pyplot as plt
import numpy as np
f, ax = plt.subplots()
times = np.arange('2005-02', '2005-03', dtype='datetime64[D]')
y = np.linspace(0, 10, len(times))
y[3] = 1
y[10] = 1
y[21] = 1
ax.scatter(times, [x if x != 1 else np.nan for x in y], color='b', marker='o')
plt.show()

@jklymak jklymak added the status: needs clarification Issues that need more information to resolve. label May 28, 2019
@Dapid
Copy link
Author

Dapid commented May 29, 2019

I managed to find a minimal example:

import numpy as np
import pylab as plt

times = np.arange('2005-02', '2005-03', dtype='datetime64[D]')

y = np.random.random(size=len(times))
y[0] = np.nan
plt.scatter(times, y)

plt.show()

It fails when the first number on the y values in the scatter is a nan, and the x values are dates.

@Dapid Dapid changed the title matplotlib.units.ConversionError under Matplotib 3.1 matplotlib.units.ConversionError on scatter of dates with a NaN in the first position May 29, 2019
@anntzer
Copy link
Contributor

anntzer commented May 29, 2019

Looks like

diff --git i/lib/matplotlib/units.py w/lib/matplotlib/units.py
index b4677bdd3..d77ed8c5b 100644
--- i/lib/matplotlib/units.py
+++ w/lib/matplotlib/units.py
@@ -123,34 +123,38 @@ class ConversionInterface:
     @staticmethod
     def is_numlike(x):
         """
         The Matplotlib datalim, autoscaling, locators etc work with scalars
         which are the units converted to floats given the current unit.  The
         converter may be passed these floats, or arrays of them, even when
         units are set.
         """
         if np.iterable(x):
             for thisx in x:
+                if thisx is ma.masked:
+                    continue
                 return isinstance(thisx, Number)
         else:
             return isinstance(x, Number)
 
     @staticmethod
     def is_natively_supported(x):
         """
         Return whether *x* is of a type that Matplotlib natively supports or
         *x* is array of objects of such types.
         """
         # Matplotlib natively supports all number types except Decimal
         if np.iterable(x):
             # Assume lists are homogeneous as other functions in unit system
             for thisx in x:
+                if thisx is ma.masked:
+                    continue
                 return (isinstance(thisx, Number) and
                         not isinstance(thisx, Decimal))
         else:
             return isinstance(x, Number) and not isinstance(x, Decimal)
 
 
 class DecimalConverter(ConversionInterface):
     """
     Converter for decimal.Decimal data to float.
     """

fixes the issue as of master, which makes sense. (No tests provided :p)

There are some more questions, e.g. if thisx is ma.masked, do we continue iterating (as this patch does), or do we just unmask the array to start with and expect the unmasked values to have correct type?

Also only the change to is_natively_supported is necessary here, but I think the change to is_numlike is likely necessary in other cases too.

@anntzer anntzer added this to the v3.1.1 milestone May 29, 2019
@anntzer anntzer added Release critical For bugs that make the library unusable (segfaults, incorrect plots, etc) and major regressions. topic: units and array ducktypes and removed status: needs clarification Issues that need more information to resolve. labels May 29, 2019
@jklymak
Copy link
Member

jklymak commented May 30, 2019

This works fine for plot, so yet again its a case of scatter being special. Why is the y data being masked by scatter in the first place?

@tacaswell
Copy link
Member

This is part of the work on scatter to make it handle updates to datasets that initially had some missing data more gracefully. With plot we always have exactly 2 vectors to work with, with scatter we have up to 4 (x, y, size, color).

@Dapid
Copy link
Author

Dapid commented Jul 11, 2019

The bug is still there when everything is NaN:

import numpy as np
import pylab as plt

times = np.arange('2005-02', '2005-03', dtype='datetime64[D]')

y = np.random.random(size=len(times))
y[:] = np.nan
plt.scatter(times, y)

plt.show()

@timhoffm
Copy link
Member

timhoffm commented Jul 11, 2019

What exactly would we expect to happen if all data points are NaN?

Similar open issues for all-NaN: #14439, #14124

@Dapid
Copy link
Author

Dapid commented Jul 11, 2019

I would expect it to behave as it does for individual nans: not plot anything.

Background of my motivation: I am using NaNs to mask away data points that I want to have in different alphas. Usually, every plot contains both, so it works, but today I found one where all of them were of the same kind, and the other was fully masked. I have added a check to skip it alltogether, but it used to work on earlier versions of Matplotlib, so it should still work.

@wardafiaz
Copy link

fig,ax= plt.subplots(figsize=(15,5))

ax.plot(consum_test.index,consum_test["PJME_MW"],label="Actual")
ax.plot(consum_test.index,consum_test["Prediction"],alpha=.5,zorder=10,label="Predicted")

consum = consum_test["PJME_MW"]
pred = consum_test["Prediction"]

plt.fill_between(consum_test.index, consum,pred, facecolor="green", alpha=.2,label="Difference")
from datetime import datetime

ax.set_ylim(25000, 45000)
ax.set_xbound(lower="2017-03-12 00:00:00", upper="2017-03-12 23:30:00")

plt.xlabel("Date", alpha=0.75, weight="bold")
plt.ylabel("Consumption", alpha=0.75, weight="bold")

plt.xticks(alpha=0.75,weight="bold", fontsize=11)
plt.yticks(alpha=0.75,weight="bold", fontsize=11)

plt.title("Period with the worst hourly prediction", alpha=0.75, weight="bold", fontsize=15, pad=10, loc="left")
plt.legend()

=========================================================================

this is my code and giving error


IndexError Traceback (most recent call last)
~\Anaconda3\lib\site-packages\matplotlib\axis.py in convert_units(self, x)
1522 try:
-> 1523 ret = self.converter.convert(x, self.units, self)
1524 except Exception as e:

~\Anaconda3\lib\site-packages\matplotlib\dates.py in convert(value, unit, axis)
1895 """
-> 1896 return date2num(value)
1897

~\Anaconda3\lib\site-packages\matplotlib\dates.py in date2num(d)
424 return d
--> 425 tzi = getattr(d[0], 'tzinfo', None)
426 if tzi is not None:

IndexError: too many indices for array: array is 0-dimensional, but 1 were indexed

The above exception was the direct cause of the following exception:

ConversionError Traceback (most recent call last)
in
12
13 ax.set_ylim(25000, 45000)
---> 14 ax.set_xbound(lower="2017-03-12 00:00:00", upper="2017-03-12 23:30:00")
15
16 plt.xlabel("Date", alpha=0.75, weight="bold")

~\Anaconda3\lib\site-packages\matplotlib\axes_base.py in set_xbound(self, lower, upper)
3168 upper = old_upper
3169
-> 3170 self.set_xlim(sorted((lower, upper),
3171 reverse=bool(self.xaxis_inverted())),
3172 auto=None)

~\Anaconda3\lib\site-packages\matplotlib\axes_base.py in set_xlim(self, left, right, emit, auto, xmin, xmax)
3292
3293 self._process_unit_info(xdata=(left, right))
-> 3294 left = self._validate_converted_limits(left, self.convert_xunits)
3295 right = self._validate_converted_limits(right, self.convert_xunits)
3296

~\Anaconda3\lib\site-packages\matplotlib\axes_base.py in _validate_converted_limits(self, limit, convert)
3206 """
3207 if limit is not None:
-> 3208 converted_limit = convert(limit)
3209 if (isinstance(converted_limit, Real)
3210 and not np.isfinite(converted_limit)):

~\Anaconda3\lib\site-packages\matplotlib\artist.py in convert_xunits(self, x)
173 if ax is None or ax.xaxis is None:
174 return x
--> 175 return ax.xaxis.convert_units(x)
176
177 def convert_yunits(self, y):

~\Anaconda3\lib\site-packages\matplotlib\axis.py in convert_units(self, x)
1523 ret = self.converter.convert(x, self.units, self)
1524 except Exception as e:
-> 1525 raise munits.ConversionError('Failed to convert value(s) to axis '
1526 f'units: {x!r}') from e
1527 return ret

ConversionError: Failed to convert value(s) to axis units: '2017-03-12 00:00:00'

@wardafiaz
Copy link

how i can solve plz help

@jklymak
Copy link
Member

jklymak commented Jun 18, 2021

@wardafiaz please take user help to https://discourse.matplotlib.org! (but "2017-03-12 00:00:00" is a string. Convert to a date to use it as an limit).

@Adi07-Nerd
Copy link

Could anyone provide solution to this problem? thanks in advance

@jklymak
Copy link
Member

jklymak commented Jan 17, 2022

This issue is closed. Can you post a new issue with a self-contained example? Thanks...

@Adi07-Nerd
Copy link

Adi07-Nerd commented Jan 18, 2022 via email

@charlie83xt
Copy link

Hey @Adi07-Nerd, where did you found a solution to your problem. I have expended a good couple of hours already trying to fix the issue below:

for dataframe in to_merge:
    
    names = dataframe.columns
    name = names[2]
    x = dataframe['Timestamp']
    y = dataframe['Manual_Count'].fillna(0)
    y[0] = np.nan
    z = dataframe.iloc[:,2].fillna(0)
    z[0] = np.nan
    plt.bar(x, [x if x != 1 else np.nan for x in y], label='Manual Count', width=0.2, color='cyan')
    plt.bar(x, [x if x != 1 else np.nan for x in z], label= 'Sensor data', width=0.2, color='indianred')
    plt.legend()
    ax = plt.gca()
    ax.xaxis.set_major_formatter(mdates.DateFormatter('%d-%m-%Y'))
    plt.gcf().autofmt_xdate() # Rotation
    plt.ylabel('Occupancy Count')
    plt.title(f'Manual Count vs Sensor Count for {name}')
    plt.show()

plotting within for loop, only first df is plotted. error comes after:

ConversionError: Failed to convert value(s) to axis units: 0 2022-10-27 10:30:00
1 2022-10-27 11:00:00
2 2022-10-27 11:30:00
3 2022-10-27 12:00:00
4 2022-10-27 12:30:00
...
167 2022-10-10 13:00:00
168 2022-10-10 14:00:00
169 2022-10-10 15:00:00
170 2022-10-10 16:00:00
171 NaT

Any help would be greatly appreciated.

@tacaswell
Copy link
Member

@charlie83xt Can you please open a new issue with that example + enough (synthetic) data to reproduce the issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Release critical For bugs that make the library unusable (segfaults, incorrect plots, etc) and major regressions. topic: units and array ducktypes
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants