-
-
Notifications
You must be signed in to change notification settings - Fork 7.9k
Fix TypeError when plotting stacked bar chart with decimal #13738
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for looking at this. This is a really hard problem you have chosen because it comes down to converting numbers in matplotlib, and that is messy. I suspect that axis.convert
should be doing the conversion here, and I think it should always return a floating point array. But there could be knock-on effects. In particular you will need to be careful of masked arrays.
Instead of adding type conversion right before alignment calculation, move Decimal-float conversion into the existing conversion block. Create a helper function _decimal_convert(data) to force conversion, which support masked array. Update the related test cases to using check_figures_equal decorator instead of image comparation.
def test_bar_decimal_center(fig_test, fig_ref): | ||
# Test vertical, align-center bar chart with Decimal() input | ||
# No exception should be raised | ||
ax = fig_test.subplots() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just for symmetry, please move this down to # Test image
-
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @timhoffm for advice. Have moved it down.
def test_barh_decimal_center(fig_test, fig_ref): | ||
# Test horizontal, align-center bar chart with Decimal() input | ||
# No exception should be raised | ||
ax = fig_test.subplots() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please move down as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have moved it down.
lib/matplotlib/axes/_axes.py
Outdated
x = np.asarray(self.convert_xunits(x)) | ||
# self.convert_xunits doesn't convert Decimal input, use | ||
# _decimal_convert to force conversion | ||
x = self._decimal_convert(self.convert_xunits(x)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure about the design here. On the one hand, needing a converter around the converter feels wrong. On the other hand, do we want to change the return type of convert_xunits()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that the design seems(?) problematic (is this converter going to end up being needed everywhere?).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for advice @timhoffm @anntzer . But I am not sure whether I should update convert_units(self, x) method in Axis directly.
convert_xunits mainly consumes convert_units(self, x) in Axis class, whose first line is:
# If x is already a number, doesn't need converting
if munits.ConversionInterface.is_numlike(x):
return x
Since Decimal is actually a Number, this method doesn't change the input x if x is a list of Decimal.
One possible solution is like:
# If x is already a number, doesn't need converting
if munits.ConversionInterface.is_numlike(x):
return np.asarray(x, dtype=np.float)
But I am afraid this way may cause side effect when input x is a list of integer or other number type.
If we check whether x contains Decimal to decide whether need to convert x, then I feel the is_numlike(x) function in ConversionInterface doesn't quite make sense here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Besides, I am curious about the following code in is_numlike(x):
if iterable(x):
for thisx in x:
return isinstance(thisx, Number)
This only check the first element in a list. Is this what it supposed to do when input x is a list?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps is_numlike shouldbe changed to isinstance(x, Number) and not isinstance(x, Decimal)
then (dunno).
The unit system assumes that lists are homogeneous. Yes, it's a pretty awkward assumption.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I doubt the precision matters as everything ultimately gets converted to float at some point.
The default converter would be implemented in units.py but I certainly haven't thought about the specifics...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't spent a lot of time to think about it, but perhaps the correct approach may be to remove the
if is_numlike(x): return x
inconvert_units()
and instead always go through the unit conversion system, and then add a converter forNumber
(which will thus be used both for e.g. floats and for Decimal) which casts the input to float.
On first thought, it sounds reasonable to not special-case numbers but just handle them as a united quantity. However, I'm not sure if that would work:
def convert_units(self, x):
# If x is already a number, doesn't need converting
if munits.ConversionInterface.is_numlike(x):
return x
if self.converter is None:
self.converter = munits.registry.get_converter(x)
If self.converter
is not None
, you wouldn't get the proposed NumberFormatter
; and therefore all converters would be required to pass numbers through`.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sasoripathos Sorry that we're quite slow and vague with responses here. You've chosen a tricky issue; and as you can see from the discussion, we do not have a clear idea what the appropriate fix should be.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@timhoffm Thanks for reply! That's also a place I am confused about. I still haven't figure out what should be the return value of axisinfo(unit, axis) and default_units(x, axis) if i implemented the proposed NumberFormatter. Please pardon my beginner knowledge about matplotlib. I am still looking into the proposed approach and will give a try next week. I also think my current implementation is not that elegent since it (kind of) "breaks" the Number type, but i think it may be (hopefully) a working walk around.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@timhoffm @anntzer @jklymak Hi! I just pushed a DecimalConverter I implemented. Since originally no convertion will be done if x is a number-like input, I just make axisinfo(unit, axis) return empty AxisInfo and default_units(x, axis) return None. Besides, I use ConversionInterface.is_numlike(x) to avoid requiring all existing converters pass Numbers.
Could you please have a review and give me some feedback? Thanks very much!
Instead of creating a Decimal converter function in Axes, update the convert_units in Axis to convert x when x is a Decimal or a list of Decimal. Update the comments in test_axes.py for symmetry.
In axis.py, instead of first checking whether input is number like, direct handle them as a united quantity. Since Decimal is a kind of Number, so DecimalConvert doesn't need to provide extra AxisInfo or default unit.
ax.bar(x0, y0, align='center') | ||
|
||
|
||
@check_figures_equal(extensions=["png"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think testing both bar and barh is particularly necessary and would just keep one of them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what happens if width (or height) is a Decimal?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there will be no problem since width will also be converted. In the following code, if width is of type Decimal, it will be converted to floats by convert_xunits in _convert_dx eventually.
if self.xaxis is not None:
x0 = x
x = np.asarray(self.convert_xunits(x))
width = self._convert_dx(width, x0, x, self.convert_xunits)
......
Similarly for height.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor comment re: testing left, but other than that it looks good.
Congrats for tackling a fairly hairy problem :)
value : decimal.Decimal or iterable | ||
Decimal or list of Decimal need to be converted | ||
""" | ||
# If value is a Decimal |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can be left out. The following code already states exactly that.
lib/matplotlib/units.py
Outdated
# If value is a Decimal | ||
if isinstance(value, Decimal): | ||
return np.float(value) | ||
# else x is a list of Decimal |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would change this to
else:
# assume x is a list of Decimal
@@ -1571,10 +1571,6 @@ def have_units(self): | |||
return self.converter is not None or self.units is not None | |||
|
|||
def convert_units(self, x): | |||
# If x is already a number, doesn't need converting | |||
if munits.ConversionInterface.is_numlike(x): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it really necessary this way? The purpose of converters is to convert to something number-like matplotlib can work with. So the original logic "if is number return else convert" seems more clear than "convert; if that fails, did we already have a number? Ok then we just use it".
Maybe munits.ConversionInterface.is_numlike()
is just not the right function/name. I'd prefer this to stay and be
if not munits.ConversionInterface.needs_conversion(x):
return x
That will allow to cleanly pass through Decimal
s.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, not sure on the sign of the logic is_natively_supported_numlike(x)
may be better than not needs_conversion(x)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original cause of this problem is the operations like plus between Decimal and float is not supported by matplotlib. If there is not such operations in the logic, Decimal input will not cause any error. As we discussed before, it doesn't quite make sense to distinguish Decimal and other types of number since Decimal is a type of Number.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's why I'm saying "is_numlike" is not the right check. Semantically, we want
if matplotlib_can_natively_work_with(x):
return x
# try to convert x
It's not about number or not, it's about "do we need a converter". So, if matplotlib cannot natively handle Decimal
that can fall through to the converter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How can we define natively handle? For Decimal input, matplotlib supports it if there is no operation between Decimal and float. For example:
x = [1, 2, 3, 4]
y = [Decimal(1.5), Decimal(2.5), Decimal(3.5), Decimal(4.5)]
ax.bar(x, y, align='center')
This piece of code works fine. I am not sure about other kinds of plot, but based on this, I think matplotlib does not totally support/unsupport Decimal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, operationally, "can natively handle" is currently is_numlike(x) and not isinstance(x, Decimal)
. I mean, that's what your proposed code does as well.
It's a bit clumsy to define is that way, but that's the current logic. And I prefer to have that logic explicitly written down in a single place in a function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see currently is_numlike is used in some places like StrCategoryConverter.convert() in category.py. And from the code (as below), it looks like the Decimal input will not cause any problems here (since Decimal will be converted to floats).
# pass through sequence of non binary numbers
if all((units.ConversionInterface.is_numlike(v) and
not isinstance(v, (str, bytes))) for v in values):
return np.asarray(values, dtype=float)
So I think maybe instead of change is_numlike(x), which cause API changes, we can add a new function is_natively_supported_numlike(x) like proposed in ConversionInterface. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's a good idea to leave is_numlike(x)
as is and introduce and use ConversionInterface.is_natively_supported_numlike(x)
Naming: Not sure if we need the _numlike
suffix in the name. On the one hand, this is exactly meant as is_natively_supported
- no artificial restriction to numlikes necessary (even though it's unlikely that we'll natively support non-numlikes). On the other hand, the _numlike
may give additional context to the reader, but I would see this more as a job of the is_natively_supported*
docstring.
is_numlike(x)
may become deprecated in the future, but we can handle this later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added a function named is_natively_supported() in ConversionInterface. Since the current unit system assume lists are homogeneous, I followed the same style as is_numlik(). Now I only consider all numbers except Decimal are natively supported.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is OK, though I'm somewhat leery of running this through a converter like this.
Can you check width and length of the bars because we have issues with dx and dy-like...
ax.bar(x0, y0, align='center') | ||
|
||
|
||
@check_figures_equal(extensions=["png"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what happens if width (or height) is a Decimal?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should get a few more tests of different drawing methods. Maybe just scatter
and plot
are good enough.
But otherwise I think it looks good.
Add one test case for plot with Decimal input Add one test case for scatter plot with Decimal input Add two test cases for bar plot with Decimal height/width
@jklymak I added a few more test cases. Please have a check! |
Thanks a lot @sasoripathos ; nice contribution on a tricky problem. Your persistence is much appreciated. |
PR Summary
Hi, I'm opening this as part of coursework from University of Toronto Scarborough's CSCD01 taught by Dr. Anya Tafliovich @atafliovich. (https://www.utsc.utoronto.ca/~atafliovich/cscd01/index.html)
Fixes #10788
Bug Analysis and Technical Commentary:
When try to plot a bar graph with, a TypeError exception will be raised if values for bar bases are of type Decimal.
The problem is the width and height used to calculate the bar location are always of type float or int. There is no operation defined between float/int and Decimal.
The fix is adding a type conversion right before starting calculate the bar location according to align and orientation. If the input values for bar bases are of type Decimal, then convert all elements in the width/height array into Decimal. With further investigation, we cannot postpone the type conversion until the calculation time since the height/width array will later be used to create Patches (Artist).
Acceptance Test:
create a bar chart with Decimal values as input
using plt.show() to view the generaed chart
Result:
A bar chart is successfully rendered without TypeError
Example code:
PR Checklist