Skip to content

[Bug]: Confusing error messages #23083

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
ben-briscoe opened this issue May 21, 2022 · 2 comments · Fixed by #23088
Closed

[Bug]: Confusing error messages #23083

ben-briscoe opened this issue May 21, 2022 · 2 comments · Fixed by #23088
Milestone

Comments

@ben-briscoe
Copy link

Bug summary

Basically, plotting from a dataframe failed because of a keyerror but the message I received was regarding formatting using a string. The failure happened silently, causing me to spend over an hour tracking down a type because I had no clue where to start.

Code for reproduction

>>> import pandas as pd
>>> import matplotlib.pyplot as plt
>>> data  = [ [1,1], [2,2], [3,3] ]
>>> df = pd.DataFrame(data, columns = ['header','mispelledHeader'])
>>> figure, axes = plt.subplots()
>>> line = axes.plot('header','correctlySpelledHeader',data = df)

Actual outcome

Traceback (most recent call last):
File "", line 1, in
File "/home/b_briscoe/thirdparty/phel-1.2.0/linux_x86_64_9.4.0/miniconda3-4.9.2/lib/python3.9/site-packages/matplotlib/axes/_axes.py", line 1605, in plot
lines = [*self._get_lines(*args, data=data, **kwargs)]
File "/home/b_briscoe/thirdparty/phel-1.2.0/linux_x86_64_9.4.0/miniconda3-4.9.2/lib/python3.9/site-packages/matplotlib/axes/_base.py", line 315, in call
yield from self._plot_args(this, kwargs)
File "/home/b_briscoe/thirdparty/phel-1.2.0/linux_x86_64_9.4.0/miniconda3-4.9.2/lib/python3.9/site-packages/matplotlib/axes/_base.py", line 452, in _plot_args
linestyle, marker, color = _process_plot_format(fmt)
File "/home/b_briscoe/thirdparty/phel-1.2.0/linux_x86_64_9.4.0/miniconda3-4.9.2/lib/python3.9/site-packages/matplotlib/axes/_base.py", line 188, in _process_plot_format
raise ValueError(
ValueError: Illegal format string "correctlySpelledHeader"; two color symbols

Expected outcome

The actual failure is happening when the df and key are passed into this as data and value respectively.

mpl._replacer(data,value):
----try:
--------# if key isn't a string don't bother
--------if isinstance(value, str):
--------# try to use getitem
--------value = data[value] <-----------------------Key Error because of typo
----except Exception:
--------# key does not exist, silently fall back to key
--------pass
----return sanitize_sequence(value)

As you can see from the comment, this happens silently. And as you can see from the Traceback provided the error you finally receive is regarding a formatting string. So this caused quite a bit of confusion, because I was looking everywhere except my header spellings. I feel like this shouldn't happen 'silently', it at least deseves a warning, perhaps:

----except Exception:
--------warnings.warn('KeyError generated when attempting to access data using provided str')

side note: the docstring says it returns data[value] or data back. in reality it passes back data[value] or value back. Not sure what the purpose is for allowing this to slide through, but either the behavior is wrong or the docstring is.

Additional information

No response

Operating system

Ubuntu 20.04

Matplotlib Version

3.4.2

Matplotlib Backend

Qt5Agg

Python version

3.9.1

Jupyter version

No response

Installation

No response

@jklymak
Copy link
Member

jklymak commented May 21, 2022

I'm not really sure what we can do here. Unfortunately we also support ax.plot('boo', '-r', data=df) to plot with df['boo'] as the y data (x data is assumed np.arange(len(y))), and '-r' as the format string. We definitely don't want to warn on data['-r'] - instead we just assume '-r' is a format string and carry on.

Given this flexibility, I think the error message is as helpful as it can be: Matplotlib is considering your second string as a format string, and it is illegal. Where in doubt, or debugging, users are always encouraged to be explicit: ax.plot(df['header'], df['correctlySpelledHeader']) raises:

KeyError: 'correctlySpelledHeader'

Someone could try to fix this by checking if the second string is a valid formatting string, but then it is ambiguous whether the error should be a KeyError or a ValueError. I'd vote that this is a "can't fix".

@ben-briscoe
Copy link
Author

I was unaware of why, but assumed there was a reason this was allowed to pass through. I guess my comment is that understanding that the format string error message could be related to a key error in your data is pretty esoteric and hostile to a new user trying to do a basic task and making a very common error.

for example my situation of finding this is using a simulation framework that uses matplotlib to plot results. So a user writes a model which registers the output(typo here) which lets the framework write the data to a csv. The framework calls pandas to make the dataframe, the framework calls matplotlib to plot using the correct spelling, and then the error occurs.

so I have a smart user out there, with a wealth of knowledge just no matplotlib knowledge. But because the typo occurred miles away, and the message has nothing to do with the error, a user has to spend an hour or more learning matplotlib’s various functionalities and hunting this down.

now, I will make the suggestion that the framework adopt the approach:

ax.plot( df[

]…

as opposed to:

ax.plot(

….data=df…

Which is fine for the framework developers to be expected to have pretty decent knowledge of a library they employ. But again, I think this behavior is hostile to new users.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants