Skip to content

DOC: explain too many ticks #21943

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Jan 10, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
53 changes: 53 additions & 0 deletions doc/users/faq/howto_faq.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,59 @@ How-to
.. contents::
:backlinks: none


.. _how-to-too-many-ticks:

Why do I have so many ticks, and/or why are they out of order?
--------------------------------------------------------------

One common cause for unexpected tick behavior is passing a *list of strings
instead of numbers or datetime objects*. This can easily happen without notice
when reading in a comma-delimited text file. Matplotlib treats lists of strings
as *categorical* variables
(:doc:`/gallery/lines_bars_and_markers/categorical_variables`), and by default
puts one tick per category, and plots them in the order in which they are
supplied.

.. plot::
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find the examples a bit extensive. Would the following be sufficient?

import matplotlib.pyplot as plt
import numpy as np

fig, ax = plt.subplots(1, 2, constrained_layout=True, figsize=(6, 2))

ax[0].set_title('Ticks seem out of order / misplaced')
x = ['5', '20', '1', '9']  # strings
y = [5, 20, 1, 9]
ax[0].plot(x, y, 'd')
ax[0].tick_params(axis='x', labelcolor='red', labelsize=14)

ax[1].set_title('Many ticks')
x = [str(xx) for xx in np.arange(100)]  # strings
y = np.arange(100)
ax[1].plot(x, y)
ax[1].tick_params(axis='x', labelcolor='red', labelsize=14)

grafik

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that you have updated the examples, and they are more clear now.

Still, do we need more than a screen full of figures and code to illustrate the relatively simple fact "passing string data plots all values as ticks in the given order"?

Motivation for the proposed plot above:

  • I've limited this to the error cases. Plotting them is helpful to see how strings typically show up as "wrong" itics
  • I've left out the respective "correct" plots with numbers for brevity. It should be quite clear from highlighted ticks that these are undesired. How the correct plot looks does not add much information IMHO.
  • I've chosen two plots which show the characteristic error signatures "wrong order" and "too many ticks"
    "Dates out of order" is essentially the same as "numbers out of order".

I favor this to reduce clutter and keep the information density high. But I won't fight over it. If you feel strongly on having the correct plot as well, I suggest to at least remove the dates and use the same layout in both plots to make them equal in size.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My apologies, I wrote a reply, but must have closed it before hitting save....

a) datetime-as-string is a very common failure mode, so I think it should be in the FAQ
b) for beginners "turn your strings into floats" is going to require a google search; half the time, even I have to look up how to do datetimes. I felt it was important to offer a basic solution, and also to show the "expected" output so the "unexpected" is crystal clear.

This section is for folks who are trying to do their first plot, reading their Excel data in as a csv. While I agree we should be as concise as possible, I think here the extra guidance is worth the virtual real estate to get them on the right path.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section is for folks who are trying to do their first plot, reading their Excel data in as a csv.

IMHO this can happen with all levels of experience.

Anyway, this is a judgement call on the audience and the amount of detail we think is necessary. I appreciate your position, but cannot sign-off on it. If others approve and merge I'm fine with that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean its a How-To - I would argue we should provide concrete solutions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It‘s not really a how-to, it‘s only in how-to because we don‘t have the proper category yet. #21943 (comment)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me rephrase: whatever we call it, I think it should have a solution in it, not just the problem.

Copy link
Member

@timhoffm timhoffm Jan 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To me the solution is „don‘t pass in strings“. How to get from strings to numerical values depends on the data source and values, and is beyond Matplotlibs business. Note for example that the canonical solution for pandas.read_csv would be to pass converters for those values and not create a numpy array from the data after loading.

I‘m fine that some people may have to do an additional google search. As said, it‘s a judgement call.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, well, lets move the full example into an example.

:include-source:
:align: center

import matplotlib.pyplot as plt
import numpy as np

fig, ax = plt.subplots(1, 2, constrained_layout=True, figsize=(6, 2))

ax[0].set_title('Ticks seem out of order / misplaced')
x = ['5', '20', '1', '9'] # strings
y = [5, 20, 1, 9]
ax[0].plot(x, y, 'd')
ax[0].tick_params(axis='x', labelcolor='red', labelsize=14)

ax[1].set_title('Many ticks')
x = [str(xx) for xx in np.arange(100)] # strings
y = np.arange(100)
ax[1].plot(x, y)
ax[1].tick_params(axis='x', labelcolor='red', labelsize=14)

The solution is to convert the list of strings to numbers or
datetime objects (often ``np.asarray(numeric_strings, dtype='float')`` or
``np.asarray(datetime_strings, dtype='datetime64[s]')``).

For more information see :doc:`/gallery/ticks/ticks_too_many`.

.. _howto-determine-artist-extent:

Determine the extent of Artists in the Figure
---------------------------------------------

Sometimes we want to know the extent of an Artist. Matplotlib `.Artist` objects
have a method `.Artist.get_window_extent` that will usually return the extent of
the artist in pixels. However, some artists, in particular text, must be
rendered at least once before their extent is known. Matplotlib supplies
`.Figure.draw_without_rendering`, which should be called before calling
``get_window_extent``.
Comment on lines +58 to +63
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Sometimes we want to know the extent of an Artist. Matplotlib `.Artist` objects
have a method `.Artist.get_window_extent` that will usually return the extent of
the artist in pixels. However, some artists, in particular text, must be
rendered at least once before their extent is known. Matplotlib supplies
`.Figure.draw_without_rendering`, which should be called before calling
``get_window_extent``.
The method `.Artist.get_window_extent` returns the extent of the artist in
display space, i.e. in pixels. However, some artists, in particular text, must
be drawn be at least once before their extent is known. Call
`.Figure.draw_without_rendering` before using `.Artist.get_window_extent`.

I'd like to be as specific as possible: Which artists aside from texts need this?
If we cannot be more specific here, we should write "Always call ..."

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I largely know of texts. I guess there may be some artists that are proportional to the figure or axes size?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel that vague statements are rather confusing to the user. We should do either

  • dig into this deeper and explicitly write out for which artists this is necessary. - This is some work.
  • State that users should always call draw_without_rendering before using get_window_extent(). - I'm a bit uneasy with that because we're forcing additional code and overhead on the user also in cases that would work without it.

I suggest to move this to a separate PR to not block the ticks stuff.

Note also that #22122 now mentions draw_without_rendering() in the Text.get_window_extent() exception, so that discussing draw_without_rendering() here for that case is less important now. If the error happens, the user has the solution at hand.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think people are coming to a "How to" for the exact reason their artist extent is wrong - they just want to fix it. I don't think we need to enumerate all the troublesome artists, or interactions with layout managers etc here - we just want to tell them to call a fake draw and see if that helps.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok.


.. _howto-figure-empty:

Check whether a figure is empty
Expand Down
76 changes: 76 additions & 0 deletions examples/ticks/ticks_too_many.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
"""
=====================
Fixing too many ticks
=====================

One common cause for unexpected tick behavior is passing a list of strings
instead of numbers or datetime objects. This can easily happen without notice
when reading in a comma-delimited text file. Matplotlib treats lists of strings
as *categorical* variables
(:doc:`/gallery/lines_bars_and_markers/categorical_variables`), and by default
puts one tick per category, and plots them in the order in which they are
supplied. If this is not desired, the solution is to convert the strings to
a numeric type as in the following examples.

"""

############################################################################
# Example 1: Strings can lead to an unexpected order of number ticks
# ------------------------------------------------------------------

import matplotlib.pyplot as plt
import numpy as np

fig, ax = plt.subplots(1, 2, constrained_layout=True, figsize=(6, 2.5))
x = ['1', '5', '2', '3']
y = [1, 4, 2, 3]
ax[0].plot(x, y, 'd')
ax[0].tick_params(axis='x', color='r', labelcolor='r')
ax[0].set_xlabel('Categories')
ax[0].set_title('Ticks seem out of order / misplaced')

# convert to numbers:
x = np.asarray(x, dtype='float')
ax[1].plot(x, y, 'd')
ax[1].set_xlabel('Floats')
ax[1].set_title('Ticks as expected')

############################################################################
# Example 2: Strings can lead to very many ticks
# ----------------------------------------------
# If *x* has 100 elements, all strings, then we would have 100 (unreadable)
# ticks, and again the solution is to convert the strings to floats:

fig, ax = plt.subplots(1, 2, figsize=(6, 2.5))
x = [f'{xx}' for xx in np.arange(100)]
y = np.arange(100)
ax[0].plot(x, y)
ax[0].tick_params(axis='x', color='r', labelcolor='r')
ax[0].set_title('Too many ticks')
ax[0].set_xlabel('Categories')

ax[1].plot(np.asarray(x, float), y)
ax[1].set_title('x converted to numbers')
ax[1].set_xlabel('Floats')

############################################################################
# Example 3: Strings can lead to an unexpected order of datetime ticks
# --------------------------------------------------------------------
# A common case is when dates are read from a CSV file, they need to be
# converted from strings to datetime objects to get the proper date locators
# and formatters.

fig, ax = plt.subplots(1, 2, constrained_layout=True, figsize=(6, 2.75))
x = ['2021-10-01', '2021-11-02', '2021-12-03', '2021-09-01']
y = [0, 2, 3, 1]
ax[0].plot(x, y, 'd')
ax[0].tick_params(axis='x', labelrotation=90, color='r', labelcolor='r')
ax[0].set_title('Dates out of order')

# convert to datetime64
x = np.asarray(x, dtype='datetime64[s]')
ax[1].plot(x, y, 'd')
ax[1].tick_params(axis='x', labelrotation=90)
ax[1].set_title('x converted to datetimes')

plt.show()