Skip to content

DOC: explain too many ticks #21943

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Jan 10, 2022
Merged

Conversation

jklymak
Copy link
Member

@jklymak jklymak commented Dec 14, 2021

PR Summary

Users often get confused when they get too many ticks or those that are out of order, because we plot lists of strings as categoricals. This PR seeks to at least have a FAQ entry to address that problem and encourage them to convert to numbers of some sort.

PR Checklist

Tests and Styling

  • Has pytest style unit tests (and pytest passes).
  • Is Flake 8 compliant (install flake8-docstrings and run flake8 --docstring-convention=all).

Documentation

  • New features are documented, with examples if plot related.
  • New features have an entry in doc/users/next_whats_new/ (follow instructions in README.rst there).
  • API changes documented in doc/api/next_api_changes/ (follow instructions in README.rst there).
  • Documentation is sphinx and numpydoc compliant (the docs should build without error).

@jklymak jklymak added Documentation: website layout/behavior/styling changes topic: categorical labels Dec 14, 2021
@jklymak jklymak force-pushed the doc-explain-too-many branch from 53fdcdf to 7206d0c Compare December 14, 2021 11:01
@story645
Copy link
Member

I think it's a good idea in theory but dunno that the FAQ (which is hard to find & I think @timhoffm has been moving stuff out of) is the best place for it. Is there a troubleshooting section on one of the date tutorials?

@jklymak jklymak force-pushed the doc-explain-too-many branch from 7206d0c to 64c0c94 Compare December 14, 2021 11:40
@jklymak
Copy link
Member Author

jklymak commented Dec 14, 2021

It could go in troubleshooting versus how_to, for sure. I do not think "examples" or "tutorials" is a good place for this in particular - though we certainly should point out trouble shooting issues in the relevant place as well.

Certainly if I type "matplotlib faq" into google I get https://matplotlib.org/stable/users/faq/index.html as the first entry...

I also don't agree that our users guide should be completely empty, and that we should move everything to tutorials. Probably somewhat the opposite - we could consider the tutorials the rough draft of what should go in the proper user manual.

@jklymak jklymak force-pushed the doc-explain-too-many branch from 64c0c94 to e9179c3 Compare December 14, 2021 12:22
@timhoffm
Copy link
Member

We don't have a proper category for this currently. It's not a "How-to", it's also not "Troubleshooting", which is only technical stuff like finding out your version and reporting a bug.

I can see the need for real FAQ-style "Matplotlib behaves unexpectedly" (better name t.b.d.). That could contain a couple of common errors that are otherwise hard to document: Aside from this:

  • "Why are positions of artists not correct when I read them" --> need to draw/layout fist
  • "Why do you have multiple ways to do the same thing (APIs)
  • "Why does my figure not show" (if we can give helpful information to this general problem.

@jklymak
Copy link
Member Author

jklymak commented Dec 15, 2021

We could have a "Common Issues" page/section. OTOH, I think this fits just as well in the "how to" section eg "How to fix this problem..." I'd personally prefer a one-stop page for all the things I never remember (lets add setting tick label properties to the list, which I have to look up every time, and the right way is always last in whatever I find in stackoverflow)

However, I could easily imagine this "how-to" page getting out of hand and needing its own sub headings.

@jklymak
Copy link
Member Author

jklymak commented Dec 15, 2021

Newest commit adds a section on artist extents.

BTW, I'm not sure what to do about "Why does my figure not show". Usually boils down to "you are on a headless machine" doesn't it?

@jklymak
Copy link
Member Author

jklymak commented Dec 16, 2021

@jklymak
Copy link
Member Author

jklymak commented Jan 4, 2022

Can I ping for this PR and ask that any renaming/reorg get moved to an issue?

Comment on lines +79 to +84
Sometimes we want to know the extent of an Artist. Matplotlib `.Artist` objects
have a method `.Artist.get_window_extent` that will usually return the extent of
the artist in pixels. However, some artists, in particular text, must be
rendered at least once before their extent is known. Matplotlib supplies
`.Figure.draw_without_rendering`, which should be called before calling
``get_window_extent``.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Sometimes we want to know the extent of an Artist. Matplotlib `.Artist` objects
have a method `.Artist.get_window_extent` that will usually return the extent of
the artist in pixels. However, some artists, in particular text, must be
rendered at least once before their extent is known. Matplotlib supplies
`.Figure.draw_without_rendering`, which should be called before calling
``get_window_extent``.
The method `.Artist.get_window_extent` returns the extent of the artist in
display space, i.e. in pixels. However, some artists, in particular text, must
be drawn be at least once before their extent is known. Call
`.Figure.draw_without_rendering` before using `.Artist.get_window_extent`.

I'd like to be as specific as possible: Which artists aside from texts need this?
If we cannot be more specific here, we should write "Always call ..."

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I largely know of texts. I guess there may be some artists that are proportional to the figure or axes size?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel that vague statements are rather confusing to the user. We should do either

  • dig into this deeper and explicitly write out for which artists this is necessary. - This is some work.
  • State that users should always call draw_without_rendering before using get_window_extent(). - I'm a bit uneasy with that because we're forcing additional code and overhead on the user also in cases that would work without it.

I suggest to move this to a separate PR to not block the ticks stuff.

Note also that #22122 now mentions draw_without_rendering() in the Text.get_window_extent() exception, so that discussing draw_without_rendering() here for that case is less important now. If the error happens, the user has the solution at hand.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think people are coming to a "How to" for the exact reason their artist extent is wrong - they just want to fix it. I don't think we need to enumerate all the troublesome artists, or interactions with layout managers etc here - we just want to tell them to call a fake draw and see if that helps.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok.

to Matplotlib. In the lower row the data is converted to either floats or
datetime64; note that the ticks are now ordered and spaced numerically.

.. plot::
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find the examples a bit extensive. Would the following be sufficient?

import matplotlib.pyplot as plt
import numpy as np

fig, ax = plt.subplots(1, 2, constrained_layout=True, figsize=(6, 2))

ax[0].set_title('Ticks seem out of order / misplaced')
x = ['5', '20', '1', '9']  # strings
y = [5, 20, 1, 9]
ax[0].plot(x, y, 'd')
ax[0].tick_params(axis='x', labelcolor='red', labelsize=14)

ax[1].set_title('Many ticks')
x = [str(xx) for xx in np.arange(100)]  # strings
y = np.arange(100)
ax[1].plot(x, y)
ax[1].tick_params(axis='x', labelcolor='red', labelsize=14)

grafik

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that you have updated the examples, and they are more clear now.

Still, do we need more than a screen full of figures and code to illustrate the relatively simple fact "passing string data plots all values as ticks in the given order"?

Motivation for the proposed plot above:

  • I've limited this to the error cases. Plotting them is helpful to see how strings typically show up as "wrong" itics
  • I've left out the respective "correct" plots with numbers for brevity. It should be quite clear from highlighted ticks that these are undesired. How the correct plot looks does not add much information IMHO.
  • I've chosen two plots which show the characteristic error signatures "wrong order" and "too many ticks"
    "Dates out of order" is essentially the same as "numbers out of order".

I favor this to reduce clutter and keep the information density high. But I won't fight over it. If you feel strongly on having the correct plot as well, I suggest to at least remove the dates and use the same layout in both plots to make them equal in size.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My apologies, I wrote a reply, but must have closed it before hitting save....

a) datetime-as-string is a very common failure mode, so I think it should be in the FAQ
b) for beginners "turn your strings into floats" is going to require a google search; half the time, even I have to look up how to do datetimes. I felt it was important to offer a basic solution, and also to show the "expected" output so the "unexpected" is crystal clear.

This section is for folks who are trying to do their first plot, reading their Excel data in as a csv. While I agree we should be as concise as possible, I think here the extra guidance is worth the virtual real estate to get them on the right path.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section is for folks who are trying to do their first plot, reading their Excel data in as a csv.

IMHO this can happen with all levels of experience.

Anyway, this is a judgement call on the audience and the amount of detail we think is necessary. I appreciate your position, but cannot sign-off on it. If others approve and merge I'm fine with that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean its a How-To - I would argue we should provide concrete solutions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It‘s not really a how-to, it‘s only in how-to because we don‘t have the proper category yet. #21943 (comment)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me rephrase: whatever we call it, I think it should have a solution in it, not just the problem.

Copy link
Member

@timhoffm timhoffm Jan 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To me the solution is „don‘t pass in strings“. How to get from strings to numerical values depends on the data source and values, and is beyond Matplotlibs business. Note for example that the canonical solution for pandas.read_csv would be to pass converters for those values and not create a numpy array from the data after loading.

I‘m fine that some people may have to do an additional google search. As said, it‘s a judgement call.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, well, lets move the full example into an example.

jklymak and others added 3 commits January 6, 2022 09:28
Co-authored-by: Tim Hoffmann <2836374+timhoffm@users.noreply.github.com>
Co-authored-by: Tim Hoffmann <2836374+timhoffm@users.noreply.github.com>
to Matplotlib. In the lower row the data is converted to either floats or
datetime64; note that the ticks are now ordered and spaced numerically.

.. plot::
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that you have updated the examples, and they are more clear now.

Still, do we need more than a screen full of figures and code to illustrate the relatively simple fact "passing string data plots all values as ticks in the given order"?

Motivation for the proposed plot above:

  • I've limited this to the error cases. Plotting them is helpful to see how strings typically show up as "wrong" itics
  • I've left out the respective "correct" plots with numbers for brevity. It should be quite clear from highlighted ticks that these are undesired. How the correct plot looks does not add much information IMHO.
  • I've chosen two plots which show the characteristic error signatures "wrong order" and "too many ticks"
    "Dates out of order" is essentially the same as "numbers out of order".

I favor this to reduce clutter and keep the information density high. But I won't fight over it. If you feel strongly on having the correct plot as well, I suggest to at least remove the dates and use the same layout in both plots to make them equal in size.

Comment on lines 26 to 29
In the example below, the upper row plots are plotted using strings for *x*;
note that each string gets a tick, and they are in the order of the list passed
to Matplotlib. If this is not desired, we need to change *x* to an array of
numbers.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs updating for the changed example code below.

Comment on lines +79 to +84
Sometimes we want to know the extent of an Artist. Matplotlib `.Artist` objects
have a method `.Artist.get_window_extent` that will usually return the extent of
the artist in pixels. However, some artists, in particular text, must be
rendered at least once before their extent is known. Matplotlib supplies
`.Figure.draw_without_rendering`, which should be called before calling
``get_window_extent``.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel that vague statements are rather confusing to the user. We should do either

  • dig into this deeper and explicitly write out for which artists this is necessary. - This is some work.
  • State that users should always call draw_without_rendering before using get_window_extent(). - I'm a bit uneasy with that because we're forcing additional code and overhead on the user also in cases that would work without it.

I suggest to move this to a separate PR to not block the ticks stuff.

Note also that #22122 now mentions draw_without_rendering() in the Text.get_window_extent() exception, so that discussing draw_without_rendering() here for that case is less important now. If the error happens, the user has the solution at hand.

@jklymak jklymak force-pushed the doc-explain-too-many branch from 00eae26 to 6355d1b Compare January 9, 2022 10:39
Copy link
Member

@timhoffm timhoffm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm 👍 on this conceptually now. Only some minor cosmetic remarks.

jklymak and others added 2 commits January 10, 2022 11:07
Co-authored-by: Tim Hoffmann <2836374+timhoffm@users.noreply.github.com>
Co-authored-by: Tim Hoffmann <2836374+timhoffm@users.noreply.github.com>
@timhoffm timhoffm added this to the v3.5.2 milestone Jan 10, 2022
@timhoffm timhoffm merged commit 5b2bcab into matplotlib:main Jan 10, 2022
meeseeksmachine pushed a commit to meeseeksmachine/matplotlib that referenced this pull request Jan 10, 2022
@jklymak jklymak deleted the doc-explain-too-many branch January 10, 2022 20:51
@jklymak
Copy link
Member Author

jklymak commented Jan 10, 2022

Thanks for your help @timhoffm !

@timhoffm
Copy link
Member

Thanks for enduring my review @jklymak! 😄

timhoffm added a commit that referenced this pull request Jan 10, 2022
…943-on-v3.5.x

Backport PR #21943 on branch v3.5.x (DOC: explain too many ticks)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Documentation: website layout/behavior/styling changes topic: categorical
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants