Skip to content

Add support for multiple hatches, edgecolors and linewidths in histograms #28073

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
Jul 20, 2024

Conversation

Impaler343
Copy link
Contributor

@Impaler343 Impaler343 commented Apr 13, 2024

PR summary

Closes #26718 Distributes keyword args passed to each Patch using a cycler. Probably not the best way to do this?

PR checklist

@jklymak
Copy link
Member

jklymak commented Apr 13, 2024

I'd suggest showing what this does with an example either in the GitHub pr description, or ideally in the gallery

@github-actions github-actions bot added the Documentation: examples files in galleries/examples label Apr 14, 2024
@Impaler343
Copy link
Contributor Author

I'm not really sure if I need to add a new test or just modify an exisiting one(test_hist_stacked_bar) in test_axes.py

@oscargus
Copy link
Member

I guess there are two things here:

  1. Add/modify an example in the gallery to illustrate how it is used. (Will be easier for the reviewers to get an idea of how it is used etc.)
  2. Add a test to get the code coverage up and make sure no one breaks it later. Maybe there is some old example that one can "hi-jack", but otherwise create a new test with all the bells and whistles turned on.

@Impaler343
Copy link
Contributor Author

pinging @story645 for review. The failing tests are unrelated

Copy link
Member

@story645 story645 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also needs updated documentation that the patch properties are now vectorized (@timhoffm any concerns here?)

@timhoffm
Copy link
Member

timhoffm commented May 19, 2024

also needs updated documentation that the patch properties are now vectorized (@timhoffm any concerns here?)

What should we be concerned about? We already have vectorized label and color. It's only fair to vectorize these inside hist() as well. This should be documented with **kwargs (can't immediately suggest, because that part has not been touched in the PR). Something like:

        **kwargs
            `~matplotlib.patches.Patch` properties. The following properties additionally
            accept lists of property values, one element for each dataset:
            *edgecolors*, *linewidths*, *linestyles*, *hatches*.

This should also get a what's new entry.

@story645
Copy link
Member

What should we be concerned about? We already have vectorized label and color.

My bias is vectorize everything so I don't have concerns, but in the past for some vectorization discussions there have been concerns about the tradeoffs. But if there isn't opposition, awesome!

@timhoffm
Copy link
Member

I don't see any drawbacks for hist()

@Impaler343
Copy link
Contributor Author

So we are planning to vectorize all parameters of Patches? Like joinstyle, capstyle etc.

@story645
Copy link
Member

So we are planning to vectorize all parameters of Patches? Like joinstyle, capstyle etc

Not at this time w/ the current architecture, especially because nobody has asked for those.

@Impaler343 Impaler343 force-pushed the histogram branch 2 times, most recently from 150c63b to 2351cbd Compare May 29, 2024 07:34
@Impaler343
Copy link
Contributor Author

Codecov is acting fishy, it passed once and failed again after squashing. Anything else to add/change?

@Impaler343
Copy link
Contributor Author

Impaler343 commented Jun 5, 2024

As of the last commit, this behavior is consistent for all histtypes unless mentioned otherwise:

  • If edgecolor, facecolor and color are mentioned, color is completely ignored.
  • If edgecolor and color are mentioned, color is applied to only the faces and edgecolor to the edges.
  • If facecolor and color are mentioned, color is applied to only the edges and facecolor to the faces.
  • If facecolor and edgecolor are mentioned, edgecolor is applied to edges, and facecolor is applied to faces.
  • If only color is mentioned, the face is colored for all filled histograms, and only the edge is colored for unfilled histograms with the values in color.
  • If only facecolor is mentioned, the face is colored for all filled histograms with the value in facecolor, and the edges are colored with the default color cycle for unfilled histograms.
  • If only edgecolor is mentioned, the face is colored by the default color cycle for all filled histograms, and the edge is colored with the values in edgecolor for all histograms.
    I feel this is the required behavior, but I am unsure if the color setting for the kwargs can be written more concisely.
    Let me know if I should start writing tests based on this structure.

@story645
Copy link
Member

story645 commented Jun 6, 2024

@Impaler343 I started trying to flow chart that and then realized it's probably clearer as a table. Let me know if this jives with what you're saying:

hist type color facecolor edgecolor patch face color patch edge color
all set set set facecolor edgecolor
all set None set color edgecolor
all set set None facecolor color
all None set set facecolor edgecolor
bar, barstacked, stepfilled set None None color 'none'
step set None None 'none' color
bar, barstacked, stepfilled None set None color 'none'
step None set None 'none' default color cycle
bar, barstacked, stepfilled None None set default color cycle 'none'
step None None set 'none' edgecolor
bar, barstacked, stepfilled None None None
step None None None

@Impaler343
Copy link
Contributor Author

hist type color facecolor edgecolor patch face color patch edge color
all set set set facecolor edgecolor
all set None set color edgecolor
all set set None facecolor color
all None set set facecolor edgecolor
bar, barstacked, stepfilled set None None color 'none'
step set None None 'none' color
bar, barstacked, stepfilled None set None facecolor 'none'
step None set None 'none' default color cycle
bar, barstacked, stepfilled None None set default color cycle edgecolor
step None None set 'none' edgecolor
bar, barstacked, stepfilled None None None default color cycle 'none'
step None None None 'none' default color cycle

Have bolded the corrected ones

@Impaler343
Copy link
Contributor Author

I'm unable to fix CircleCI errors for docs. Could someone help me out?

@story645
Copy link
Member

story645 commented Jun 7, 2024

Hi, so the error is in your what's new:

/home/circleci/project/doc/users/next_whats_new/histogram_vectorized_parameters.rst:47: WARNING: Explicit markup ends without a blank line; unexpected unindent.
/home/circleci/project/doc/users/next_whats_new/histogram_vectorized_parameters.rst:51: ERROR: Unexpected indentation.
/home/circleci/project/doc/users/next_whats_new/histogram_vectorized_parameters.rst:64: WARNING: Explicit markup ends without a blank line; unexpected unindent.
/home/circleci/project/doc/users/next_whats_new/histogram_vectorized_parameters.rst:68: ERROR: Unexpected indentation.

error: https://app.circleci.com/pipelines/github/matplotlib/matplotlib/31532/workflows/2533c013-7833-4d0c-ae72-dead7c7fbc76/jobs/83481?invite=true#step-113-207133_109
rendering: https://output.circle-artifacts.com/output/job/2c962d26-1c66-49dd-a987-e5bf25c8fcbe/artifacts/0/doc/build/html/users/next_whats_new/histogram_vectorized_parameters.html

@Impaler343
Copy link
Contributor Author

Impaler343 commented Jun 8, 2024

Ok it turns out the vectorization of colors is somewhat incomplete:

  • When we pass None, calls the color cycle, which is fine.
  • When we pass a list of valid colors, it uses them, which is also fine.
  • When we pass a list of Nones or even if one value in the list is None, it throws an error saying invalid RBGA argument. What can we do in this situation? Should we say that we use the colour cycle even if one value in the list is None and raise a warning? Or call the color cycle's next color whenever a None is encountered in a list? Or simply assume that the user will not pass a list like color= [None, "green", None] and ignore this check?

@story645
Copy link
Member

story645 commented Jun 9, 2024

or simply assume that the user will not pass a list like color= [None, "green", None]

I think this input is ambiguous cause it can mean either:

  1. opt those two lines into the color cycle such that it maps to ['C0', "green", 'C1']
  2. set both lines to the same default ['C0', "green", 'C0']

So I think it's ok to error out (unless/until someone comes around expecting 1 or 2) but we should maybe special case and have a better error message, like if not all(colors) (if any are NaN) "Ambiguous color specification: colors in the list may not be None"

Or something like that.

@Impaler343
Copy link
Contributor Author

The added lines have all been covered, codecov seems to still fail

@@ -4614,6 +4672,10 @@ def test_hist_emptydata():
ax.hist([[], range(10), range(10)], histtype="step")


def test_hist_none_patch():
plt.hist([1, 2], label=["First", "Second"])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
plt.hist([1, 2], label=["First", "Second"])
plt.hist([ ], label=["First", "Second"])

None patch is the empty list/array case

Copy link
Member

@story645 story645 Jun 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And what you probably want to test is

assert len(patches) == 0

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This too works, basically the None condition is hit when we don't have enough datasets to cover each label. This is the doing of itertools.zip_longest

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should i still change it up? As the previous one seems clearer

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I thought that zip_longest would copy the patch to both labels. I think there needs to be some sort of assert that you're test is doing what you think it is - maybe then checking that len(patches) != len(labels)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is len(lbs) here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The number of labels that have been used in the plot

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I meant what were the actual numbers/does this test pass.

But yes if this test passes and coverage improves (add a comment that this test is to cover the if not patch case), then I'll let this concern go.

Copy link
Member

@story645 story645 Jul 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But also assert len(patches) < len(labels) too I think?

Copy link
Contributor Author

@Impaler343 Impaler343 Jul 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes the tests pass, and yes we can add the length(patches) < len(labels) one as well, does the same comparison.
The actual numbers are 1 < 2

@story645
Copy link
Member

story645 commented Jul 1, 2024

Cause I think this maybe got lost, I was suggesting that this construction makes the skip case clearer:

        for patch, lbl in itertools.zip_longest(patches, labels):
            
            if not patch: continue 

            p = patch[0]
            kwargs.update({
                'hatch': next(hatches),
                'linewidth': next(linewidths),
                'linestyle': next(linestyles),
                'edgecolor': next(edgecolors),
                'facecolor': next(facecolors),
            })
            p._internal_update(kwargs)
            if lbl is not None:
                p.set_label(lbl)
            for p in patch[1:]:
                p._internal_update(kwargs)
                p.set_label('_nolegend_')

Copy link
Member

@story645 story645 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all your patience and extra thanks for tracing through and figuring out and testing our histogram color semantics/precedence 😅

I think the plot tests are now at a state where it's easy to see which parameter failed but there's also now only six image tests.

I'm pretty sure that the color semantics test doesn't try to draw anything but there might be a "build only artist tree" function that I'm missing.

Comment on lines 4675 to 4682
def test_hist_none_patch():
# To cover None patches when excess labels are provided
labels = ["First", "Second"]
patches = [[1, 2]]
fig, ax = plt.subplots()
ax.hist(patches, label=labels)
_, lbls = ax.get_legend_handles_labels()
assert (len(lbls) < len(labels) and len(patches) < len(labels))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please elaborate what this tests? What are None patches? Does this express what you want to test?

Suggested change
def test_hist_none_patch():
# To cover None patches when excess labels are provided
labels = ["First", "Second"]
patches = [[1, 2]]
fig, ax = plt.subplots()
ax.hist(patches, label=labels)
_, lbls = ax.get_legend_handles_labels()
assert (len(lbls) < len(labels) and len(patches) < len(labels))
def test_hist_unused_labels():
# When a list with one dataset and N elements is provided and N labels, ensure
# hat the first label is used for the dataset and all other labels are ignored
fig, ax = plt.subplots()
ax.hist([[1, 2, 3]], label=["values", "unused", "also unused"])
_, labels = ax.get_legend_handles_labels()
assert labels == ["values"]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

None patches are the ones generated when the number of labels specified are more than the number of patches(Filled in by zip_longest). The suggested test also covers the line and has clearer labels, ill change it up

@story645
Copy link
Member

story645 commented Jul 9, 2024

#28533 reminded me that this should get a versionadded directive

https://matplotlib.org/devdocs/devel/api_changes.html#announce-new-and-deprecated-api

@QuLogic
Copy link
Member

QuLogic commented Jul 20, 2024

The doc build failure could be fixed by rebasing on main to get fixed CircleCI, but I built it locally and confirmed it builds, so I'm not going to wait for that, and just merge.

@QuLogic QuLogic merged commit ed56f95 into matplotlib:main Jul 20, 2024
36 of 37 checks passed
@QuLogic QuLogic modified the milestones: future releases, v3.10.0 Jul 20, 2024
@Impaler343
Copy link
Contributor Author

I'm so glad that this finally got merged! Thank you for the timely reviews

@QuLogic
Copy link
Member

QuLogic commented Jul 20, 2024

Thanks for keeping up with all the reviews!

timhoffm added a commit to timhoffm/matplotlib that referenced this pull request Aug 2, 2024
- [Hatch filled histogram](https://matplotlib.org/stable/gallery/lines_bars_and_markers/filled_step.html)
  Hatching in histograms is fully supported through matplotlib#28073. This is [now simple](https://matplotlib.org/devdocs/gallery/statistics/histogram_multihist.html#hatch) and does warrant a dedicated example.

- [Percentiles in horizontal bar charts](https://matplotlib.org/stable/gallery/statistics/barchart_demo.html)
  This is a very complex example. But in the end, it's just a bar plot with annotations. While it is statistics-themed through the use of perecntiles, the plot itself has no statistics-related characteristics.
timhoffm added a commit to timhoffm/matplotlib that referenced this pull request Aug 3, 2024
- [Hatch filled histogram](https://matplotlib.org/stable/gallery/lines_bars_and_markers/filled_step.html)
  Hatching in histograms is fully supported through matplotlib#28073. This is [now simple](https://matplotlib.org/devdocs/gallery/statistics/histogram_multihist.html#hatch) and does warrant a dedicated example.

- [Percentiles in horizontal bar charts](https://matplotlib.org/stable/gallery/statistics/barchart_demo.html)
  This is a very complex example. But in the end, it's just a bar plot with annotations. While it is statistics-themed through the use of perecntiles, the plot itself has no statistics-related characteristics.
@Impaler343 Impaler343 deleted the histogram branch August 8, 2024 17:59
@Impaler343 Impaler343 restored the histogram branch August 8, 2024 17:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Documentation: examples files in galleries/examples New feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug]: stacked histogram does not properly handle edgecolor and hatches
7 participants