ENH: Add grouped_bar() method #28560

timhoffm · 2024-07-13T08:58:18Z

This is a WIP to implement #24313. It will be updated incrementally. As a first step, I've designed the data and label input API.

Please check the included example, which explains the API variants and motivations/consistency considerations. It's a WIP for the illustration and discussion of the API, and will not go into the final PR

I've flagged as "work in progress" because many aspects are not implemented yet. **But please give comments as I consider the implemented parts as complete proposal for that aspects.**

The API proposal is complete. Please check whether it is reasonable. You can get an overview the preliminary overview example and in the API docs for grouped_bar. Design decisions and background are listed in right below in the todos:

👉 Ready for review

What's new
API docs
Updated example, which is now much simpler.

Todos:

story645 · 2024-07-15T18:51:47Z

For now, only categorical labels are supported as x. Should we also support numbers as with bar?

Am thinking of a bunch of situations (date times, coded data, machine learning labels) where the numbers are really categoricals but the dtype isn't string and I think folks will expect it to just work. Also the categorical remapping mechanism isn't terribly robust 😓 so I could see folks wanting to just use numbers + formatter if they're trying to do something like animated/interactive sorting of bars (something like bar chart race) where the position of the bar is actually independent of its label as categorical will maintain the same cat->int mapping as stuff is added to the chart. Which tldr, yes numbers b/c that allows for a lot of fine tuning of bar position.

timhoffm · 2024-07-15T20:13:43Z

Added numeric values support for x. I've required that x ist equidistant, becuause otherwise positioning will be quite messy.

if they're trying to do something like animated/interactive sorting of bars (something like bar chart race) where the position of the bar is actually independent of its label as categorical will maintain the same cat->int mapping as stuff is added to the chart.

I've not thought this case though, but remember that grouped_bar() is just a convenience interface for positioning multiple sets of data. One boundary condition is that each group (=bars at one category/tick position) has the same datasets in the same order. If you need something exotic, you'll have to go back to drawing individual bars.

story645 · 2024-07-15T21:17:04Z

If you need something exotic, you'll have to go back to drawing individual bars

I don't think this is technically feasible, but this has me thinking it'd be nice if the return object was a 2D bar container that supported slicing out all the segments at a given X or all the segments of a group for easier consistent updates.

timhoffm · 2024-09-11T13:59:21Z

The API proposal is complete. Please check whether it is reasonable. You can get an overview the preliminary overview example
and in the API docs for grouped_bar. Design decisions and background are listed in the first comment of this PR. Feedback is welcome!

tacaswell · 2024-09-13T02:29:45Z

A couple of quick comments:

thoughts on how this could be extended if (when) we implement hierarchical tick labels?
for the return did you consider a dict? It might be worth considering a minimal custom class (__getitem__ by the labels and obj.remove()) as well? The two things I think are important are the by-label access (particularly when user passed us a dict to begin with) and the ability to remove the whole lot in one shot. Do we want to return a new compound artist (that legit lives in the tree)?
is there a way to control styling by label (e.g. put a different hatch on each group)? I'm not sure this is useful, but also not sure it is not useful. "grab the returned object and do it your self" is also a good answer for now.
the first example in the docs should be turned into an figure comparison test to get at least some coverage?

timhoffm · 2024-09-13T07:11:42Z

thoughts on how this could be extended if (when) we implement hierarchical tick labels?

Not thought about it. But the fundamental API grouped_bar(x, heights) should be able to bear this. Likely with a hierarchial x and with or without substructuring in heights.

Side note: There is one aspect, I've been pondering: We could revese the positional argument order to groupde_bar(heights, x). We have kinda precendence for that in stairs. This would allow to (i) leave out x entirely for a quick look (use range() as default) and (ii) more importantly make it possible to pass a complex object grouped_bar(dataframe), which contains all the labelling information; this could then also be hierarchial. I've refrained from it to keep more analogy to bar. After all, our API is mostly low-level component-based. I believe to change that, it would be better to add a data preprocessing decorator that takes complex objects and decomposes them into the data components.

for the return did you consider a dict? It might be worth considering a minimal custom class (__getitem__ by the labels and obj.remove()) as well? The two things I think are important are the by-label access (particularly when user passed us a dict to begin with) and the ability to remove the whole lot in one shot. Do we want to return a new compound artist (that legit lives in the tree)?

We have no precedence that plotting functions return dicts. Additionally, one can plot without passing labels, which would us require to invent keys. So dict is not an option to me. Mid-term, I'd like to have better return objects, but his is a larger topic, which I don't want to figure out in this PR. I see the list of Container as a temporary approach - one of the main reasons to make this provisonal. It is certainly not the most elegant approach, but it's consistent (c.f. stackplot()), simple and gives users a grip on the underlying artists. Given that re-using the Artists is quite rare, that's good enough for now. Mid-term, I'd like to have better return objects. I will add an explicit note that this is likely to change.

is there a way to control styling by label (e.g. put a different hatch on each group)? I'm not sure this is useful, but also not sure it is not useful. "grab the returned object and do it your self" is also a good answer for now.

No. I've never seen this. Generally, groups are distinguished by location (and identified by x tick labels), and there's a point in that the bars of one dataset look the same over all groups to help make the connection. Per-group settings is not something I would want to encourage through API (though we could later introduce a parameter dict group_props if you convince me that's really a valid use case 😃). For now: Do it yourself.

the first example in the docs should be turned into an figure comparison test to get at least some coverage?

Yes, as the todo in the intial PR shows, tests are still open. I'm collecting feedback before so that I don't have to rewrite all tests in case I would have to overhaul the design.

tacaswell · 2024-09-13T13:46:56Z

We could revese the positional argument order to groupde_bar(heights, x). We have kinda precendence for that in stairs.

I agree that is interesting, but as you point out could be solved by layers on top of this.

Additionally, one can plot without passing labels, which would us require to invent keys.

We could fallback to range() or [f"dataset_{i}" for in range()] but I agree that is less than great. That would probably mean we would also force those invented labels into the legend which is not great. I'm convinced we should not return a dict (but a bit sad about it).

Rather than a list could we do a custom object like:

class GroupedBarReturn:
    def __init__(self, bars):
        self.bars = bars
    def remove(self):
        [b.remove() for b in self.bars]

? This would also give us a reasonable expansion path if this API grew to add errorbars or annotations to the bars. Also a path to obj.by_labels[key]

Even though it is provisional, I'm a bit worried about returning an object with an API that would be annoying to replicate (e.g. not fall into the trap of ax.errorbar).

This may be a bigger discussion, but I think we should try to make everything we return from plotting methods to have a .remove method so "make it go away" is always the same.

That said, I'm not going to push this further and returning a list is acceptable to merge.

No. I've never seen this....For now: Do it yourself.

sounds reasonable.

timhoffm · 2024-09-13T14:24:49Z

Edit: I think I've reasonbly addressed the doc build failure.

~~Technical topic: I don't understand the doc failure:~~

/home/circleci/project/doc/api/axes_api.rst:294:<autosummary>:1: WARNING: py:obj reference target not found: Axes.dataLim [ref.obj] /home/circleci/project/doc/api/axes_api.rst:449:<autosummary>:1: WARNING: py:obj reference target not found: AxesBase [ref.obj]

The call stacks goes through the .. autosummary:: sections axes_api.rst ending with the stated line numbers. They try to autogenerate the doc pages for update_datalim and add_child_axes but fail to resolve the following object references:

matplotlib/lib/matplotlib/axes/_base.py

Line 2527 in 9365c97

Extend the `~.Axes.dataLim` Bbox to include the given points.

and

matplotlib/lib/matplotlib/axes/_base.py

Line 2268 in 9365c97

Add an `.AxesBase` to the Axes' children; return the child Axes.

What's funny is that this PR does not touch the respective docstrings or targeted objects. Also, the only modification of axes_api.rst itself is the addition of grouped_bar (in another autosummary).

It seems that the failure is not related to the PR but some external influence (change of sphinx?). The failure may even be expected, because both references use AxesBase, which is not in our documentation, and the stable/devedocs render them as unresolved references. But why does it happen here and now? Other PRs do not see the problem. Do we have caching of the doc builds, and I'm the first to touch axes_api.rst to trigger regeneration?

story645 · 2024-09-13T16:39:11Z

more importantly make it possible to pass a complex object grouped_bar(dataframe), which contains all the labelling information

Also would make it trivial to transpose, which honestly is one of the best parts of the pandas API.

Which also, I think my request for an BarContainerArray and Tom's for a dict based return are both pointing towards a data frame type artist container - which yes I can see being out of scope here and also I think you proposed something like this once for Axes so you could do both mosaic label based and index based

These were not rendered as links in the current docs, because the associated code objects do not exist as targets in the docs. For some reason, sphinx did not complain about it. But it does in some recent PRs (extracted from matplotlib#28560) - I'm unclear why, but anyway the correct solution is to change the references: - `AxesBase` is not documented itself and in the given context, we don't have to make the specific distinction, so just link `Axes` - The datalim attribute is documented as part of the Axes class, but does not have a dedicated target to link to. So we simply use a literal instead of a link.

timhoffm · 2024-09-16T09:43:58Z

Changed to a provisional custom return object as proposed in #28560 (comment) to ease possible future migration.

These were not rendered as links in the current docs, because the associated code objects do not exist as targets in the docs. For some reason, sphinx did not complain about it. But it does in some recent PRs (extracted from matplotlib#28560) - I'm unclear why, but anyway the correct solution is to change the references: - `AxesBase` is not documented itself and in the given context, we don't have to make the specific distinction, so just link `Axes` - The datalim attribute is documented as part of the Axes class, but does not have a dedicated target to link to. So we simply use a literal instead of a link.

timhoffm · 2025-01-24T00:26:39Z

@story645 A big thank you for a though round of review 🎉.

I've either changed the commented positions or replied why I think they should be kept as is. Please mark the comments as resolved on which you are ok with my reply so that only the open topics remain visible.

story645

take or leave the table tweak

lib/matplotlib/axes/_axes.py

lib/matplotlib/tests/test_axes.py

lib/matplotlib/axes/_axes.py

lib/matplotlib/axes/_axes.pyi

Co-authored-by: hannah <story645@gmail.com>

timhoffm added the status: work in progress label Jul 13, 2024

github-actions bot added the Documentation: examples files in galleries/examples label Jul 13, 2024

timhoffm force-pushed the grouped_bar branch from 57a38c7 to 75d8852 Compare July 13, 2024 09:29

timhoffm force-pushed the grouped_bar branch from 4e4c012 to 7241d40 Compare July 17, 2024 06:49

timhoffm force-pushed the grouped_bar branch 3 times, most recently from 93c7651 to 5bab18d Compare September 11, 2024 11:42

github-actions bot added the topic: pyplot API label Sep 11, 2024

timhoffm force-pushed the grouped_bar branch from 5bab18d to 128b509 Compare September 11, 2024 11:46

github-actions bot added the Documentation: API files in lib/ and doc/api label Sep 11, 2024

timhoffm removed the status: work in progress label Sep 11, 2024

tacaswell added this to the v3.10.0 milestone Sep 13, 2024

github-actions bot added the topic: axes label Sep 13, 2024

timhoffm force-pushed the grouped_bar branch from b0f2f34 to 08e154d Compare September 13, 2024 12:55

github-actions bot removed the topic: axes label Sep 13, 2024

github-actions bot added the topic: axes label Sep 13, 2024

timhoffm force-pushed the grouped_bar branch from e12c91f to c62213f Compare September 13, 2024 15:04

timhoffm mentioned this pull request Sep 16, 2024

DOC: Fix non-working code object references #28825

Merged

timhoffm force-pushed the grouped_bar branch from 6acdf0e to 4cc0e31 Compare January 24, 2025 00:19

timhoffm force-pushed the grouped_bar branch 3 times, most recently from 162ce79 to dfcd52f Compare January 24, 2025 22:28

timhoffm force-pushed the grouped_bar branch from 41650b0 to 6a9dfc8 Compare February 5, 2025 18:46

story645 approved these changes Feb 5, 2025

View reviewed changes

lib/matplotlib/axes/_axes.py Outdated Show resolved Hide resolved

timhoffm added New feature and removed topic: pyplot API Documentation: examples files in galleries/examples Documentation: API files in lib/ and doc/api labels Feb 11, 2025

timhoffm added the status: needs review label Apr 8, 2025

QuLogic reviewed May 30, 2025

View reviewed changes

timhoffm and others added 7 commits June 2, 2025 15:50

ENH: Add grouped_bar() method

fea06c9

Add tests for grouped_bar()

556895d

Simplify "Grouped bar chart with labels" using grouped_bar()

900ace5

Apply suggestions from code review

7fa82d7

Co-authored-by: hannah <story645@gmail.com>

Docstring wording

8cf06c4

Update lib/matplotlib/axes/_axes.py

a4ed768

Co-authored-by: hannah <story645@gmail.com>

Add test for grouped_bar() return value

d9aa5f6

timhoffm force-pushed the grouped_bar branch from 9e17cad to d9aa5f6 Compare June 2, 2025 14:15

github-actions bot added topic: pyplot API Documentation: examples files in galleries/examples Documentation: API files in lib/ and doc/api labels Jun 2, 2025

Apply suggestions from code review

e0afe74

QuLogic approved these changes Jun 3, 2025

View reviewed changes

QuLogic merged commit 17653a6 into matplotlib:main Jun 5, 2025
40 checks passed

QuLogic removed the status: needs review label Jun 5, 2025

timhoffm deleted the grouped_bar branch June 5, 2025 20:32

QuLogic mentioned this pull request Jun 21, 2025

[ENH]: API discussion for grouped bar charts #24313

Closed

Uh oh!

ENH: Add grouped_bar() method #28560

ENH: Add grouped_bar() method #28560

Uh oh!

Conversation

timhoffm commented Jul 13, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

story645 commented Jul 15, 2024

Uh oh!

timhoffm commented Jul 15, 2024

Uh oh!

story645 commented Jul 15, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

timhoffm commented Sep 11, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tacaswell commented Sep 13, 2024

Uh oh!

timhoffm commented Sep 13, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tacaswell commented Sep 13, 2024

Uh oh!

timhoffm commented Sep 13, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

story645 commented Sep 13, 2024

Uh oh!

timhoffm commented Sep 16, 2024

Uh oh!

timhoffm commented Jan 24, 2025

Uh oh!

story645 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

timhoffm commented Jul 13, 2024 •

edited

Loading

story645 commented Jul 15, 2024 •

edited

Loading

timhoffm commented Sep 11, 2024 •

edited

Loading

timhoffm commented Sep 13, 2024 •

edited

Loading

timhoffm commented Sep 13, 2024 •

edited

Loading