-
-
Notifications
You must be signed in to change notification settings - Fork 7.9k
ENH: Add grouped_bar() method #28560
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Am thinking of a bunch of situations (date times, coded data, machine learning labels) where the numbers are really categoricals but the dtype isn't string and I think folks will expect it to just work. Also the categorical remapping mechanism isn't terribly robust 😓 so I could see folks wanting to just use numbers + formatter if they're trying to do something like animated/interactive sorting of bars (something like bar chart race) where the position of the bar is actually independent of its label as categorical will maintain the same cat->int mapping as stuff is added to the chart. Which tldr, yes numbers b/c that allows for a lot of fine tuning of bar position. |
Added numeric values support for x. I've required that x ist equidistant, becuause otherwise positioning will be quite messy.
I've not thought this case though, but remember that |
I don't think this is technically feasible, but this has me thinking it'd be nice if the return object was a 2D bar container that supported slicing out all the segments at a given X or all the segments of a group for easier consistent updates. |
93c7651
to
5bab18d
Compare
5bab18d
to
128b509
Compare
The API proposal is complete. Please check whether it is reasonable. You can get an overview the preliminary overview example |
A couple of quick comments:
|
Not thought about it. But the fundamental API Side note: There is one aspect, I've been pondering: We could revese the positional argument order to
We have no precedence that plotting functions return dicts. Additionally, one can plot without passing
No. I've never seen this. Generally, groups are distinguished by location (and identified by x tick labels), and there's a point in that the bars of one dataset look the same over all groups to help make the connection. Per-group settings is not something I would want to encourage through API (though we could later introduce a parameter dict
Yes, as the todo in the intial PR shows, tests are still open. I'm collecting feedback before so that I don't have to rewrite all tests in case I would have to overhaul the design. |
b0f2f34
to
08e154d
Compare
I agree that is interesting, but as you point out could be solved by layers on top of this.
We could fallback to Rather than a class GroupedBarReturn:
def __init__(self, bars):
self.bars = bars
def remove(self):
[b.remove() for b in self.bars] ? This would also give us a reasonable expansion path if this API grew to add errorbars or annotations to the bars. Also a path to Even though it is provisional, I'm a bit worried about returning an object with an API that would be annoying to replicate (e.g. not fall into the trap of This may be a bigger discussion, but I think we should try to make everything we return from plotting methods to have a That said, I'm not going to push this further and returning a list is acceptable to merge.
sounds reasonable. |
Edit: I think I've reasonbly addressed the doc build failure.
The call stacks goes through the matplotlib/lib/matplotlib/axes/_base.py Line 2527 in 9365c97
and matplotlib/lib/matplotlib/axes/_base.py Line 2268 in 9365c97
What's funny is that this PR does not touch the respective docstrings or targeted objects. Also, the only modification of
|
e12c91f
to
c62213f
Compare
Also would make it trivial to transpose, which honestly is one of the best parts of the pandas API. Which also, I think my request for an BarContainerArray and Tom's for a dict based return are both pointing towards a data frame type artist container - which yes I can see being out of scope here and also I think you proposed something like this once for Axes so you could do both mosaic label based and index based |
These were not rendered as links in the current docs, because the associated code objects do not exist as targets in the docs. For some reason, sphinx did not complain about it. But it does in some recent PRs (extracted from matplotlib#28560) - I'm unclear why, but anyway the correct solution is to change the references: - `AxesBase` is not documented itself and in the given context, we don't have to make the specific distinction, so just link `Axes` - The datalim attribute is documented as part of the Axes class, but does not have a dedicated target to link to. So we simply use a literal instead of a link.
Changed to a provisional custom return object as proposed in #28560 (comment) to ease possible future migration. |
These were not rendered as links in the current docs, because the associated code objects do not exist as targets in the docs. For some reason, sphinx did not complain about it. But it does in some recent PRs (extracted from matplotlib#28560) - I'm unclear why, but anyway the correct solution is to change the references: - `AxesBase` is not documented itself and in the given context, we don't have to make the specific distinction, so just link `Axes` - The datalim attribute is documented as part of the Axes class, but does not have a dedicated target to link to. So we simply use a literal instead of a link.
dcd253e
to
3825074
Compare
From my perspective, this is ready for review.
|
7cf1e6b
to
5e82fbb
Compare
@story645 A big thank you for a though round of review 🎉. I've either changed the commented positions or replied why I think they should be kept as is. Please mark the comments as resolved on which you are ok with my reply so that only the open topics remain visible. |
e023e0e
to
162ce79
Compare
Co-authored-by: hannah <story645@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
take or leave the table tweak
Co-authored-by: hannah <story645@gmail.com>
This is a WIP to implement #24313. It will be updated incrementally. As a first step, I've designed the data and label input API.
Please check the included example, which explains the API variants and motivations/consistency considerations. It's a WIP for the illustration and discussion of the API, and will not go into the final PR
I've flagged as "work in progress" because many aspects are not implemented yet. **But please give comments as I consider the implemented parts as complete proposal for that aspects.**The API proposal is complete. Please check whether it is reasonable. You can get an overview the preliminary overview example and in the API docs forgrouped_bar
. Design decisions and background are listed in right below in the todos:👉 Ready for review
Todos:
API: How to structure and interpret the two data dimensions and the associated labels.
For now, only categorical labels are supported as x.Should we also support numbers as with bar?Numbers are now also supported.
How to parametrize bar width and group spacing?
A bar group takes horizontal space w (default: 1).
I believe explicit bar widths are not helpful because they can easily overflow if enough bars are in a group, and "reasonable" values need fine-tuning for per number of bars in a group. I therefore suggest to define the spacing (margin / padding; name t.b.d.) in relative units. Options:
pad=0.1
would mean a bar group targeting the interval(x-0.5*w, x+0.5*w)
would have the outer bar edges at(x-(0.5-pad*w), x+(0.5-pad*w) =
(x-0.4w, x+0.4w)`.pad=1
would mean that the bar width is chosen so that there's a padding of 1-bar width on both sides of the group.I believe, we can choose reasonable defaults for both cases. It's possibly easier to tune a desired look with the second approach, but harder if one wants to fully control positions.
Decision: Implemented as multiples of bar width
Do we want to support spacing between the bars in a group?
Yes. Spacing should follow the same relative conventions as group padding.
Since we have num_bars times the spacing "spacing in units of w will require tuning depending on the number of bars.
Spacing would be universal when given as "fraction of bar width".
When defaulting to 0 (my slight preference) both variants will work for any number of bars by default.
How to support horizontal bars?
Use
orientation
parameter. Replicating to a separate functiongrouped_barh
is overly cumbersome.Do we want to support stacked bars? And grouped stacked bars? - Not for now/here.
While stacked bars could use the same input datastructures, other plotting parameters like width and spacing settings do not match, and some
bar()
flexibility like varying width will not be available viagrouped_bar()
. I thus recommend not to try and squeeze stacked bars in togrouped_bar()
. Stacking bars is already much simpler with the existingbar()
methods compared to grouping bars.bar()
is thus more sufficient. If we want a higher level interface, a separatestacked_bar()
would be a thin wrapper aroundbar()
. I don't see a reasonable function for stacked and grouped bars, because the addtional dimension of data and labelling would make the interface very difficult. If you need stacked and grouped bars, you should use a loop ofgrouped_bar()
with adjustedbottom
.support kwargs (Rectangle properties)
support vectorized style parameters - at least colors
optional: support
bottom
values, so that one could stack (but could be added later)dataset labels naming?
label
as inbar()
, orlabels
ordataset_labels
or …Note:
stackplot()
useslabels
. This would be an argument for consistency. What's different:We potentially have two labels: dataset labels and group labels. This may warrant deviating from the simple
labels
apporach to disambiguate them. OTOH, I feel "dataset" is quite a generic name (but I don't have any better) so that
it does not make things much clearer. The group labels are implicit in x (or historically in bar(), called
tick_labels
.I'm going with
labels
because itsstackplot()
andbar()
(which useslabel
)label
add pyplot wrapper
add mypy stubs
improve documentation
add tests
Should we mark this as provisional? This would allow to react to user feedback. -> Yes.
What is the return type? List of
BarContainer
, a newBarGroupContainer
, something else, nothing for a start (because we can't decide yet)?Going with
list[BarContainer}
. This is most simple and in analogy to whatstackplot()
does. Since the API is provisional, we can still reconsider.