Skip to content

Allow user to specify colors in violin plots with constructor method #27304

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
Jan 30, 2025
Merged

Allow user to specify colors in violin plots with constructor method #27304

merged 13 commits into from
Jan 30, 2025

Conversation

landoskape
Copy link
Contributor

@landoskape landoskape commented Nov 10, 2023

PR summary

This PR adds the option of specifying the fillcolor and linecolor of violinplots. It addresses
issue #27298 (and other discusssions and requests elsewhere).

PR checklist

@landoskape landoskape changed the title Add color specification and pyi docs Allow user to specify colors in violin plots with constructor method Nov 10, 2023
@oscargus
Copy link
Member

oscargus commented Nov 10, 2023

Thanks!

Can you please add a test for this? I think either an image test, e.g. with a number of subplots using different color arguments, or an image comparison test where one check that using the arguments and setting the colors afterwards result in identical images.

Edit: see https://matplotlib.org/stable/devel/testing.html#writing-an-image-comparison-test if more info on the testing is required.

@oscargus oscargus added this to the v3.9.0 milestone Nov 10, 2023
@oscargus
Copy link
Member

I couldn't find a doc-page for that, but in https://github.com/matplotlib/matplotlib/tree/main/doc/users/next_whats_new there are small notes to be included in the next What's New. Please add one for this feature, ideally with an example showing how it works to highlight this new and useful feature.

@landoskape
Copy link
Contributor Author

Thanks!

Can you please add a test for this? I think either an image test, e.g. with a number of subplots using different color arguments, or an image comparison test where one check that using the arguments and setting the colors afterwards result in identical images.

Edit: see https://matplotlib.org/stable/devel/testing.html#writing-an-image-comparison-test if more info on the testing is required.

Test added, I have never written one but I believe this should work as expected.

@landoskape
Copy link
Contributor Author

And now there's a whats new page. I think that covers your requests. Let me know if there's anything else to do! Thanks for your feedback and support.

@story645
Copy link
Member

story645 commented Nov 10, 2023

You might want to install the pre-commit hooks to help with the linting/formatting errors
of the https://matplotlib.org/devdocs/devel/development_setup.html#install-pre-commit-hooks

ETA: Also, thanks for jumping headfirst into this, this is a good feature to have.

@landoskape
Copy link
Contributor Author

I think that's everything @oscargus and @story645. Thanks for your help!

Copy link
Member

@oscargus oscargus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor comments.

Maybe one should change both face- and edgecolor (to different values) in one of the subplots to confirm that it also works?

@landoskape
Copy link
Contributor Author

Maybe one should change both face- and edgecolor (to different values) in one of the subplots to confirm that it also works?

Good point. I added two new subplot axes to the test - one that changes facecolor and edgecolor to different things, and one that checks if color overwrites the other two.

Copy link
Member

@story645 story645 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand if you feel this is out of scope, but since you're already implementing these parameters how do you feel about implementing them vectorized? Meaning that you can also pass in a list of colors/facecolor/edgecolor like in bar?

@landoskape
Copy link
Contributor Author

I understand if you feel this is out of scope, but since you're already implementing these parameters how do you feel about implementing them vectorized? Meaning that you can also pass in a list of colors/facecolor/edgecolor like in bar?

Good idea! That's now implemented with documentation, a test for this feature, and an example for the gallery. It's implemented the same way as it is in the bar method, where a sequence of colors can be passed that has a different length than the data, but the violin method will repeat the colors.

@landoskape landoskape requested a review from story645 November 13, 2023 16:52
@@ -8226,6 +8227,21 @@ def violinplot(self, dataset, positions=None, vert=True, widths=0.5,
its only parameter and return a scalar. If None (default), 'scott'
is used.

facecolor : color or list of colors or None; see :ref:`colors_def`
If provided, will set the face color(s) of the violin plots. The alpha
value is automatically set to 0.3.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to figure out what the semantics should be to remain consistent with current behavior... It feels wrong to override the alpha value of a user provided color...

Though I will also note that the 0.3 alpha is actually applied to the patch object as a whole, not its color values...

Makes me wonder if the "better" solution would be to just pass **kwargs to lower level artists (But I think that gets complicated quickly because some of them take the "edgecolor" as "color", for instance..., and other values would have to be filtered out likely...)

Copy link
Member

@story645 story645 Nov 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it too much trouble to do a behavior change to 'alpha is now set via the color/must be set explicitly/behaves like alpha in bar/pie/etc?' -> it's a little pain now, but it pulls towards more overall cohesiveness

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point. I think that if the user doesn't specify the alpha, it makes sense to set it to 0.3. That just looks good, and honestly I'd prefer that as someone who wants to use the violinplot method with these new features.

I added a commit that changes the behavior:
If the user specifies the alpha, then that is the alpha that is used. If the user doesn't specify the alpha, then it defaults to 0.3.

Copy link
Member

@timhoffm timhoffm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for being late to the party.

Up to now, styling was not configurable at all. That was simple. When we now start to add options, we should be extremely careful with the API. To be consistent, we need to look towards bar() as well as boxplot().

bar() doc:

  • color for the bar faces
  • edgecolor for the bar edges
  • ecolors for the errorbars
    but in kwargs, we have the standard color, edgecolor, facecolor behavior, where color sets both edge and face. This conflicts with the color doc above. We need to check what's actually happening.

boxplot()

Styling is configured via dicts for the individual parts

  • boxprops
  • capprops
  • whiskerprops
  • flierprops
  • medianprops
  • meanprops
    This allows very detailed configuration, but is cumbersome to use.

violinplot()

We have the following parts

  • bodies
  • cmeans
  • cmins
  • cmaxes
  • cbars
  • cmedians
  • cquantiles

To follow the concept of boxplot(), we would need to make them available via *props, but that's quite a lift, and I'm unsure on the usefulness.

Generally, the proposed global color settings seem useful (and that doesn't mean we cannot add *props later or add color kwargs to boxplot()).

For now the two minimal requirements if have on API consistency are

  • Check the color behavior of bar() and be consistent with it or argue why we should behave differently here.
  • Is edgecolor the correct name? Does it color the bodies edges (if they have a finite width)? Should it?

@landoskape
Copy link
Contributor Author

@timhoffm I can tell you from my perspective as someone who thinks about matplotlib primarily as a user (this is my first PR) I'm very happy to have a plotting function that allows me to make something look good very quickly, but that has the option of more detailed customization if required. The way the violinplot works now (after this PR) seems to fit that criteria. You can make a violin or set of violins with a color scheme that works, in one line of code, then go and edit individual objects later if you need to.

--

For now the two minimal requirements if have on API consistency are

  • Check the color behavior of bar() and be consistent with it or argue why we should behave differently here.

The color behavior between the new violinplot() and bar() are identical in terms of how the colors are handled within the function. However, bar() only has edgecolor and color as inputs. I changed the violinplot() to have edgecolor, facecolor, and color (which overwrites the other two) in response to @ksunden's point in issue #27298. This makes it consistent with patches.

In my opinion, this makes a lot of sense, because a user might want specific control over the lines and patch within the violin plot (i.e. using edgecolor/facecolor), or might want to make everything look the same (i.e. using color). That seems like a perfect compromise between customizability and simplicity. So if anything, maybe bar() should copy this behavior?

Extra point: I think this edgecolor/facecolor/color schema additionally makes sense given the way I plan to handle alpha values, which I'll explain in the response to @ksunden and @story645 in the above thread.

--

  • Is edgecolor the correct name? Does it color the bodies edges (if they have a finite width)? Should it?

I changed the name to edgecolor from linecolor because @ksunden and @story645 pointed out that that's the typical name for the rest of the library. I agree with that reasoning.

As it is, the bodies by default do not have an edgeline for violinplots. This is how it was before and still is. I think this is great, because it creates a very clean looking plot, and if the user wants to, they can go in and add an edge to the bodies. We can add this as an example to the customized violin plot in the gallery for the interested user if you think that's a good idea.

@ksunden
Copy link
Member

ksunden commented Nov 15, 2023

I guess I had it in my head when I made the original suggestion that the main shape were by default drawn with edges, which isn't actually true.

I think Tim is right that bar is the closer analog here...

I do think that the most full solution would be to do something like bodyprops=dict(...), cmeanprops=dict(...), etc... (naming up to debate, as they are pluralized in the returns)... but I also agree that it is probably a bigger lift than we really want.) That style is explicit and flexible (and something we do use in other parts of the library, particularly when there are multiple places **kwargs could be routed to, including boxplot, which is the closest analog to violinplot), but not necessarily the most friendly to use.

box has color (defaults to next patch color), edgecolor (defaults to None, which translates to the same as facecolor, I think) as well as ecolor (defaults to black, used for error bars).

color is either a single color or a list of colors which is cycled.

@landoskape
Copy link
Contributor Author

landoskape commented Nov 15, 2023

Not to complicate things further but regarding the earlier discussion it also may be helpful to include an argument like facealpha. I'm happy to keep working on it (although I think I may tap out if the lift to bodyprops=dict(...) is requested). But I think these decisions are above my pay grade so let me know what you all think.

Also, I'm not sure I understand why the tests failed.
The python 3.9 on ubuntu-20.04 (with and without Minimum Versions) failed because:
FAILED lib/matplotlib/tests/test_backends_interactive.py::test_figure_leak_20490[time_mem0-{'MPLBACKEND': 'wxagg'}]

And the Main Pytest Windows_py39 failed because:
FAILED lib/matplotlib/tests/test_pickle.py::test_pickle_load_from_subprocess[png]

But I'm not sure why that is happening or how it relates to the commits I've made.

@story645
Copy link
Member

it also may be helpful to include an argument like facealpha.

We added support for (color, alpha) as a valid color specification precisely to support specifying alpha on specific components w/o growing kwargs #24691

@oscargus
Copy link
Member

how it relates to the commits I've made.

It shouldn't be directly related. While it is weird that it seems to happen consistently on this PR (and some others), there is no obvious reason why it happens.

@landoskape
Copy link
Contributor Author

it also may be helpful to include an argument like facealpha.

We added support for (color, alpha) as a valid color specification precisely to support specifying alpha on specific components w/o growing kwargs #24691

Yep. I now removed the alpha handling so the alpha will just directly be controlled by facecolor without defaults to alternative values.

@landoskape
Copy link
Contributor Author

I just wanted to ping the reviewers in case there is continued interest in this PR. I'm happy to continue working on the changes to the violinplot constructor, but there was no clear agreement on how best to manage the additional color arguments.

It seems to me like there is tension between which other plotting method violinplot should be consistent with -- should it be similar to barplot or boxplot (or something else)? And -- should the kwargs be expanded to include parameter control over each feature of the violinplot -- e.g. bodies, cmeans, cmins, etc. or just have a single set of kwargs to control each feature of the plot.

In my opinion as a user, I'm very happy to have a few simple kwargs (like in barplot, i.e. just having color, edgecolor, facecolor) and letting matplotlib apply them to each feature of the plot so it looks good with minimal effort. The output of the constructor still provides the option of using the feature dictionaries for further specification if desired by a user.

@story645
Copy link
Member

story645 commented Jan 18, 2024

@landoskape thanks for following up. We've got a developers call Thursday at I think 8:00PM UTC) where I think it might be good to discuss this - please join if you can make it: https://scientific-python.org/calendars/

I tend to agree w/ @landoskape on keeping it simple: color/face/edge & if folks want more complicated then they can pop into the artist layer.

@story645
Copy link
Member

story645 commented Jan 18, 2024

So looking at how seaborn just exposes body color, @mwaskom is one proposal (bar plot vs box plot) preferable?

seaborn.violinplot(data=None, *, x=None, y=None, hue=None, order=None, hue_order=None, orient=None, color=None, palette=None, saturation=0.75, fill=True, inner='box', split=False, width=0.8, dodge='auto', gap=0, linewidth=None, linecolor='auto', cut=2, gridsize=100, bw_method='scott', bw_adjust=1, density_norm='area', common_norm=False, hue_norm=None, formatter=None, log_scale=None, native_scale=False, legend='auto', scale=<deprecated>, scale_hue=<deprecated>, bw=<deprecated>, inner_kws=None, ax=None, **kwargs)

@story645 story645 mentioned this pull request Feb 14, 2024
@landoskape
Copy link
Contributor Author

How does this look?

# Ensures that setting colors in violinplot constructor works
# the same way as setting the color of each object manually
np.random.seed(19680801)
data = [sorted(np.random.normal(0, std, 100)) for std in range(1, 5)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
data = [sorted(np.random.normal(0, std, 100)) for std in range(1, 5)]
data = [sorted(np.random.normal(0, std, 100)) for std in range(1, 2)]

Two violins should be enough. We don't get more information out of more violins. It only takes more computation and the image is more cramped.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fair point. I made it 3 so that we can set facecolor&linecolor, just facecolor, or just linecolor

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. please have a quick visual check of the generated figure to ensure the violins have still a decent width (you have 5 subplots next to each other).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks good with just 3 subplots (you can clearly see the details of each violin)

@timhoffm
Copy link
Member

The code is good. Left some comments on documentation and tests.

landoskape and others added 2 commits January 28, 2025 08:16
Co-authored-by: Tim Hoffmann <2836374+timhoffm@users.noreply.github.com>
Co-authored-by: Tim Hoffmann <2836374+timhoffm@users.noreply.github.com>
@timhoffm timhoffm dismissed stale reviews from story645 and themself January 28, 2025 08:38

Changes addressed

@timhoffm timhoffm removed the status: needs comment/discussion needs consensus on next step label Jan 28, 2025
Minor formatting and code cleanup
@timhoffm
Copy link
Member

I've added some minor formatting and code cleanup. See last commit.

@landoskape
Copy link
Contributor Author

Looks good, thanks for the cleanup! that facecol variable name was there from before I didn't have the heart to change it :)

@timhoffm timhoffm merged commit 96f5603 into matplotlib:main Jan 30, 2025
41 checks passed
@timhoffm
Copy link
Member

@landoskape Thanks for the patience to bring this through!

@landoskape
Copy link
Contributor Author

Woot! Thanks for working with me on it. Glad it's part of matplotlib now

@story645
Copy link
Member

@timhoffm I was in the middle of reviewing this, should I just do a follow up PR?

@timhoffm
Copy link
Member

@story645 sorry that was not obvious to me. Please make a followup PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Waiting for author
Development

Successfully merging this pull request may close these issues.

[ENH]: Add color argument to violinplot constructor
8 participants