Skip to content

[MNT]: #28701 separate the generation of polygon vertices in fill_between to enable resampling #28702

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

cmp0xff
Copy link
Contributor

@cmp0xff cmp0xff commented Aug 10, 2024

PR summary

closes #28701:

  • In https://matplotlib.org/stable/gallery/event_handling/resample.html, it is shown that one can resample lines by inserting the underlying line coordinates.
  • If we want to do the same resampling for fill_between, we can extract functionalities to generate the polygon coordinates only.
  • We implemented a new class FillBetweenPolyCollection to collect related functionalities.
  • We refactored the code related to fill_between to make it easier to understand.

PR checklist

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for opening your first PR into Matplotlib!

If you have not heard from us in a week or so, please leave a new comment below and that should bring it to our attention. Most of our reviewers are volunteers and sometimes things fall through the cracks.

You can also join us on gitter for real-time discussion.

For details on testing, writing docs, and our review process, please see the developer guide

We strive to be a welcoming and open project. Please follow our Code of Conduct.

@cmp0xff cmp0xff changed the title feat: #28701 separate the generation of polygon vertices feat: #28701 separate the generation of polygon vertices in fill_between to enable resampling Aug 10, 2024
@jklymak
Copy link
Member

jklymak commented Aug 11, 2024

I'm somewhat skeptical we would expose this publicly. The point here is to want to be able to change the x and y data. If I were to do this I'd change the polygon patch to a polygon patch subclass and have set_x/ydata methods that do the "right thing".

@cmp0xff
Copy link
Contributor Author

cmp0xff commented Aug 11, 2024

Hi @jklymak thank you for the comment.

change the polygon patch to a polygon patch subclass and have set_x/ydata methods that do the "right thing".

I am not sure what you mean here, especially the "polygon patch". Could you elaborate?

Furthermore, would it be more acceptable if we make separate methods of make_verts_x and make_verts_y, just like there have been separate methods of fill_between and fill_betweenx?

@rcomer
Copy link
Member

rcomer commented Aug 11, 2024

Currently fill_between returns a PolyCollection. If you subclass the PolyCollection then that subclass could have an equivalent of make_verts on it which could be used both in its instantiation and in new set_xdata, set_ydata methods. Then fill_between could return this new artist.

I think that is what @jklymak meant, and he just temporarily forgot that it's a collection rather than an single patch.

@cmp0xff
Copy link
Contributor Author

cmp0xff commented Aug 11, 2024

Hi @rcomer thank you for the reply.

  • Currently there is a set_verts that can be used to update collections, and plays the role of set_data in the example. set_verts seems to accept numpy arrays or lists.
  • If we do not add a subclass, there can be two advantages:
    1. We can minimise the effort and change of code, while it still works.
    2. The interfaces are in parallel for resampling Line2D and resampling PolyCollection.
    • for Line2D, we use set_data and insert a Sequence or np.ndarray of x and y coordinates
    • for PolyCollection, we use set_vert and insert a Sequence or np.ndarray of vertices
  • What I am doing here is to provide an additional interface, to generate an input for set_vert, from the input of fill_between.

@rcomer
Copy link
Member

rcomer commented Aug 11, 2024

Line2D.set_data call its set_xdata and set_ydata methods

def set_data(self, *args):
"""
Set the x and y data.
Parameters
----------
*args : (2, N) array or two 1D arrays
See Also
--------
set_xdata
set_ydata
"""
if len(args) == 1:
(x, y), = args
else:
x, y = args
self.set_xdata(x)
self.set_ydata(y)

So for API consistency it would make sense to have a collection that implements all three (set_data, set_xdata and set_ydata). I would expect those to ultimately call the existing set_verts.

I agree that it would be more work to implement a new subclass, but I do not think that is more important than providing a consistent API for users. Are there any current axes methods that purely translate data to be used in a specific type of plot?

@rcomer
Copy link
Member

rcomer commented Aug 11, 2024

Perhaps another option would be to put something in cbook? That has e.g. boxplot_stats which is factored out of boxplot so you can separately do calculations and plotting.

@cmp0xff cmp0xff marked this pull request as draft August 11, 2024 14:36
@jklymak
Copy link
Member

jklymak commented Aug 11, 2024

Yes there is no way we would allow a helper like this to be a method on Axes. A subclass is relatively trivial and could admit an API similar to the fill_between API

@cmp0xff
Copy link
Contributor Author

cmp0xff commented Aug 11, 2024

Hi, thank you for the discussions. I would like to have some further assistance before continue.

  1. As far as I understand, the point is to move make_verts to a new subclass of PolyCollection. Am I right about this? Let us call the new subclass PolyCollectionForFillBetween for now.
  2. In my current implementation, there is an _normalise_fill_between_params, which is used by both make_verts and _fill_between_x_or_y. This method calls helper functions from _AxesBase, _get_patches_for_fill and _process_unit_info. I guess we have a few options:
    1. either PolyCollectionForFillBetween needs to inherit also from _AxesBase
    2. or _normalise_fill_between_params stays in Axes
    3. or more functions are to be migrated
  3. Now make_verts is to be migrated to PolyCollectionForFillBetween, while _normalise_fill_between_params are needed by both PolyCollectionForFillBetween.make_verts and Axes._fill_between_x_or_y. How can we make this work?

@cmp0xff cmp0xff changed the title feat: #28701 separate the generation of polygon vertices in fill_between to enable resampling [MNT]: #28701 separate the generation of polygon vertices in fill_between to enable resampling Aug 11, 2024
@rcomer
Copy link
Member

rcomer commented Aug 11, 2024

I think

  • _get_patches_for_fill is only setting a default color if the user doesn't pass it. So that functionality is not needed within make_verts.
  • _process_unit_info is both telling the axes what sort of data is passed so it can make conversions (e.g. mapping datetimes to floats, because ultimately we plot in floats) and making that conversion on the x/y values. We only need the second of these within make_verts and the new collection will be inheriting convert_xunits/convert_yunits from artist, so it can be done that way.

@cmp0xff cmp0xff marked this pull request as ready for review August 11, 2024 20:01
@jklymak
Copy link
Member

jklymak commented Aug 11, 2024

The end goal, I think, is

fb = ax.fill_between(x, y1, y2, where=where)
...
fb.set_data(newx, newy1, newy2, where=newwhere)

or

fb = ax.fill_between(y, x1, x2, where=where)
...
fb.set_data(newy, newx1, newx2, where=newwhere)

I appreciate that this is more complicated than a helper to generate the vertices of the polygons and then passing that to a PolygonCollection.set_verts, but it is quite a bit more friendly to the end user who need not know anything more about the PolygonCollection.

A lot of fill_between is now in _fill_between_x_or_y - I'd expect most of this would end up in the new FillBetweenPolygon's set_data method. If it were me, I'd probably instantiate that class with the signature of _fill_between_x_y and then store the things like ind_dir, and then have set_data do the right thing.

@cmp0xff
Copy link
Contributor Author

cmp0xff commented Aug 12, 2024

Hi @jklymak, thank you for giving an outlook.

fb = ax.fill_between(x, y1, y2, where=where)
...
fb.set_data(newx, newy1, newy2, where=newwhere)

or

fb = ax.fill_between(y, x1, x2, where=where)
...
fb.set_data(newy, newx1, newx2, where=newwhere)

The code already looks like this for now. Maybe I can make a set_data_x and another set_data_y, just like fill_between and fill_between_y?

A lot of fill_between is now in _fill_between_x_or_y - I'd expect most of this would end up in the new FillBetweenPolygon's set_data method. If it were me, I'd probably instantiate that class with the signature of _fill_between_x_y and then store the things like ind_dir, and then have set_data do the right thing.

I also believe that the code already looks like this now. What still remains in _fill_between_x_or_y is

    # now update the datalim and autoscale
    self.update_datalim(collection._pts, updatex=up_x, updatey=up_y)

    self.add_collection(collection, autolim=False)
    self._request_autoscale_view()

that is not needed when we set_data. There is also no autoscale in set_data of Line2D.

@jklymak
Copy link
Member

jklymak commented Aug 12, 2024

At a quick glance, this looks great, but others should chip in before polishing up. Perhaps @timhoffm and @antzer as API and tech leads?

Can I suggest you add an example that uses set_data - either as a static example (which is a bit silly since the original plot would get overwritten, but...), or as an animation? That would help make this easier to see if it is working. Also tests will be needed.

@timhoffm
Copy link
Member

timhoffm commented Aug 12, 2024

High-level feedback:

  • correct: PolyCollection itself does not have the capability to provide a high-level interface for manipulating fill_between results after creation.

  • correct: Subclassing PolyCollection is one approach to provide such a high-level interface, but there are others - see below (Minor naming: I'd call this FillBetweenPolyCollection not PolyCollectionForFillBetween).

  • There's a danger that FillBetweenPolyCollection won't live up to its expectations. With just PolyCollection we're not claiming that the result still contains all additional information. This is basically the right approach:

    If it were me, I'd probably instantiate that class with the signature of _fill_between_x_y and then store the things like ind_dir, and then have set_data do the right thing.

    Move all logic into FillBetweenPolyCollection. If something is missing one can fix that later. So danger is not too high. OTOH, the derieved class may have functionality that does not make sense in the specific context. Maybe not the end of the world but think a little how much/little functionality of the base class is needed. It might be that creating a separate class not deriving from PolyCollection` is more suitable (I haven't looked into the details).

  • This is a fundamental paradigm shift from plotting functions creating simple generic geometric Artists to creating semantic Artists that contain all aspects of the plot type. Thinking this further, we will end up with a dedicated Artist per plot function. This would have merits, e.g. also for scatter it's not trivial to manipulate the data afterwards, but consequently changing this would be a significant effort. We should be very conscious whether we want to go that direction (and whether partial implementations are ok because migrating would be a longer-term effort). Note that we have precedence for this with StepPatch.

  • A possibility to consider (also not investigated in detail): Would it make more sense to create semantic containers that provide the high-level interfaces and contain the geometric artists? (-> favor composition over inheritance)

  • detail: I would leave out in_dir from set_data(). It's not a parameter of fill_between[x] and thus unintuitive. For now, I'd not expose changing in_dir after class creation. It should be rare that a user needs this (for the original creation they would have to exchange functions) and makes for a more straight forward interface.

I'm overall +0.5 on this, but would like to get feedback on the paradigm shift towards semantic artists.

Copy link
Member

@timhoffm timhoffm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe, this is ready after handling this round of review comments. 🎉🎉🎉

@timhoffm
Copy link
Member

Thanks for digging this up. That convinces me that we can use it here (even though I feel we need to reconsider this pattern mid-term).

@cmp0xff cmp0xff requested review from anntzer and timhoffm September 23, 2024 17:15
@anntzer anntzer removed their request for review September 23, 2024 19:19
Update doc/users/next_whats_new/fill_between_poly_collection.rst

Co-authored-by: Tim Hoffmann <2836374+timhoffm@users.noreply.github.com>
@timhoffm
Copy link
Member

timhoffm commented Oct 1, 2024

@cmp0xff please change the line numbers in the following places from 120 to 121:

"lib/matplotlib/axes/_axes.py:docstring of matplotlib.axes._axes.Axes.fill_between:120",
"lib/matplotlib/axes/_axes.py:docstring of matplotlib.axes._axes.Axes.fill_betweenx:120",

and

"lib/matplotlib/pyplot.py:docstring of matplotlib.pyplot.fill_between:120",
"lib/matplotlib/pyplot.py:docstring of matplotlib.pyplot.fill_betweenx:120",

Also add "matplotlib.artist.FillBetweenPolyCollection.set:45" to the same context.

This should fix the doc build. Having to track the missing references by line number is an unsatisfactory workaround, but a fundamental fix for the issue is somewhat involved.

@tacaswell tacaswell merged commit 5503774 into matplotlib:main Oct 11, 2024
41 of 43 checks passed
@tacaswell
Copy link
Member

Thank you @cmp0xff and congratulations on your first merged Matplotlib PR 🎉

I hope we hear from you again!

@QuLogic QuLogic added this to the v3.10.0 milestone Oct 11, 2024
@cmp0xff cmp0xff deleted the feature/cmp0xff/28701-separate-generating-vertices branch October 11, 2024 21:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Waiting for author
Development

Successfully merging this pull request may close these issues.

[MNT]: Separate the generation of polygon vertices from _fill_between_x_or_y
7 participants