Skip to content

Compute area of path #16859

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

Conversation

brunobeltran
Copy link
Contributor

@brunobeltran brunobeltran commented Mar 20, 2020

PR Summary

Add method to .Path that can compute the exact (signed) area of the path "fast" (i.e. without any integrations or discretization required) using the simple analytical formula for general Bezier curves.

Required for implementation of "equal area" markers (my current proposal for dealing with #15703).

Roadmap:
#16812 (*) <- #16832 (*) <- (This PR) (*) <- #16891 (MarkerStyle improvements!)

"<-" means "depends on", and "(*)" marks PRs whose implementations are complete and fully ready for review.

PR Checklist

  • Has Pytest style unit tests
  • Code is Flake 8 compliant
  • New features are documented, with examples if plot related
  • Documentation is sphinx and numpydoc compliant
  • Added an entry to doc/users/next_whats_new/ if major new feature (follow instructions in README.rst there)
  • Documented in doc/api/api_changes.rst if API changed in a backward-incompatible way

Copy link
Contributor Author

@brunobeltran brunobeltran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments that might be useful to a reviewer.

@brunobeltran brunobeltran force-pushed the path_area branch 2 times, most recently from 296baeb to 37d533f Compare March 22, 2020 03:21
@brunobeltran brunobeltran force-pushed the path_area branch 2 times, most recently from cb14bb2 to aebcbc7 Compare March 23, 2020 19:02
@brunobeltran brunobeltran force-pushed the path_area branch 4 times, most recently from 4c80f38 to 08e61bc Compare March 30, 2020 21:35
@brunobeltran brunobeltran force-pushed the path_area branch 3 times, most recently from 1c30420 to 4a41a2f Compare March 31, 2020 01:23
dr = dB(t).T
n = np.array([dr[1], -dr[0]])
return (B(t).T @ n) / 2
return scipy.integrate.quad(integrand, 0, 1)[0]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Testing our area formula (bezier.BezierSegment.arc_area) is most naturally done by comparing to the naive integral implementation. However, this required me to add scipy to the testing dependencies.

I already test this formula (indirectly) in test_path.test_signed_area_curve and test_path.test_signed_area_unit_rectangle, so if we prefer to not add any dependency on scipy at all, I can simply remove this test.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was agreed on a call back in June that it's okay to add this Scipy dependency as long as it's just an optional testing dependency.

@brunobeltran brunobeltran force-pushed the path_area branch 3 times, most recently from 4cf12f3 to 0037abd Compare March 31, 2020 01:40
@brunobeltran
Copy link
Contributor Author

brunobeltran commented Sep 1, 2020

I didn't fully read all the math yet, but I think some of them could be \bm{B} or \vec{B} to differentiate the vectors from the x/y components.

@QuLogic when you get tired of working with the Mac stuff, this is rebased and the math should now render fully (making it marginally easier to review), and I incorporated all of your recommendations.

@github-actions
Copy link

Since this Pull Request has not been updated in 60 days, it has been marked "inactive." This does not mean that it will be closed, though it may be moved to a "Draft" state. This helps maintainers prioritize their reviewing efforts. You can pick the PR back up anytime - please ping us if you need a review or guidance to move the PR forward! If you do not plan on continuing the work, please let us know so that we can either find someone to take the PR over, or close it.

@github-actions github-actions bot added the status: inactive Marked by the “Stale” Github Action label Jul 14, 2023
@greglucas greglucas marked this pull request as ready for review October 29, 2024 22:02
@greglucas
Copy link
Contributor

This was a lot of good work that seems useful to get in. I just rebased this PR and replaced the manual combinatorial function with math.comb now that we are beyond 3.7 support.

@greglucas greglucas force-pushed the path_area branch 2 times, most recently from 7054e98 to 40275b8 Compare October 29, 2024 22:22
Copy link
Member

@timhoffm timhoffm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, I'm +/-0 on adding this. Do we really need the area? It was pitched for equal-area markers. But I'm not sure on the cost/benefit. As mentioned above there are some potential rabbit holes if we provide this as a public interface. If the primary use would be equal-area markers, I suggest to make this private for now.


Notes
-----
An analytical formula is possible for arbitrary bezier curves.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there some external resource we can link to that describes the formula?
While it's reasonable to have it documented here, I'd also anchor to some public description if possible.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a link to a publicly accessible course notes on computer graphics. The terminology is slightly different, but the approach looks the same to me.
https://scholarsarchive.byu.edu/facpub/1/

prev_point = None
prev_code = None
start_point = None
for B, code in self.iter_bezier():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be reasonable to calculate the area on a shifted path, which has it's "center" in the origin? If we are far away from the origin, we add and subtract a lot of long-and-small areas. I can imagine that this could significantly reduce numerical precision.

# a point
Path([[1, 2]], [Path.MOVETO]),
_ExampleCurve(Path([[1, 2]], [Path.MOVETO]), extents=(1., 2., 1., 2.),
area=0.),
]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test a simple non-curved polygon such as a triangle:

Suggested change
]
_ExampleCurve(Path([[1, 1], [2, 1], [1.5, 2]]), extents=(1, 1, 2, 2), area=1/3),
]

Side-remark: Not an expert, but I believe there are more efficient formulas for simple polygons without Bezier segments.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, but looking through this it is all vectorized and over the degree-1 shapes I'm guessing it isn't that expensive. I think this should also be a good way for someone to later come in and add a fast-path for those cases: if CURVE == ...: simple formula. Starting with this seems reasonable to me.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Digression, but this is apparently messing up mypy stubtest which I think is due to the float in here vs int in all the other paths. Changing this to tuple(point) instead satisfies mypy. Seems like typing gone wrong here.

Do we really need to be running mypy over the test suite?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a strong opinion on mypy on tests. I assume the added value would be that we check for reasonable typing of our interfaces by exercising calls to these interfaces.

@greglucas greglucas force-pushed the path_area branch 3 times, most recently from cdf05fe to df8ea60 Compare October 30, 2024 15:07
@greglucas
Copy link
Contributor

@timhoffm, I share some of your concern, but overall feels like this is an improvement worth adding. If someone wants more complex geometry handling they should likely move over to Shapely and GEOS. But this has the work put into it and I could see people not wanting to have to convert a Matplotlib Path over to Shapely and go back and forth if they can avoid it.

For the equal area markers: I think that is one of the most requested scatter marker "features", so I think people definitely would want it.

@greglucas greglucas force-pushed the path_area branch 2 times, most recently from 70f327d to a489cbe Compare October 30, 2024 15:55
@timhoffm
Copy link
Member

timhoffm commented Oct 30, 2024

@timhoffm, I share some of your concern, but overall feels like this is an improvement worth adding. If someone wants more complex geometry handling they should likely move over to Shapely and GEOS. But this has the work put into it and I could see people not wanting to have to convert a Matplotlib Path over to Shapely and go back and forth if they can avoid it.

We have to be careful what we offer. If we provide a public functionality it should "just work" and we need to document it's scope and limitations. This is more work than just writing it for internal usage.

For example, the current contains() implementation depends on the visibility #23875. I assume this has been implemented for event handling, but is very surprising if a user uses contains as an abstract concept. While this could be labelled a design issue, having it public surfaces the problem to the user.

Similarly, marker paths are typically defined in the vicinity of the origin, so we would not have to care for the possible numeric issue with far-away paths. Also, the precision requirements for reasonable equal-area markers are low. A few percent deviation are tolerable for our use case, but it may not in user applications.
Note also that we could decide to go other routes for equal-area markers. At least for the builtins, we could pre-calculate size-to-area ratios (if need be even through counting covered pixels for a large rendered marker), which would get us 90% the way for equal-area markers without any geometry code.

Generally, geometric calculations are not trivial. I'm quite afraid we provide half-baked solutions, that do not satisfy user needs and will result in follow-up requests to improve on that functionality, which is not our core business.

@github-actions github-actions bot removed the status: inactive Marked by the “Stale” Github Action label Oct 31, 2024
@story645
Copy link
Member

This is more work than just writing it for internal usage.

The ROSES 2024 grant has path transformations as an area of focus.

@greglucas if this issue is part of the work that will help w/ that, could you create a roadmap issue of what should be implemented in Matplotlib to facilitate that work?

@greglucas
Copy link
Contributor

@greglucas if this issue is part of the work that will help w/ that, could you create a roadmap issue of what should be implemented in Matplotlib to facilitate that work?

No this isn't related to that. The PR after this (path_length) would be more related, but not directly as that work is all done within Cartopy.

I don't personally have a direct benefit from this at this point in time, so I will just leave the rebased version here in case someone else wants to advocate for it again in the future or use it as a reference, I don't feel strongly enough to advocate for this right now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants