Skip to content

Bezier/Path API Cleanup: fix circular import issue #16812

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 31, 2020

Conversation

brunobeltran
Copy link
Contributor

@brunobeltran brunobeltran commented Mar 17, 2020

PR Summary

Roadmap:

(This PR) (*) <- #16832 (*) <- #16859 (*) <- #16888 <- #16889 (*) <- #16891 (MarkerStyle improvements!)

"<-" means "depends on", and "(*)" marks PRs whose implementations are complete and fully ready for review.

Currently, bezier.py depends on path.py. While it's easy to see how this could happen, this dependency should pretty obviously be reversed. bezier.py contains helper functions that are designed to help compute things about bezier curves. path.py defines a Path, and all computations that are "about paths" should be in class Path.

This became a show-stopper in #16607, #16832 and #16859) when I wanted to write methods that could compute things about paths (like their extrema, area, center of mass...all required for the marker size fixes proposed in #15703). I want to be able to define a member of Path that uses helper functions in bezier.py to help it compute bounds, etc., but currently cannot because that would cause a circular import (path.py - imports -> bezier.py' - imports -> path.py').

This is fixed here by moving a couple of functions from bezier.py that really compute things about paths (not about bezier curves necessarily) into path.py.

EDIT: original list of functions to deprecate is now shorter, since @Annzter deprecated make_path_regular and concatenate_paths in his own PR after my initial submission.

  1. bezier.split_path_inout is now Path.split_path_inout, since its first argument is more naturally "self". This allows removing the import.
  2. Bezier's "inside_circle" is only used by patches.py (not really anything to do with bezier curves) moved there.

The changes to the API I made seem likely to be internal-facing only, since bezier.py was obviously intended to be used internally by matplotlib to help with implementing patches.py, path.py, etc.

Per @tacaswell's suggestion, the original functions were kept around (as pass-thru/wrappers) and deprecated instead of removed.

No tests were added since no new code was added, only moved around.

PR Checklist

  • Has Pytest style unit tests
  • Code is Flake 8 compliant
  • New features are documented, with examples if plot related
  • Documentation is sphinx and numpydoc compliant
  • Added an entry to doc/users/next_whats_new/ if major new feature (follow instructions in README.rst there)
  • Documented in doc/api/api_changes.rst if API changed in a backward-incompatible way

@brunobeltran
Copy link
Contributor Author

brunobeltran commented Mar 17, 2020

@anntzer: as you suggested, this PR is the first step in breaking up #16607 into several easier-to-merge chunks :)

@tacaswell tacaswell added this to the v3.3.0 milestone Mar 18, 2020
Copy link
Member

@tacaswell tacaswell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have concerns about moving public functions from public module. While I'm not sure I disagree with the analysis of what the organization should be (also not sure if I agree), our user base is very large and very clever. The probability approaches one that we have users who are using these functions and we can not move them with no warning.

It looks like the minimal fix to prevent the circular imports is to move the from .path import Path to be local to make_path_regular and concatenate_paths

@tacaswell
Copy link
Member

I am 👍 on the enthusiasm! However, I have concerns about the unintended consequences.

@brunobeltran
Copy link
Contributor Author

brunobeltran commented Mar 18, 2020

Thanks for the positive feedback, @tacaswell !

It looks like the minimal fix to prevent the circular imports is to move the from .path import Path to be local to make_path_regular and concatenate_paths

Even easier, I made them just wrap the appropriate method calls (of the .Path object). This technically could be a breaking change if a user was abusing duck typing to use these functions for a non-Path-subclass that happened to implement similarly-named methods....but I assume there is less concern about impacting users that cross that threshold of cleverness?

we can not move them with no warning.

Would you be more in favor of slowly deprecating these functions? I obviously don't mind them being perpetually duplicated in bezier.py personally, but I went ahead and pushed up a version with some deprecation warnings.

@anntzer
Copy link
Contributor

anntzer commented Mar 18, 2020

Actually, I am deprecating make_path_regular and concatenate_paths in #14199; can that get reviewed first? Both functions actually have relatively simple alternatives available.

@brunobeltran
Copy link
Contributor Author

@tacaswell , what is the expectation before merging, should this branch be rebased to be a single commit? Or do you prefer to preserve the whole commit history?

@brunobeltran brunobeltran force-pushed the bezier-refactor branch 3 times, most recently from 8f67987 to 1a7b35b Compare March 18, 2020 19:41
@QuLogic
Copy link
Member

QuLogic commented Mar 18, 2020

If you make 50 commits where one would do (e.g., if there are multiple test image changes), we'd ask to squash, but otherwise we don't bother.

@brunobeltran brunobeltran force-pushed the bezier-refactor branch 2 times, most recently from fea08e8 to d533912 Compare March 19, 2020 09:57
@brunobeltran
Copy link
Contributor Author

This PR is now ready for re-review pending merge of #14199.

After that #14199 is merged, this PR will basically just:

  1. deprecate bezier.split_path_inout in favor of path.Path.split_path_inout
  2. deprecate bezier.inside_circle (only used by patches.py)
  3. various doc cleanups I found while documenting these deprecations

This PR is required to begin fixing #16606, #16830, and #7184.

@brunobeltran
Copy link
Contributor Author

Squashed all commits into one, since a bunch of other PRs depend on this, to make it easier to review them.



@cbook.deprecated("3.2", alternative="Path.make_compound_path()")
def split_path_inout(path, inside, tolerance=0.01, reorder_inout=False):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can just reimport this function at the top of the module (from .path import split_path_inout with a note that this needs to remain available in this module for API stability reasons) and then not bother with the deprecation (if the implementation stays around anywhere in the library, the disruption of removing it isn't really worth it).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea behind this pull request is to remove from .path import ... from the top of bezier.py. This is because I implemented all the functions to do e.g. MarkerStyle renormalization (and other things), but they are all methods of Path, and they need access to BezierSegment. Currently, I cannot from .bezier import ... in path.py because bezier.py already imports from .path .... So reimporting this function at the top of the module would defeat the purpose.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah. OK, but then at the end of path.py you can do bezier.split_path_inout = Path.split_path_inout (with a comment re: conserving old API)? This way people doing from bezier import split_path_inout won't be affected?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I'm missing something, but as in other comment, it seems like this patch would only work if it was applied at the package level? For example in lib/matplotlib/__init__.py I could add

import matplotlib.patches
import matplotlib.path
import matplotlib.bezier
bezier.split_path_inout = Path.split_path_inout
bezier.inside_circle = patches._inside_circle

but I don't see how that is better than my current approach, of just keeping the functions in bezier.py and calling their new implementations. My approach also doesn't have any repeated code, and doesn't affect anybody doing from bezier import split_path_inout.

If your argument is that we shouldn't deprecate these functions at all, I am happy to remove the @deprecated statements, since the real point of this PR was to remove the from .path import Path statement from bezier.py.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is better to keep these sort of details contained inside of the modules that affect them. Having path.py monkey-patch bezier is less bad than doing the monkey patching in the top level __init__.py

It looks like you are moving all of the helper methods from bezier.py -> path.py then it is OK to keep a .path import in bezier.py?

Copy link
Contributor Author

@brunobeltran brunobeltran Mar 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tacaswell, I agree. That's why I didn't do it in the first place.

Maybe I'm missing something, but the proposed monkey patch doesn't solve the stated problem (of not changing the public API)?

I think I have done a poor job explaining the purpose of this PR: the point is to remove the .path import from bezier.py. Why? Because many things that you might want to compute about Paths (bounding box, area, center of mass, see e.g. #16832, #16859) need to call functions of bezier.py (and these are crucial for improvements that you guys seem to largely be behind, such as the equal area marker size fix). If we leave the .path import in bezier.py, it is not possible to implement those new features in a reasonable way.

Conversely, the current reason for the existence of the import is that a bunch of functions that have no business being in bezier.py in the first place.

So one solution is to move the .path import into each of those functions, and leave everything else as is. I wouldn't prefer this, but I'm also not responsible for maintaining this code, so I am happy to do this instead.

Another, better solution (this PR), is to move the functions where they ought to have been in the first place and have bezier.whatever just pass thru to the new version of the function. (And optionally deprecate the old versions).

My understanding is that instead of this pass-thru approach, @anntzer would prefer that I monkey patch the functions back in. But this breaks the public API.

With my current approach:

from matplotlib import bezier
bezier.split_path_inout

Works.

With @anntzer's monkey patch approach as I understand it, you need to do

from matplotlib import bezier
bezier.split_path_inout # error
from matplotlib import path
bezier.split_path_inout # works

Sorry if reply sounds terse, I'm on mobile.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But if you have moved all of the useful functions to path.py it no longer has to import bezier so it is no longer circular. Are there more functions than these that you want to move?

We could also pull the common code out to a new module that both path.py and bezier.py import from.

As a general rule, we try to avoid both API changes and code-churn for the sake of aesthetics changes. The question I have is: Is this is the minimal API and code change that we can do to get to the desired outcome?

Copy link
Contributor Author

@brunobeltran brunobeltran Mar 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see where you're coming from now, it was just me not doing a good job explaining why this PR exists in the first place. I'm 100% on-board with doing my best to minimize API changes and code-churn. (For example, I am happy to leave inside_circle where it is, I am really only after the moving of split_path_inout out of bezier.py so that it no longer has to import from .path).

However, this is already the minimum API change required so that I can do #16832, #16859 (and more), which are not-just-aesthetic. Basically the confusion seems to be that since those new functions aren't in this PR, the PR doesn't seem justified.

In short, I moved all the (currently) useful functions to path.py. There not more functions that I want to move, but there are a half-dozen functions that I have written (with the end-goal of equal-area markers) that depend on .path being able to from bezier import BezierSegment (again, that class is empty in this PR, but that's only because I wanted to split the PRs into manageable chunks to make review easier).

Copy link
Contributor Author

@brunobeltran brunobeltran Mar 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it makes sense for me to make a "BIG" PR of my "equal area markers" branch (which I currently have split into #16812, #16832, #16859, (and one more that I haven't pushed up yet)).

That way we can discuss the top-level planning of how to go about the equal area markers thing? Given my current strategy (writing code to compute the area of an arbitrary path, which turns out to be quite simple, see #16859), this PR is a necessary first step.



@cbook.deprecated("3.2")
def inside_circle(cx, cy, r):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As for split_path_inout I would just leave this as public API reexported into this module, for now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Public API where? Currently I've moved it into patches.py, which is the only place where it is used. I hesitate to make inside_circle a part of the public API of that module, since it's such a non-public-facing part of the module's internals.

In addition, we can't REALLY just re-exporting it into this module from there, since patches.py already does from .bezier import .... I fixed this now so it doesn't explicitly repeat the code, which I figure is better so that the implementations don't diverge.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above, you can patch bezier.py to add this function to it at the end of patches.py? In fact it can stay private in patches, and be "public" in bezier (bezier.inside_circle = _inside_circle at the end of patches.py).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but then the API will only remain unchanged as long as the user imports patches. If they import bezier only, the patch will not be applied and the functions will be missing?

@brunobeltran brunobeltran force-pushed the bezier-refactor branch 2 times, most recently from e805071 to 4943d33 Compare March 22, 2020 01:10
@tacaswell tacaswell dismissed their stale review March 22, 2020 22:18

significantly stale

@tacaswell
Copy link
Member

I agree we need to step back and get a broader view of what we are doing here.

While it's easy to see how this could happen, this dependency should pretty obviously be reversed. bezier.py

I do not agree it is obvious, bezier can be seen as specialized version of path.

and all computations that are "about paths" should be in class Path.

Again, I'm not sure that this is obvious. I am assuming that these helpers are things that only need to use the public API of Path to do their job? If that is the case, we may want to have many more helpers that work on Path objects than we want to put as part of the public API of Path.

I think we could we avoid the need for this re-arrangement if we make a _marker_path_helpers.py that can import both .path and .bezier and in markers import .path (as we do now) and _marker_path_helpers. Once that is all done, then we can circle back to decide which of those helpers we to promote to being methods on the Path object.

In addition to avoiding this code churn, starting with the helpers in a private module lets us be less careful about merging things and in changing what is there independent of how the totality of this work gets relative to version releases etc.

Again I am 👍 on this work in principle and looking for the path of least resistance to getting this code in.

@brunobeltran
Copy link
Contributor Author

I agree we need to step back and get a broader view of what we are doing here.

Thanks as usual for the positive feedback! I have opened a PR (#16891) with one proposed architecture for the new changes. I think a lot of this discussion would best take place there, and this PR can be revisited once we're all on the same page as to the best top-level architectural decisions for how to add this new code.

I'll answer your comments here for the record, but will reproduce these points in #16891, so we can continue the discussion there.

While it's easy to see how this could happen, this dependency should pretty obviously be reversed. bezier.py

I do not agree it is obvious, bezier can be seen as specialized version of path.

You're right, obvious was the wrong word. There are two equally-valid interpretations of the relationship of bezier and path, which imply opposite directions of dependency:
1) a Path's are made up of BezierSegments, so path should from .bezier import any functions that it needs for computing things about Paths.
2) a BezierSegment is just a Path with one code. So bezier.py should contain...helper functions for simple paths that are commonly found in the bezier literature, but should rely on from .path import'ing anything that Path already does (?)

What this PR is implicitly proposing is that the most natural way to architect this library going forward is to commit to (1). This is because anything you want to compute about a path (arc length, area, bounding box, etc) you do it by computing it for each BezierSegment making up the path. And if you go with (2) instead, then you have to end up polluting path.py with a bunch of code that should by all rights be methods of BezierSegment (e.g. if we go with (2), we have functions like path.bezier_length instead of bezier.BezierSegment.length because path.py can't import from bezier.py).

and all computations that are "about paths" should be in class Path.

Again, I'm not sure that this is obvious. I am assuming that these helpers are things that only need to use the public API of Path to do their job? If that is the case, we may want to have many more helpers that work on Path objects than we want to put as part of the public API of Path.

I think we could we avoid the need for this re-arrangement if we make a _marker_path_helpers.py that can import both .path and .bezier and in markers import .path (as we do now) and _marker_path_helpers. Once that is all done, then we can circle back to decide which of those helpers we to promote to being methods on the Path object.

I definitely agree that all of this extra code could for now be put into a private module, like _marker_path_helpers. Let's discuss in #... what should eventually be private and public. What this PR was implicitly proposing was that all of these new functions I'm writing (e.g. Path.signed_area, Path.length, Path.center_of_mass, etc), while being useful for markers.py, were also performant enough and broadly applicable enough that they might be worth having in the public API of Path.

If we decide that none of these functions eventually need to be public, then I will (reluctantly lol) close this PR and move all of my new code into a private helper module.

@QuLogic
Copy link
Member

QuLogic commented Mar 26, 2020

Given that this enables #16832 which has a concrete use to fix a bug, and there was mostly agreement on the last call for this one, I'm not sure we need to hold this up for the more abstract MarkerStyle changes. But we can leave it for the call to hash out again.

@brunobeltran
Copy link
Contributor Author

Based on feedback from the call today, this PR has been changed to not actually perform any API changes. A new PR will be created with proposed API changes independently of this PR stack, which will now only contain feature improvements.

@jklymak jklymak requested a review from QuLogic March 30, 2020 21:34
@jklymak
Copy link
Member

jklymak commented Mar 30, 2020

... since the deprecations went in, this is a trivial re-arrangements of the imports in the two deprecated methods.

@anntzer
Copy link
Contributor

anntzer commented Mar 30, 2020

just waiting for ci

@QuLogic QuLogic merged commit e702edd into matplotlib:master Mar 31, 2020
@QuLogic
Copy link
Member

QuLogic commented Mar 31, 2020

The commit message was a bit weird, since it's not a backport of anything. I used a squash merge so I could change the message without having to wait for all the CI to re-run needlessly. Unfortunately, this does mean the other PRs need rebasing.

@brunobeltran
Copy link
Contributor Author

@QuLogic, thanks for fixing that, the commit message only made sense in the context of my personal branches.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants