Skip to content

bugfix for PathSimplifier #28478

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Oct 22, 2024
Merged

bugfix for PathSimplifier #28478

merged 6 commits into from
Oct 22, 2024

Conversation

r3kste
Copy link
Contributor

@r3kste r3kste commented Jun 27, 2024

PR summary

closes #17914

PathSimplifier should ignore vertices corresponding to Path.CLOSEPOLY, and rather replace them with the most recent Path.MOVETO vertex.

Code for reproducing the bug / Actual outcome

import numpy as np
import matplotlib.pyplot as plt
from matplotlib.path import Path
from matplotlib.patches import PathPatch

p = Path([(0, 0), (1, 0), (1, 1), (np.nan, np.nan)],
         [Path.MOVETO, Path.LINETO, Path.LINETO, Path.CLOSEPOLY])

# Remove this to see expected output
p = p.cleaned(simplify=True)

fig, ax = plt.subplots()
patch = PathPatch(p, facecolor="none", edgecolor="blue")
ax.add_patch(patch)
ax.set_xlim(-1, 2)
ax.set_ylim(-1, 2)

plt.show()

Expected Outcome

The Path should be closed.

Actual Outcome

PR checklist

@r3kste
Copy link
Contributor Author

r3kste commented Jun 27, 2024

As additional information, if the path had finite values instead on np.nan like:

p = Path([(0, 0), (1, 0), (1, 1), (0.5, 1.5)],
         [Path.MOVETO, Path.LINETO, Path.LINETO, Path.CLOSEPOLY])

Expected Outcome

The same as above.

Actual Outcome

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for opening your first PR into Matplotlib!

If you have not heard from us in a week or so, please leave a new comment below and that should bring it to our attention. Most of our reviewers are volunteers and sometimes things fall through the cracks.

You can also join us on gitter for real-time discussion.

For details on testing, writing docs, and our review process, please see the developer guide

We strive to be a welcoming and open project. Please follow our Code of Conduct.

@tacaswell
Copy link
Member

Thank you for your work on this, but I do not this is the right solution. It does make it look better, but simplify is still not handling the CLOSEPOLY correctly and is adding multiple LINETOs instead

In [2]: p.cleaned(simplify=True).codes
Out[2]: array([1, 2, 2, 2, 2, 0], dtype=uint8)

In [3]: p.codes
Out[3]: array([ 1,  2,  2, 79], dtype=uint8)


Copy link
Member

@tacaswell tacaswell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think that this is addressing the actual underlying bug.

@r3kste
Copy link
Contributor Author

r3kste commented Jun 27, 2024

but simplify is still not handling the CLOSEPOLY correctly and is adding multiple LINETOs instead

I think that this is not an issue with this handling of CLOSEPOLY, but rather that PathSimplifier repeats the last vertex twice, even if it was Path.LINETO.

>>> p         
Path(array([[0. , 0. ],
       [1. , 0. ],
       [1. , 1. ],
       [0.5, 1.5]]), array([1, 2, 2, 2], dtype=uint8))
>>> p.cleaned(simplify=True)
Path(array([[0. , 0. ],
       [1. , 0. ],
       [1. , 1. ],
       [0.5, 1.5],
       [0.5, 1.5],
       [0. , 0. ]]), array([1, 2, 2, 2, 2, 0], dtype=uint8))

I do not think that this is addressing the actual underlying bug.

Do you mean that it should be handling nan values separately (as mentioned in the Issue discussion)?

@r3kste
Copy link
Contributor Author

r3kste commented Jul 20, 2024

As I mentioned earlier, PathSimplifier adds the last vertex twice. It seems like this queue_push is what adds the vertex a second time.

--- a/src/path_converters.h
+++ b/src/path_converters.h
@@ -907,9 +907,6 @@ class PathSimplifier : protected EmbeddedQueue<9>
                 }
                 m_moveto = false;
             }
-            queue_push((m_moveto || m_after_moveto) ? agg::path_cmd_move_to : agg::path_cmd_line_to,
-                       m_lastx,
-                       m_lasty);
             m_moveto = false;
             queue_push(agg::path_cmd_stop, 0.0, 0.0);
         }

Removing these lines produces the "intended" behavior of PathSimplifier

>>> p
Path(array([[0., 0.],
       [1., 0.],
       [1., 1.]]), array([1, 2, 2], dtype=uint8))
>>> p.cleaned(simplify=True)
Path(array([[0., 0.],
       [1., 0.],
       [1., 1.],
       [0., 0.]]), array([1, 2, 2, 0], dtype=uint8))

However, I am not sure what would break if these lines are removed. Moreover, I think that it is better to address this issue in a different PR, because I feel that the underlying issue in #17914 was that PathSimplifier did not have logic for considering CLOSEPOLY vertices.

And for special casing of nan vertices, I think that this logic is already present in PathNanRemover, so I think including that in PathSimplifier might be unnecessary.

I would like to hear your thoughts on this @tacaswell

@r3kste r3kste requested a review from tacaswell July 20, 2024 12:14
Copy link
Contributor

@greglucas greglucas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems reasonable to me, but I think there could be a fair number of edge cases here that we could try to hit in the test to make it a bit more robust.

The extra LINETOs being added was noticed in the original issue: #17914 (comment)

Did you try running the test suite with your proposed change to remove that extra LINETO to see if it breaks anything?

@r3kste
Copy link
Contributor Author

r3kste commented Jul 21, 2024

Thanks for the response.

This seems reasonable to me, but I think there could be a fair number of edge cases here that we could try to hit in the test to make it a bit more robust.

I agree. I have modified the tests according to your suggestions.

Did you try running the test suite with your proposed change to remove that extra LINETO to see if it breaks anything?

A significant number of tests in test_simplification.py fails. Most of these tests check for the size/number of vertices in the simplified path. These tests could be made to pass by fixing the off by one errors.

The only test that was not off by one was test_start_with_moveto(), which checks that the path starts with a Path.MOVETO. To account for this, I think a valid solution would be:

@@ -907,9 +907,10 @@ class PathSimplifier : protected EmbeddedQueue<9>
                 }
                 m_moveto = false;
             }
-            queue_push((m_moveto || m_after_moveto) ? agg::path_cmd_move_to : agg::path_cmd_line_to,
-                       m_lastx,
-                       m_lasty);
+
+            if (m_moveto || m_after_moveto) {
+                queue_push(agg::path_cmd_move_to, m_lastx, m_lasty);
+            }
             m_moveto = false;
             queue_push(agg::path_cmd_stop, 0.0, 0.0);
         }

This would guarantee that there is at least one vertex in the path, which would be a MOVETO to (0, 0).

Nevertheless, I couldn't find any tests where removing that extra LINETO produces a logical error.

Comment on lines 567 to 568
expected_path = Path([(1, 1), (2, 1), (2, 2), (1, 1),
(-1, 0), (-2, 0), (-2, 1), (-1, 0), (-1, 0), (0, 0)],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is interesting that only the final CLOSEPOLY simplification gets the extra LINETO section.

I personally think it'd be nice to apply your fixes here and not assert against the extra LINETO since you've gone through the work of investigating them and your fix looks good to me at first glance.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is interesting that only the final CLOSEPOLY simplification gets the extra LINETO section.

The extra LINETO is not related to CLOSEPOLY simplification. Even a simple path has it's final LINETO vertex duplicated.

>>> p
Path(array([[1., 1.],
       [2., 1.],
       [2., 0.],
       [0., 0.]]), array([1, 2, 2, 0], dtype=uint8))
>>> p.cleaned(simplify=True)
Path(array([[1., 1.],
       [2., 1.],
       [2., 0.],
       [2., 0.],
       [0., 0.]]), array([1, 2, 2, 2, 0], dtype=uint8))

For now, the fix that I mentioned is just to remove the part of code which adds that LINETO.

I personally think it'd be nice to apply your fixes here and not assert against the extra LINETO since you've gone through the work of investigating them and your fix looks good to me at first glance.

It is good to hear that. However, what about the failing pytests in path_simplification.py
as mentioned in #28478 (comment). Should I also modify them so as to fix the off by one errors?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, what about the failing pytests in path_simplification.py
as mentioned in #28478 (comment). Should I also modify them so as to fix the off by one errors?

I forgot about needing to update those as well. Maybe it is best to just leave the assertion with the extra expected LINETO for now and then update all the off-by-one issues in a follow-up PR.

Copy link
Contributor

@greglucas greglucas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tacaswell it would be good to get your eyes on this again since you're currently requesting changes.

@r3kste
Copy link
Contributor Author

r3kste commented Jul 23, 2024

Thanks for reviewing the changes.

I was thinking that moving the test from test_path.py to test_simplification.py would make more sense. I also have to refactor m_last_startx to m_initX.

These are small changes, so can I go ahead with them?

Comment on lines +794 to +796
/* If we don't have a valid initial vertex, then
we can't close the path, so we skip the vertex */
continue;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a test for this case as well?

What is the expected result if the second MOVETO of a compound path is (nan, nan)? Here it looks like we are still adding the inbetween LINETOs and just not starting/closing the path. Does it therefore merge into the first path in this case?

Copy link
Contributor Author

@r3kste r3kste Jul 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For a compound path such as:

p = Path([(1, 0), (1, -1), (2, -1),
          (np.nan, np.nan), (-1, -1), (-2, 1), (-1, 1),
          (2, 2), (0, -1)],
         [Path.MOVETO, Path.LINETO, Path.LINETO,
          Path.MOVETO, Path.LINETO, Path.LINETO, Path.LINETO,
          Path.CLOSEPOLY, Path.LINETO])

Here the second MOVETO is (nan, nan). I plotted this path without simplification to see what the intended plot should be like.

So it seems like, when plotting, if CLOSEPOLY is encountered without having a valid MOVETO vertex, it plots the point corresponding to the CLOSEPOLY (in this case, it is point 8 at (2, 2)) but doesn't close the polygon. After this, the LINETO to (0, -1) is continuous with the previously existing path.

Here it looks like we are still adding the inbetween LINETOs and just not starting/closing the path. Does it therefore merge into the first path in this case?

So yes, you are correct, and it seems to be the expected result as well.

Therefore, the simplified Path would be:

>>> p.cleaned(simplify=True)
Path(array([[ 1.,  0.],
       [ 1., -1.],
       [ 2., -1.],
       [nan, nan],
       [-1., -1.],
       [-2.,  1.],
       [-1.,  1.],
       [ 0., -1.],
       [ 0., -1.],
       [ 0.,  0.]]), array([1, 2, 2, 1, 2, 2, 2, 2, 2, 0], dtype=uint8))

Could you add a test for this case as well?

I think that the compound path discussed above will be good for the test.

@greglucas
Copy link
Contributor

@tacaswell pinging you for a review on this since you're blocking. We are hitting some of these edge cases in Cartopy I think where we are currently working around them manually ourselves.

@r3kste do you have a separate update/PR with the removal of the extra LINETO you mentioned?

@r3kste
Copy link
Contributor Author

r3kste commented Oct 21, 2024

@r3kste do you have a separate update/PR with the removal of the extra LINETO you mentioned?

I haven't yet opened a separate PR for the removal of the extra LINETO, but I would like to do so.
Should I go ahead with opening one?

@greglucas
Copy link
Contributor

@r3kste do you have a separate update/PR with the removal of the extra LINETO you mentioned?

I haven't yet opened a separate PR for the removal of the extra LINETO, but I would like to do so.
Should I go ahead with opening one?

Yes, please feel free to open a PR with that update, I just wanted to make sure I hadn't missed it coming in.

@tacaswell tacaswell added this to the v3.10.0 milestone Oct 21, 2024
Copy link
Member

@tacaswell tacaswell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not remember the previous code, but this seems correct to me now.

@greglucas greglucas merged commit 235bf97 into matplotlib:main Oct 22, 2024
35 of 38 checks passed
@greglucas
Copy link
Contributor

Thanks, @r3kste and congratulations on your first merged Matplotlib PR 🎉 Hope to hear from you again!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Waiting for author
Development

Successfully merging this pull request may close these issues.

PathSimplifier fails to ignore CLOSEPOLY vertices
3 participants