Skip to content

[Bug]: Changing the array returned by get_data affects future calls to get_data but does not change the plot on canvas.draw. #29760

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Rainald62 opened this issue Mar 15, 2025 · 3 comments

Comments

@Rainald62
Copy link

Bug summary

I use get_ydata for simplicity. I call it twice, with orig=True and False, respectively, storing the returned arrays as yT and yF, which are then changed. The drawn position is the original one, seems to be a third entity, maybe cached.
Using set_ydata moves the marker, but does not touch the arrays yT and yF.
If not a bug, it is a usability issue: My application calls set_data conditionally after changing the arrays, and if not, the result of future calls to get_data didn't reflect the marker position, which took some time to debug.

Code for reproduction

import numpy as np
from matplotlib.pyplot import subplots

fig, ax = subplots()
h = ax.plot(1., 1., 'bo')[0]
ax.set_ylim(-2, 4)
fig.show()
yT = h.get_ydata(orig=True)
yF = h.get_ydata(orig=False)
print(yT, yF) # [1.] [1.]
yT += 1
yF -= 1
print(h.get_ydata(orig=True),  # [2.], instead of the original value.
      h.get_ydata(orig=False)) # [0.], as expected.
fig.canvas.draw() # Still at the truly original position.
# 1/0 # "breakpoint" to verify the above comment.
if False:
    h.set_ydata((3,))
    print(yT, yF) # set_ydata had no effect on yF, yT, still [2.] [0.]
    fig.canvas.draw() # but on the marker.
    print(h.get_ydata(orig=True),  # (3,), the tuple passed to set_data.
          h.get_ydata(orig=False)) # [3.], a new array object.
else:
    yT = h.get_ydata(orig=True)
    yF = h.get_ydata(orig=False)
    yT += 1
    yF -= 1
    print(yT, yF) # [3.] [-1.], while the marker would still be drawn at 1.

Actual outcome

Given as comments in the code.

Expected outcome

Either get_data shall return a copy, which can be changed independently of the artist, or the canvas shall reflect changes to the artist's state.
The documentation of the orig parameter is insufficient.

Additional information

#24790 links a pull request mentioning caching.

Operating system

Windows 10

Matplotlib Version

3.10.0

Matplotlib Backend

tkagg

Python version

3.13.1

Jupyter version

No response

Installation

pip

@tacaswell
Copy link
Member

tacaswell commented Mar 17, 2025

To start with a joke: There are 2 hard problems in computer science: naming things, cache invalidation, and off-by-one bugs.


The issue is that, for performance reasons, we do some caching in Line2D.recache that makes a second (possibly modified) internal copy of the data the user passed us (both to maybe mutate the values and because we need the x and y data stacked into a 2D array). The orig keyword to the get_*data function controls if you get (our copy) of what the user passed in or the maybe modified copy. We return the object from get_*data so if you mutate it and call the get method again you see the mutation (because it is the same object!). However, if you have drawn at least once before we already have our cache so do not consult the mutated version. When you call set_*data we make a new copy of the data passed (to avoid exactly this sort of issue and to normalize the behavior between passing in a list and an array) so the previously returned object (that you are now holding) is no longer held by the Line2D object.

If you h.recache(always=True) you will get the "mutate-in-place" to propagate back.

I hope that made sense how we got here and why the current behavior is the way it is.


Paths out:

  1. no behavior change but improve docs: I suspect this behavior is 15+ years old so while confusing it is long standing and I think dependable defendable.
  2. make get_*data return copies which avoids this problem but at the cost of more memory (where the intended use is something like "get -> update -> set") which is less than ideal

I'm inclined to go with option 1 and better document how to force a re-cache of the Line. I am not sure it is a good idea to mutate data arrays is place and recache (calling set_*data seems simpler), but we already have all the public API to do it. I do not think we should rip it out so we should document it.

[edited to fix a typo]

@anntzer
Copy link
Contributor

anntzer commented Mar 17, 2025

Also somewhat related to #21147 and #18841 (not directly as these issues are on the input side, but some common documentation on what gets copied and what doesn't would be nice).

@Rainald62
Copy link
Author

(because it is the same object!)
Thank you. I will use non-mutable objects with orig=True for readable code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants