Skip to content

Replace use of renderer._uid by weakref. #9070

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 28, 2017

Conversation

anntzer
Copy link
Contributor

@anntzer anntzer commented Aug 22, 2017

This avoids the need for a note regarding issues with renderer classes
that do not inherit from RendererBase.

Remember that the goal of _uid was to prevent two renderers from sharing
the same id() even though they were actually different (if one has been
GC'd and another is created at the same address). Maintaining a weakref
to the renderer works as well, as weakref equality is thusly defined:

If the referents are still alive, two references have the same
equality relationship as their referents (regardless of the
callback). If either referent has been deleted, the references are
equal only if the reference objects are the same object.

Alternate for #8708, which it reverts. #9069 should get modified accordingly if this gets merged.
attn @QuLogic @tacaswell

PR Summary

PR Checklist

  • Has Pytest style unit tests
  • Code is PEP 8 compliant
  • New features are documented, with examples if plot related
  • Documentation is sphinx and numpydoc compliant
  • Added an entry to doc/users/next_whats_new/ if major new feature (follow instructions in README.rst there)
  • Documented in doc/api/api_changes.rst if API changed in a backward-incompatible way

This avoids the need for a note regarding issues with renderer classes
that do not inherit from RendererBase.

Remember that the goal of _uid was to prevent two renderers from sharing
the same id() even though they were actually different (if one has been
GC'd and another is created at the same address).  Maintaining a weakref
to the renderer works as well, as weakref equality is thusly defined:

    If the referents are still alive, two references have the same
    equality relationship as their referents (regardless of the
    callback). If either referent has been deleted, the references are
    equal only if the reference objects are the same object.
@tacaswell tacaswell added this to the 2.1 (next point release) milestone Aug 23, 2017
@tacaswell
Copy link
Member

Does this trigger the edge case you fix in #9074?

@anntzer
Copy link
Contributor Author

anntzer commented Aug 23, 2017

#9074 is actually totally unrelated (that came up when I was trying to build a weak cache of FT2Font -> FT_Face for mpl_cairo some time ago).
This only weakrefs renderers, which are Python-level objects (not extension types) so they're always weakrefable (for example, mpl_cairo's renderer class inherits from both the extension class and RendererBase).

@WeatherGod
Copy link
Member

Question... this all hinges on the definition of the equality of weakrefs and to not hold hard references to large python objects. The property tuple that contains the weakref here is meant to get hashed, which means that the weakrefs aren't directly compared, right? I don't know what the rules are with weakref hashes.

@tacaswell
Copy link
Member

Weak references are hashable if the object is hashable. They will maintain their hash value even after the object was deleted. If hash() is called the first time only after the object was deleted, the call will raise TypeError.

@WeatherGod
Copy link
Member

Ah, so the hashes of two weakrefs that point to the same object will be the same?

@anntzer
Copy link
Contributor Author

anntzer commented Aug 24, 2017

Yes.

@tacaswell
Copy link
Member

Doing a bit more digging, I think we can simplify this even more and just use hash(renderer).

I think there is a (probably impossible) race condition here is that if you try to take the hash for the first time after the object has gone away.

In [108]: class foo(): pass

In [109]: (weakref.ref(foo()), ) in {}
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-109-32cf9cf15e2e> in <module>()
----> 1 (weakref.ref(foo()), ) in {}

TypeError: weak object has gone away

The docs from__hash__ say

User-defined classes have eq() and hash() methods by default; with them, all objects compare unequal (except with themselves) and x.hash() returns an appropriate value such that x == y implies both that x is y and hash(x) == hash(y).

@WeatherGod
Copy link
Member

Isn't that what the original code did (before the uid change)? The uid approach tried to fix an edge case where id() of two renderers created at different times could collide, right?

@tacaswell
Copy link
Member

no, the original code was using id(obj)

@anntzer
Copy link
Contributor Author

anntzer commented Aug 24, 2017

If ids can collide, hashs can collide too.

In [1]: class C:
   ...:     def __init__(self, a): self.a = a
   ...: 

In [2]: id(C(1)) == id(C(2))  # relying on the fact that the first C gets gc'd immediately
Out[2]: True

In [3]: hash(C(1)) == hash(C(2))
Out[3]: True

The only use of get_prop_tup in the codebase immediately uses the tuple for a dict lookup (in _get_layout), while the renderer is definitely still alive, so there won't be a race condition.
I'm not convinced we should bother with other use cases.

@tacaswell tacaswell added the Release critical For bugs that make the library unusable (segfaults, incorrect plots, etc) and major regressions. label Aug 26, 2017
@tacaswell
Copy link
Member

Hmm, I wonder if that is an edge case that should go up to cpython?

I think this should go in as it eliminates a epi-cycle that we added this cycle.

@anntzer
Copy link
Contributor Author

anntzer commented Aug 26, 2017

id(C()) == id(C()) makes sense because id just returns the pointer address and the object is being reused (there's probably a free list somewhere).
There is no guarantee whatsoever that different objects have different hashes (and that's even more true when the two objects are not even alive at the same time). The only guarantee is that two different objects have different hashes.
So I don't see anything wrong with cpython here.

@tacaswell
Copy link
Member

In [122]: class foo():...

In [123]: foo() == foo()
Out[123]: False

In [125]: id(foo()) == id(foo())
Out[125]: True

In [126]: hash(foo()) == hash(foo())
Out[126]: True

In [127]: foo() == foo()
Out[127]: False

In [128]: foo() is foo()
Out[128]: False

violates the docs on hash. I suspect the docs should just needs a caveat about objects must exist for all the conditions to hold.

@anntzer
Copy link
Contributor Author

anntzer commented Aug 26, 2017

Ah, I see. In https://docs.python.org/3/reference/datamodel.html#object.__hash__

x.hash() returns an appropriate value such that x == y implies both that x is y and hash(x) == hash(y).

should be

x.hash() returns an appropriate value such that x == y and x is y both imply that hash(x) == hash(y).

right?

@dopplershift dopplershift merged commit 9d3b14c into matplotlib:master Aug 28, 2017
@tacaswell
Copy link
Member

I think the sentence is ok as it is, it may need some clarity that it is 'if' not 'iff' though.

@anntzer anntzer deleted the no-renderer-uid branch August 28, 2017 19:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Release critical For bugs that make the library unusable (segfaults, incorrect plots, etc) and major regressions.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants