Skip to content

Conversation

scoder
Copy link
Member

@scoder scoder commented Aug 23, 2025

No description provided.

scoder added 30 commits June 9, 2025 10:33
…read-only proxy search. Once we know that there are no further proxies referring to the subtree. we don't need locking anymore.
…rite_with()" order and avoid repeated locking of the same source document (since most appended elements will probably come from the same one).
@scoder
Copy link
Member Author

scoder commented Aug 23, 2025

The current state compiles and runs most tests. But it crashes in the threading tests under Py3.14t, e.g. in "lxml.tests.test_threading.ThreadingTestCase.test_concurrent_proxies":

AssertionError
Python/generated_cases.c.h:5288: _Py_NegativeRefcount: Assertion failed: object has negative ref count
<object at 0x200140c0d90 is freed>
Fatal Python error: _PyObject_AssertFailed: _PyObject_AssertFailed
Python runtime state: initialized

Stack (most recent call first):
  File "src/lxml/tests/test_threading.py", line 384 in testrun
  File "src/lxml/tests/test_threading.py", line 42 in sync_start
  File "python3.14t/lib/python3.14t/threading.py", line 1023 in run
  File "python3.14t/lib/python3.14t/threading.py", line 1081 in _bootstrap_inner
  File "python3.14t/lib/python3.14t/threading.py", line 1043 in _bootstrap

Extension modules: lxml._elementpath, lxml.etree (total: 2)
Exception ignored in: 'lxml.etree._Element.__del__'
Traceback (most recent call last):
  File "src/lxml/tests/test_threading.py", line 384, in testrun
Objects/object.c:595: PyObject_CallFinalizerFromDealloc: Assertion failed: PyObject_CallFinalizerFromDealloc called on object with a non-zero refcount
Memory block allocated at (most recent call first):
  File "src/lxml/tests/test_threading.py", line 383

object address  : 0x200160c01f0
object refcount : 0
object type     : 0x7ffff6d48f20
object type name: lxml.etree._Element
object repr     : <refcnt 0 at 0x200160c01f0>

Objects/object.c:595: PyObject_CallFinalizerFromDealloc: Assertion failed: PyObject_CallFinalizerFromDealloc called on object with a non-zero refcount
Memory block allocated at (most recent call first):
  File "src/lxml/tests/test_threading.py", line 383

object address  : 0x200120c0850
object refcount : 0
object type     : 0x7ffff6d48f20
object type name: lxml.etree._Element
object repr     : <refcnt 0 at 0x200120c0850>

python3: ./Include/internal/pycore_object.h:705: _PyObject_ResurrectStart: Assertion `Py_REFCNT(op) == 0' failed.
Exception in thread Thread-5 (sync_start):
    del el
AssertionError: 

Thread 3 "Thread-2 (sync_" received signal SIGABRT, Aborted.
[Switching to Thread 0x7ffff37fd6c0 (LWP 428824)]
__pthread_kill_implementation (no_tid=0, signo=6, threadid=<optimized out>) at ./nptl/pthread_kill.c:44
Warnung: 44     ./nptl/pthread_kill.c: Datei oder Verzeichnis nicht gefunden
(gdb) bt 20
#0  __pthread_kill_implementation (no_tid=0, signo=6, threadid=<optimized out>) at ./nptl/pthread_kill.c:44
#1  __pthread_kill_internal (signo=6, threadid=<optimized out>) at ./nptl/pthread_kill.c:78
#2  __GI___pthread_kill (threadid=<optimized out>, signo=signo@entry=6) at ./nptl/pthread_kill.c:89
#3  0x00007ffff7c4527e in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#4  0x00007ffff7c288ff in __GI_abort () at ./stdlib/abort.c:79
#5  0x00005555555df1ae in fatal_error_exit (status=-1) at Python/pylifecycle.c:3155
#6  fatal_error (fd=<optimized out>, header=header@entry=1, prefix=prefix@entry=0x555555b38cc0 <__func__.11> "_PyObject_AssertFailed", msg=msg@entry=0x555555ac5f01 "_PyObject_AssertFailed", status=status@entry=-1) at Python/pylifecycle.c:3301
#7  0x000055555599ee18 in _Py_FatalErrorFunc (func=func@entry=0x555555b38cc0 <__func__.11> "_PyObject_AssertFailed", msg=msg@entry=0x555555ac5f01 "_PyObject_AssertFailed") at Python/pylifecycle.c:3387
#8  0x0000555555782d4d in _PyObject_AssertFailed (obj=obj@entry=0x200140c0d90, expr=expr@entry=0x0, msg=msg@entry=0x555555ac5f2f "object has negative ref count", file=file@entry=0x555555acb72a "Python/generated_cases.c.h", line=line@entry=5288, function=function@entry=0x555555b39240 <__func__.63> "_Py_NegativeRefcount") at Objects/object.c:3031
#9  0x0000555555782ed6 in _Py_NegativeRefcount (filename=filename@entry=0x555555acb72a "Python/generated_cases.c.h", lineno=lineno@entry=5288, op=op@entry=0x200140c0d90) at Objects/object.c:272
#10 0x000055555578256a in _Py_DecRefSharedIsDead (lineno=5288, filename=<optimized out>, o=0x200140c0d90) at Objects/object.c:400
#11 _Py_DecRefSharedDebug (o=0x200140c0d90, filename=<optimized out>, lineno=5288) at Objects/object.c:422
#12 0x00005555555ffbf1 in Py_DECREF (op=0x200140c0d90, lineno=5288, filename=0x555555acb72a "Python/generated_cases.c.h") at ./Include/refcount.h:355
#13 _PyEval_EvalFrameDefault (tstate=<optimized out>, frame=<optimized out>, throwflag=<optimized out>) at Python/generated_cases.c.h:5288

This stress test concurrently creates Element proxy objects for the same C XML tree node, reusing the same instance if it already exists (potentially created in another thread) and creating a new one if not. The C node keeps a pointer (not an owned reference) to its current proxy object, which enables the reuse but also complicates the concurrent deallocation. The first implementation in lxml, twenty years ago, used weak references, but they were waaaayyyy to slow for lxml's need and were quickly replaced by the backpointer scheme.

Previously, the instantiation was guarded by the GIL, now I use a PyMutex on the XML Document instance to guard against concurrent proxy access.

https://github.com/scoder/lxml/blob/c2df5dbfa6f7cf3cc9efd4cc4a44b02d15263ac6/src/lxml/etree.pyx#L2104-L2148

The proxy deallocation is really tricky then:
https://github.com/scoder/lxml/blob/c2df5dbfa6f7cf3cc9efd4cc4a44b02d15263ac6/src/lxml/etree.pyx#L920-L969

EDIT: I actually only recently switched from a Cython's __dealloc__ to a __del__ finaliser to more safely allow concurrent reinstantiations. But they don't seem entirely safe.

@ngoldbaum, is there a guard in CPython against concurrent object finalisation and deallocation?

@Fidget-Spinner
Copy link

@scoder Btw if the check for reusing object code is Py_RECNT(op) == 1, that check is wrong and needs updating to _PyObject_IsUniquelyReferenced

@ngoldbaum
Copy link

is there a guard in CPython against concurrent object finalisation and deallocation?

I'm not sure, but I shared this PR elsewhere to some core devs who might know.

@scoder
Copy link
Member Author

scoder commented Aug 24, 2025

if the check for reusing object code is Py_REFCNT(op) == 1, that check is wrong and needs updating to _PyObject_IsUniquelyReferenced

Thanks for the hint, but I'm not quite sure that's the right condition. _PyObject_IsUniquelyReferenced() (called by its public counterpart PyUnstable_Object_IsUniquelyReferenced) uses this code in the FT build:

    return (_Py_IsOwnedByCurrentThread(ob) &&
            _Py_atomic_load_uint32_relaxed(&ob->ob_ref_local) == 1 &&
            _Py_atomic_load_ssize_relaxed(&ob->ob_ref_shared) == 0);

IIUC from the PR that added it, _Py_IsOwnedByCurrentThread(ob) means that the object was created in the current thread and is called here because the intention is to modify the object in the current thread. That's not the case for lxml. Instead, lxml wants to know if the object has been resurrected by a concurrent thread during its finalisation in the current thread, regardless of where it was created and who owns it. The check is guarded by a lock, and user code cannot hold a reference to the object any more (because it is being finalised), so there is not risk of concurrent refcount modification while we're evaluating the refcount check.

So, if Py_REFCNT(op) was not adapted to report the correct (and expected) refcount, how can I calculate the correct global refcount in lxml?

@scoder
Copy link
Member Author

scoder commented Aug 24, 2025

Hmm, but the _Py_REFCNT implementation looks correct to me. Why do you advise against it?

@scoder
Copy link
Member Author

scoder commented Aug 24, 2025

@scoder Btw if the check for reusing object code is Py_RECNT(op) == 1, that check is wrong and needs updating to _PyObject_IsUniquelyReferenced

Ah, I think what you meant was not the place where the proxy instance is deallocated but the place where the decision is taken whether to use an existing proxy instance as proxy for a C node.

In that case as well, the condition is wrong because it ensures a refcount of 1. A proxy instance can also be reused if the refcount is higher than that. (And I really don't care about how many threads create, use and throw away a proxy object – there can only ever be one, globally, for a given C node.)

The problem I've been fighting for a while now is that CPython can decide at any time to start the finalisation of a proxy object, and lxml will only learn about it when __del__ is called. That's late because other threads might already have started to use the proxy while CPython is still preparing its deallocation. So __del__ must be able to detect concurrent resurrection and stop the deallocation in that case.

I already added a refcount check to hasProxy(), which is used to determine if a proxy instance can be reused:

cdef inline bint hasProxy(xmlNode* c_node) noexcept:
    proxy = <cpython.object.PyObject*> c_node._private
    return (
        # Node has a proxy reference?
        proxy is not NULL and
        # Proxy reference is not currently deallocating?
        cpython.ref._Py_REFCNT(proxy) > 0
    )

https://github.com/scoder/lxml/blob/ft_python/src/lxml/proxy.pxi#L18-L27

But that still bares a (small but non-zero) risk of the refcount going to zero after checking the refcount with hasProxy() and before increfing it to use the proxy, so that CPython could still trigger the finalisation of an object that is used, which must then again be caught and aborted by __del__.

But, there's apparently a refcount check in PyObject_CallFinalizerFromDealloc(), which fails hard with a fatal C assertion if the refcount has been increased from 0 between the time it went down to 0 so that CPython started the finalisation and the time when something calls PyObject_CallFinalizerFromDealloc(). That is very annoying because it can happen and simply means that the object has been resurrected before it learned about its finalisation. And that seems perfectly ok to me, give that the finaliser is already allowed to resurrect to object.
I created a CPython ticket for this: python/cpython#138101

@Fidget-Spinner
Copy link

Fidget-Spinner commented Aug 24, 2025

Sorry I pointed to the wrong function. It's PyUnstable_Object_IsUniqueReferencedTemporary, not _PyObject_IsUniquelyReferenced.

The difference is that if you have something on the stack, the refcount is no longer guaranteed to be the "true" refcount by CPython. CPython elides reference counts for certain temporaries on the stack now. PyUnstable_Object_IsUniqueReferencedTemporary scans the topmost frame's stack to check the real refcount.

@scoder
Copy link
Member Author

scoder commented Aug 24, 2025

The first thing that PyUnstable_Object_IsUniqueReferencedTemporary does is call _PyObject_IsUniquelyReferenced, which, as I argumented above, is already too restrictive.
https://github.com/python/cpython/blob/6fcac09401e336b25833dcef2610d498e73b27a1/Objects/object.c#L2745-L2766

@Fidget-Spinner
Copy link

Yeah I get your point, I was answering this question:

So, if Py_REFCNT(op) was not adapted to report the correct (and expected) refcount, how can I calculate the correct global refcount in lxml?

Basically, Py_REFCNT is no longer "correct" on 3.14 and later.

@Fidget-Spinner
Copy link

Though I don't think Py_REFCNT is the problem here.

@scoder
Copy link
Member Author

scoder commented Aug 24, 2025

Ah, TryIncRef seems to do the trick. It gives atomic increfs in safe cases. I can easily change the code to make use of that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants