Skip to content

gh-133465: Efficient signal checks with detached thread state. #135358

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

zackw
Copy link

@zackw zackw commented Jun 10, 2025

Add new C-API functions PyErr_CheckSignalsDetached and PyErr_AreSignalsPending.

PyErr_CheckSignalsDetached can only be called by threads that don’t have an attached thread state. It does the same thing as PyErr_CheckSignals, except that it guarantees it will only reattach the supplied thread state if necessary in order to run signal handlers. (Also, it never runs the cycle collector.)

PyErr_AreSignalsPending can be called with or without an attached thread state. It reports to its caller whether signals are pending, but never runs any handlers itself.

Rationale: Compiled-code modules that implement time-consuming operations that don’t require manipulating Python objects, are supposed to call PyErr_CheckSignals frequently throughout each such operation, so that if the user interrupts the operation with control-C, it is canceled promptly.

In the normal case where no signals are pending, PyErr_CheckSignals is cheap (two atomic memory operations); however, callers must have an attached thread state, and compiled-code modules that implement time-consuming operations are also supposed to detach their thread state during each such operation. The overhead of re-attaching a thread state in order to call PyErr_CheckSignals, and then releasing it again, sufficiently often for reasonable user responsiveness, can be substantial, particularly in traditional (non-free-threaded) builds. These new functions permit compiled-code modules to avoid that extra overhead.


📚 Documentation preview 📚: https://cpython-previews--135358.org.readthedocs.build/en/135358/c-api/exceptions.html#c.PyErr_CheckSignalsDetached

Add new C-API functions `PyErr_CheckSignalsDetached` and
`PyErr_AreSignalsPending`.

`PyErr_CheckSignalsDetached` can *only* be called by threads
that *don’t* have an attached thread state.  It does the same thing as
`PyErr_CheckSignals`, except that it guarantees it will only reattach
the supplied thread state if necessary in order to run signal
handlers.  (Also, it never runs the cycle collector.)

`PyErr_AreSignalsPending` can be called with or without an attached
thread state.  It reports to its caller whether signals are pending,
but never runs any handlers itself.

Rationale: Compiled-code modules that implement time-consuming
operations that don’t require manipulating Python objects, are
supposed to call PyErr_CheckSignals frequently throughout each such
operation, so that if the user interrupts the operation with
control-C, it is canceled promptly.

In the normal case where no signals are pending, PyErr_CheckSignals
is cheap (two atomic memory operations); however, callers must have
an attached thread state, and compiled-code modules that implement
time-consuming operations are also supposed to detach their thread
state during each such operation.  The overhead of re-attaching a
thread state in order to call PyErr_CheckSignals, and then releasing
it again, sufficiently often for reasonable user responsiveness, can
be substantial, particularly in traditional (non-free-threaded) builds.
These new functions permit compiled-code modules to avoid that extra
overhead.
@zackw
Copy link
Author

zackw commented Jun 10, 2025

This pull request semi-supersedes #133466. Given the substantial changes requested in discussion there, I found it easier to start over from scratch. As requested, it has been minimized down to the new API change. Once we are all 100% happy with the new API, then I will make follow-up PRs for the more elaborate documentation changes and the stdlib uses of the new API that were also included in #133466.

This is also semi-draft; I would appreciate code review but the API design discussion in capi-workgroup/decisions#68 has not been finalized yet.

Please note that I will be offline from Friday of this week (that is, 2025 June 13) through the 22nd.

@ZeroIntensity
Copy link
Member

Marking as DO-NOT-MERGE until the C API WG comes to a decision.

@ZeroIntensity ZeroIntensity self-requested a review June 11, 2025 11:15
zackw added a commit to MillionConcepts/cpython-patches that referenced this pull request Jun 11, 2025
…n the stdlib.

This patch demonstrates how `PyErr_CheckSignalsDetached` (python#135358; see
also python#133465 and capi-workgroup/decisions#68) might be used for some
(small) efficiency improvements in the stdlib.  Almost all existing
uses of `PyErr_CheckSignals` in the stdlib appear in loops with this
general structure:

```c
int res = 0;
int async_err = 0;
do {
    Py_BEGIN_ALLOW_THREADS
    res = system_call(arg1, arg2);
    Py_END_ALLOW_THREADS
} while (res < 0 && errno == EINTR
         && !(async_err = PyErr_CheckSignals()));
if (res < 0) {
    return !async_err ? PyErr_SetFromErrno(PyExc_OSError) : NULL;
}
Py_RETURN_NONE;
```

With the addition of `PyErr_CheckSignalsDetached`, this can become

```c
int res = 0;
int async_err = 0;
PyThreadState *tstate = PyEval_SaveThread();
do {
    res = system_call(arg1, arg2);
} while (res < 0 && errno == EINTR
         && !(async_err = PyErr_CheckSignalsDetached(tstate)));
PyEval_RestoreThread(tstate);
if (res < 0) {
    return !async_err ? PyErr_SetFromErrno(PyExc_OSError) : NULL;
}
Py_RETURN_NONE;
```

There were also a small number of places where

```
Py_BLOCK_THREADS
sig = PyErr_CheckSignals();
Py_UNBLOCK_THREADS
```

could become just

```
sig = PyErr_CheckSignalsDetached(tstate);
```

I chose to use explicit `PyErr_{Save,Restore}Thread` throughout this
patch, in order to give the thread-state parameter to
`PyErr_CheckSignalsDetached` a visible name in the calling code.

This is possibly an argument for adding a wrapper macro
(`PyErr_CHECK_SIGNALS_DETACHED()`, perhaps) that is part of the
`Py_BEGIN_ALLOW_THREADS` family, and is therefore entitled to use the
hidden `_state` variable.  Or it could be an argument for adding a new
construct like

```
Py_WITH_DETACHED_THREAD_STATE (tstate) {
    do {
        res = system_call(arg1, arg2);
    } while (res < 0 && errno == EINTR
             && !(async_err = PyErr_CheckSignalsDetached(tstate)));
}
```

(This could be implemented in plain C by off-label use of a for loop.
I’m not sure if that’s a good idea or not.  In particular, `break`
and `continue` at the top level of a `Py_WITH_DETACHED_THREAD_STATE`
block would have surprising effects.)

The patch only tackles the most straightforward conversions.  Places
in the stdlib that could probably benefit from a conversion, but that
I did not touch, are:

* `signal.sigtimedwait`, `select.select`, `select.poll`, `select.devpoll`,
  `select.epoll`, `select.kevent`, `time.sleep`, `_multiprocessing.semaphore`,
  and the ssl and socket modules.  All of these places need to do arithmetic
  on some combination of `struct timespec`, `struct timeval`, and `Py_time_t`
  in their loops, and the APIs for doing so are not currently safe to use
  without an attached thread state (in particular, they may set the error
  indicator).

* The ssl and socket modules also do a bunch of other work in the
  same loops that detach the thread state and I’m not familiar enough
  with those APIs to know if it’s safe to expand the regions with
  detached thread state.  This is unfortunate as I think these are the
  loops with the biggest chance of _substantial_ efficiency gains from
  enlarging those regions.

* `time.sleep` is also a horrible maze of `#ifdef`s and I think I see
  a race bug in the Windows-specific code (filed separately as python#135407).

* The tkinter module goes back and forth between holding a Python
  thread state and holding a _Tcl_ thread state and I didn’t want to
  mess with that.

There are also many places where no conversion is necessary:

* `signal.pause`, `signal.raise`, `signal.signal`,
  `signal.pthread_sigmask`, `signal.pthread_kill`, `os.kill`,
  `os._getdiskusage`, `PyObject_Print`, `PyObject_Repr`,
  `PyObject_Str`, `PyErr_SetFromErrnoWithFilenameObjects`,
  `builtin_input_impl`:
  These do not contain a loop, so there’s no reason to mess with them.

* The `_sre` module, `longobject.c`, `faulthandler.c`, `traceback.c`:
  These don’t detach the thread state at all, so again there’s no
  reason to mess with them.

As a final note, it would probably be a good idea to make it a
documented guarantee that `PyEval_{Save,Restore}Thread` and
`PyErr_CheckSignals{,Detached}` preserve the value of `errno`
and all the Windows-specific errno analogues.  A lot of this code
is already assuming that this is the case, and the code that isn’t
making that assumption is substantially clunkier for it.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants