Skip to content

Potential Windows-specific lost event race in the implementation of time.sleep #135407

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
zackw opened this issue Jun 11, 2025 · 0 comments
Open
Labels
OS-windows type-bug An unexpected behavior, bug, or error

Comments

@zackw
Copy link

zackw commented Jun 11, 2025

Bug description:

In the Windows-specific implementation of time.sleep there appears this code:

cpython/Modules/timemodule.c

Lines 2341 to 2345 in b706ff0

// Check for pending SIGINT signal before resetting the event
if (PyErr_CheckSignals()) {
goto error;
}
ResetEvent(sigint_event);

If a control-C event happens after PyErr_CheckSignals returns, but before ResetEvent is called, I believe that event may be lost.

This is probably a hard race window to hit, and I don't have a Windows development environment, so I have not confirmed that it is possible.

CPython versions tested on:

CPython main branch

Operating systems tested on:

Windows

@zackw zackw added the type-bug An unexpected behavior, bug, or error label Jun 11, 2025
zackw added a commit to MillionConcepts/cpython-patches that referenced this issue Jun 11, 2025
…n the stdlib.

This patch demonstrates how `PyErr_CheckSignalsDetached` (python#135358; see
also python#133465 and capi-workgroup/decisions#68) might be used for some
(small) efficiency improvements in the stdlib.  Almost all existing
uses of `PyErr_CheckSignals` in the stdlib appear in loops with this
general structure:

```c
int res = 0;
int async_err = 0;
do {
    Py_BEGIN_ALLOW_THREADS
    res = system_call(arg1, arg2);
    Py_END_ALLOW_THREADS
} while (res < 0 && errno == EINTR
         && !(async_err = PyErr_CheckSignals()));
if (res < 0) {
    return !async_err ? PyErr_SetFromErrno(PyExc_OSError) : NULL;
}
Py_RETURN_NONE;
```

With the addition of `PyErr_CheckSignalsDetached`, this can become

```c
int res = 0;
int async_err = 0;
PyThreadState *tstate = PyEval_SaveThread();
do {
    res = system_call(arg1, arg2);
} while (res < 0 && errno == EINTR
         && !(async_err = PyErr_CheckSignalsDetached(tstate)));
PyEval_RestoreThread(tstate);
if (res < 0) {
    return !async_err ? PyErr_SetFromErrno(PyExc_OSError) : NULL;
}
Py_RETURN_NONE;
```

There were also a small number of places where

```
Py_BLOCK_THREADS
sig = PyErr_CheckSignals();
Py_UNBLOCK_THREADS
```

could become just

```
sig = PyErr_CheckSignalsDetached(tstate);
```

I chose to use explicit `PyErr_{Save,Restore}Thread` throughout this
patch, in order to give the thread-state parameter to
`PyErr_CheckSignalsDetached` a visible name in the calling code.

This is possibly an argument for adding a wrapper macro
(`PyErr_CHECK_SIGNALS_DETACHED()`, perhaps) that is part of the
`Py_BEGIN_ALLOW_THREADS` family, and is therefore entitled to use the
hidden `_state` variable.  Or it could be an argument for adding a new
construct like

```
Py_WITH_DETACHED_THREAD_STATE (tstate) {
    do {
        res = system_call(arg1, arg2);
    } while (res < 0 && errno == EINTR
             && !(async_err = PyErr_CheckSignalsDetached(tstate)));
}
```

(This could be implemented in plain C by off-label use of a for loop.
I’m not sure if that’s a good idea or not.  In particular, `break`
and `continue` at the top level of a `Py_WITH_DETACHED_THREAD_STATE`
block would have surprising effects.)

The patch only tackles the most straightforward conversions.  Places
in the stdlib that could probably benefit from a conversion, but that
I did not touch, are:

* `signal.sigtimedwait`, `select.select`, `select.poll`, `select.devpoll`,
  `select.epoll`, `select.kevent`, `time.sleep`, `_multiprocessing.semaphore`,
  and the ssl and socket modules.  All of these places need to do arithmetic
  on some combination of `struct timespec`, `struct timeval`, and `Py_time_t`
  in their loops, and the APIs for doing so are not currently safe to use
  without an attached thread state (in particular, they may set the error
  indicator).

* The ssl and socket modules also do a bunch of other work in the
  same loops that detach the thread state and I’m not familiar enough
  with those APIs to know if it’s safe to expand the regions with
  detached thread state.  This is unfortunate as I think these are the
  loops with the biggest chance of _substantial_ efficiency gains from
  enlarging those regions.

* `time.sleep` is also a horrible maze of `#ifdef`s and I think I see
  a race bug in the Windows-specific code (filed separately as python#135407).

* The tkinter module goes back and forth between holding a Python
  thread state and holding a _Tcl_ thread state and I didn’t want to
  mess with that.

There are also many places where no conversion is necessary:

* `signal.pause`, `signal.raise`, `signal.signal`,
  `signal.pthread_sigmask`, `signal.pthread_kill`, `os.kill`,
  `os._getdiskusage`, `PyObject_Print`, `PyObject_Repr`,
  `PyObject_Str`, `PyErr_SetFromErrnoWithFilenameObjects`,
  `builtin_input_impl`:
  These do not contain a loop, so there’s no reason to mess with them.

* The `_sre` module, `longobject.c`, `faulthandler.c`, `traceback.c`:
  These don’t detach the thread state at all, so again there’s no
  reason to mess with them.

As a final note, it would probably be a good idea to make it a
documented guarantee that `PyEval_{Save,Restore}Thread` and
`PyErr_CheckSignals{,Detached}` preserve the value of `errno`
and all the Windows-specific errno analogues.  A lot of this code
is already assuming that this is the case, and the code that isn’t
making that assumption is substantially clunkier for it.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
OS-windows type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

2 participants