Skip to content

Crash when calling textiowrapper_iternext and writing to a text file simultaneously in ft build #134908

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
duaneg opened this issue May 30, 2025 · 12 comments
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) topic-free-threading type-crash A hard crash of the interpreter, possibly with a core dump

Comments

@duaneg
Copy link
Contributor

duaneg commented May 30, 2025

Running this script under free-threading build:

import os
import tempfile
import threading

N=2
COUNT=100

def writer(file, barrier):
    barrier.wait()
    for _ in range(COUNT):
        f.write("x")

def reader(file, stopping):
    while not stopping.is_set():
        for line in file:
            assert line == ""

stopping = threading.Event()
with tempfile.NamedTemporaryFile("w+") as f:
    reader = threading.Thread(target=reader, args=(f, stopping))
    reader.start()

    barrier = threading.Barrier(N)
    writers = [threading.Thread(target=writer, args=(f, barrier))
               for _ in range(N)]
    for t in writers:
        t.start()
    for t in writers:
        t.join()
    stopping.set()
    reader.join()

    f.flush()
    assert(os.stat(f.name).st_size == COUNT * N)

...results in a crash:

python: ./Modules/_io/textio.c:1751: _io_TextIOWrapper_write_impl: Assertion `self->pending_bytes_count == 0' failed.
Aborted (core dumped)

The textiowrapper_iternext method is not protected by a critical section and calls _textiowrapper_readline, which calls _textiowrapper_writeflush, which relies on the GIL or critical section to synchronise access to its internal data. In a free-threading build this means iterating over lines in a text file is not thread-safe and can crash if it races with writes or other operations.

It looks like all other entry points that could call _textiowrapper_writeflush are protected by the critical section, so the easiest fix is probably to just likewise protect textiowrapper_iternext.

CPython versions tested on:

CPython main branch

Operating systems tested on:

Linux

Output from running 'python -VV' on the command line:

No response

Linked PRs

@duaneg duaneg added the type-crash A hard crash of the interpreter, possibly with a core dump label May 30, 2025
duaneg added a commit to duaneg/cpython that referenced this issue May 30, 2025
…tion

``textiowrapper_iternext`` calls ``_textiowrapper_writeflush`` but does not
take the critical section, making it racy in free-threaded builds.
@duaneg
Copy link
Contributor Author

duaneg commented May 30, 2025

Note that I found this while investigating #118138, however I think it is a separate bug that should be addressed under a separate issue.

CC @cmaloney who seems to be looking at related issues in this area already 👋

@ZeroIntensity
Copy link
Member

ZeroIntensity commented May 30, 2025

I'm pretty sure we don't support concurrent calls to __next__. You need to manually serialize the calls yourself. See gh-124397.

@duaneg
Copy link
Contributor Author

duaneg commented May 30, 2025

See gh-124397.

That is a very interesting discussion, thanks for linking!

I'm pretty sure we don't support concurrent calls to __next__. You need to manually serialize the calls yourself.

I totally agree that in general iterators should not be assumed to be thread-safe, in the broader sense. However they should not crash, which is happening here. In the description I reported an assertion failure, but the repro script will crash with assertions disabled too, just in more "interesting" ways.

@ZeroIntensity
Copy link
Member

You can get similar crashes from pure-Python by concurrently calling an object's __init__. I don't think there's much we can do without substantially hurting performance.

@duaneg
Copy link
Contributor Author

duaneg commented May 30, 2025

I'm sure there are places where fixing crashes without substantially hurting performance might be difficult or impossible, but I don't think this is one of them. It should have no impact on GIL performance and minimal in free-threading.

The third bullet point in #124397 is:

Other iterators implemented in C will get only the minimal changes necessary to cause them to not crash in a free-threaded build. The edits should be made in a way that does not impact existing semantics or performance (i.e. do not damage the standard GIL build).

I think this fix is exactly that...or at least, that was exactly what I intended when I wrote it. Of course, I may well have missed something!

@picnixz picnixz added the interpreter-core (Objects, Python, Grammar, and Parser dirs) label May 30, 2025
@ZeroIntensity
Copy link
Member

I'll let others weigh in here, but my personal stance is that concurrent iteration without synchronization by the caller is not a real use-case. The example you sent would suffer from so much lock contention that it would be no better than calling it in a single thread :(

I think that last point is in regards to making the iterator thread-safe when others are calling separate methods on it, not necessarily concurrent calls to __next__. I think this needs clarification, though.

@colesbury
Copy link
Contributor

From looking at the repro code, this looks like something we should fix.

@ZeroIntensity
Copy link
Member

Ok, thanks Sam. For clarity: when should we fix these cases?

@colesbury
Copy link
Contributor

This looks like concurrent operations on a NamedTemporaryFile. I think I/O should generally be internally synchronized. That's mostly true for Python already, it's usually true at the OS level, as well as in other language runtimes.

I get that there is a crash in iteration, but it doesn't look like there's an iterator shared between threads.

@ZeroIntensity
Copy link
Member

The NamedTemporaryFile is the iterator, and that's shared between threads.

@colesbury
Copy link
Contributor

Thanks. That still feels pretty different from list and dict iterators. For one thing, it'd be difficult to make list and dict iterators thread-safe without slowing down the common case substantially. I don't think that's the case here. The iteration interface just provides access to readline, which is already synchronized.

I think there's a difference between objects like NamedTemporaryFile that provide an iteration interface and objects like list_iterator, which are usually temporary objects that you don't think about and rarely store to variables.

@ZeroIntensity
Copy link
Member

Ok, that makes sense to me.

colesbury pushed a commit that referenced this issue Jun 2, 2025
…-134910)

The `textiowrapper_iternext` function called `_textiowrapper_writeflush`, but did not
use a critical section, making it racy in free-threaded builds.
miss-islington pushed a commit to miss-islington/cpython that referenced this issue Jun 2, 2025
…on (pythongh-134910)

The `textiowrapper_iternext` function called `_textiowrapper_writeflush`, but did not
use a critical section, making it racy in free-threaded builds.
(cherry picked from commit 44fb7c3)

Co-authored-by: Duane Griffin <duaneg@dghda.com>
colesbury pushed a commit to colesbury/cpython that referenced this issue Jun 2, 2025
…l section (pythongh-134910)

The `textiowrapper_iternext` function called `_textiowrapper_writeflush`, but did not
use a critical section, making it racy in free-threaded builds.
(cherry picked from commit 44fb7c3)

Co-authored-by: Duane Griffin <duaneg@dghda.com>
colesbury pushed a commit that referenced this issue Jun 2, 2025
…ion (gh-134910) (gh-135039)

The `textiowrapper_iternext` function called `_textiowrapper_writeflush`, but did not
use a critical section, making it racy in free-threaded builds.
(cherry picked from commit 44fb7c3)

Co-authored-by: Duane Griffin <duaneg@dghda.com>
colesbury pushed a commit to colesbury/cpython that referenced this issue Jun 2, 2025
…l section (pythongh-134910)

The `textiowrapper_iternext` function called `_textiowrapper_writeflush`, but did not
use a critical section, making it racy in free-threaded builds.
(cherry picked from commit 44fb7c3)

Co-authored-by: Duane Griffin <duaneg@dghda.com>
colesbury added a commit that referenced this issue Jun 2, 2025
…ion (gh-134910) (gh-135040)

The `textiowrapper_iternext` function called `_textiowrapper_writeflush`, but did not
use a critical section, making it racy in free-threaded builds.
(cherry picked from commit 44fb7c3)

Co-authored-by: Duane Griffin <duaneg@dghda.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) topic-free-threading type-crash A hard crash of the interpreter, possibly with a core dump
Projects
None yet
Development

No branches or pull requests

4 participants