Skip to content

gh-134908: protect textiowrapper_iternext with critical section #134910

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Jun 2, 2025

Conversation

duaneg
Copy link
Contributor

@duaneg duaneg commented May 30, 2025

textiowrapper_iternext calls _textiowrapper_writeflush but does not take the critical section, making it racy in free-threaded builds.

…tion

``textiowrapper_iternext`` calls ``_textiowrapper_writeflush`` but does not
take the critical section, making it racy in free-threaded builds.
Copy link
Contributor

@cmaloney cmaloney left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not familiar enough with the python critical section implementation bits to be confident this is the right way to do this. Currently on vacation but reviewed some high level pieces as a first pass. Will be back in town next week and can look more thoroughly then. Also happy for others to work on / review / land in the mean time, especially people with more free threading experience.

@duaneg
Copy link
Contributor Author

duaneg commented May 30, 2025

Currently on vacation but reviewed some high level pieces as a first pass

Thanks very much, but please don't feel any obligation to look at this while you are on holiday! There is no particular hurry, it can all wait 😄

Copy link
Member

@ZeroIntensity ZeroIntensity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I said in the issue, concurrent iteration generally isn't worth fixing on our end. It doesn't look like we do this for any of the other _io types.

From what I can tell, the main issue with supporting concurrent iteration is that it's normally not that useful. At the __next__ level, threads will skip over one another, which leaves holes in the data that each thread reads. If you're trying to exhaust a large iterator as fast as possible, like a file, the lock contention will probably be so bad that iterating from multiple threads won't be any more efficient than doing it with one.

…e-134908.3a7PxM.rst

Co-authored-by: Peter Bierma <zintensitydev@gmail.com>
@duaneg
Copy link
Contributor Author

duaneg commented May 30, 2025

As I said in the issue, concurrent iteration generally isn't worth fixing on our end.

Mentioned there also, but FTR this commit is just trying to fix the crash, not make the iterator thread-safe in any wider sense. I completely agree that is not a sensible goal. You will note there is no concurrent access of iterators in the reproduction script or test code!

It doesn't look like we do this for any of the other _io types.

If any of the other _io types are crashing with multi-threaded use...well, I'd be happy to try and fix them next 😉

@ZeroIntensity
Copy link
Member

You will note there is no concurrent access of iterators in the reproduction script or test code!

It concurrently iterates over f.

If any of the other _io types are crashing with multi-threaded use...well, I'd be happy to try and fix them next 😉

I'm pretty sure nearly all extension types will crash with concurrent iteration right now, because very few of them have locking. I think it's intentional for performance reasons.

Copy link
Member

@ZeroIntensity ZeroIntensity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, it looks like we're going to go with this per the discussion in the issue.

That said, I think we can make this thread-safe without a critical section. AFAICT, the only thread-unsafe thing here is the telling field. Let's switch that to an atomic. I'm not worried about torn or inconsistent reads because the data will be pretty screwed up for concurrent readers anyway.

@cmaloney
Copy link
Contributor

That said, I think we can make this thread-safe without a critical section. AFAICT, the only thread-unsafe thing here is the telling field. Let's switch that to an atomic. I'm not worried about torn or inconsistent reads because the data will be pretty screwed up for concurrent readers anyway.

I think the direct readline call needs a critical section around it? (_io_TextIOWrapper_readline_impl which exposes it more directly has one)

    if (Py_IS_TYPE(self, self->state->PyTextIOWrapper_Type)) {
        /* Skip method call overhead for speed */
        line = _textiowrapper_readline(self, -1);
    }

Thoughts on doing: _Py_AssertHoldsTstate() to ensure/validate that it's locked in _textiowrapper_readline, removing the "skip method call overhead" shortcut (so it gets the critical section without more copy/paste), and the telling atomic?

(Probably would/should need to do a single-threaded perf test to validate removing the shortcut path isn't highly detrimental in text line iteration...)

@ZeroIntensity
Copy link
Member

Thoughts on doing: _Py_AssertHoldsTstate() to ensure/validate that it's locked in _textiowrapper_readline, removing the "skip method call overhead" shortcut (so it gets the critical section without more copy/paste), and the telling atomic?

_Py_AssertHoldsTstate will always pass here, that's for whether or not the thread has an attached thread state. I think you're right that we need a critical section around that call. Let's just call _io_TextIOWrapper_readline_impl to let AC do the work.

@duaneg
Copy link
Contributor Author

duaneg commented May 31, 2025

That said, I think we can make this thread-safe without a critical section. AFAICT, the only thread-unsafe thing here is the telling field. Let's switch that to an atomic.

I'm afraid that won't work: atomicity of that field by itself is not sufficient. The other methods that check or change it are written with the assumption it cannot change value while they are running. A critical section is specifically required here: that is what the class is using to synchronise access to its internal data.

@ZeroIntensity
Copy link
Member

ZeroIntensity commented May 31, 2025

Yeah, I'm suggesting you switch to atomics there too. This is only for telling, right?

@duaneg
Copy link
Contributor Author

duaneg commented May 31, 2025

Yeah, I'm suggesting you switch to atomics there too. This is only for telling, right?

I'm a bit worried we may be talking at cross-purposes 😅 Let me recap a bit, to make sure we are all talking about the same things...

As I understand it, the proposal is to call the argument clinic-generated readline function from textiowrapper_iternext, thereby ensuring it takes the critical section, and to make access to telling atomic (required since it is set inside textiowrapper_iternext and hence would not be protected by the critical section).

If I've misunderstood anything there, apologies, and please correct me.

However, to explain why I don't think the approach above will work: we don't need to just make one read/write safe, we need to protect a whole sequence of operations. I.e. we need a lock, which in this case should be the critical section. To illustrate, consider textiowrapper_read_chunk. It checks if (telling) twice. The first time it gets some state from the decoder and unpacks it. The second time it uses that data to update its own internal state following the read. If this function can race with textiowrapper_iternext then it doesn't matter if reading and writing to telling is atomic: if it changes halfway through bad things will happen. Similar issues arise elsewhere telling is accessed.

@ZeroIntensity
Copy link
Member

If this function can race with textiowrapper_iternext then it doesn't matter if reading and writing to telling is atomic: if it changes halfway through bad things will happen.

I don't think we should care about this case. As I said, concurrently reading from an iterator isn't something that's generally useful. You need your own synchronization anyway, so all we need to do is make sure that it doesn't crash. Without the caller holding a lock, threads will get torn reads regardless of how we make it thread-safe.

What's the actual practicality of concurrent reading from a file while randomly writing to it? My main concern here is that we're slowing down single-threaded code for a use-case that doesn't exist.

@duaneg
Copy link
Contributor Author

duaneg commented May 31, 2025

I don't think we should care about this case. As I said, concurrently reading from an iterator isn't something that's generally useful. You need your own synchronization anyway, so all we need to do is make sure that it doesn't crash. Without the caller holding a lock, threads will get torn reads regardless of how we make it thread-safe.

It will crash. Sorry for not being clearer. The possibilities under "bad things" in that function certainly include leaking object references; I am not sure if anything worse can happen there. Elsewhere assertion failures, use-after-free, and general arbitrary memory corruption are possible. For example, the Py_CLEAR(self->snapshot) at the bottom of textiowrapper_iternext_locked will certainly crash if it hits the right window racing with tell.

What's the actual practicality of concurrent reading from a file while randomly writing to it? My main concern here is that we're slowing down single-threaded code for a use-case that doesn't exist.

I'm not suggesting there is any legit reason to do this. It sounds like a really bad idea to me. As I said before, I don't really care if the result is gibberish. However, it should not crash the interpreter!

As for performance: my proposed fix takes a critical section on entering textiowrapper_iternext. This will have no impact on single-threaded performance on GIL builds and de minimis on free-threaded builds. Note that the alternative of calling _io_TextIOWrapper_readline_impl also takes the same critical section and needs to marshal arguments into PyObjects and back out again.

@duaneg
Copy link
Contributor Author

duaneg commented May 31, 2025

Probably would/should need to do a single-threaded perf test to validate removing the shortcut path isn't highly detrimental in text line iteration...

A good suggestion. A quick and very dirty benchmark shows no significant performance difference between this branch and where it diverged:

import io
import timeit

LINES = 1_000_000
bytes = b"\n" * LINES

def test():
    buffer = io.BytesIO(b"\n" * LINES)
    wrapper = io.TextIOWrapper(buffer)
    for _ in wrapper:
        pass

if __name__ == '__main__':
    timer = timeit.Timer(test)
    elapsed = timer.timeit(number=100)
    print(f"{elapsed:.2f}")

On my system, with an optimised free threading build, that is ~3.7ns per call, with or without the fix.

@ZeroIntensity
Copy link
Member

For example, the Py_CLEAR(self->snapshot) at the bottom of textiowrapper_iternext_locked will certainly crash if it hits the right window racing with tell.

That should get a critical section around it, but it's not the common path.

de minimis on free-threaded builds.

Yeah, the acquisition of a critical section itself is efficient. But critical sections come with a lot more bells and whistles than you'd think. In particular, every time the thread state is detached, the thread has to drop the lock, and then re-acquire that lock upon reattaching. The thread state is detached more often than you'd think; it happens between bytecode instructions, and also happens every time a mutex or other critical section is acquired.

needs to marshal arguments into PyObjects and back out again.

Hm? There's no marshalling there, AFAICT. Using the AC version is just a simpler way to automatically wrap the critical section. I don't have much of a preference whether we're explicit or implicit there.


Basically, my suggestion is to keep the locking in places where we only really need it to prevent crashes. iternext has a lot of paths that don't require locking, and it's something that will only be called concurrently when fuzzing Python. This fix works for simplicity, but it also sets a bad example, because iternext generally isn't something that should have to worry about concurrency.

Copy link
Member

@ZeroIntensity ZeroIntensity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, after some more pondering, I retract my previous suggestion. It's probably too much additional complexity to make telling atomic, but I want to be extremely cautious about precedent here. Please add a comment explaining why we're going with a critical section, and that it's a special case.

I'm not super excited about adding locks here, but there's not too many great options:

  • Leave it as is, causing crashes when fuzzing.
  • Use some (relaxed) atomics to keep things generally fast for single-threaded code, at the cost of being significantly more complex.
  • Add locking around the __next__ call, which hurts performance, especially considering the frequent I/O jumping through extra hoops with the implicit acquiring and releasing.

@@ -3210,6 +3213,16 @@ textiowrapper_iternext(PyObject *op)
return line;
}

static PyObject *
textiowrapper_iternext(PyObject *op)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a comment regarding fuzzing here?

@colesbury colesbury self-requested a review June 2, 2025 14:34
Copy link
Contributor

@colesbury colesbury left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with some minor edits. I'll make the changes directly to the PR and merge it.

colesbury and others added 2 commits June 2, 2025 12:12
@colesbury colesbury enabled auto-merge (squash) June 2, 2025 16:14
@colesbury
Copy link
Contributor

Thanks @duaneg for the bug report and fix and everyone else for the code reviews!

@colesbury colesbury merged commit 44fb7c3 into python:main Jun 2, 2025
39 checks passed
@miss-islington-app
Copy link

Thanks @duaneg for the PR, and @colesbury for merging it 🌮🎉.. I'm working now to backport this PR to: 3.13, 3.14.
🐍🍒⛏🤖 I'm not a witch! I'm not a witch!

@ambv ambv added needs backport to 3.13 bugs and security fixes and removed needs backport to 3.13 bugs and security fixes labels Jun 2, 2025
@miss-islington-app
Copy link

Thanks @duaneg for the PR, and @colesbury for merging it 🌮🎉.. I'm working now to backport this PR to: 3.13.
🐍🍒⛏🤖 I'm not a witch! I'm not a witch!

@miss-islington-app
Copy link

Sorry, @duaneg and @colesbury, I could not cleanly backport this to 3.13 due to a conflict.
Please backport using cherry_picker on command line.

cherry_picker 44fb7c361cb24dcf9989a7a1cfee4f6aad5c81aa 3.13

@ambv ambv added needs backport to 3.14 bugs and security fixes and removed needs backport to 3.14 bugs and security fixes labels Jun 2, 2025
@miss-islington-app
Copy link

Thanks @duaneg for the PR, and @colesbury for merging it 🌮🎉.. I'm working now to backport this PR to: 3.14.
🐍🍒⛏🤖

miss-islington pushed a commit to miss-islington/cpython that referenced this pull request Jun 2, 2025
…on (pythongh-134910)

The `textiowrapper_iternext` function called `_textiowrapper_writeflush`, but did not
use a critical section, making it racy in free-threaded builds.
(cherry picked from commit 44fb7c3)

Co-authored-by: Duane Griffin <duaneg@dghda.com>
@bedevere-app
Copy link

bedevere-app bot commented Jun 2, 2025

GH-135039 is a backport of this pull request to the 3.14 branch.

@bedevere-app bedevere-app bot removed the needs backport to 3.14 bugs and security fixes label Jun 2, 2025
colesbury pushed a commit to colesbury/cpython that referenced this pull request Jun 2, 2025
…l section (pythongh-134910)

The `textiowrapper_iternext` function called `_textiowrapper_writeflush`, but did not
use a critical section, making it racy in free-threaded builds.
(cherry picked from commit 44fb7c3)

Co-authored-by: Duane Griffin <duaneg@dghda.com>
@bedevere-app
Copy link

bedevere-app bot commented Jun 2, 2025

GH-135040 is a backport of this pull request to the 3.13 branch.

@bedevere-app bedevere-app bot removed the needs backport to 3.13 bugs and security fixes label Jun 2, 2025
colesbury pushed a commit that referenced this pull request Jun 2, 2025
…ion (gh-134910) (gh-135039)

The `textiowrapper_iternext` function called `_textiowrapper_writeflush`, but did not
use a critical section, making it racy in free-threaded builds.
(cherry picked from commit 44fb7c3)

Co-authored-by: Duane Griffin <duaneg@dghda.com>
colesbury pushed a commit to colesbury/cpython that referenced this pull request Jun 2, 2025
…l section (pythongh-134910)

The `textiowrapper_iternext` function called `_textiowrapper_writeflush`, but did not
use a critical section, making it racy in free-threaded builds.
(cherry picked from commit 44fb7c3)

Co-authored-by: Duane Griffin <duaneg@dghda.com>
colesbury added a commit that referenced this pull request Jun 2, 2025
…ion (gh-134910) (gh-135040)

The `textiowrapper_iternext` function called `_textiowrapper_writeflush`, but did not
use a critical section, making it racy in free-threaded builds.
(cherry picked from commit 44fb7c3)

Co-authored-by: Duane Griffin <duaneg@dghda.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants