-
-
Notifications
You must be signed in to change notification settings - Fork 32.1k
gh-134908: protect textiowrapper_iternext
with critical section
#134910
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…tion ``textiowrapper_iternext`` calls ``_textiowrapper_writeflush`` but does not take the critical section, making it racy in free-threaded builds.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not familiar enough with the python critical section implementation bits to be confident this is the right way to do this. Currently on vacation but reviewed some high level pieces as a first pass. Will be back in town next week and can look more thoroughly then. Also happy for others to work on / review / land in the mean time, especially people with more free threading experience.
Misc/NEWS.d/next/Core_and_Builtins/2025-05-30-15-56-19.gh-issue-134908.3a7PxM.rst
Outdated
Show resolved
Hide resolved
Thanks very much, but please don't feel any obligation to look at this while you are on holiday! There is no particular hurry, it can all wait 😄 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I said in the issue, concurrent iteration generally isn't worth fixing on our end. It doesn't look like we do this for any of the other _io
types.
From what I can tell, the main issue with supporting concurrent iteration is that it's normally not that useful. At the __next__
level, threads will skip over one another, which leaves holes in the data that each thread reads. If you're trying to exhaust a large iterator as fast as possible, like a file, the lock contention will probably be so bad that iterating from multiple threads won't be any more efficient than doing it with one.
Misc/NEWS.d/next/Core_and_Builtins/2025-05-30-15-56-19.gh-issue-134908.3a7PxM.rst
Outdated
Show resolved
Hide resolved
…e-134908.3a7PxM.rst Co-authored-by: Peter Bierma <zintensitydev@gmail.com>
Mentioned there also, but FTR this commit is just trying to fix the crash, not make the iterator thread-safe in any wider sense. I completely agree that is not a sensible goal. You will note there is no concurrent access of iterators in the reproduction script or test code!
If any of the other |
It concurrently iterates over
I'm pretty sure nearly all extension types will crash with concurrent iteration right now, because very few of them have locking. I think it's intentional for performance reasons. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, it looks like we're going to go with this per the discussion in the issue.
That said, I think we can make this thread-safe without a critical section. AFAICT, the only thread-unsafe thing here is the telling
field. Let's switch that to an atomic. I'm not worried about torn or inconsistent reads because the data will be pretty screwed up for concurrent readers anyway.
I think the direct readline call needs a critical section around it? ( if (Py_IS_TYPE(self, self->state->PyTextIOWrapper_Type)) {
/* Skip method call overhead for speed */
line = _textiowrapper_readline(self, -1);
} Thoughts on doing: (Probably would/should need to do a single-threaded perf test to validate removing the shortcut path isn't highly detrimental in text line iteration...) |
|
I'm afraid that won't work: atomicity of that field by itself is not sufficient. The other methods that check or change it are written with the assumption it cannot change value while they are running. A critical section is specifically required here: that is what the class is using to synchronise access to its internal data. |
Yeah, I'm suggesting you switch to atomics there too. This is only for |
I'm a bit worried we may be talking at cross-purposes 😅 Let me recap a bit, to make sure we are all talking about the same things... As I understand it, the proposal is to call the argument clinic-generated readline function from If I've misunderstood anything there, apologies, and please correct me. However, to explain why I don't think the approach above will work: we don't need to just make one read/write safe, we need to protect a whole sequence of operations. I.e. we need a lock, which in this case should be the critical section. To illustrate, consider |
I don't think we should care about this case. As I said, concurrently reading from an iterator isn't something that's generally useful. You need your own synchronization anyway, so all we need to do is make sure that it doesn't crash. Without the caller holding a lock, threads will get torn reads regardless of how we make it thread-safe. What's the actual practicality of concurrent reading from a file while randomly writing to it? My main concern here is that we're slowing down single-threaded code for a use-case that doesn't exist. |
It will crash. Sorry for not being clearer. The possibilities under "bad things" in that function certainly include leaking object references; I am not sure if anything worse can happen there. Elsewhere assertion failures, use-after-free, and general arbitrary memory corruption are possible. For example, the
I'm not suggesting there is any legit reason to do this. It sounds like a really bad idea to me. As I said before, I don't really care if the result is gibberish. However, it should not crash the interpreter! As for performance: my proposed fix takes a critical section on entering |
A good suggestion. A quick and very dirty benchmark shows no significant performance difference between this branch and where it diverged: import io
import timeit
LINES = 1_000_000
bytes = b"\n" * LINES
def test():
buffer = io.BytesIO(b"\n" * LINES)
wrapper = io.TextIOWrapper(buffer)
for _ in wrapper:
pass
if __name__ == '__main__':
timer = timeit.Timer(test)
elapsed = timer.timeit(number=100)
print(f"{elapsed:.2f}") On my system, with an optimised free threading build, that is ~3.7ns per call, with or without the fix. |
That should get a critical section around it, but it's not the common path.
Yeah, the acquisition of a critical section itself is efficient. But critical sections come with a lot more bells and whistles than you'd think. In particular, every time the thread state is detached, the thread has to drop the lock, and then re-acquire that lock upon reattaching. The thread state is detached more often than you'd think; it happens between bytecode instructions, and also happens every time a mutex or other critical section is acquired.
Hm? There's no marshalling there, AFAICT. Using the AC version is just a simpler way to automatically wrap the critical section. I don't have much of a preference whether we're explicit or implicit there. Basically, my suggestion is to keep the locking in places where we only really need it to prevent crashes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, after some more pondering, I retract my previous suggestion. It's probably too much additional complexity to make telling
atomic, but I want to be extremely cautious about precedent here. Please add a comment explaining why we're going with a critical section, and that it's a special case.
I'm not super excited about adding locks here, but there's not too many great options:
- Leave it as is, causing crashes when fuzzing.
- Use some (relaxed) atomics to keep things generally fast for single-threaded code, at the cost of being significantly more complex.
- Add locking around the
__next__
call, which hurts performance, especially considering the frequent I/O jumping through extra hoops with the implicit acquiring and releasing.
@@ -3210,6 +3213,16 @@ textiowrapper_iternext(PyObject *op) | |||
return line; | |||
} | |||
|
|||
static PyObject * | |||
textiowrapper_iternext(PyObject *op) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add a comment regarding fuzzing here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with some minor edits. I'll make the changes directly to the PR and merge it.
Misc/NEWS.d/next/Core_and_Builtins/2025-05-30-15-56-19.gh-issue-134908.3a7PxM.rst
Outdated
Show resolved
Hide resolved
Co-authored-by: Peter Bierma <zintensitydev@gmail.com>
Thanks @duaneg for the bug report and fix and everyone else for the code reviews! |
Thanks @duaneg for the PR, and @colesbury for merging it 🌮🎉.. I'm working now to backport this PR to: 3.13, 3.14. |
Thanks @duaneg for the PR, and @colesbury for merging it 🌮🎉.. I'm working now to backport this PR to: 3.13. |
Sorry, @duaneg and @colesbury, I could not cleanly backport this to
|
Thanks @duaneg for the PR, and @colesbury for merging it 🌮🎉.. I'm working now to backport this PR to: 3.14. |
…on (pythongh-134910) The `textiowrapper_iternext` function called `_textiowrapper_writeflush`, but did not use a critical section, making it racy in free-threaded builds. (cherry picked from commit 44fb7c3) Co-authored-by: Duane Griffin <duaneg@dghda.com>
GH-135039 is a backport of this pull request to the 3.14 branch. |
…l section (pythongh-134910) The `textiowrapper_iternext` function called `_textiowrapper_writeflush`, but did not use a critical section, making it racy in free-threaded builds. (cherry picked from commit 44fb7c3) Co-authored-by: Duane Griffin <duaneg@dghda.com>
GH-135040 is a backport of this pull request to the 3.13 branch. |
…l section (pythongh-134910) The `textiowrapper_iternext` function called `_textiowrapper_writeflush`, but did not use a critical section, making it racy in free-threaded builds. (cherry picked from commit 44fb7c3) Co-authored-by: Duane Griffin <duaneg@dghda.com>
textiowrapper_iternext
calls_textiowrapper_writeflush
but does not take the critical section, making it racy in free-threaded builds.textiowrapper_iternext
and writing to a text file simultaneously in ft build #134908