Skip to content

Data race in intern_common when interning str objects in the free threading build #129701

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
colesbury opened this issue Feb 5, 2025 · 8 comments
Assignees
Labels
3.13 bugs and security fixes 3.14 new features, bugs and security fixes topic-free-threading type-bug An unexpected behavior, bug, or error

Comments

@colesbury
Copy link
Contributor

colesbury commented Feb 5, 2025

Bug report

When running ./python -m test test_exceptions --parallel-threads=10 in a TSAN build:

WARNING: ThreadSanitizer: data race (pid=763025)
  Atomic read of size 4 at 0x7fffbe0718cc by thread T190:
    #0 _Py_atomic_load_uint32_relaxed Include/cpython/pyatomic_gcc.h:367 (python+0x1d386e) (BuildId: 5612db6eff0f51c7fd99ee4409b2ceafceea484c)
    #1 Py_INCREF Include/refcount.h:261 (python+0x1d386e)
    #2 _Py_NewRef Include/refcount.h:518 (python+0x1d386e)
    #3 dict_setdefault_ref_lock_held Objects/dictobject.c:4386 (python+0x1e6817) (BuildId: 5612db6eff0f51c7fd99ee4409b2ceafceea484c)
    #4 PyDict_SetDefaultRef Objects/dictobject.c:4403 (python+0x1e6a37) (BuildId: 5612db6eff0f51c7fd99ee4409b2ceafceea484c)
    #5 intern_common Objects/unicodeobject.c:15820 (python+0x2a7a8d) (BuildId: 5612db6eff0f51c7fd99ee4409b2ceafceea484c)
    #6 _PyUnicode_InternImmortal Objects/unicodeobject.c:15874 (python+0x2e92d1) (BuildId: 5612db6eff0f51c7fd99ee4409b2ceafceea484c)
    #7 _PyPegen_new_identifier Parser/pegen.c:549 (python+0xad61f) (BuildId: 5612db6eff0f51c7fd99ee4409b2ceafceea484c)
....

  Previous write of size 4 at 0x7fffbe0718cc by thread T182:
    #0 Py_SET_REFCNT Include/refcount.h:176 (python+0x2a7c9a) (BuildId: 5612db6eff0f51c7fd99ee4409b2ceafceea484c)
    #1 Py_SET_REFCNT Include/refcount.h:145 (python+0x2a7c9a)
    #2 intern_common Objects/unicodeobject.c:15848 (python+0x2a7c9a)
    #3 _PyUnicode_InternImmortal Objects/unicodeobject.c:15874 (python+0x2e92d1) (BuildId: 5612db6eff0f51c7fd99ee4409b2ceafceea484c)
    #4 _PyPegen_new_identifier Parser/pegen.c:549 (python+0xad61f) (BuildId: 5612db6eff0f51c7fd99ee4409b2ceafceea484c)
...

There are a few thread-safety issues with intern_common:

  • It can return a string that's not marked as interned because we insert into interned before we mark the string as interned. This can be fixed with some additional locking.
  • The Py_SET_REFCNT(s, Py_REFCNT(s) - 2) modification is not thread-safe with respect to other reference count modifications in the free threading build
  • _Py_SetImmortal is not thread-safe in some circumstances (see _Py_SetImmortal must be run on allocating thread (no-gil) #113956)

The _Py_SetImmortal() issue is tricky and I think it's unlikely to cause problems in practice, so I think we can defer dealing with that for now.

Linked PRs

@colesbury colesbury added 3.13 bugs and security fixes 3.14 new features, bugs and security fixes topic-free-threading type-bug An unexpected behavior, bug, or error labels Feb 5, 2025
@colesbury colesbury self-assigned this Feb 6, 2025
@lysnikolaou
Copy link
Member

I hit this while testing PyYAML (though the TSAN stack trace was slightly different). Are you going to work on this @colesbury?

@colesbury
Copy link
Contributor Author

Yes, I'll probably put up a PR today.

Would you please share the TSAN stack trace?

colesbury added a commit to colesbury/cpython that referenced this issue Feb 13, 2025
…ded build

* Use a mutex to avoid potentially returning a non-immortalized string,
  because immortalization happens after the insertion into the interned
  dict.

* Use `Py_DECREF()` calls instead of `Py_SET_REFCNT(s, Py_REFCNT(s) - 2)`
  for thread-safety. This code path isn't performance sensistive, so
  just use `Py_DECREF()` unconditionally for simplicity.
@lysnikolaou
Copy link
Member

Sure, here it is.

  Read of size 4 at 0x0001202900c0 by thread T913:
    #0 unicode_eq eq.h:13 (python3.13:arm64+0x10017fec0)
    #1 unicodekeys_lookup_unicode dictobject.c:1151 (python3.13:arm64+0x100168360)
    #2 _Py_dict_lookup dictobject.c:1262 (python3.13:arm64+0x1001687d4)
    #3 dict_setdefault_ref_lock_held dictobject.c:4282 (python3.13:arm64+0x100174e0c)
    #4 PyDict_SetDefaultRef dictobject.c:4332 (python3.13:arm64+0x100174874)
    #5 intern_common unicodeobject.c:15277 (python3.13:arm64+0x10026b444)
    #6 _PyUnicode_InternMortal unicodeobject.c:15338 (python3.13:arm64+0x10026b9f8)
    #7 _PyCode_New codeobject.c:688 (python3.13:arm64+0x1000ef704)
    #8 PyCode_NewEmpty codeobject.c:936 (python3.13:arm64+0x1000f0c9c)
    ...

  Previous write of size 4 at 0x0001202900c0 by thread T915:
    #0 intern_common unicodeobject.c:15312 (python3.13:arm64+0x10026b784)
    #1 _PyUnicode_InternMortal unicodeobject.c:15338 (python3.13:arm64+0x10026b9f8)
    #2 _PyCode_New codeobject.c:688 (python3.13:arm64+0x1000ef704)
    #3 PyCode_NewEmpty codeobject.c:936 (python3.13:arm64+0x1000f0c9c)
    ...

@colesbury
Copy link
Contributor Author

Thanks. Is this is with 3.13? I think that one was fixed in #128196, but it's not backported to 3.13. That particular PR changes the data types of public fields, so I don't think we can backport it.

It might be worth doing the thread sanitizer tests on 3.14. cc @ngoldbaum

@ngoldbaum
Copy link
Contributor

I think I've seen similar tracebacks on 3.13 with PyO3 about interning strings. See this line in the gist from this PyO3 issue.

I want to re-do the TSAN PyO3 tests with 3.14, but I need to update the FFI definitions first. Generally PyO3 doesn't add support for new python versions until beta 1 to avoid dealing with churn.

colesbury added a commit to colesbury/cpython that referenced this issue Feb 13, 2025
…ded build

* Use a mutex to avoid potentially returning a non-immortalized string,
  because immortalization happens after the insertion into the interned
  dict.

* Use `Py_DECREF()` calls instead of `Py_SET_REFCNT(s, Py_REFCNT(s) - 2)`
  for thread-safety. This code path isn't performance sensistive, so
  just use `Py_DECREF()` unconditionally for simplicity.
@lysnikolaou
Copy link
Member

Is this is with 3.13?

Yes, it's with 3.13.2. I'll try 3.14 later today or early tomorrow.

@lysnikolaou
Copy link
Member

FWIW I did verify that the unicode_eq stack trace does not appear under 3.14.

encukou pushed a commit that referenced this issue Feb 17, 2025
…ild (GH-130089)

* gh-129701: Fix a data race in `intern_common` in the free threaded build

* Use a mutex to avoid potentially returning a non-immortalized string,
  because immortalization happens after the insertion into the interned
  dict.

* Use `Py_DECREF()` calls instead of `Py_SET_REFCNT(s, Py_REFCNT(s) - 2)`
  for thread-safety. This code path isn't performance sensistive, so
  just use `Py_DECREF()` unconditionally for simplicity.
@colesbury
Copy link
Contributor Author

I'm going to close this as the only remaining issue is that _Py_SetImmortal is not thread-safe in some circumstances, which is tracked in:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.13 bugs and security fixes 3.14 new features, bugs and security fixes topic-free-threading type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

3 participants