Skip to content

test_pendingcalls_threaded times out on Windows free-threading builds #114746

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
encukou opened this issue Jan 30, 2024 · 5 comments
Closed

test_pendingcalls_threaded times out on Windows free-threading builds #114746

encukou opened this issue Jan 30, 2024 · 5 comments

Comments

@encukou
Copy link
Member

encukou commented Jan 30, 2024

test.test_capi.test_misc.TestPendingCalls.test_pendingcalls_threaded usually (but not always) times out on the AMD64 Windows Server 2022 NoGIL 3.x buildbot, e.g. here: https://buildbot.python.org/all/#/builders/1241/builds/1116/steps/4/logs/stdio

It started 5 days ago, after #114262 (Implement GC for free-threaded builds) and #114479 (Make threading.Lock a real class, not a factory function), and it looks like it got more frequent after #113412 (Use pointer for interp->obmalloc state). (This is just from looking at the buildbot logs, I haven't bisected.)

Linked PRs

@encukou
Copy link
Member Author

encukou commented Jan 30, 2024

I cannot reproduce on my Windows 11 (10.0.22621.3007) VM :(

Faulthandler tracebacks (with source lines added):

Thread 0x00001f80 (most recent call first):
  File "C:\Users\Administrator\buildarea\3.x.itamaro-win64-srv-22-aws.x64.nogil\build\Lib\test\test_capi\test_misc.py", line 1344 in pendingcalls_thread
    context.event.set()
  File "C:\Users\Administrator\buildarea\3.x.itamaro-win64-srv-22-aws.x64.nogil\build\Lib\threading.py", line 1027 in run
...

Thread 0x000032d0 (most recent call first):
  File "C:\Users\Administrator\buildarea\3.x.itamaro-win64-srv-22-aws.x64.nogil\build\Lib\test\test_capi\test_misc.py", line 1304 in pendingcalls_wait
    a = i*i
  File "C:\Users\Administrator\buildarea\3.x.itamaro-win64-srv-22-aws.x64.nogil\build\Lib\test\test_capi\test_misc.py", line 1332 in test_pendingcalls_threaded
    self.pendingcalls_wait(context.l, n, context)
...

@terryjreedy
Copy link
Member

On my Win10 fresh debug nogil build, test.test_capi paused several minutes on
test_trashcan_python_class1 (test.test_capi.test_misc.CAPITest.test_trashcan_python_class1)
I went away to eat and upon return it had finished ok in 14 minutes, which I believe is a few times as long as normal.

@encukou
Copy link
Member Author

encukou commented Jan 31, 2024

@colesbury Is this a high-priority issue for you, or normal behaviour at this stage of the feature?
We might want to increase the timeout, and revisit GC performance later.

@colesbury
Copy link
Contributor

@encukou I'm planning to put out a PR to fix it today

@colesbury colesbury self-assigned this Jan 31, 2024
colesbury added a commit to colesbury/cpython that referenced this issue Jan 31, 2024
The free-threaded build's GC implementation is non-generational, but was
scheduled as if it were collecting a young generation leading to
quadratic behavior. This increases the minimum threshold and scales it
to the number of live objects as we do for the old generation in the
default build.

Note that the scheduling is still not thread-safe without the GIL. Those
changes will come in later PRs.

A few tests, like "test_sneaky_frame_object" rely on prompt scheduling
of the GC. For now, to keep that test passing, we disable the scaled
threshold after calls like `gc.set_threshold(1, 0, 0)`.
encukou pushed a commit that referenced this issue Feb 1, 2024
The free-threaded build's GC implementation is non-generational, but was
scheduled as if it were collecting a young generation leading to
quadratic behavior. This increases the minimum threshold and scales it
to the number of live objects as we do for the old generation in the
default build.

Note that the scheduling is still not thread-safe without the GIL. Those
changes will come in later PRs.

A few tests, like "test_sneaky_frame_object" rely on prompt scheduling
of the GC. For now, to keep that test passing, we disable the scaled
threshold after calls like `gc.set_threshold(1, 0, 0)`.
@colesbury
Copy link
Contributor

The six builds since the PR merged are all green: https://buildbot.python.org/all/#/builders/1241

aisk pushed a commit to aisk/cpython that referenced this issue Feb 11, 2024
…GH-114817)

The free-threaded build's GC implementation is non-generational, but was
scheduled as if it were collecting a young generation leading to
quadratic behavior. This increases the minimum threshold and scales it
to the number of live objects as we do for the old generation in the
default build.

Note that the scheduling is still not thread-safe without the GIL. Those
changes will come in later PRs.

A few tests, like "test_sneaky_frame_object" rely on prompt scheduling
of the GC. For now, to keep that test passing, we disable the scaled
threshold after calls like `gc.set_threshold(1, 0, 0)`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants