-
Notifications
You must be signed in to change notification settings - Fork 24.9k
[Profiler] Fix unexpected C return events #159574
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/159574
Note: Links to docs will display an error until the docs builds have been completed. ✅ You can merge normally! (1 Unrelated Failure)As of commit 66c37ca with merge base d2e0258 ( UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This seems reasonable to me. Lets try seeing if it passes CI |
@pytorchbot label "topic: not user facing" |
@pytorchbot merge |
This PR needs to be approved by an authorized maintainer before merge. |
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
The fix in #155446 addressed the "stack empty" issue that's easily reproducible on CPython 3.12.0-4. While this issue can also appear in other versions, it's not as easy to reproduce there.
I recently found a new cause for this problem.
https://github.com/python/cpython/blob/1df5d0014578be7fe7ae25e2cc60c50c8b5cc0f7/Python/ceval.c#L5807-L5836
In the CPython 3.10 implementation, PyTrace_C_CALL and PyTrace_C_RETURN/PyTrace_C_EXCEPTION are supposed to appear in pairs. However, when c_profilefunc is changed, unexpected PyTrace_C_RETURN/PyTrace_C_EXCEPTION events can occur.
Here is the code to reproduce this problem.
To fix this problem, we can record active_frames_ and remaining_start_frames_ for each thread, and when the PyTrace_C-RETURN/PyTrace_CEXT CEPTION event occurs, we can determine whether to record this event based on these two fields.
In reality, even without this fix, the final data appears to be right since the match process can handle this case (it would just result in an exception log being printed).
Do you think the fix is necessary?