Skip to content

[Graph Partition] Pass all OSS unit tests #154667

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 28 commits into from
Closed

Conversation

BoyuanFeng
Copy link
Contributor

@BoyuanFeng BoyuanFeng commented May 29, 2025

Graph partition leads to 6.2% speedup on vision_maskrcnn, 5.8% speedup on yolov3. P1819700563, 39.5% speedup on speech_transformer inference P1830602200, 85% speedup on speech_transformer training P1831115315.

Run the same diff on two days and both show speedup on average.

first TorchInductor Benchmark ci run
image

second TorchInductorBenchmark ci run
image

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @chenyang78 @kadeng @muchulee8 @amjames @chauhang @aakhundov @coconutruben @Lucaskabela

Copy link

pytorch-bot bot commented May 29, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/154667

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit df77c93 with merge base ee89cc7 (image):

FLAKY - The following job failed but was likely due to flakiness present on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@BoyuanFeng BoyuanFeng force-pushed the bf/partition-turn-on branch from 7acb21e to 880a8ca Compare June 5, 2025 17:14
@BoyuanFeng BoyuanFeng mentioned this pull request Jun 13, 2025
20 tasks
@BoyuanFeng BoyuanFeng changed the title [Graph Partition] Turn-on in OSS by default [Graph Partition] Pass all OSS unit tests Jun 27, 2025
@eellison
Copy link
Contributor

What is the next steps for this ? Would you file issues related to the cudagraph partition slowdowns in torchbench ? This was a good amount of work - would be great to get it rolled out automatically to users.

@BoyuanFeng
Copy link
Contributor Author

yes we should turn on by default in oss. let me check more on torchbench perf.

@BoyuanFeng
Copy link
Contributor Author

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: 1 jobs have failed, first few of them are: Meta Internal-Only Changes Check

Details for Dev Infra team Raised by workflow job

@facebook-github-bot
Copy link
Contributor

@BoyuanFeng has imported this pull request. If you are a Meta employee, you can view this in D79104472.

@BoyuanFeng
Copy link
Contributor Author

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@clee2000
Copy link
Contributor

@pytorchbot revert -m "broke inductor/test_memory.py::TestOperatorReorderForPeakMemory::test_reorder_peak_memory_lpmf GH job link HUD commit link note to self: bad TD" -c nosignal

@pytorchmergebot
Copy link
Collaborator

@pytorchbot successfully started a revert job. Check the current status here.
Questions? Feedback? Please reach out to the PyTorch DevX Team

@pytorchmergebot
Copy link
Collaborator

@BoyuanFeng your PR has been successfully reverted.

pytorchmergebot added a commit that referenced this pull request Aug 11, 2025
This reverts commit ca7315c.

Reverted #154667 on behalf of https://github.com/clee2000 due to broke inductor/test_memory.py::TestOperatorReorderForPeakMemory::test_reorder_peak_memory_lpmf [GH job link](https://github.com/pytorch/pytorch/actions/runs/16885961204/job/47836769279) [HUD commit link](https://hud.pytorch.org/pytorch/pytorch/commit/ca7315c17162ea21b1ca5ba23f4bf6168766c7b9) note to self: bad TD ([comment](#154667 (comment)))
@pytorchmergebot pytorchmergebot added Reverted ci-no-td Do not run TD on this PR labels Aug 11, 2025
@BoyuanFeng
Copy link
Contributor Author

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants