Skip to content

[ROCm] [CK] Composable Kernel integration for ROCm #158747

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 17 commits into
base: main
Choose a base branch
from

Conversation

iupaikov-amd
Copy link
Collaborator

@iupaikov-amd iupaikov-amd commented Jul 21, 2025

This is a part of our effort for integrating Composable Kernel library for Inductor backend. Currently we have a submodule, but would prefer to have commit pin control over the library as with Triton.

The idea is to have CK in 2.8 release to allow people to use it with inductor and AOT inductor and then gradually step away from submodule usage. Right now SDPA is tied to submodule files. We would like to avoid putting all installation logic in CI scripts to allow locally built versions to have this functionality.

This PR is a remake of due to branch error: #156192

cc @jeffdaily @sunway513 @jithunnair-amd @pruthvistony @ROCmSupport @dllehr-amd @jataylo @hongxiayang @naromero77amd @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @chenyang78 @kadeng @muchulee8 @amjames @chauhang @aakhundov @coconutruben

Copy link

pytorch-bot bot commented Jul 21, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/158747

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 1f6208b with merge base a53d14d (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@pytorch-bot pytorch-bot bot added module: inductor module: rocm AMD GPU support for Pytorch release notes: releng release notes category labels Jul 21, 2025
@iupaikov-amd
Copy link
Collaborator Author

@pytorchbot rebase

@pytorchmergebot
Copy link
Collaborator

@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here

@pytorchmergebot
Copy link
Collaborator

Rebase failed due to Command git -C /home/runner/work/pytorch/pytorch rebase refs/remotes/origin/viable/strict pull/158747/head returned non-zero exit code 1

Rebasing (1/6)
Auto-merging setup.py
CONFLICT (content): Merge conflict in setup.py
error: could not apply f61030db195... Implemented CK installation for ROCm builds
hint: Resolve all conflicts manually, mark them as resolved with
hint: "git add/rm <conflicted_files>", then run "git rebase --continue".
hint: You can instead skip this commit: run "git rebase --skip".
hint: To abort and get back to the state before "git rebase", run "git rebase --abort".
hint: Disable this message with "git config set advice.mergeConflict false"
Could not apply f61030db195... # Implemented CK installation for ROCm builds

Raised by https://github.com/pytorch/pytorch/actions/runs/16421942414

@iupaikov-amd iupaikov-amd force-pushed the iupaikov_ck_integration_upstream2 branch from 36cb691 to 41deb6e Compare July 21, 2025 16:13
@iupaikov-amd iupaikov-amd added ciflow/trunk Trigger trunk jobs on your pull request ciflow/inductor keep-going Don't stop on first failure, keep running tests until the end ciflow/rocm Trigger "default" config CI on ROCm ciflow/rocm-mi300 Trigger "default" config CI on ROCm MI300 labels Jul 21, 2025
@iupaikov-amd
Copy link
Collaborator Author

@pytorchbot rebase

@pytorchmergebot
Copy link
Collaborator

@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here

@pytorchmergebot
Copy link
Collaborator

Successfully rebased iupaikov_ck_integration_upstream2 onto refs/remotes/origin/viable/strict, please pull locally before adding more changes (for example, via git checkout iupaikov_ck_integration_upstream2 && git pull --rebase)

@pytorchmergebot pytorchmergebot force-pushed the iupaikov_ck_integration_upstream2 branch from 41deb6e to 784166a Compare July 22, 2025 10:10
@pytorch-bot pytorch-bot bot removed ciflow/trunk Trigger trunk jobs on your pull request ciflow/inductor ciflow/rocm Trigger "default" config CI on ROCm ciflow/rocm-mi300 Trigger "default" config CI on ROCm MI300 labels Jul 22, 2025
@iupaikov-amd iupaikov-amd added ciflow/trunk Trigger trunk jobs on your pull request ciflow/inductor ciflow/rocm Trigger "default" config CI on ROCm ciflow/rocm-mi300 Trigger "default" config CI on ROCm MI300 ciflow/periodic-rocm-mi300 Trigger "distributed" config CI on ROCm MI300 labels Jul 22, 2025
@pytorch-bot pytorch-bot bot removed ciflow/trunk Trigger trunk jobs on your pull request ciflow/inductor labels Jul 22, 2025
@jataylo jataylo added ciflow/inductor-rocm Trigger "inductor" config CI on ROCm ciflow/rocm-mi300 Trigger "default" config CI on ROCm MI300 labels Jul 31, 2025
@pytorch-bot pytorch-bot bot removed ciflow/trunk Trigger trunk jobs on your pull request ciflow/rocm Trigger "default" config CI on ROCm ciflow/inductor-rocm Trigger "inductor" config CI on ROCm ciflow/rocm-mi300 Trigger "default" config CI on ROCm MI300 labels Aug 1, 2025
@jataylo jataylo added ciflow/inductor ciflow/rocm Trigger "default" config CI on ROCm ciflow/inductor-rocm Trigger "inductor" config CI on ROCm ciflow/rocm-mi300 Trigger "default" config CI on ROCm MI300 ciflow/trunk Trigger trunk jobs on your pull request ci-no-td Do not run TD on this PR and removed ciflow/inductor ciflow/rocm Trigger "default" config CI on ROCm ciflow/inductor-rocm Trigger "inductor" config CI on ROCm ciflow/rocm-mi300 Trigger "default" config CI on ROCm MI300 labels Aug 1, 2025
@pytorch-bot pytorch-bot bot removed the ciflow/trunk Trigger trunk jobs on your pull request label Aug 7, 2025
@jataylo jataylo added ciflow/trunk Trigger trunk jobs on your pull request ciflow/rocm-mi300 Trigger "default" config CI on ROCm MI300 labels Aug 7, 2025
@tenpercent
Copy link
Collaborator

@pytorchbot rebase -s

@pytorchmergebot
Copy link
Collaborator

@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here

@pytorchmergebot
Copy link
Collaborator

Rebase failed due to Command git -C /home/runner/work/pytorch/pytorch rebase refs/remotes/origin/viable/strict pull/158747/head returned non-zero exit code 1

Rebasing (1/14)
Rebasing (2/14)
Rebasing (3/14)
Rebasing (4/14)
Rebasing (5/14)
Rebasing (6/14)
Rebasing (7/14)
Rebasing (8/14)
Rebasing (9/14)
Rebasing (10/14)
Rebasing (11/14)
Rebasing (12/14)
Rebasing (13/14)
Rebasing (14/14)
Auto-merging .github/workflows/rocm-mi300.yml
CONFLICT (content): Merge conflict in .github/workflows/rocm-mi300.yml
error: could not apply b1b38a380bd... Temporary hack to only run target UT file (mi300)
hint: Resolve all conflicts manually, mark them as resolved with
hint: "git add/rm <conflicted_files>", then run "git rebase --continue".
hint: You can instead skip this commit: run "git rebase --skip".
hint: To abort and get back to the state before "git rebase", run "git rebase --abort".
hint: Disable this message with "git config set advice.mergeConflict false"
Could not apply b1b38a380bd... # Temporary hack to only run target UT file (mi300)

Raised by https://github.com/pytorch/pytorch/actions/runs/16814222730

@pytorch-bot pytorch-bot bot removed ciflow/trunk Trigger trunk jobs on your pull request ciflow/rocm-mi300 Trigger "default" config CI on ROCm MI300 labels Aug 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci-no-td Do not run TD on this PR keep-going Don't stop on first failure, keep running tests until the end module: inductor module: rocm AMD GPU support for Pytorch open source release notes: releng release notes category release notes: rocm mandatorylabel
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants