Skip to content

[CUDA][MAGMA][Linalg][WIP] Remove MAGMA #155694

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

eee4017
Copy link
Collaborator

@eee4017 eee4017 commented Jun 11, 2025

This PR removes support for the MAGMA library in PyTorch's linear algebra operations on CUDA and ROCm devices. Going forward, all such operations will exclusively rely on cuSOLVER/hipSOLVER only.

cc @ptrblck @msaroufim @eqy @jerryzh168 @jeffdaily @sunway513 @jithunnair-amd @pruthvistony @ROCmSupport @dllehr-amd @jataylo @hongxiayang @naromero77amd @jianyuh @nikitaved @mruberry @walterddr @xwang233 @lezcano

Copy link

linux-foundation-easycla bot commented Jun 11, 2025

CLA Signed

The committers listed above are authorized under a signed CLA.

  • ✅ login: eee4017 / name: Frank Lin (f813f58)

Copy link

pytorch-bot bot commented Jun 11, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/155694

Note: Links to docs will display an error until the docs builds have been completed.

❌ 9 New Failures, 4 Cancelled Jobs, 1 Unrelated Failure

As of commit f813f58 with merge base 3040ca6 (image):

NEW FAILURES - The following jobs have failed:

CANCELLED JOBS - The following jobs were cancelled. Please retry:

BROKEN TRUNK - The following job failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@pytorch-bot pytorch-bot bot added the release notes: releng release notes category label Jun 11, 2025
@facebook-github-bot facebook-github-bot added the module: rocm AMD GPU support for Pytorch label Jun 11, 2025
@eee4017
Copy link
Collaborator Author

eee4017 commented Jun 11, 2025

@pytorchbot label labels "module: linalg" "module: magma"

Copy link

pytorch-bot bot commented Jun 11, 2025

Didn't find following labels among repository labels: labels,module: linalg

@pytorch-bot pytorch-bot bot added the module: magma related to magma linear algebra cuda support label Jun 11, 2025
@eee4017
Copy link
Collaborator Author

eee4017 commented Jun 11, 2025

@pytorchbot label "module: linear algebra"

@pytorch-bot pytorch-bot bot added the module: linear algebra Issues related to specialized linear algebra operations in PyTorch; includes matrix multiply matmul label Jun 11, 2025
@Aidyn-A Aidyn-A added module: cuda Related to torch.cuda, and CUDA support in general ciflow/trunk Trigger trunk jobs on your pull request ciflow/rocm Trigger "default" config CI on ROCm labels Jun 11, 2025
@ptrblck
Copy link
Collaborator

ptrblck commented Jun 11, 2025

Functionality and performance data is missing. Also, unclear if and why you want to remove MAGMA from rocm as well.

@eee4017 eee4017 marked this pull request as draft June 11, 2025 23:38
@eee4017 eee4017 changed the title [CUDA][MAGMA][Linalg] Remove MAGMA [CUDA][MAGMA][Linalg][WIP] Remove MAGMA Jun 11, 2025
@syed-ahmed
Copy link
Collaborator

@eee4017 We should have this as a stacked PR so that it's easier to review, e.g. 1st PR removes magma kernel invocation, 2nd one cleans the unused kernels, 3rd cleans cmake build, 4th cleans .ci etc. I would suggest just start with removing the kernel invocation, and once that PR is in, we can use ghstack (https://github.com/ezyang/ghstack) to post rest of the PRs, e.g.:

// remove
case at::LinalgBackend::Magma:
      return _cholesky_solve_helper_cuda_magma(self, A, upper);

Copy link
Contributor

Looks like this PR hasn't been updated in a while so we're going to go ahead and mark this as Stale.
Feel free to remove the Stale label if you feel this was a mistake.
If you are unable to remove the Stale label please contact a maintainer in order to do so.
If you want the bot to never mark this PR stale again, add the no-stale label.
Stale pull requests will automatically be closed after 30 days of inactivity.

@github-actions github-actions bot added the Stale label Aug 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ciflow/rocm Trigger "default" config CI on ROCm ciflow/trunk Trigger trunk jobs on your pull request module: cuda Related to torch.cuda, and CUDA support in general module: linear algebra Issues related to specialized linear algebra operations in PyTorch; includes matrix multiply matmul module: magma related to magma linear algebra cuda support module: rocm AMD GPU support for Pytorch open source release notes: releng release notes category Stale
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants