Skip to content

[ATen][CPU][Sparse] Use Third-Party Eigen for sparse add and addmm #155357

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

Aidyn-A
Copy link
Collaborator

@Aidyn-A Aidyn-A commented Jun 6, 2025

This pull request adds the following ops for sparse matrices using Eigen library:

    add(a_csr, b_csr)
    add(a_csc, b_csc)

    addmm(c_csr, a_csr, b_csr)
    addmm(c_csr, a_csr, b_csc)
    addmm(c_csr, a_csc, b_csc)
    addmm(c_csr, a_csc, b_csr)

    addmm(c_csc, a_csr, b_csr)
    addmm(c_csc, a_csr, b_csc)
    addmm(c_csc, a_csc, b_csc)
    addmm(c_csc, a_csc, b_csr)

Currently, the operations for sparse matrices on CPU are available through MKL only. The non-existence of MKL on aarch64 causes the unavailability of these ops on any machines with ARM based CPUs, including Apple Silicon, AWS Graviton and NVIDIA Grace. This PR addresses this issue by using Eigen as a backend for the above ops.

This is a re-factored version of my previous PR #101814. The main difference with the old one, this does not enable Eigen by default.

Copy link

pytorch-bot bot commented Jun 6, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/155357

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit 93aa73b with merge base 2507ae6 (image):

UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:

  • pull / linux-jammy-py3_9-clang9-xla / test (xla, 1, 1, lf.linux.12xlarge, unstable) (gh) (#158876)
    /var/lib/jenkins/workspace/xla/torch_xla/csrc/runtime/BUILD:476:14: Compiling torch_xla/csrc/runtime/xla_util_test.cpp failed: (Exit 1): gcc failed: error executing CppCompile command (from target //torch_xla/csrc/runtime:xla_util_test) /usr/bin/gcc -U_FORTIFY_SOURCE -fstack-protector -Wall -Wunused-but-set-parameter -Wno-free-nonheap-object -fno-omit-frame-pointer -g0 -O2 '-D_FORTIFY_SOURCE=1' -DNDEBUG -ffunction-sections ... (remaining 229 arguments skipped)

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@pytorch-bot pytorch-bot bot added the release notes: sparse release notes category label Jun 6, 2025
@Aidyn-A Aidyn-A changed the title sparse add and addmm [ATen][CPU][Sparse] Use Third-Party Eigen for sparse add and addmm Jun 6, 2025
@Aidyn-A Aidyn-A requested review from cpuhrsch and malfet June 6, 2025 18:55
@mikaylagawarecki mikaylagawarecki requested a review from pearu June 6, 2025 22:02
@mikaylagawarecki mikaylagawarecki added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Jun 6, 2025
@Aidyn-A
Copy link
Collaborator Author

Aidyn-A commented Jun 7, 2025

@pytorchbot rebase

@pytorchmergebot
Copy link
Collaborator

@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here

@pytorchmergebot
Copy link
Collaborator

Successfully rebased eigen_for_sparse_addmm_v2 onto refs/remotes/origin/viable/strict, please pull locally before adding more changes (for example, via git checkout eigen_for_sparse_addmm_v2 && git pull --rebase)

@pytorchmergebot pytorchmergebot force-pushed the eigen_for_sparse_addmm_v2 branch from 71d7617 to 8db7338 Compare June 7, 2025 06:20
@Aidyn-A Aidyn-A force-pushed the eigen_for_sparse_addmm_v2 branch from 8db7338 to df77628 Compare June 7, 2025 18:03
Copy link
Collaborator

@pearu pearu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR, it looks good to me!

I have a number of nits and one question, see #155357 (comment)

@Aidyn-A Aidyn-A force-pushed the eigen_for_sparse_addmm_v2 branch from df77628 to 36d5c43 Compare July 14, 2025 08:51
@Aidyn-A Aidyn-A requested a review from pearu July 14, 2025 16:53
@Aidyn-A
Copy link
Collaborator Author

Aidyn-A commented Jul 23, 2025

@pytorchbot rebase

@pytorchmergebot
Copy link
Collaborator

@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here

@pytorchmergebot
Copy link
Collaborator

Successfully rebased eigen_for_sparse_addmm_v2 onto refs/remotes/origin/viable/strict, please pull locally before adding more changes (for example, via git checkout eigen_for_sparse_addmm_v2 && git pull --rebase)

@pytorchmergebot pytorchmergebot force-pushed the eigen_for_sparse_addmm_v2 branch from 1afd750 to 2ea6b20 Compare July 23, 2025 23:52
Copy link
Collaborator

@pearu pearu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is another review iteration that removes duplicated code.

Question: there exists multiple paths for addmm depending on the layout combinations of input and output tensors. Are all these paths covered by existing unittests?

if (mat2.layout() == kSparseCsr) {
const Eigen::Map<EigenCsrMatrix> mat2_eigen =
Tensor_to_EigenCsr<scalar_t, index_t>(mat2);
const EigenCscMatrix mat1_mat2_eigen = (mat1_eigen * mat2_eigen);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the link!

I think this inefficiency issue can be resolved by a comment in the source code.

@Aidyn-A Aidyn-A force-pushed the eigen_for_sparse_addmm_v2 branch from 2ea6b20 to 2c69b71 Compare August 7, 2025 14:12
Copy link
Collaborator

@pearu pearu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me, however, see my comment below about the suspicion that there exists a bug in addmm_out_sparse_eigen logic that should be revealed in tests.

Comment on lines +232 to +234
// Out_csc = M1_csc * M2_csc
const auto mat2_eigen = Tensor_to_Eigen<scalar_t, Eigen::ColMajor, index_t>(mat2);
const auto mat1_mat2_eigen = (mat1_eigen * mat2_eigen);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Notice that this code is identical to the code above under // Out_csr = M1_csc * M2_csc (line 200), however, the expected mat1_mat2_eigen storage orders ought to be different here and above.

Unless a compiler is able to infer different types for auto mat1_mat2_eigen based on the following Eigen_to_Tensor call below (I would be surprised if this is the case), calling Eigen_to_Tensor ought to fail either here or above.

Hence, I think the question raised in #155357 (review)
needs to be answered either by confirming that all branches in this function are tested or by implementing the missing tests (which I suspect will fail due to the check in Tensor_to_Eigen function here or above).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
open source release notes: sparse release notes category triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants