-
Notifications
You must be signed in to change notification settings - Fork 24.9k
[ATen][CPU][Sparse] Use Third-Party Eigen for sparse add and addmm #155357
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/155357
Note: Links to docs will display an error until the docs builds have been completed. ⏳ No Failures, 22 PendingAs of commit 6f01edd with merge base 01bcf9a ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
@pytorchbot rebase |
@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here |
Successfully rebased |
71d7617
to
8db7338
Compare
8db7338
to
df77628
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR, it looks good to me!
I have a number of nits and one question, see #155357 (comment)
df77628
to
36d5c43
Compare
@pytorchbot rebase |
@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here |
Successfully rebased |
1afd750
to
2ea6b20
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is another review iteration that removes duplicated code.
Question: there exists multiple paths for addmm depending on the layout combinations of input and output tensors. Are all these paths covered by existing unittests?
if (mat2.layout() == kSparseCsr) { | ||
const Eigen::Map<EigenCsrMatrix> mat2_eigen = | ||
Tensor_to_EigenCsr<scalar_t, index_t>(mat2); | ||
const EigenCscMatrix mat1_mat2_eigen = (mat1_eigen * mat2_eigen); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the link!
I think this inefficiency issue can be resolved by a comment in the source code.
2ea6b20
to
2c69b71
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me, however, see my comment below about the suspicion that there exists a bug in addmm_out_sparse_eigen
logic that should be revealed in tests.
// Out_csc = M1_csc * M2_csc | ||
const auto mat2_eigen = Tensor_to_Eigen<scalar_t, Eigen::ColMajor, index_t>(mat2); | ||
const auto mat1_mat2_eigen = (mat1_eigen * mat2_eigen); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Notice that this code is identical to the code above under // Out_csr = M1_csc * M2_csc
(line 200), however, the expected mat1_mat2_eigen
storage orders ought to be different here and above.
Unless a compiler is able to infer different types for auto mat1_mat2_eigen
based on the following Eigen_to_Tensor
call below (I would be surprised if this is the case), calling Eigen_to_Tensor
ought to fail either here or above.
Hence, I think the question raised in #155357 (review)
needs to be answered either by confirming that all branches in this function are tested or by implementing the missing tests (which I suspect will fail due to the check in Tensor_to_Eigen
function here or above).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right, I think it is better to use explicit type declarations instead of auto
.
Question: there exists multiple paths for addmm depending on the layout combinations of input and output tensors. Are all these paths covered by existing unittests?
No, those are not covered in the existing unit-tests, as MKL backend does not support these layouts:
addmm(c_csc, a_csr, b_csr)
addmm(c_csc, a_csr, b_csc)
addmm(c_csc, a_csc, b_csc)
addmm(c_csc, a_csc, b_csr)
Here is the short script I run to test the changes in this PR: test_addmm_sparse.py.txt
93aa73b
to
6f01edd
Compare
This pull request adds the following ops for sparse matrices using Eigen library:
Currently, the operations for sparse matrices on CPU are available through MKL only. The non-existence of MKL on
aarch64
causes the unavailability of these ops on any machines with ARM based CPUs, including Apple Silicon, AWS Graviton and NVIDIA Grace. This PR addresses this issue by using Eigen as a backend for the above ops.This is a re-factored version of my previous PR #101814. The main difference with the old one, this does not enable Eigen by default.