Skip to content

FIX Add input array check to randomized_range_finder #30819

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Apr 17, 2025

Conversation

clane9
Copy link
Contributor

@clane9 clane9 commented Feb 12, 2025

Reference Issues/PRs

Fixes #30736. See also #30737.

What does this implement/fix? Explain your changes.

Adds a check of the input array to randomized_range_finder so that complex valued inputs are rejected. As a result, complex inputs are also rejected from randomized_svd, which calls randomized_range_finder.

@clane9 clane9 changed the title Add input array check to randomized_range_finder FIX Add input array check to randomized_range_finder Feb 12, 2025
Copy link

github-actions bot commented Feb 12, 2025

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: d032446. Link to the linter CI: here

Comment on lines 276 to 277
A = check_array(A, accept_sparse=True)
xp, is_array_api_compliant = get_namespace(A)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is going to basically cancel the array API work I think.

I wonder if explicitly checking for complex values is the only way.

cc @OmarManzoor maybe

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, you're right. Thanks for the catch. What if we just call check_array but not reassign A?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if we don't assign the new array to A, it should work like before while handling the exceptions we need to catch. By the way I think check_array supports the array api itself.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, made the change to not reassign A after check_array.

I also tried check_array on a torch Tensor. The check runs and passes but it does convert to a numpy ndarray.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check_array is array-api compliant so it can be used as usual.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jeremiedbb, I see, when I tried out check_array on a torch Tensor previously, I didn't have the array api enabled.

@jeremiedbb
Copy link
Member

randomized_svd was supposed to be array-api compliant but was not because of its param validation expecting a numpy array. I fixed it and added array-api compliance tests for both randomized_svd and randomized_range_finder which are both public.

Copy link
Member

@jeremiedbb jeremiedbb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -0,0 +1,2 @@
- :func:`sklearn.utils.extmath.randomized_svd` now support Array API compatible inputs.
By :user:`Connor Lane <clane9>`.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generous attribution to user @clane9 😂

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe @jeremiedbb you could add yourself as a co-author of this contribution ;)

@clane9
Copy link
Contributor Author

clane9 commented Apr 16, 2025

Thanks for the help with this PR @jeremiedbb and @ogrisel!

Copy link
Member

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@ogrisel
Copy link
Member

ogrisel commented Apr 17, 2025

@jeremiedbb I let you credit your work in the changelog and merge if you wish.

@jeremiedbb
Copy link
Member

Thanks @clane9 and sorry for the late feedback on this PR.

@jeremiedbb jeremiedbb enabled auto-merge (squash) April 17, 2025 08:35
Copy link
Member

@lesteve lesteve left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to add a test to check that randomized_svd and randomized_range_finder raise a meaninful error on invalid data (e.g. complex data like in the original issue) or is it overkill?

@jeremiedbb
Copy link
Member

The thing is that we don't have checks for complex input for any other function so I would not add it here and trust the check_array dedicated unit tests

@jeremiedbb jeremiedbb merged commit 9f3ca07 into scikit-learn:main Apr 17, 2025
34 checks passed
lucyleeow pushed a commit to EmilyXinyi/scikit-learn that referenced this pull request Apr 23, 2025
…inder` (scikit-learn#30819)

Co-authored-by: Adrin Jalali <adrin.jalali@gmail.com>
Co-authored-by: Jérémie du Boisberranger <jeremie@probabl.ai>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

randomized_svd incorrect for complex valued matrices
6 participants