Skip to content

Enable config setting sparse_interface to control sparray and spmatrix creation #31177

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

dschult
Copy link
Contributor

@dschult dschult commented Apr 11, 2025

This PR sets up a config parameter sparse_interface to indicate "sparray" or "spmatrix" outputs, as suggested in #26418.

The first commit sets everything up and implements the system for a few modules. Please take a look and provide feedback for whether this is the way to proceed. The next commit(s) will implement these same style of changes throughout the library. If you would prefer they be in separate PRs let me know. (I'll keep Draft status until there is feedback and the full library is convered.)

More specifically, this PR does the following:

  • adds sparse_interface to the config parameters. (I think this name is better than sparse_format because "format" means csr/coo/lil, etc in the sparse world.) The values it can hold are "sparray" or "spmatrix". Update config tests accordingly.
  • adds utils._sparse.py with (private) helper functions. The difference is how much checking is done. Tests added too.
    • _as_sparse(x_sparse), raises unless sparse input. converts to interface chosen by config.
    • _select_interface_if_sparse(x), allows dense input with no action, sparse input uses _as_sparse.
    • one-line convenience functions: _convert_from_spmatrix_to_sparray(x) and _convert_from_sparray_to_spmatrix(x)
  • updates the following modules to use this helper utility to return or store newly created sparse objects.
    • sklearn/feature_selection/text.py and adapts tests.
    • sklearn/linear_model/_coordinate_descent.py no tests change needed.
    • sklearn/manifold/_locally_linear.py no tests change needed.

@thomasjpfan can you see if this does what you had in mind? I tried to pick modules that cover returning sparse, setting estimators to hold sparse, and transforming to sparse, so you can see how this would work.

Let me know if you think _as_sparse should be a public function, and if my approach aligns with how you want it. The next steps for this PR are to repeat these type of changes throughout the library.

Copy link

github-actions bot commented Apr 11, 2025

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: 251e2e2. Link to the linter CI: here

@dschult
Copy link
Contributor Author

dschult commented Apr 11, 2025

Another note:
The new sparse construction function eye_array(n) which is the sparray version of eye(n) was released in SciPy v1.12 along with other construction functions like diags_array. So they will not work with the oldest supported version being v1.8.

We can work around it for now with e.g. _as_sparse(eye(n)), but it will need to be updated later (before spmatrix is removed).

The recent features for sparse by version are:

  • v1.12 added construction function e.g. eye_array, diags_array, etc
  • v1.14 added 1D sparray support
  • v1.15 added indexing for sparray which returns 1D objects (like numpy.array does), e.g. A[3,:] -> 1D array
    goals: v1.16 nD support, v1.17 broadcasting of binary operations.

This info might help us decide when to support which versions. I think the construction functions are all that is currently needed. If/when we start using indexing code for both sparse and dense, we will likely want 1.15. If/when we want nD sparse we will need v1.16, and broadcasting binary operations in v1.17. But for now, 1.8 leaves out only construction functions from current code.

@dschult dschult force-pushed the impl_as_sparse_function branch from e2b7d8b to 251e2e2 Compare May 3, 2025 19:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant