Skip to content

Array API: integrating array-api-extra #30367

Closed
@lucascolley

Description

@lucascolley

Following on from our discussion in the array API standard community meeting @betatim:

https://data-apis.org/array-api-extra/ is a library which I have authored, with the main purpose of giving a public API 'home' to array-agnostic functions which consumer libraries find themselves needing to write and store in private modules. Various functions in https://github.com/scikit-learn/scikit-learn/blob/main/sklearn/utils/_array_api.py could find a home in array-api-extra.

SciPy has offloaded its functions of this sort to array-api-extra, which it now vendors via a git submodule. The main purpose of scikit-learn doing the same is to share the work which has been done with any other consumer library via a clear API Reference, rather than them having to dig through the source code and copy & paste (or reinvent the wheel). There are other benefits too, as the functions in array-api-extra get proper documentation, tests, and static type hints.

The question is then how to adopt array-api-extra in scikit-learn. From the discussion with @betatim we concluded that array-api-extra is not best suited as an optional dependency, as boilerplate would need to be added to catch the case that it isn't installed and either error or replicate the functions with NumPy.

Does scikit-learn have an established process for vendoring code from other repos? In gh-30340 I tried to wrangle CI to get a git submodule to work, but there may be alternatives to that.

cc @ogrisel @thomasjpfan @OmarManzoor


A sidenote is that scikit-learn may want to look into vendoring array-api-compat itself rather than having an optional dependency. In the meeting, it wasn't clear why it was done this way in the first place.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions