Skip to content

Add array api support for jaccard score #31204

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

OmarManzoor
Copy link
Contributor

Reference Issues/PRs

Towards #26024

What does this implement/fix? Explain your changes.

  • Adds array api support for jaccard score

Any other comments?

Copy link

github-actions bot commented Apr 15, 2025

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: c403043. Link to the linter CI: here

@OmarManzoor
Copy link
Contributor Author

OmarManzoor commented Apr 15, 2025

Some benchmarks

data size = 1e7
dtype = np.int64

average Orignal flow Pytorch CPU Pytorch CUDA
micro 1.0681 2.63489 0.08644
macro 2.12834 5.26227 0.15068
weighted 3.19726 7.89072 0.21451

@OmarManzoor
Copy link
Contributor Author

CC: @ogrisel @betatim @adrinjalali for reviews.

It seems like Pytorch CPU degrades the performance.

Copy link
Member

@virchan virchan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR, @OmarManzoor!

Just one small nitpick from my side—otherwise, LGTM!

Edit: I forget we haven't checked with MPS. 😅

@adrinjalali
Copy link
Member

I guess we're okay that PyTorch CPU has a subpar implementation?

@OmarManzoor
Copy link
Contributor Author

I guess we're okay that PyTorch CPU has a subpar implementation?

I think so. Just to be clear I tested on a kaggle kernel which sometimes involves differing and conflicting versions of packages but I think that is the best we can get for publicly available free gpus. Since metrics do not involve too much computation using Pytorch with CPU doesn't really offer much benefit and we are better off with the original numpy implementation when we are using a CPU.

But I think let's get an opinion from @ogrisel as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants