Skip to content

ENH Allow 0<p<1 for Minkowski metric #26760

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

Shreesha3112
Copy link
Contributor

Reference Issues/PRs

Fixes #26548

What does this implement/fix? Explain your changes.

Currently MinkowskiDistance metric raises ValueError("p must be greater than 1") if p<1. This PR allows 0<p<1.

Any other comments?

@github-actions
Copy link

github-actions bot commented Jul 4, 2023

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: 6750028. Link to the linter CI: here

@Shreesha3112
Copy link
Contributor Author

Shreesha3112 commented Jul 4, 2023

CI failing since Minkowski metirc of scipy raises ValueError("p must be greater than 1") in 1.5.0 release. 0<p<1 allowed for Minkowski in scipy from 1.8.0. To allow p>0 requires updating scipy minimum dependency to 1.8.

Scipy issue for reference: scipy/scipy#15055

@Shreesha3112 Shreesha3112 marked this pull request as ready for review July 4, 2023 12:52
@Shreesha3112 Shreesha3112 changed the title Fix Allow 0<p<1 for Minkowski metric ENH Allow 0<p<1 for Minkowski metric Jul 4, 2023
@Micky774
Copy link
Contributor

Micky774 commented Jul 4, 2023

CI failing since Minkowski metirc of scipy raises ValueError("p must be greater than 1") in 1.5.0 release. 0<p<1 allowed for Minkowski in scipy from 1.8.0. To allow p>0 requires updating scipy minimum dependency to 1.8.

Rather than bumping the minimum dependency, we can instead modify the tests based on what the installed version of scipy is. For example, see:

if sp_version >= parse_version("1.8.0.dev0"):

@Shreesha3112
Copy link
Contributor Author

Side note: As per scipy release notes, 0<p<1 allowed for Minkowski from 1.8.0. Testing with different versions of scipy revealed, this is allowed from 1.7.0.

@Micky774
Copy link
Contributor

Micky774 commented Jul 7, 2023

We should update the documentation of MinkowskiDistance to mention the fact that $p\in(0, 1)$ implies that it is no longer a metric. I think that would be a more appropriate place than the changelog.

Copy link
Member

@jeremiedbb jeremiedbb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @Shreesha3112. Just a few comments, looks good otherwise

Comment on lines 84 to 85
- |Enhancement| The `"Minkowski"` metric of :class:`metrics.DistanceMetric` now accepts
parameter `p` values in the open interval (0,1).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather have a higher level message more related to the original issue, like say that NearestNeighbors now works with p < 1 regardless of the dtype of X, and mark this entry as a |Fix|

shreesha3112 and others added 3 commits July 12, 2023 20:53
Co-authored-by: Jérémie du Boisberranger <34657725+jeremiedbb@users.noreply.github.com>
@jeremiedbb
Copy link
Member

One last thing that I forgot to ask yesterday: please add a non-regression test for this; something like the reproducer given in the linked issue #26548 (comment).

@Shreesha3112
Copy link
Contributor Author

One last thing that I forgot to ask yesterday: please add a non-regression test for this; something like the reproducer given in the linked issue #26548 (comment).

Added non-regression test

Copy link
Contributor

@Micky774 Micky774 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also impacts other estimators -- really any that directly use MinkowskiDistance{32, 64} -- should we document this change for them as well in this PR? What do you think @jeremiedbb?

Otherwise looks good, thanks!

Co-authored-by: Meekail Zain <34613774+Micky774@users.noreply.github.com>
@Micky774 Micky774 added Quick Review For PRs that are quick to review Waiting for Second Reviewer First reviewer is done, need a second one! labels Jul 24, 2023
Copy link
Member

@jeremiedbb jeremiedbb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks @Shreesha3112

@jeremiedbb
Copy link
Member

should we document this change for them as well in this PR?

I updated the message to be more generic about neighbors based estimators

@jeremiedbb jeremiedbb enabled auto-merge (squash) July 26, 2023 15:33
@jeremiedbb jeremiedbb merged commit 44d4cd4 into scikit-learn:main Jul 26, 2023
@Shreesha3112 Shreesha3112 deleted the fix/MinkowskiDistance_when_p_less_than_1 branch July 26, 2023 17:11
punndcoder28 pushed a commit to punndcoder28/scikit-learn that referenced this pull request Jul 29, 2023
…learn#26760)

Co-authored-by: Jérémie du Boisberranger <34657725+jeremiedbb@users.noreply.github.com>
Co-authored-by: Meekail Zain <34613774+Micky774@users.noreply.github.com>
glemaitre pushed a commit to glemaitre/scikit-learn that referenced this pull request Sep 18, 2023
…learn#26760)

Co-authored-by: Jérémie du Boisberranger <34657725+jeremiedbb@users.noreply.github.com>
Co-authored-by: Meekail Zain <34613774+Micky774@users.noreply.github.com>
jeremiedbb added a commit that referenced this pull request Sep 20, 2023
Co-authored-by: Jérémie du Boisberranger <34657725+jeremiedbb@users.noreply.github.com>
Co-authored-by: Meekail Zain <34613774+Micky774@users.noreply.github.com>
REDVM pushed a commit to REDVM/scikit-learn that referenced this pull request Nov 16, 2023
…learn#26760)

Co-authored-by: Jérémie du Boisberranger <34657725+jeremiedbb@users.noreply.github.com>
Co-authored-by: Meekail Zain <34613774+Micky774@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cython module:metrics Quick Review For PRs that are quick to review Waiting for Second Reviewer First reviewer is done, need a second one!
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Using NearestNeighbors with p < 1 and floats raises an error
3 participants