-
-
Notifications
You must be signed in to change notification settings - Fork 26k
FIX wminkowski removed in 1.8.0.dev0 #21741
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FIX wminkowski removed in 1.8.0.dev0 #21741
Conversation
There is no need for backport because it will only impact the nightly build before the release of 1.8.0 final, and only the scikit-learn |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Does it mean that parse_version
does not parse dev and rc numbers?
parse_version("1.7.2") < parse_version("1.8.0.dev0") < parse_version("1.8.0.dev0.whatever") < parse_version("1.8.0") as expected. It's just our code that was wrong since the |
Right. Is |
It'd be nice to do a scipy dev commit before merging |
Forgot we could do this ;) |
Hum I realize that there are plenty of other failures in the original log caused by a new
|
There is another failure in |
I pushed a new commit to make the code work consistently with scipy 1.8 while still being backward compatible with the scipy 1.7 and earlier parametrization. I think we need to backport this 1.0.2 to make this release compatible with 1.8 that might be released approximately at the same time and very probably before we release 1.1. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Otherwise LGTM.
This error is making it hard to see if there's any relevant error in the logs though:
E DeprecationWarning: the `interpolation=` argument to percentile was renamed to `method=`, which has additional options.
E Users of the modes 'nearest', 'lower', 'higher', or 'midpoint' are encouraged to review the method they. (Deprecated NumPy 1.22)
# WMinkowskiDistance in sklearn implements the weighting | ||
# scheme of the old 'wminkowski' in scipy < 1.8 and the | ||
# following adaptation: | ||
return WMinkowskiDistance(p, w ** (1/p), **kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue then is that the user can create a WMinkowskiDistance(p, q, **kwargs)
themself, which would be different than get_metric("wminkowski", p, w)
, and kinda similar to get_metric("minkowski", p, w)
, but with a transformation on w
. I think this can cause quite a bit of confusion, and I'm not sure how to solve it.
If we remove "wminkowski"
, then it'd be more consistent I guess
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can rewrite MinkowskiDistance
and WMinkowskiDistance
to implement the new scheme in MinkowskiDistance
directly and make WMinkowskiDistance
inherit from MinkowskiDistance
with the transformation of the weight in its constructor and leave DistanceMetric.get_metric
to be more straightforward.
And then later we can deprecate "wminkowski"
(but probably after a year or two, once scipy 1.8 is commonly installed).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it will be easier to do this change once we no longer support versions of scipy (pre-1.6) where wminkowski
is not deprecated.
I think this PR is good enough for now. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this PR is good enough for now. WDYT?
I agree with you and believe the steps you gave in the first comment are proper ones to take in another PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm looking at the metric across the board, and here's what I see:
_dist_metrics.pyx:558-594
# W-Minkowski Distance
# d = sum(w_i^p * (x_i^p - y_i^p)) ^ (1/p)
cdef class WMinkowskiDistance(DistanceMetric):
r"""Weighted Minkowski Distance
.. math::
D(x, y) = [\sum_i |w_i * (x_i - y_i)|^p] ^ (1/p)
the comment and the docstring are inconsistent, but the docstring seems to be consistent with the implementation:
for j in range(size):
d += pow(self.vec[j] * fabs(x1[j] - x2[j]), self.p)
Now this PR is making this change (on the minkowski
part):
"minkowski" MinkowskiDistance p, w ``sum(w * |x - y|^p)^(1/p)``
"wminkowski" WMinkowskiDistance p, w ``sum(|w * (x - y)|^p)^(1/p)``
and the two are just different and not consistent with one another.
Since scipy has deprecated wminkowski
already for a while, and now it's removed, I'd say it makes sense for this PR to deprecate WMinkowskiDistance
, and make everything consistent with scipy? Especially since from what I see, it seems to be the case that we also have been a little bit flaky on how we incorporate w
there, and now is a good time to just stick to one model.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, if one of the component of w
is negative, WMinkowskiDistance
's call to pow
might crash.
Edit: actually, this gives nan
on main
:
from sklearn.neighbors import DistanceMetric
import numpy as np
rng = np.random.RandomState(1)
X = rng.rand(10, 100)
w = rng.rand(X.shape[1],)
w[0] = -1337
dist = DistanceMetric.get_metric("wminkowski", p=3, w=w)
# This produces nan
dm = dist.pairwise(X)
print(dm)
[[ 0. nan nan nan nan nan nan nan nan nan]
[nan 0. nan nan nan nan nan nan nan nan]
[nan nan 0. nan nan nan nan nan nan nan]
[nan nan nan 0. nan nan nan nan nan nan]
[nan nan nan nan 0. nan nan nan nan nan]
[nan nan nan nan nan 0. nan nan nan nan]
[nan nan nan nan nan nan 0. nan nan nan]
[nan nan nan nan nan nan nan 0. nan nan]
[nan nan nan nan nan nan nan nan 0. nan]
[nan nan nan nan nan nan nan nan nan 0.]]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since scipy has deprecated wminkowski already for a while, and now it's removed, I'd say it makes sense for this PR to deprecate WMinkowskiDistance, and make everything consistent with scipy? Especially since from what I see, it seems to be the case that we also have been a little bit flaky on how we incorporate w there, and now is a good time to just stick to one model.
I intended this PR to be minimal so as to be backportable to 1.0.2.
I would rather not deprecate a scikit-learn component in a scikit-learn minor release, even if that component maps more or less directly to a deprecated components in a dependency.
I can do the proposed change but I think it should target 1.1 and then 1.0.2 will have 2 broken tests under scipy 1.8 but maybe we don't care. Or maybe we care because it can make the release process automation fail...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, if one of the component of w is negative, WMinkowskiDistance's call to pow might crash.
I don't think we care about negative weights. We could raise a better error message but this out of the scope of this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the comment and the docstring are inconsistent, but the docstring seems to be consistent with the implementation
The inline comment is just plain wrong. Let me delete it, it's just useless.
Now this PR is making this change (on the minkowski part):
"minkowski" MinkowskiDistance p, wsum(w * |x - y|^p)^(1/p)
"wminkowski" WMinkowskiDistance p, wsum(|w * (x - y)|^p)^(1/p)
and the two are just different and not consistent with one another.
This is correct. DistanceMetric.get_metric("minkowski", p, w)
is not the same as DistanceMetric.get_metric("minkowski", p, w)
as you need to take the p-root or p-power of one w to get the equivalent w
of the other. This is all consistent with what scipy does.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thank you, @ogrisel.
Some minor suggestions.
# WMinkowskiDistance in sklearn implements the weighting | ||
# scheme of the old 'wminkowski' in scipy < 1.8 and the | ||
# following adaptation: | ||
return WMinkowskiDistance(p, w ** (1/p), **kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this PR is good enough for now. WDYT?
I agree with you and believe the steps you gave in the first comment are proper ones to take in another PR.
def test_cdist(metric_param_grid, X1, X2): | ||
metric, param_grid = metric_param_grid | ||
keys = param_grid.keys() | ||
for vals in itertools.product(*param_grid.values()): | ||
kwargs = dict(zip(keys, vals)) | ||
if metric == "mahalanobis": | ||
# See: https://github.com/scipy/scipy/issues/13861 | ||
pytest.xfail("scipy#13861: cdist with 'mahalanobis' fails onmemmap data") | ||
elif metric == "wminkowski": | ||
if sp_version >= parse_version("1.8.0"): | ||
pytest.skip("wminkowski will be removed in SciPy 1.8.0") | ||
|
||
# wminkoski is deprecated in SciPy 1.6.0 and removed in 1.8.0 | ||
ExceptionToAssert = None | ||
if sp_version >= parse_version("1.6.0"): | ||
ExceptionToAssert = DeprecationWarning | ||
with pytest.warns(ExceptionToAssert): | ||
D_true = cdist(X1, X2, metric, **kwargs) | ||
else: | ||
D_true = cdist(X1, X2, metric, **kwargs) | ||
|
||
check_cdist(metric, kwargs, D_true) | ||
# Possibly caused by: https://github.com/joblib/joblib/issues/563 | ||
pytest.xfail( | ||
"scipy#13861: cdist with 'mahalanobis' fails on joblib memmap data" | ||
) | ||
check_cdist(metric, kwargs, X1, X2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test is way clearer now: thanks 👍
Co-authored-by: Julien Jerphanion <git@jjerphan.xyz>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that this is suitable for the minor release, but it'd be nice if you could do a follow up PR for 1.1 :)
… MinkowskiDistance's docstring
I opened #21765 ;) |
Co-authored-by: Julien Jerphanion <git@jjerphan.xyz>
Co-authored-by: Julien Jerphanion <git@jjerphan.xyz>
Co-authored-by: Julien Jerphanion <git@jjerphan.xyz>
Co-authored-by: Julien Jerphanion <git@jjerphan.xyz>
This follow-ups with scikit-learn#21741
This is follow-up for scikit-learn#21741.
This is follow-up for scikit-learn#21741.
This should fix the scipy-dev nightly build failure:
https://dev.azure.com/scikit-learn/scikit-learn/_build/results?buildId=35203&view=logs&j=dfe99b15-50db-5d7b-b1e9-4105c42527cf&t=ef785ae2-496b-5b02-9f0e-07a6c3ab3081
I tested locally and it seems to work as expected.
EDIT: to be consistent with scipy 1.8 overall, it was also necessary to make it possible to pass a
w
parameter whenmetric="minkowski"
in which case the weights are to be raised to the powerp
to be consistent with scipy 1.8.