-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
MNT Updated DistanceMetric
API with new ABC/interface
#26471
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a quick look and I like the general idea although it's breaking the build at the moment:
Error compiling Cython file:
------------------------------------------------------------
...
next_node_min_reach = min_reachability[j]
next_node_source = current_sources[j]
pair_distance = dist_metric.dist(
&raw_data[current_node, 0],
^
------------------------------------------------------------
sklearn/cluster/_hdbscan/_linkage.pyx:182:16: Cannot convert 'const float64_t *' to Python object
Error compiling Cython file:
------------------------------------------------------------
...
next_node_min_reach = min_reachability[j]
next_node_source = current_sources[j]
pair_distance = dist_metric.dist(
&raw_data[current_node, 0],
&raw_data[j, 0],
^
------------------------------------------------------------
sklearn/cluster/_hdbscan/_linkage.pyx:183:16: Cannot convert 'const float64_t *' to Python object
Thanks for pointing that out @ogrisel! I forgot to adjust for the newly-merged HDBSCAN. Should be fixed now :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, Meekail.
Just a few comments.
Could you also add tests to check that the dispatch on np.float{32,64}
data return correct instances of respectively DistanceMetric{32,64}
and that it raise an error otherwise?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM once Julien's comments are addressed.
@jjerphan All concerns should be addressed now 😄 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Merci!
…it-learn#26471)" This reverts commit 314f7ba.
As of 0e253d9, we have: In [1]: from sklearn.metrics import DistanceMetric
In [2]: DistanceMetric.get_metric("euclidean")
Out[2]: <sklearn.metrics._dist_metrics.EuclideanDistance64 at 0x7f7785681b60> As mentioned in #25914 (comment), we might want to hide This PR is not required at all for 1.3, but might be misleading if it is released now. Hence, I am for now pinning this PR for 1.4 not to have users adhere to this change of behavior. What do you think, @Micky774? |
Just for visibility sake, copying my comment from the linked discussion:
|
Actually, the consensus is to keep the current behavior in main. I reverted the milestone to 1.3. |
Reference Issues/PRs
Related to #26267
Addresses #26267 (comment)
What does this implement/fix? Explain your changes.
DistanceMetric-->DistanceMetric64
in line withDistanceMetric32
DistanceMetric
as an abstract base class forDistanceMetric{32, 64}
get_metric
method toDistanceMetric
which accepts adtype
argument to decide on metric specialization internally, without requiring explicit use ofDistanceMetric{32, 64}.get_metric
Somewhat optional:
sklearn.neighbors.DistanceMetric
as a completion of its due deprecation (clears otherwise confusing namespace)Any other comments?
The removal of
sklearn.neighbors.DistanceMetric
is not strictly necessary for this, however it seems appropriate.