Skip to content

DOC Add comment about input checking in pairwise_distances #31170

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Apr 14, 2025

Conversation

lucyleeow
Copy link
Member

Reference Issues/PRs

ref: #31162

Add comment that input dim checking is not done when user provides callable metric

What does this implement/fix? Explain your changes.

Any other comments?

cc @jeremiedbb

Copy link

github-actions bot commented Apr 10, 2025

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: fb793c4. Link to the linter CI: here

@jeremiedbb
Copy link
Member

I just realized that _pairwise_callable actually calls check_pairwise_array, so it is done for callables as well. It's just that we don't force 2D arrays here to allow things like distances between strings.

@lucyleeow
Copy link
Member Author

Yup I just said dim checking, it doesn't enforce 2d or check x and y have the same 2nd dim.

Do you think comment is not needed?

@lucyleeow
Copy link
Member Author

It is essentially the same comment as:

elif ensure_2d and X.shape[1] != Y.shape[1]:
# Only check the number of features if 2d arrays are enforced. Otherwise,
# validation is left to the user for custom metrics.

but I thought it would be useful to have it in pairwise_distances, especially as we don't mention this in the docstring.

@jeremiedbb
Copy link
Member

I'm okay with the comment but I feel that it should instead be in _pairwise_callable right before the call to check_pairwise_arrays, otherwise we don't really understand what it refers to.

@lucyleeow
Copy link
Member Author

lucyleeow commented Apr 10, 2025

That's a good point because this is relevant to both pairwise_kernels and pairwise_distances when metric is a callable. Will move comment to where you suggest.

What do you think about including in the 'Notes' section of both functions, that no restrictions are placed on X and Y dimensions when metric is a callable. I think the original confusion for me was because both functions docstrings specify clearly what dimensions X and Y should be, but does not mention that this does not apply when metric is a callable.

@jeremiedbb
Copy link
Member

What do you think about including in the 'Notes' section of both functions, that no restrictions are placed on X and Y dimensions when metric is a callable. I think the original confusion for me was because both functions docstrings specify clearly what dimensions X and Y should be, but does not mention that this does not apply when metric is a callable.

looks good

@lucyleeow
Copy link
Member Author

Thanks @jeremiedbb ! Done.

Copy link
Member

@jeremiedbb jeremiedbb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks

@jeremiedbb jeremiedbb merged commit d99c7cf into scikit-learn:main Apr 14, 2025
36 checks passed
@lucyleeow lucyleeow deleted the doc_pwcallable branch April 14, 2025 09:46
lucyleeow added a commit to EmilyXinyi/scikit-learn that referenced this pull request Apr 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants