Skip to content

Conversation

thomasjpfan
Copy link
Member

Reference Issues/PRs

Fixes #13617

What does this implement/fix? Explain your changes.

This PR uses the correct threshold if the inner estimator uses predict_proba.

@thomasjpfan thomasjpfan added module:multiclass Quick Review For PRs that are quick to review labels Feb 24, 2022
@thomasjpfan
Copy link
Member Author

Marking this as quick review. We do a very similar thing here:

if hasattr(self.estimators_[0], "decision_function") and is_classifier(
self.estimators_[0]
):
thresh = 0
else:
thresh = 0.5

Copy link
Member

@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@thomasjpfan thomasjpfan changed the title FIX Fixes OneVsOneClassifier.predict for ests with only predict_proba FIX Fixes OneVsOneClassifier.predict for Estimators with only predict_proba Mar 2, 2022
Copy link
Member

@jjerphan jjerphan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apart a suggestion, this LGTM. Thank you, @thomasjpfan.

Comment on lines 773 to 780
if hasattr(self.estimators_[0], "decision_function") and is_classifier(
self.estimators_[0]
):
thresh = 0
else:
# predict_proba threshold
thresh = 0.5
return self.classes_[(Y > thresh).astype(int)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth wrapping this and the lines you mentioned, i.e.:

if hasattr(self.estimators_[0], "decision_function") and is_classifier(
self.estimators_[0]
):
thresh = 0
else:
thresh = 0.5

in a private _threshold property?

I think this can make the logic explicit and ease maintainability.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OneVsOneClassifier and OneVsRestClassifier do not share a common base class where a _threshold property makes sense. Also the threshold is only used in the binary case.

Inspired by your suggestion, I think a helper function makes the logic better 7372cee (#22604)

@lesteve
Copy link
Member

lesteve commented Mar 18, 2022

Merging main to get rid of the doc-min-dependencies error about Python 3.7.

@lesteve
Copy link
Member

lesteve commented Mar 18, 2022

Merging, thanks!

@lesteve lesteve merged commit fb83577 into scikit-learn:main Mar 18, 2022
glemaitre pushed a commit to glemaitre/scikit-learn that referenced this pull request Apr 6, 2022
…_proba (scikit-learn#22604)

Co-authored-by: Loïc Estève <loic.esteve@ymail.com>
joshua-cogliati-inl added a commit to joshua-cogliati-inl/raven that referenced this pull request Dec 9, 2024
wangcj05 pushed a commit to idaholab/raven that referenced this pull request Jan 14, 2025
* Updating to scipy 1.12

* Scikitlearn 1.0 is incompatible with scipy 1.12, so updating to 1.1

* Sometimes only a 1d array is returned.

* Removing testing kulsinski metric.

* simps is deprecated so switching to simpson

* Handle change from scikit-learn/scikit-learn#22199

* Improving test to classify better.

* Regold due to scikit change
scikit-learn/scikit-learn#22604

* Regold PolyExponential files (had rel err of 1e-03 or less)

* Make timestep uniform for scipy update.

* Regolding because of changes in scipy 1.12

* Increase limits to improve convergence.

* Unpinning xarray and updating numpy

* Updating various libraries.

* Fix working with newer tensorflow.

* Values need to be switched to tuples for hstack in numpy 1.26

* Updating to new ray version.

* The deque size can be bigger in python 3.11

* Report difference in row lengths, instead of crashing OrderedCSVDiffer.

Also report gold file name.

* Remove Fourier__signal_f__period10.0__phase

This was either +pi or -pi semirandomly, so nolonger testing it.

* Regolding changes to ROM/TimeSeries/DMD/BOPDMD because of library changes.

* Support xarray 2024.7 and newer.

Pre 2024.7 automatically squeeze()ed groupby results, so now need
to explicitly call squeeze().

* Fixing long line.

* Increasing zero threshold because of change in libraries.

* Remove version from setuptools since ray updated.

* Optimizing persistence in BayesianMatyas.

* Switch OVO to use estimator that is not constantly zero.

* Use keepdims instead of try catch block.

* Updating to default using python 3.11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module:multiclass Quick Review For PRs that are quick to review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

OneVsOneClassifier with KNeighborsClassifier different predictions on two-class dataset
4 participants