Skip to content

FIX RuntimeWarning division by zero in check_classifiers_one_label #19690

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

mbatoul
Copy link
Contributor

@mbatoul mbatoul commented Mar 16, 2021

Reference Issues/PRs

See issue #19334.

What does this implement/fix? Explain your changes.

The warning stems from the LDA's fit method. There is a division by zero in the LinearDiscriminantAnalysis#_solve_svd method, when computing the explained variance ratio: self.explained_variance_ratio_ = (S**2 / np.sum(S**2))[:self._max_components]. Here, S is zero because it's the singular values vector of the zero matrix X given by X = np.dot(((np.sqrt((n_samples * self.priors_) * fac)) * (self.means_ - self.xbar_).T).T, scalings). Here, we have self.means_ == self.xbar_ because we are testing the LDA classifier with only one class (see y = np.ones(10)).

Any other comments?

I am still trying to figure out the right way to solve this problem.

@jeremiedbb
Copy link
Member

With only 1 class, self._max_components = 0, so self.explained_variance_ratio is an empty array and we should not even do the computation here

self.explained_variance_ratio_ = (S**2 / np.sum(

Maybe we can do something like

if self._max_components == 0:
    self.explained_variance_ratio = np.empty((0,), dtype=S.dtype)
else:
    self.explained_variance_ratio = ...

@mbatoul
Copy link
Contributor Author

mbatoul commented Mar 16, 2021

Thank you for your help, @jeremiedbb!

It makes total sense, I missed that self._max_components was zero in this case.

I will work on it tonight.

@mbatoul mbatoul marked this pull request as ready for review March 16, 2021 17:55
@mbatoul mbatoul changed the title [WIP] Fix: RuntimeWarning division by zero in check_classifiers_one_label (sklearn/utils/estimator_checks.py) [WIP] Fix: RuntimeWarning division by zero in check_classifiers_one_label Mar 16, 2021
Copy link
Member

@jeremiedbb jeremiedbb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks @mbatoul !

@lorentzenchr lorentzenchr changed the title [WIP] Fix: RuntimeWarning division by zero in check_classifiers_one_label FIX RuntimeWarning division by zero in check_classifiers_one_label Mar 16, 2021
Copy link
Member

@lorentzenchr lorentzenchr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. @mbatoul Thank you.

@lorentzenchr lorentzenchr merged commit b7b510f into scikit-learn:main Mar 16, 2021
marrodion pushed a commit to marrodion/scikit-learn that referenced this pull request Mar 17, 2021
@glemaitre glemaitre mentioned this pull request Apr 22, 2021
12 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants