FIX: Reduce bias of `covariance.MinCovDet` with consistency correction #32117

dherrera1911 · 2025-09-05T22:23:50Z

Reference Issues/PRs

Partially fixes Issue #23162

What does this implement/fix? Explain your changes.

Background:

In brief, the output of the covariance.MinCovDet covariance estimator is strongly biased, because it is lacking a consistency correction. This PR adds the missing correction, reducing the bias. To fully eliminate the bias, an additional correction is needed, that should be implemented in the future.

In the Issue #23162, an example of the bias in the output is shown. In my comment to that Issue, I provide a thorough explanation of the statistics behind the issue and this fix, and show that the new output obtained with this PR is less biased than the original.

Changes:

In this PR, I added the function _consistency_correction to the covariance._robust_covariance submodule. This function calculates the multiplicative correction factor so that the output is consistent at the normal distribution, following the references mentioned in my comment. In these lines I multiply the robust covariance estimate by the new multiplicative factor, reducing the bias.

Also, this correction actually needs to be applied twice. The original implementation did apply the correction once, but using a somewhat adhoc method from the original paper. For consistency in the code, and because the new correction is more theoretically grounded, I changed the code to use the new _consistency_correction for the first application too, in these lines.

Any other comments?

The estimates with Gaussian datasets with no outliers are still not unbiased, because they lack the finite sample correction. A future PR should add this further correction. The implementation of MinCovDet in R can be used as a template to implement this https://rdrr.io/cran/robustbase/src/R/covMcd.R.

github-actions · 2025-09-05T22:24:49Z

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

_{Generated for commit: e938beb. Link to the linter CI: here}

dherrera1911 · 2025-09-06T19:41:16Z

It seems like in some cases, this test fails under the new implementation: https://github.com/scikit-learn/scikit-learn/blob/main/sklearn/covariance/tests/test_robust_covariance.py#L35. I suggest that this it not to blame on the implementation, and that the tolerance threshold of the test should be increased so it passes, as argued below.

The test compares the obtained result with the covariance obtain from the true inliers. The specific test that fails has a much smaller tolerance threshold than the other tests. The test would pass in the PR if the threshold was similar to the other tests. The threshold seems too low given the variability of the method results (see #23162).

In the example code of the Issue #23162 it is shown that the new implementation is much less biased than the original implementation. The test failure should not be blamed on the new implementation, but rather on the variability of the results + low threshold for that specific test.

I suggest increasing the threshold in that test from 0.02 to 0.1 like the other simulations in that same test function.

dherrera1911 · 2025-09-07T17:29:58Z

I went ahead and increased the tolerance of the test, which now passes.

Also, I changed the expected values in the Docstring examples to match the new implementation. Interestingly, the results in the PR are more similar to the true values than the results with the original version.

dherrera1911 added 4 commits September 5, 2025 17:54

Add correction to make robust covariance consistent with normal model

cee4252

Use theoretical consistency estimator for MCD matrix

5bd6fc3

Ruff formatting

4b4af56

Merge branch 'main' into mcd_correct

f4fdd27

github-actions bot added the module:covariance label Sep 5, 2025

dherrera1911 added 2 commits September 5, 2025 18:28

Linting

f1a7f34

Add changelog

c484de2

dherrera1911 changed the title ~~Add correction to MinCovDet~~ Add consistency correction to MinCovDet, to reduce bias Sep 5, 2025

dherrera1911 changed the title ~~Add consistency correction to MinCovDet, to reduce bias~~ Add consistency correction to covariance.MinCovDet, to reduce bias Sep 5, 2025

dherrera1911 mentioned this pull request Sep 5, 2025

MinCovDet estimation of covariance with strong bias? #23162

Open

dherrera1911 changed the title ~~Add consistency correction to covariance.MinCovDet, to reduce bias~~ FIX Reduce bias of covariance.MinCovDet with consistency correction Sep 6, 2025

dherrera1911 changed the title ~~FIX Reduce bias of covariance.MinCovDet with consistency correction~~ FIX: Reduce bias of covariance.MinCovDet with consistency correction Sep 6, 2025

Use gamma.cdf instead of gammainc, remove scipy.special import

5f29cc3

dherrera1911 added 4 commits September 7, 2025 09:32

Increase mincovdet test tolerance

f65c56e

Increase test tolerance further

d82e155

Change docstring Example output to new values

39bea6b

Change location output in Doctest to match new corrected version

e938beb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

FIX: Reduce bias of `covariance.MinCovDet` with consistency correction #32117

FIX: Reduce bias of `covariance.MinCovDet` with consistency correction #32117

dherrera1911 commented Sep 5, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Sep 5, 2025 •

edited

Loading

Uh oh!

dherrera1911 commented Sep 6, 2025 •

edited

Loading

Uh oh!

dherrera1911 commented Sep 7, 2025

Uh oh!

Uh oh!

Uh oh!

FIX: Reduce bias of covariance.MinCovDet with consistency correction #32117

Are you sure you want to change the base?

FIX: Reduce bias of covariance.MinCovDet with consistency correction #32117

Conversation

dherrera1911 commented Sep 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Any other comments?

Uh oh!

github-actions bot commented Sep 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✔️ Linting Passed

Uh oh!

dherrera1911 commented Sep 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dherrera1911 commented Sep 7, 2025

Uh oh!

Uh oh!

FIX: Reduce bias of `covariance.MinCovDet` with consistency correction #32117

FIX: Reduce bias of `covariance.MinCovDet` with consistency correction #32117

dherrera1911 commented Sep 5, 2025 •

edited

Loading

github-actions bot commented Sep 5, 2025 •

edited

Loading

dherrera1911 commented Sep 6, 2025 •

edited

Loading