Skip to content

'svd' and 'eigen' shouldn't yield such different results for RidgeCV #4781

@lemonlaug

Description

@lemonlaug
import numpy as np
from sklearn.linear_model import RidgeCV
from sklearn.datasets import load_boston
from sklearn.preprocessing import scale 

boston = scale(load_boston().data)
target = load_boston().target

alphas = np.linspace(0,200)
fit0 = RidgeCV(alphas=alphas, store_cv_values=True, 
gcv_mode='eigen').fit(boston, target)
fit0.alpha_
#4.0816326530612246
fit0.cv_values_[0,0:5]
#array([ 37.65055379,  38.25669302,  38.99731156,  39.51049034,  39.85507581]

fit1 = RidgeCV(alphas=alphas, store_cv_values=True, gcv_mode='svd').fit(boston, target)
fit1.alpha_
#0.0
fit1.cv_values_[0,0:5]
#array([         nan,  38.25669302,  38.99731156,  39.51049034,  39.85507581])

The problem here appears to be that gcv_mode='svd' produces nan for alpha=0.

The ridge regression docs suggests 0 as a valid value of alpha--of course corresponding to the unregularized regression.

Seems like solution would be either:

  • Change computation of cv_values_ under 'svd' to produce a value.
  • Warn, or change user docs to discourage using alpha = 0 under this case.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions