Skip to content

BUG: liblinear stucked on iris after centering the data #18264

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
mathurinm opened this issue Aug 26, 2020 · 11 comments · Fixed by #25214
Closed

BUG: liblinear stucked on iris after centering the data #18264

mathurinm opened this issue Aug 26, 2020 · 11 comments · Fixed by #25214
Labels
Bug help wanted Moderate Anything that requires some knowledge of conventions and best practices module:linear_model

Comments

@mathurinm
Copy link
Contributor

mathurinm commented Aug 26, 2020

Describe the bug

The example on sparse Logreg surprised me because even going at extremely low regularizations (large C), only 1 feature enters the model:
image

Investigation lead me to check if StandardScaling X changed the graph. When X is preprocessed, the liblinear solver does not seem to converge.

Steps/Code to Reproduce

from sklearn.svm import l1_min_c
from sklearn import datasets
from sklearn import linear_model
from sklearn.preprocessing import StandardScaler
import matplotlib.pyplot as plt
import numpy as np
from time import time
print(__doc__)

# Author: Alexandre Gramfort <alexandre.gramfort@inria.fr>
# License: BSD 3 clause


iris = datasets.load_iris()
X = iris.data
y = iris.target

X = X[y != 2]
y = y[y != 2]


# #############################################################################
# Demo path functions

cs = l1_min_c(X, y, loss='log') * np.logspace(0, 10, 30)


X = StandardScaler().fit_transform(X)
print("Computing regularization path ...")
start = time()
clf = linear_model.LogisticRegression(penalty='l1', solver='liblinear',
                                      tol=1e-6, max_iter=int(1e6),
                                      warm_start=True,
                                      intercept_scaling=10000.)
coefs_ = []
for c in cs:
    clf.set_params(C=c)
    clf.fit(X, y)
    coefs_.append(clf.coef_.ravel().copy())
    print(clf.coef_.ravel(), clf.intercept_)

Expected Results

the code should run fast (iris has 4 features)

Actual Results

the code gets stuck after the first regularization parameter

I guess something is happening with the scaled intercept column added because liblinear does not fit an unregularized intercept.
Going to a lower intercept scaling (intercept_scaling=1) gives me the way more reasonable graph:
image

where I still suspect numerical errors to be responsible for the bumps when C becomes large

Versions

Happy to help if it's a known issue
@agramfort

@mathurinm
Copy link
Contributor Author

I just realised that iris is linearly separable:

plt.scatter(X[:, 2], X[:, 3], c=y)

image

in that case, solvers on the unregularized Logreg problem do not converge but go to infinity. My guess is that when the intercept scaling is high, liblinear gets stuck pushing its value to infinity. It is still weird that when the data is not centered this does not happen.

@agramfort
Copy link
Member

agramfort commented Aug 26, 2020

with current master I get with this code:

from sklearn.svm import l1_min_c
from sklearn import datasets
from sklearn import linear_model
from sklearn.preprocessing import StandardScaler
import matplotlib.pyplot as plt
import numpy as np
from time import time
print(__doc__)

# Author: Alexandre Gramfort <alexandre.gramfort@inria.fr>
# License: BSD 3 clause


iris = datasets.load_iris()
X = iris.data
y = iris.target

X = X[y != 2]
y = y[y != 2]


# #############################################################################
# Demo path functions

cs = l1_min_c(X, y, loss='log') * np.logspace(0, 10, 30)


X = StandardScaler().fit_transform(X)
print("Computing regularization path ...")
start = time()
clf = linear_model.LogisticRegression(penalty='l1', solver='liblinear',
                                      tol=1e-6, max_iter=int(1e6),
                                      warm_start=True,
                                      intercept_scaling=10000.)
coefs_ = []
for c in cs:
    clf.set_params(C=c)
    clf.fit(X, y)
    coefs_.append(clf.coef_.ravel().copy())
    print(clf.coef_.ravel(), clf.intercept_)

plt.figure()
plt.plot(np.log10(cs), np.array(coefs_))
plt.show()

reasonable results.

@mathurinm
Copy link
Contributor Author

On my machine with the latest pull from master it worked once, but I cannot reproduce: it gets stuck after 2 Cs.

Can I give more debugging/reproducing info ?

@agramfort
Copy link
Member

agramfort commented Aug 26, 2020 via email

@TomDLT
Copy link
Member

TomDLT commented Aug 26, 2020

Well spotted. I suggest to stop using LIBLINEAR, and start using SAGA:
Figure_1

We should not use such large intercept scaling in LIBLINEAR. Intercept scaling is a dirty hack which scales the intercept feature so differently from the rest of the features that it slows down the convergence a lot (see e.g. #17557 (comment)).

@agramfort
Copy link
Member

agramfort commented Aug 27, 2020 via email

@glemaitre
Copy link
Member

This might be to find 2 random_state, one failing and one working to be able to debug it and find in which stage it is going wrong?

@mathurinm
Copy link
Contributor Author

mathurinm commented Aug 27, 2020

random_state=0 fails, random_state=6 converges

from sklearn.svm import l1_min_c
from sklearn import datasets
from sklearn import linear_model
from sklearn.preprocessing import StandardScaler
import matplotlib.pyplot as plt
import numpy as np
from time import time

iris = datasets.load_iris()
X = iris.data
y = iris.target
X = X[y != 2]
y = y[y != 2]
cs = l1_min_c(X, y, loss='log') * np.logspace(0, 10, 30)
X_prep = StandardScaler().fit_transform(X)
clf = linear_model.LogisticRegression(penalty='l1', solver='liblinear',
                                      tol=1e-6, max_iter=int(1e6),
                                      warm_start=True,
                                      intercept_scaling=10000.,
                                      random_state=6)
coefs_ = []
intercepts_ = []
for c in cs:
    clf.set_params(C=c)
    clf.fit(X_prep, y)
    coefs_.append(clf.coef_.ravel().copy())
    intercepts_.append(clf.intercept_)
    print(clf.coef_.ravel(), clf.intercept_)

@glemaitre glemaitre added Bug help wanted Moderate Anything that requires some knowledge of conventions and best practices and removed Bug: triage labels Aug 27, 2020
@glemaitre
Copy link
Member

Great, now we need somebody motivated to get into debug mode :)

@Rick-Mackenbach
Copy link
Contributor

I'll work on this, curious to see what is happening here.

@TomDLT
Copy link
Member

TomDLT commented Dec 19, 2022

See proposed fix in #25214

Another possibility would be to fix LIBLINEAR to avoid regularizing the intercept, backporting the fix from cjlin1/liblinear@f68d25c.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug help wanted Moderate Anything that requires some knowledge of conventions and best practices module:linear_model
Projects
None yet
6 participants