-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
BUG: liblinear stucked on iris after centering the data #18264
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I just realised that iris is linearly separable:
in that case, solvers on the unregularized Logreg problem do not converge but go to infinity. My guess is that when the intercept scaling is high, liblinear gets stuck pushing its value to infinity. It is still weird that when the data is not centered this does not happen. |
with current master I get with this code: from sklearn.svm import l1_min_c
from sklearn import datasets
from sklearn import linear_model
from sklearn.preprocessing import StandardScaler
import matplotlib.pyplot as plt
import numpy as np
from time import time
print(__doc__)
# Author: Alexandre Gramfort <alexandre.gramfort@inria.fr>
# License: BSD 3 clause
iris = datasets.load_iris()
X = iris.data
y = iris.target
X = X[y != 2]
y = y[y != 2]
# #############################################################################
# Demo path functions
cs = l1_min_c(X, y, loss='log') * np.logspace(0, 10, 30)
X = StandardScaler().fit_transform(X)
print("Computing regularization path ...")
start = time()
clf = linear_model.LogisticRegression(penalty='l1', solver='liblinear',
tol=1e-6, max_iter=int(1e6),
warm_start=True,
intercept_scaling=10000.)
coefs_ = []
for c in cs:
clf.set_params(C=c)
clf.fit(X, y)
coefs_.append(clf.coef_.ravel().copy())
print(clf.coef_.ravel(), clf.intercept_)
plt.figure()
plt.plot(np.log10(cs), np.array(coefs_))
plt.show() reasonable results. |
On my machine with the latest pull from master it worked once, but I cannot reproduce: it gets stuck after 2 Cs. Can I give more debugging/reproducing info ? |
indeed I can replicate the pb. It fails now. It only worked once on my
machine.
… |
Well spotted. I suggest to stop using LIBLINEAR, and start using SAGA: We should not use such large intercept scaling in LIBLINEAR. Intercept scaling is a dirty hack which scales the intercept feature so differently from the rest of the features that it slows down the convergence a lot (see e.g. #17557 (comment)). |
it's not that simple to me. Coordinate descent will outperform SAGA if
n_samples << n_features
and SAGA suffers when you have non-standardized features. liblinear is a
very efficient solver
I personally use quite systematically on log reg when I teach. The fact
that it works once and then
not suggests more a memory management problem in the C++ code.
… |
This might be to find 2 |
random_state=0 fails, random_state=6 converges from sklearn.svm import l1_min_c
from sklearn import datasets
from sklearn import linear_model
from sklearn.preprocessing import StandardScaler
import matplotlib.pyplot as plt
import numpy as np
from time import time
iris = datasets.load_iris()
X = iris.data
y = iris.target
X = X[y != 2]
y = y[y != 2]
cs = l1_min_c(X, y, loss='log') * np.logspace(0, 10, 30)
X_prep = StandardScaler().fit_transform(X)
clf = linear_model.LogisticRegression(penalty='l1', solver='liblinear',
tol=1e-6, max_iter=int(1e6),
warm_start=True,
intercept_scaling=10000.,
random_state=6)
coefs_ = []
intercepts_ = []
for c in cs:
clf.set_params(C=c)
clf.fit(X_prep, y)
coefs_.append(clf.coef_.ravel().copy())
intercepts_.append(clf.intercept_)
print(clf.coef_.ravel(), clf.intercept_) |
Great, now we need somebody motivated to get into debug mode :) |
I'll work on this, curious to see what is happening here. |
See proposed fix in #25214 Another possibility would be to fix LIBLINEAR to avoid regularizing the intercept, backporting the fix from cjlin1/liblinear@f68d25c. |
Describe the bug
The example on sparse Logreg surprised me because even going at extremely low regularizations (large C), only 1 feature enters the model:

Investigation lead me to check if StandardScaling X changed the graph. When X is preprocessed, the liblinear solver does not seem to converge.
Steps/Code to Reproduce
Expected Results
the code should run fast (iris has 4 features)
Actual Results
the code gets stuck after the first regularization parameter
I guess something is happening with the scaled intercept column added because liblinear does not fit an unregularized intercept.

Going to a lower intercept scaling (intercept_scaling=1) gives me the way more reasonable graph:
where I still suspect numerical errors to be responsible for the bumps when C becomes large
Versions
Happy to help if it's a known issue
@agramfort
The text was updated successfully, but these errors were encountered: