-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
FIX: make LinearRegression perfectly consistent across sparse or dense #13279
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FIX: make LinearRegression perfectly consistent across sparse or dense #13279
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You probably want to add an entry in what's new
clf_dense.fit(X, y) | ||
clf_sparse.fit(Xcsr, y) | ||
assert_almost_equal(clf_dense.intercept_, clf_sparse.intercept_) | ||
assert_array_almost_equal(clf_dense.coef_, clf_sparse.coef_) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assert_array_almost_equal(clf_dense.coef_, clf_sparse.coef_) | |
assert_allclose(clf_dense.coef_, clf_sparse.coef_) |
clf_sparse = LinearRegression(**params) | ||
clf_dense.fit(X, y) | ||
clf_sparse.fit(Xcsr, y) | ||
assert_almost_equal(clf_dense.intercept_, clf_sparse.intercept_) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assert_almost_equal(clf_dense.intercept_, clf_sparse.intercept_) | |
assert clf_dense.intercept_ == pytest.approx(clf_sparse.intercept_) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Otherwise LGTM
doc/whats_new/v0.21.rst
Outdated
@@ -174,6 +174,10 @@ Support for Python 3.4 and below has been officially dropped. | |||
parameter value ``copy_X=True`` in ``fit``. | |||
:issue:`12972` by :user:`Lucio Fernandez-Arjona <luk-f-a>` | |||
|
|||
- |Fix| Fixed a bug in :class:`linear_model.LinearRegression` that | |||
was not returning the same coeffecient and intercepts with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is missing mention of sparse/dense
sklearn/linear_model/base.py
Outdated
def matvec(b): | ||
return X.dot(b) - b.dot(X_offset_scale) | ||
def rmatvec(b): | ||
return X.T.dot(b) - (X_offset_scale) * np.sum(b) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
redundant parentheses
|
|
||
X_centered = sparse.linalg.LinearOperator(shape=X.shape, | ||
matvec=matvec, | ||
rmatvec=rmatvec) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very elegant!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Beautiful solution. +1 for merge.
Merging.
…e the fit_intercept=False that should not be needed since scikit-learn#13279 is merged
When 0-indented you require two blank lines, but elsewhere one.
|
scikit-learn#13279) * FIX : make LinearRegression perfectly consistent across sparse or dense * comments * review
… or dense (scikit-learn#13279)" This reverts commit 41f106c.
… or dense (scikit-learn#13279)" This reverts commit 41f106c.
scikit-learn#13279) * FIX : make LinearRegression perfectly consistent across sparse or dense * comments * review
due to non centering of X when sparse, LinearRegression has never been 100% the same as the dense solver. This now fixes this.
cc @amueller