Skip to content

FIX: make LinearRegression perfectly consistent across sparse or dense #13279

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

agramfort
Copy link
Member

due to non centering of X when sparse, LinearRegression has never been 100% the same as the dense solver. This now fixes this.

cc @amueller

Copy link
Member

@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You probably want to add an entry in what's new

clf_dense.fit(X, y)
clf_sparse.fit(Xcsr, y)
assert_almost_equal(clf_dense.intercept_, clf_sparse.intercept_)
assert_array_almost_equal(clf_dense.coef_, clf_sparse.coef_)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
assert_array_almost_equal(clf_dense.coef_, clf_sparse.coef_)
assert_allclose(clf_dense.coef_, clf_sparse.coef_)

clf_sparse = LinearRegression(**params)
clf_dense.fit(X, y)
clf_sparse.fit(Xcsr, y)
assert_almost_equal(clf_dense.intercept_, clf_sparse.intercept_)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
assert_almost_equal(clf_dense.intercept_, clf_sparse.intercept_)
assert clf_dense.intercept_ == pytest.approx(clf_sparse.intercept_)

@glemaitre glemaitre changed the title FIX : make LinearRegression perfectly consistent across sparse or dense FIX: make LinearRegression perfectly consistent across sparse or dense Feb 26, 2019
@glemaitre glemaitre self-requested a review February 26, 2019 16:32
Copy link
Member

@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise LGTM

@@ -174,6 +174,10 @@ Support for Python 3.4 and below has been officially dropped.
parameter value ``copy_X=True`` in ``fit``.
:issue:`12972` by :user:`Lucio Fernandez-Arjona <luk-f-a>`

- |Fix| Fixed a bug in :class:`linear_model.LinearRegression` that
was not returning the same coeffecient and intercepts with
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is missing mention of sparse/dense

def matvec(b):
return X.dot(b) - b.dot(X_offset_scale)
def rmatvec(b):
return X.T.dot(b) - (X_offset_scale) * np.sum(b)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

redundant parentheses

@glemaitre
Copy link
Member

glemaitre commented Feb 26, 2019

We should almost have a common test. Wrong PR.


X_centered = sparse.linalg.LinearOperator(shape=X.shape,
matvec=matvec,
rmatvec=rmatvec)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very elegant!

Copy link
Member

@GaelVaroquaux GaelVaroquaux left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Beautiful solution. +1 for merge.

Merging.

@GaelVaroquaux GaelVaroquaux merged commit 66899ed into scikit-learn:master Feb 27, 2019
wdevazelhes pushed a commit to wdevazelhes/scikit-learn that referenced this pull request Feb 27, 2019
…e the fit_intercept=False that should not be needed since scikit-learn#13279 is merged
@jnothman
Copy link
Member

jnothman commented Feb 28, 2019 via email

xhluca pushed a commit to xhluca/scikit-learn that referenced this pull request Apr 28, 2019
scikit-learn#13279)

* FIX : make LinearRegression perfectly consistent across sparse or dense

* comments

* review
xhluca pushed a commit to xhluca/scikit-learn that referenced this pull request Apr 28, 2019
xhluca pushed a commit to xhluca/scikit-learn that referenced this pull request Apr 28, 2019
koenvandevelde pushed a commit to koenvandevelde/scikit-learn that referenced this pull request Jul 12, 2019
scikit-learn#13279)

* FIX : make LinearRegression perfectly consistent across sparse or dense

* comments

* review
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants