-
-
Notifications
You must be signed in to change notification settings - Fork 26.2k
ENH add sample_weight to sparse coordinade descent #22808
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH add sample_weight to sparse coordinade descent #22808
Conversation
@agramfort @TomDLT @rth You might be interested. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe @mathurinm has time to look
509844d
to
b4557c9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM ! thx @lorentzenchr
just to check did you observe any slowdown in the no sample weight case due to extract branching now present in the new code?
from sklearn.linear_model import ElasticNet
from sklearn.linear_model.tests.test_sparse_coordinate_descent import make_sparse_data
X, y = make_sparse_data(n_samples=1000, n_features=10_000)
%timeit ElasticNet().fit(X, y) Results with
Note that I notices quite some variation of those timings. |
@lorentzenchr beware that with such data and
so computations like the lipschitz constants (norm(X, axis=0) ** 2) have a heavy weight in the total time.
|
@mathurinm good point. from sklearn.linear_model import ElasticNet
from sklearn.linear_model.tests.test_sparse_coordinate_descent import make_sparse_data
X, y = make_sparse_data(n_samples=1000, n_features=10_000, n_informative=1000)
%timeit ElasticNet(alpha=0.01).fit(X, y) Gives
|
perfect ! thx @lorentzenchr just need another +1 to MRG this one. 🙏 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good. I can't really review the sparse_enet_coordinate_descent
details (besides trying to check that the code does what you wrote as comments), but I trust the tests and @agramfort :). And it does not modify the no sample weight caseso no big risk.
issue #3702 was created in 2014. We are on shedule 😄 |
@lorentzenchr after thoughts I don't understand why in the sparse case Is the current design adopted in order to have the same code to compute the gradient, with or without sample weights ? |
Reference Issues/PRs
Fixes #3702 by implementing the final bits.
What does this implement/fix? Explain your changes.
This PR implements
sample_weight
support with sparseX
forElasticNet
,ElasticNetCV
,Lasso
andLassoCV
.Details
The objective with sample weight
sw
is given bySolving for the intercept
w0
giveswhere the mean is a weighted average, weighted by
sw
.Dense solvers go on an rescale
y1
andX1
bysqrt(sw)
but sparse solvers cannot setX1 = X - X_mean
as this destroys sparsity ofX
. Therefore,X_mean
is passed to the coordinate descent solver.This PR goes on and also passes sample weights to the cd solver. The alternative would be to provide
sw * X_mean
which is a dense matrix of the same dimensions asX
, not a good idea.