Benchmark bug in SGDRegressor.fit on sparse data

There is a 10x increase in fit duration as reported here:

https://scikit-learn.org/scikit-learn-benchmarks/#linear_model.SGDRegressorBenchmark.time_fit?commits=e6b46675-b4afbeee&p-representation='sparse'

This happened between between e6b46675 (still fast) b4afbeee (slow).

Note that the same estimator with dense data was not impacted.

I have not investigated the cause myself. Just spotted it when reviewing the benchmarks.

Looking at the commit messages of the output of `git log e6b46675..b4afbeee`, it could be the case that 1a669d8202b1a22750b6e4f74b08257b5b95e0fe introduced the regression but this needs confirmation:

- #25587

EDIT: this is actually a bug in the design of the benchmark rather than a performance regression. See the discussion below.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Benchmark bug in SGDRegressor.fit on sparse data #26095

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Benchmark bug in SGDRegressor.fit on sparse data #26095

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions