DOC add FAQ entry for the many linear model classes #19861

lorentzenchr · 2021-04-11T10:01:25Z

Reference Issues/PRs

See discussion of the dev meeting https://github.com/scikit-learn/administrative/blob/master/meeting_notes/2021-03-29.md.

What does this implement/fix? Explain your changes.

This PR adds a FAQ entry that explains why there are so many linear model classes.

cmarmo

Thanks @lorentzenchr !

doc/faq.rst

Co-authored-by: Chiara Marmo <cmarmo@users.noreply.github.com>

NicolasHug

Thanks @lorentzenchr , I think it's great that we start addressing this common question upfront.

However the current version seems like a justification from a developer point of view, but not from a user perspective. Just by reading this is seems that it would still be perfectly possible to just unify all this into a single class.

Also I think we'd need to address another important oddity of the linear_model module: the fact that the SGD estimators even exist, instead of being just another solver option of the aforementioned classes.

doc/faq.rst

lorentzenchr · 2021-04-11T18:28:33Z

@NicolasHug Thanks for having a look and your feedback. I divided it into maintainer and user perspective. Is it better now?

NicolasHug · 2021-04-11T19:18:05Z

I'm wondering: knowing what we know now, if we were to re-design the linear_model module from scratch, would we make the same choices again? My impression is that we would definitely not. And I think that for the sake of transparency, it's important to address and acknowledge that in the FAQ entry. If mistakes were made (if any, IDK), it's ok to say it. Design is hard and stuff happen for historical reasons, and it's worth explaining it if we're writing such an entry.

For now I don't think the current text will make a reader say "Oh ok, that makes sense, I understand why everything is the way it is now." At least it doesn't have that effect on me (and probably not on them either :p)

I wasn't around when these decisions were made, so I'm not able to provide any answer here. Perhaps @GaelVaroquaux @ogrisel @agramfort @amueller @jnothman can provide some insights?

Also again I think it's relevant to include the SGD estimators in the conversation, since they're pretty much offsiders within the linear_model module

GaelVaroquaux · 2021-04-11T21:07:43Z

@NicolasHug breaking something vastly used to satisfy a feeling of esthetic is probably not a good idea.

Design is a difficult thing, and it is impossible to satisfy everybody (in particular on reddit). People approach the library with a background knowledge, in terms of names that they know and think that they understand. Our design was driven to answer this approach. A fraction of people have good-enough training in statistics and ML to understand that l1-penalized squared loss is a lasso, and hinge or squared hinge are SVMs. We certainly do not speak well to these people, but other libraries do (for instance lightning, http://contrib.scikit-learn.org/lightning/ ).

Separating the various classes also enables algorithmic optimizations, such as good choices for the solvers (arguably, we could maybe separate out the sparse logistic regression from the non-sparse one).

I wouldn't really revisit the separation of the various classes. It overall helps making the library approachable. I do think that better documenting the links is a good idea (thank you @lorentzenchr ). Ideally, we should check all the estimators to check the "see also" section of the docstring. We should also proof our docs to check that the links are well specified.

Where I find that this is a vast room for improvement is in our online solver, SGDClassifier/SGDRegressor. Not only does it not map well to the user's vocabulary, but also they are certainly not "fire and forget".

I, for one, would welcome work to add MiniBatch versions of some of the linear model classes, witch carefully-tuned algorithmics so that they are easily useful (such as early-stopping). If we do this, after a few releases, we could maybe consider retiring the SGDCassifier/SGDRegressor.

GaelVaroquaux

Overall, I think that the explanation capture the big picture and are useful. Thank you!

I made a few minor comments, but to me, this is good to go.

GaelVaroquaux · 2021-04-11T21:08:48Z

doc/faq.rst

+
+Why are there so many different estimators for linear models?
+-------------------------------------------------------------
+Usually, there is one classifier and one regressor per algorithms, e.g.


You probably mean "model", rather than "algorithm". An algorithm eg is the type of solver used.

How about "model type"?

GaelVaroquaux · 2021-04-11T21:09:56Z

doc/faq.rst

+For linear models, there are many estimator classes which are very close to
+each other. Let us have a look at
+
+- :class:`~linear_model.LinearRegression`, no penalty


Should we say here that it is there only for teaching purpose. IMHO, it should never be used, and I would love to retire this class (though I do know that not everybody agrees with this desire).

I think, it is the wrong place to mention this. And I belong to the mentioned small group...

doc/faq.rst

GaelVaroquaux · 2021-04-11T21:16:20Z

doc/faq.rst

+For linear models, there are many estimator classes which are very close to
+each other. Let us have a look at
+
+- :class:`~linear_model.LinearRegression`, no penalty


Suggested change

- :class:`~linear_model.LinearRegression`, no penalty

- :class:`~linear_model.LinearRegression`, no penalty (for teaching purposes, prefer Ridge with a small penalty)

This can be read as "prefer Ridge with a small penalty for teaching purposes", so maybe:

Suggested change

- :class:`~linear_model.LinearRegression`, no penalty

- :class:`~linear_model.LinearRegression`, no penalty. This estimator exists mostly for teaching purpose. In general Ridge with a small penalty is preferable.

Again, I'm not sure that this is the right place.

I agree with @lorentzenchr. I think the message to convey is boiling down to a listing of the linear model in regression. We can make it explicit beforehand that we are stating which models to use but rather making a catalogue of the model

Now thinking that this is only an entry in the FAQ, I think this is even more relevant to not add usage information.

doc/faq.rst

agramfort

not much to add except this minor suggestion.

agramfort · 2021-04-12T06:30:16Z

doc/faq.rst

+
+**User perspective:**
+If all the 4 above mentioned linear models were unified into a single class,
+there would be parameters with a lot of options like the ``solver`` parameter.


Another argument is that it would harder to discover and certainly less straight forward to teach.
Without a class LogisticRegression I feel you would be harder to discover and use for very beginners.

I added this point, but with ridge instead of logistic as ridge is already mentioned.

NicolasHug · 2021-04-12T06:35:37Z

@NicolasHug breaking something vastly used to satisfy a feeling of esthetic is probably not a good idea.

@GaelVaroquaux I never suggested to re-design the linear_models module. I'm just suggesting to explain what led to the current status of things in the FAQ entry.

lorentzenchr · 2021-04-12T16:33:14Z

Also again I think it's relevant to include the SGD estimators in the conversation, since they're pretty much offsiders within the linear_model module

@NicolasHug I added SGDRegressor.

glemaitre

I think that this make this is indeed an improvement. I added a small comment but I would already be happy to see this in the documentation.

glemaitre · 2021-04-14T07:56:20Z

doc/faq.rst

+For linear models, there are many estimator classes which are very close to
+each other. Let us have a look at
+
+- :class:`~linear_model.LinearRegression`, no penalty


I agree with @lorentzenchr. I think the message to convey is boiling down to a listing of the linear model in regression. We can make it explicit beforehand that we are stating which models to use but rather making a catalogue of the model

glemaitre · 2021-04-14T07:58:15Z

doc/faq.rst

+each other. Let us have a look at
+
+- :class:`~linear_model.LinearRegression`, no penalty
+- :class:`~linear_model.Ridge`, L2 penalty


Do you think that it would make sense to list RidgeCV, LassoCV, and ElasticNetCV (maybe together after SGD) mentioning that penalized models where the optimum penalty is found via CV?

I thought about that while I was writing this PR and decided against it. I'd say that they build their own group, meaning LinearRegression, Ridge, Lasso, ElasticNet do the same thing and RidgeCV, LassoCV and ElasticNetCV do the another same thing.

Not a big deal on my side, we can keep it as it is.

glemaitre · 2021-04-14T08:00:54Z

doc/faq.rst

+estimator classes for different penalties.
+
+**User perspective:**
+First, the current design makes it easy to find the models by their well known


I might be slightly more assertive mentioning that these models are given a specific name in the scientific literature. A design choice was to follow-up this trend, for the best or the worse depending of user's expectations.

Do you have a concrete proposal or suggestion for improvement?

Maybe:

First, the current design is inspired by the scientific literature where the linear models for regression settings with different regularization/penalty were given different names. Having different models might make it easier for users to discover the right regressor to choose.

I rephrased slightly.

GaelVaroquaux · 2021-04-16T07:44:27Z

Let's include @glemaitre 's last suggestions (or not) and merge. I feel that the current state is already good and we are better off merging than waiting.

glemaitre · 2021-04-16T19:07:21Z

Thanks @lorentzenchr

Co-authored-by: Chiara Marmo <cmarmo@users.noreply.github.com>

DOC add faq entry for the many linear model classes

e13a516

github-actions bot added the Documentation label Apr 11, 2021

DOC add tricks

3648dac

cmarmo reviewed Apr 11, 2021

View reviewed changes

doc/faq.rst Outdated Show resolved Hide resolved

Apply reviewer suggestion

c97380a

Co-authored-by: Chiara Marmo <cmarmo@users.noreply.github.com>

NicolasHug reviewed Apr 11, 2021

View reviewed changes

doc/faq.rst Outdated Show resolved Hide resolved

DOC split in maintainer and user perspective

0b2a92b

GaelVaroquaux approved these changes Apr 11, 2021

View reviewed changes

agramfort approved these changes Apr 12, 2021

View reviewed changes

lorentzenchr added 5 commits April 12, 2021 18:02

DOC algorithm -> model type

361605e

DOC add "sparse models"

ca62e9a

DOC fix type

c15bc09

DOC add SGDRegressor

8fe7980

DOC mention design to ease finding by well known names

5727a05

glemaitre reviewed Apr 14, 2021

View reviewed changes

DOC design inspired by scientific literature

6a9dd77

glemaitre merged commit e1f879e into scikit-learn:main Apr 16, 2021

lorentzenchr deleted the faq_linear_models branch April 16, 2021 20:47

thomasjpfan pushed a commit to thomasjpfan/scikit-learn that referenced this pull request Apr 19, 2021

DOC add FAQ entry for the many linear model classes (scikit-learn#19861)

397fbcd

Co-authored-by: Chiara Marmo <cmarmo@users.noreply.github.com>

glemaitre mentioned this pull request Apr 22, 2021

Release 0.24.2 #19954

Merged

12 tasks

glemaitre pushed a commit to glemaitre/scikit-learn that referenced this pull request Apr 22, 2021

DOC add FAQ entry for the many linear model classes (scikit-learn#19861)

dc6c087

Co-authored-by: Chiara Marmo <cmarmo@users.noreply.github.com>

glemaitre pushed a commit that referenced this pull request Apr 28, 2021

DOC add FAQ entry for the many linear model classes (#19861)

787e066

Co-authored-by: Chiara Marmo <cmarmo@users.noreply.github.com>

	- :class:`~linear_model.LinearRegression`, no penalty
	- :class:`~linear_model.LinearRegression`, no penalty (for teaching purposes, prefer Ridge with a small penalty)

Uh oh!

DOC add FAQ entry for the many linear model classes #19861

DOC add FAQ entry for the many linear model classes #19861

Uh oh!

Conversation

lorentzenchr commented Apr 11, 2021

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Uh oh!

cmarmo left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

NicolasHug left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

lorentzenchr commented Apr 11, 2021

Uh oh!

NicolasHug commented Apr 11, 2021

Uh oh!

GaelVaroquaux commented Apr 11, 2021

Uh oh!

GaelVaroquaux left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

agramfort left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

NicolasHug commented Apr 12, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lorentzenchr commented Apr 12, 2021

Uh oh!

glemaitre left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

GaelVaroquaux commented Apr 16, 2021

NicolasHug left a comment •

edited

Loading

NicolasHug commented Apr 12, 2021 •

edited

Loading