-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
__repr__ not that helpful #6323
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I like the idea of removing parameters with default values. I do like that In the notebook context you could register formatters (can't find a better link :-/) for estimators for "rich" displaying. |
@amueller |
Someone want to add a PR for removing the default values? I feel like having in there whether the model is fitted or not would be nice, but I'm not entirely sure how to achieve that. We currently have the I'm not sure how to incorporate the fit status into the repr while still providing a useful python expression. |
I've been wondering about custom HTML display code for a little while... On 8 October 2016 at 10:24, Andreas Mueller notifications@github.com
|
The drawback is that they are then harder to discover.
+1. It also matches the description of what repr and str should be It's good to keep in mind that repr of an estimator is what's going |
Actually I might take a stab at this this weekend (if I ever finish my talk). |
Hm ok maybe Additionally, each model can add their own stuff in |
I am a bit worried that the str is going to become very long. But I guess
that it's fine because that's not what's used in things like error
messages.
|
That's why I was thinking about making it optional and maybe off by default. |
That's why I was thinking about making it optional and maybe off by default. sklearn.set_print(verbose=True)?
Something like this sound about right. It's also consistent with things
in pandas and numpy.
|
True that it's consistent with pandas and numpy. On 9 October 2016 at 07:29, Gael Varoquaux notifications@github.com wrote:
|
I'd be interested in experimenting with a jupyter-friendly HTML rendering of estimators, with the benefit of that being more flexible:
After mocking that up, or even merging it, we can then see what is worth porting to repr strings via options and changing defaults. I might try to mock something up, but would welcome someone else's attempt as my time is very limited atm, and I'd like to focus on other features. Initially I envision something closely replicating the repr structure, but with:
|
that sounds interesting but also somewhat fragile. And as I said, we currently don't even have a way to see if something was fitted or not. I guess you could check if there are any trailing underscore attributes? |
Maybe this should be an enhancement proposal...
So I think our current
__repr__
is not that helpful.Most construction parameters are default parameters that are never seen by a user, so reporting them is basically noise.
We don't report other important things, though, like whether the model was fitted at all, or, like, what the training score is, or the training time.
I know R has a very different approach, and I'm not sure their approach is good. But I think our current approach is pretty suboptimal.
I think the
__repr__
has become more important because of the popularity of jupyter notebook. If I run fit, I get the__repr__
back. And it is just noise.A slight improvement might be to just print the construction parameters that are not set to the default value. But we could also think about something more helpful, and maybe something more model specific. GridSearchCV for example could report the best score, and the best parameters found etc.
I have been thinking about this for a while, but this is somewhat inspired by looking at #5299. Why does adding a faster solver change the
__repr__
ofPCA
? That seems really weird to me.PCA
is one of the simplest and most commonly used methods in ML, in particular in courses. Now people will constantly see something about tol and number of iterations that is probably not relevant for them at all.The text was updated successfully, but these errors were encountered: