-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
[MRG+2] Timing and training score in GridSearchCV #7325
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
It doesn't really make sense to weight it by test set size. Perhaps by training set size. I'd leave it unweighted. |
+1. |
Yes I was asking for weighing it by the training set size only... Okay! Let's leave it unweighted... |
are you working on this? |
a7d8c32
to
28290a9
Compare
I've made the keys as |
28290a9
to
5d4813c
Compare
About the example, do we want one that compares the train and test scores and also plots the training time for each candidate? (Also ping @vene for suggestions) |
@@ -746,6 +784,10 @@ class GridSearchCV(BaseSearchCV): | |||
FitFailedWarning is raised. This parameter does not affect the refit | |||
step, which will always raise the error. | |||
|
|||
return_train_score: boolean, default=True | |||
If ``'False'``, the results_ attribute will not include training |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cv_
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
space before colon
I'm not altogether happy that in |
I.e. that was @amueller's attempt at this PR's feature, never merged. |
Otherwise, re example, I think an example of train scores explicitly introducing over/under-fitting would be interesting, but adding it to the wishlist for a later PR is fine. |
I think if we want to... then EDIT: Or maybe not. I feel test still does not have a place here... We are timing fit + score (on testing set). |
Apologies for the delay. I got busy configuring my laptop ^_^. And
No I speculate that to be the reason... I tried running on my windows machine and found it not to be the case... I still don't have success in installing scikit-learn on windows ;( And if we follow the In the next couple of hours I'll somehow figure out a way to install scikit-learn on my windows and see if this test fails... |
e4246dc
to
07e1e34
Compare
I did Anyway, @jnothman @amueller WDYT we should do. The current test setup fine? |
You can use a less trivial fit, then? But I'm okay with this being written On 24 September 2016 at 09:39, Raghav RV notifications@github.com wrote:
|
Finally I succeeded in installing it in windows. (Had to resort to a 32 bit win in virtual box). Now I can confirm that it is the time precision that makes the test fail. This section of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Otherwise LGTM
'split0_train_score' : [0.8, 0.9, 0.7], | ||
'split1_train_score' : [0.82, 0.5, 0.7], | ||
'mean_train_score' : [0.81, 0.7, 0.7], | ||
'std_train_score' : [0.00073, 0.00063, 0.00043], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we make these numbers more readable? Pretend it's slower.
@@ -249,19 +258,22 @@ def _fit_and_score(estimator, X, y, scorer, train, test, verbose, | |||
|
|||
else: | |||
test_score = _score(estimator, X_test, y_test, scorer) | |||
score_time = time.time() - start_time - fit_time |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah ok that makes more sense. Before this was just time.time() - start_time
or something like that and I was really confused. and I was really confused.
lgtm. |
Thanks everyone for the reviews. @jnothman Does it look fine now? Merge? |
* ENH separate fit / score times * Make score_time=0 if errored; Ignore warnings in test * Cleanup docstrings * ENH Use helper to store the results * Move fit time computation to else of try...except...else * DOC readable sample scores * COSMIT Add a commnent on why time test is >= 0 instead of > 0 (Windows time.time precision is not accurate enought to be non-zero for trivial fits)
9a7a93a
to
c3478e3
Compare
@jnothman bump :) Anything more to be done here? |
Thanks for asking. The last thing to do: mention somewhere that timings are in seconds in |
Done :) |
And done. Backport please, @amueller... |
* Resolved issue scikit-learn#6894 and scikit-learn#6895: Now *SearchCV.results_ includes both timing and training scores. wrote new test (sklearn/model_selection/test_search.py) and new doctest (sklearn/model_selection/_search.py) added a few more lines in the docstring of GridSearchCV and RandomizedSearchCV. Revised code according to suggestions. Add a few more lines to test_grid_search_results(): 1. check test_rank_score always >= 1 2. check all regular scores (test/train_mean/std_score) and timing >= 0 3. check all regular scores <= 1 Note that timing can be greater than 1 in general, and std of regular scores always <= 1 because the scores are bounded between 0 and 1. * ENH/FIX timing and training score. * ENH separate fit / score times * Make score_time=0 if errored; Ignore warnings in test * Cleanup docstrings * ENH Use helper to store the results * Move fit time computation to else of try...except...else * DOC readable sample scores * COSMIT Add a commnent on why time test is >= 0 instead of > 0 (Windows time.time precision is not accurate enought to be non-zero for trivial fits) * Convey that times are in seconds # Conflicts: # doc/whats_new.rst
* Resolved issue scikit-learn#6894 and scikit-learn#6895: Now *SearchCV.results_ includes both timing and training scores. wrote new test (sklearn/model_selection/test_search.py) and new doctest (sklearn/model_selection/_search.py) added a few more lines in the docstring of GridSearchCV and RandomizedSearchCV. Revised code according to suggestions. Add a few more lines to test_grid_search_results(): 1. check test_rank_score always >= 1 2. check all regular scores (test/train_mean/std_score) and timing >= 0 3. check all regular scores <= 1 Note that timing can be greater than 1 in general, and std of regular scores always <= 1 because the scores are bounded between 0 and 1. * ENH/FIX timing and training score. * ENH separate fit / score times * Make score_time=0 if errored; Ignore warnings in test * Cleanup docstrings * ENH Use helper to store the results * Move fit time computation to else of try...except...else * DOC readable sample scores * COSMIT Add a commnent on why time test is >= 0 instead of > 0 (Windows time.time precision is not accurate enought to be non-zero for trivial fits) * Convey that times are in seconds
* tag '0.18': (1286 commits) [MRG + 1] More versionadded everywhere! (scikit-learn#7403) minor doc fixes fix lbfgs rename (scikit-learn#7503) minor fixes to whatsnew fix scoring function table fix rebase messup DOC more what's new subdivision DOC Attempt to impose some order on What's New 0.18 no fixed width within bold REL changes for release in 0.18.X branch (scikit-learn#7414) [MRG+2] Timing and training score in GridSearchCV (scikit-learn#7325) DOC: Added Nested Cross Validation Example (scikit-learn#7111) Sync docstring and definition default argument in kneighbors (scikit-learn#7476) added contributors for 0.18, minor formatting fixes. Fix typo in whats_new.rst [MRG+2] FIX adaboost estimators not randomising correctly (scikit-learn#7411) Addressing issue scikit-learn#7468. (scikit-learn#7472) Reorganize README clean up deprecation warning stuff in common tests [MRG+1] Fix regression in silhouette_score for clusters of size 1 (scikit-learn#7438) ...
* releases: (1286 commits) [MRG + 1] More versionadded everywhere! (scikit-learn#7403) minor doc fixes fix lbfgs rename (scikit-learn#7503) minor fixes to whatsnew fix scoring function table fix rebase messup DOC more what's new subdivision DOC Attempt to impose some order on What's New 0.18 no fixed width within bold REL changes for release in 0.18.X branch (scikit-learn#7414) [MRG+2] Timing and training score in GridSearchCV (scikit-learn#7325) DOC: Added Nested Cross Validation Example (scikit-learn#7111) Sync docstring and definition default argument in kneighbors (scikit-learn#7476) added contributors for 0.18, minor formatting fixes. Fix typo in whats_new.rst [MRG+2] FIX adaboost estimators not randomising correctly (scikit-learn#7411) Addressing issue scikit-learn#7468. (scikit-learn#7472) Reorganize README clean up deprecation warning stuff in common tests [MRG+1] Fix regression in silhouette_score for clusters of size 1 (scikit-learn#7438) ...
* dfsg: (1286 commits) [MRG + 1] More versionadded everywhere! (scikit-learn#7403) minor doc fixes fix lbfgs rename (scikit-learn#7503) minor fixes to whatsnew fix scoring function table fix rebase messup DOC more what's new subdivision DOC Attempt to impose some order on What's New 0.18 no fixed width within bold REL changes for release in 0.18.X branch (scikit-learn#7414) [MRG+2] Timing and training score in GridSearchCV (scikit-learn#7325) DOC: Added Nested Cross Validation Example (scikit-learn#7111) Sync docstring and definition default argument in kneighbors (scikit-learn#7476) added contributors for 0.18, minor formatting fixes. Fix typo in whats_new.rst [MRG+2] FIX adaboost estimators not randomising correctly (scikit-learn#7411) Addressing issue scikit-learn#7468. (scikit-learn#7472) Reorganize README clean up deprecation warning stuff in common tests [MRG+1] Fix regression in silhouette_score for clusters of size 1 (scikit-learn#7438) ...
* Resolved issue scikit-learn#6894 and scikit-learn#6895: Now *SearchCV.results_ includes both timing and training scores. wrote new test (sklearn/model_selection/test_search.py) and new doctest (sklearn/model_selection/_search.py) added a few more lines in the docstring of GridSearchCV and RandomizedSearchCV. Revised code according to suggestions. Add a few more lines to test_grid_search_results(): 1. check test_rank_score always >= 1 2. check all regular scores (test/train_mean/std_score) and timing >= 0 3. check all regular scores <= 1 Note that timing can be greater than 1 in general, and std of regular scores always <= 1 because the scores are bounded between 0 and 1. * ENH/FIX timing and training score. * ENH separate fit / score times * Make score_time=0 if errored; Ignore warnings in test * Cleanup docstrings * ENH Use helper to store the results * Move fit time computation to else of try...except...else * DOC readable sample scores * COSMIT Add a commnent on why time test is >= 0 instead of > 0 (Windows time.time precision is not accurate enought to be non-zero for trivial fits) * Convey that times are in seconds
* Resolved issue scikit-learn#6894 and scikit-learn#6895: Now *SearchCV.results_ includes both timing and training scores. wrote new test (sklearn/model_selection/test_search.py) and new doctest (sklearn/model_selection/_search.py) added a few more lines in the docstring of GridSearchCV and RandomizedSearchCV. Revised code according to suggestions. Add a few more lines to test_grid_search_results(): 1. check test_rank_score always >= 1 2. check all regular scores (test/train_mean/std_score) and timing >= 0 3. check all regular scores <= 1 Note that timing can be greater than 1 in general, and std of regular scores always <= 1 because the scores are bounded between 0 and 1. * ENH/FIX timing and training score. * ENH separate fit / score times * Make score_time=0 if errored; Ignore warnings in test * Cleanup docstrings * ENH Use helper to store the results * Move fit time computation to else of try...except...else * DOC readable sample scores * COSMIT Add a commnent on why time test is >= 0 instead of > 0 (Windows time.time precision is not accurate enought to be non-zero for trivial fits) * Convey that times are in seconds
Finishing up @eyc88's #7026
Resolves #6894 and #6895
TODO
mean|std_fit_time
vsmean|std_score_time
Have an example atAddress in separate PRexamples/
?@jnothman @amueller @vene @ogrisel (To be reviewed after merge of #7324)
NOTE: Make sure @eyc88 is credited when squashing and merging.