Skip to content

[MRG+2] Timing and training score in GridSearchCV #7325

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Sep 27, 2016

Conversation

raghavrv
Copy link
Member

@raghavrv raghavrv commented Sep 1, 2016

Finishing up @eyc88's #7026

Resolves #6894 and #6895

TODO

Have an example at examples/? Address in separate PR

@jnothman @amueller @vene @ogrisel (To be reviewed after merge of #7324)

NOTE: Make sure @eyc88 is credited when squashing and merging.

@raghavrv
Copy link
Member Author

raghavrv commented Sep 1, 2016

@amueller @jnothman Do we have to get the weighted average for mean_train_score across split if i.i.d.?

@jnothman
Copy link
Member

jnothman commented Sep 1, 2016

It doesn't really make sense to weight it by test set size. Perhaps by training set size. I'd leave it unweighted.

@GaelVaroquaux
Copy link
Member

It doesn't really make sense to weight it by test set size. Perhaps by
training set size. I'd leave it unweighted.

+1.

@raghavrv
Copy link
Member Author

raghavrv commented Sep 1, 2016

Yes I was asking for weighing it by the training set size only... Okay! Let's leave it unweighted...

@amueller amueller added this to the 0.18 milestone Sep 6, 2016
@amueller
Copy link
Member

amueller commented Sep 6, 2016

are you working on this?

@raghavrv
Copy link
Member Author

raghavrv commented Sep 7, 2016

@amueller yes. I was waiting for #7324 to get merged. I'll push in a while...

@raghavrv raghavrv force-pushed the timing_training_score branch from a7d8c32 to 28290a9 Compare September 8, 2016 13:08
@raghavrv
Copy link
Member Author

I've made the keys as "mean_time" and "std_time" as the word test/train is irrelevant here...

@raghavrv raghavrv force-pushed the timing_training_score branch from 28290a9 to 5d4813c Compare September 11, 2016 11:41
@raghavrv
Copy link
Member Author

@amueller @jnothman added some basic tests for training scores... Reviews please...

@raghavrv
Copy link
Member Author

About the example, do we want one that compares the train and test scores and also plots the training time for each candidate? (Also ping @vene for suggestions)

@@ -746,6 +784,10 @@ class GridSearchCV(BaseSearchCV):
FitFailedWarning is raised. This parameter does not affect the refit
step, which will always raise the error.

return_train_score: boolean, default=True
If ``'False'``, the results_ attribute will not include training
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cv_

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

space before colon

@jnothman
Copy link
Member

I'm not altogether happy that in _fit_and_score, scoring of training data is included within the timing. Should we consider pulling it out?

@jnothman
Copy link
Member

Regarding example, @amueller plotted train score and time in a variant of plot_rbf_parameters at #1742.

@jnothman
Copy link
Member

I.e. that was @amueller's attempt at this PR's feature, never merged.

@jnothman
Copy link
Member

Otherwise, re example, I think an example of train scores explicitly introducing over/under-fitting would be interesting, but adding it to the wishlist for a later PR is fine.

@raghavrv
Copy link
Member Author

raghavrv commented Sep 11, 2016

I'm not altogether happy that in _fit_and_score, scoring of training data is included within the timing. Should we consider pulling it out?

I think if we want to... then mean_test_time would make sense... WDYT?

EDIT: Or maybe not. I feel test still does not have a place here... We are timing fit + score (on testing set).

@raghavrv raghavrv changed the title [WIP] Timing and training score in GridSearchCV [MRG] Timing and training score in GridSearchCV Sep 11, 2016
@raghavrv
Copy link
Member Author

raghavrv commented Sep 23, 2016

Apologies for the delay. I got busy configuring my laptop ^_^.

And

oh, you're saying it registered 0. Of course >= 0 is what you want.

No I speculate that to be the reason...

I tried running on my windows machine and found it not to be the case... time.time is as accurate as it was on linux. I don't know how older versions of windows handle it though...

I still don't have success in installing scikit-learn on windows ;(

And if we follow the >= 0 that would mean we are allowing fit_time to be 0 in linux too. Such a test would mask a bug of fit_time always being 0...

In the next couple of hours I'll somehow figure out a way to install scikit-learn on my windows and see if this test fails...

@raghavrv raghavrv force-pushed the timing_training_score branch from e4246dc to 07e1e34 Compare September 23, 2016 23:29
@raghavrv
Copy link
Member Author

I did >= 0 and the appveyor passes. I am unable to install on windows (I set too small a partition size and it bit me by being unable to install the whopping 8GB for visual studio libraries. I don't understand why mingw requires vs13)

Anyway, @jnothman @amueller WDYT we should do. The current test setup fine?

@jnothman
Copy link
Member

You can use a less trivial fit, then? But I'm okay with this being written
off as "hard to test".

On 24 September 2016 at 09:39, Raghav RV notifications@github.com wrote:

I did >= 0 and the appveyor passes. I am unable to install on windows (I
set too small a partition size and it bit me by being unable to install the
whopping 8GB for visual studio libraries. I don't understand why mingw
requires vs13)

Anyway, @jnothman https://github.com/jnothman @amueller
https://github.com/amueller WDYT we should do. The current test setup
fine?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#7325 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAEz61B1nXjUChsX_xHAckbXYD-EcUzIks5qtGMWgaJpZM4JyaRM
.

@raghavrv
Copy link
Member Author

Finally I succeeded in installing it in windows. (Had to resort to a 32 bit win in virtual box). Now I can confirm that it is the time precision that makes the test fail.

This section of timeit explains it clearly. I would say let's leave it as such like you suggested before. The alternative would be to make a utils.fixes function safe_time that always gives us the highest precision clock. (Which could mean checking platform and choosing between time.clock, time.time, or the newer time.perf_counter)

@amueller @jnothman Final reviews and merge?

Copy link
Member

@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise LGTM

'split0_train_score' : [0.8, 0.9, 0.7],
'split1_train_score' : [0.82, 0.5, 0.7],
'mean_train_score' : [0.81, 0.7, 0.7],
'std_train_score' : [0.00073, 0.00063, 0.00043],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make these numbers more readable? Pretend it's slower.

@jnothman jnothman changed the title [MRG] Timing and training score in GridSearchCV [MRG+1] Timing and training score in GridSearchCV Sep 24, 2016
@@ -249,19 +258,22 @@ def _fit_and_score(estimator, X, y, scorer, train, test, verbose,

else:
test_score = _score(estimator, X_test, y_test, scorer)
score_time = time.time() - start_time - fit_time
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah ok that makes more sense. Before this was just time.time() - start_time or something like that and I was really confused. and I was really confused.

@amueller
Copy link
Member

lgtm.

@amueller amueller changed the title [MRG+1] Timing and training score in GridSearchCV [MRG+2] Timing and training score in GridSearchCV Sep 25, 2016
@raghavrv
Copy link
Member Author

Thanks everyone for the reviews. @jnothman Does it look fine now? Merge?

* ENH separate fit / score times
* Make score_time=0 if errored; Ignore warnings in test
* Cleanup docstrings
* ENH Use helper to store the results
* Move fit time computation to else of try...except...else
* DOC readable sample scores
* COSMIT Add a commnent on why time test is >= 0 instead of > 0
  (Windows time.time precision is not accurate enought to be non-zero
   for trivial fits)
@raghavrv
Copy link
Member Author

@jnothman bump :) Anything more to be done here?

@jnothman
Copy link
Member

Thanks for asking. The last thing to do: mention somewhere that timings are in seconds in cv_results_.

@raghavrv
Copy link
Member Author

Done :)

@jnothman jnothman merged commit b444cc9 into scikit-learn:master Sep 27, 2016
@jnothman
Copy link
Member

And done. Backport please, @amueller...

@raghavrv raghavrv deleted the timing_training_score branch September 27, 2016 10:50
amueller added a commit to amueller/scikit-learn that referenced this pull request Sep 27, 2016
* Resolved issue scikit-learn#6894 and scikit-learn#6895:
    Now *SearchCV.results_ includes both timing and training scores.

wrote new test (sklearn/model_selection/test_search.py)
and new doctest (sklearn/model_selection/_search.py)
added a few more lines in the docstring of GridSearchCV and RandomizedSearchCV.
Revised code according to suggestions.
Add a few more lines to test_grid_search_results():
    1. check test_rank_score always >= 1
    2. check all regular scores (test/train_mean/std_score) and timing >= 0
    3. check all regular scores <= 1
Note that timing can be greater than 1 in general, and std of regular scores
always <= 1 because the scores are bounded between 0 and 1.

* ENH/FIX timing and training score.

* ENH separate fit / score times
* Make score_time=0 if errored; Ignore warnings in test
* Cleanup docstrings
* ENH Use helper to store the results
* Move fit time computation to else of try...except...else
* DOC readable sample scores
* COSMIT Add a commnent on why time test is >= 0 instead of > 0
  (Windows time.time precision is not accurate enought to be non-zero
   for trivial fits)

* Convey that times are in seconds

# Conflicts:
#	doc/whats_new.rst
TomDLT pushed a commit to TomDLT/scikit-learn that referenced this pull request Oct 3, 2016
* Resolved issue scikit-learn#6894 and scikit-learn#6895:
    Now *SearchCV.results_ includes both timing and training scores.

wrote new test (sklearn/model_selection/test_search.py)
and new doctest (sklearn/model_selection/_search.py)
added a few more lines in the docstring of GridSearchCV and RandomizedSearchCV.
Revised code according to suggestions.
Add a few more lines to test_grid_search_results():
    1. check test_rank_score always >= 1
    2. check all regular scores (test/train_mean/std_score) and timing >= 0
    3. check all regular scores <= 1
Note that timing can be greater than 1 in general, and std of regular scores
always <= 1 because the scores are bounded between 0 and 1.

* ENH/FIX timing and training score.

* ENH separate fit / score times
* Make score_time=0 if errored; Ignore warnings in test
* Cleanup docstrings
* ENH Use helper to store the results
* Move fit time computation to else of try...except...else
* DOC readable sample scores
* COSMIT Add a commnent on why time test is >= 0 instead of > 0
  (Windows time.time precision is not accurate enought to be non-zero
   for trivial fits)

* Convey that times are in seconds
yarikoptic added a commit to yarikoptic/scikit-learn that referenced this pull request Nov 10, 2016
* tag '0.18': (1286 commits)
  [MRG + 1] More versionadded everywhere! (scikit-learn#7403)
  minor doc fixes
  fix lbfgs rename (scikit-learn#7503)
  minor fixes to whatsnew
  fix scoring function table
  fix rebase messup
  DOC more what's new subdivision
  DOC Attempt to impose some order on What's New 0.18
  no fixed width within bold
  REL changes for release in 0.18.X branch (scikit-learn#7414)
  [MRG+2] Timing and training score in GridSearchCV (scikit-learn#7325)
  DOC: Added Nested Cross Validation Example (scikit-learn#7111)
  Sync docstring and definition default argument in kneighbors (scikit-learn#7476)
  added contributors for 0.18, minor formatting fixes.
  Fix typo in whats_new.rst
  [MRG+2] FIX adaboost estimators not randomising correctly (scikit-learn#7411)
  Addressing issue scikit-learn#7468. (scikit-learn#7472)
  Reorganize README
  clean up deprecation warning stuff in common tests
  [MRG+1] Fix regression in silhouette_score for clusters of size 1 (scikit-learn#7438)
  ...
yarikoptic added a commit to yarikoptic/scikit-learn that referenced this pull request Nov 10, 2016
* releases: (1286 commits)
  [MRG + 1] More versionadded everywhere! (scikit-learn#7403)
  minor doc fixes
  fix lbfgs rename (scikit-learn#7503)
  minor fixes to whatsnew
  fix scoring function table
  fix rebase messup
  DOC more what's new subdivision
  DOC Attempt to impose some order on What's New 0.18
  no fixed width within bold
  REL changes for release in 0.18.X branch (scikit-learn#7414)
  [MRG+2] Timing and training score in GridSearchCV (scikit-learn#7325)
  DOC: Added Nested Cross Validation Example (scikit-learn#7111)
  Sync docstring and definition default argument in kneighbors (scikit-learn#7476)
  added contributors for 0.18, minor formatting fixes.
  Fix typo in whats_new.rst
  [MRG+2] FIX adaboost estimators not randomising correctly (scikit-learn#7411)
  Addressing issue scikit-learn#7468. (scikit-learn#7472)
  Reorganize README
  clean up deprecation warning stuff in common tests
  [MRG+1] Fix regression in silhouette_score for clusters of size 1 (scikit-learn#7438)
  ...
yarikoptic added a commit to yarikoptic/scikit-learn that referenced this pull request Nov 10, 2016
* dfsg: (1286 commits)
  [MRG + 1] More versionadded everywhere! (scikit-learn#7403)
  minor doc fixes
  fix lbfgs rename (scikit-learn#7503)
  minor fixes to whatsnew
  fix scoring function table
  fix rebase messup
  DOC more what's new subdivision
  DOC Attempt to impose some order on What's New 0.18
  no fixed width within bold
  REL changes for release in 0.18.X branch (scikit-learn#7414)
  [MRG+2] Timing and training score in GridSearchCV (scikit-learn#7325)
  DOC: Added Nested Cross Validation Example (scikit-learn#7111)
  Sync docstring and definition default argument in kneighbors (scikit-learn#7476)
  added contributors for 0.18, minor formatting fixes.
  Fix typo in whats_new.rst
  [MRG+2] FIX adaboost estimators not randomising correctly (scikit-learn#7411)
  Addressing issue scikit-learn#7468. (scikit-learn#7472)
  Reorganize README
  clean up deprecation warning stuff in common tests
  [MRG+1] Fix regression in silhouette_score for clusters of size 1 (scikit-learn#7438)
  ...
Sundrique pushed a commit to Sundrique/scikit-learn that referenced this pull request Jun 14, 2017
* Resolved issue scikit-learn#6894 and scikit-learn#6895:
    Now *SearchCV.results_ includes both timing and training scores.

wrote new test (sklearn/model_selection/test_search.py)
and new doctest (sklearn/model_selection/_search.py)
added a few more lines in the docstring of GridSearchCV and RandomizedSearchCV.
Revised code according to suggestions.
Add a few more lines to test_grid_search_results():
    1. check test_rank_score always >= 1
    2. check all regular scores (test/train_mean/std_score) and timing >= 0
    3. check all regular scores <= 1
Note that timing can be greater than 1 in general, and std of regular scores
always <= 1 because the scores are bounded between 0 and 1.

* ENH/FIX timing and training score.

* ENH separate fit / score times
* Make score_time=0 if errored; Ignore warnings in test
* Cleanup docstrings
* ENH Use helper to store the results
* Move fit time computation to else of try...except...else
* DOC readable sample scores
* COSMIT Add a commnent on why time test is >= 0 instead of > 0
  (Windows time.time precision is not accurate enought to be non-zero
   for trivial fits)

* Convey that times are in seconds
paulha pushed a commit to paulha/scikit-learn that referenced this pull request Aug 19, 2017
* Resolved issue scikit-learn#6894 and scikit-learn#6895:
    Now *SearchCV.results_ includes both timing and training scores.

wrote new test (sklearn/model_selection/test_search.py)
and new doctest (sklearn/model_selection/_search.py)
added a few more lines in the docstring of GridSearchCV and RandomizedSearchCV.
Revised code according to suggestions.
Add a few more lines to test_grid_search_results():
    1. check test_rank_score always >= 1
    2. check all regular scores (test/train_mean/std_score) and timing >= 0
    3. check all regular scores <= 1
Note that timing can be greater than 1 in general, and std of regular scores
always <= 1 because the scores are bounded between 0 and 1.

* ENH/FIX timing and training score.

* ENH separate fit / score times
* Make score_time=0 if errored; Ignore warnings in test
* Cleanup docstrings
* ENH Use helper to store the results
* Move fit time computation to else of try...except...else
* DOC readable sample scores
* COSMIT Add a commnent on why time test is >= 0 instead of > 0
  (Windows time.time precision is not accurate enought to be non-zero
   for trivial fits)

* Convey that times are in seconds
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants