-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
add common test: fixing random_state makes algorithms deterministic. #7139
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
How are we going to prove deterministic behaviour? Test every estimator 3-4 times after setting random_state? |
A more manual way that would be faster at test-time than the above would be to run the estimator with a set |
@nelson-liu, This sounds better, it will be kind of regression testing. We could serialise all results. But we will have to reset these results after some major changes of some estimator. |
A more manual way that would be faster at test-time than the above
would be to run the estimator with a set random_state, get the results,
then hardcode these results into the test
I am always very worried of hardcoded results in tests. When things start
failing, it's hard to know why.
|
I started work on this. |
Great, thanks @betatim I think this is a really needed enhancement |
Another way could be to fit the estimator on some data, serialize it and hash the output. The hash should be be identical after fitting the estimator twice (I think). This would allow checking the reproducibility of estimators that e.g. do not implement predict, though it would fail to detect non-deterministic behavior in predict function itself. |
move to milestone 0.23 |
removing from milestone |
@adrinjalali is it still needed? If so, I could start working on this. |
Thanks, but there's already an open PR for this @Reksbril |
I was giving a tutorial these days when someone stopped me because Here is the piece of code executed: from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
data = load_breast_cancer()
X_train, X_test, y_train, y_test = train_test_split(
data.data, data.target,
stratify=data.target, random_state=0)
lr = LogisticRegression().fit(X_train, y_train)
score = lr.score(X_test, y_test) I have no idea if this behavior is to be expected since I did not set the Since this kind of reproducibility errors cannot be easily tested with CI, I asked them for their system information. The report with the scores obtained can be found below. Python version: 3.8.5 | packaged by conda-forge | (default, Sep 24 2020, 16:20:24) [MSC v.1916 64 bit (AMD64)] Score of sklearn code: 0.9370629370629371 Python version: 3.8.5 | packaged by conda-forge | (default, Sep 24 2020, 16:20:24) [MSC v.1916 64 bit (AMD64)] Score of sklearn code: 0.9370629370629371 Python version: 3.8.5 | packaged by conda-forge | (default, Sep 16 2020, 17:19:16) [MSC v.1916 64 bit (AMD64)] Score of sklearn code: 0.9440559440559441 Python version: 3.8.5 | packaged by conda-forge | (default, Sep 24 2020, 16:20:24) [MSC v.1916 64 bit (AMD64)] Score of sklearn code: 0.9300699300699301 Python version: 3.8.5 | packaged by conda-forge | (default, Aug 29 2020, 00:43:28) [MSC v.1916 64 bit (AMD64)] Score of sklearn code: 0.9230769230769231 Python version: 3.8.3 (default, Jul 2 2020, 17:30:36) [MSC v.1916 64 bit (AMD64)] Score of sklearn code: 0.9300699300699301 Python version: 3.8.6 | packaged by conda-forge | (default, Oct 7 2020, 18:22:52) [MSC v.1916 64 bit (AMD64)] Score of sklearn code: 0.9300699300699301 Python version: 3.8.5 | packaged by conda-forge | (default, Aug 29 2020, 00:43:28) [MSC v.1916 64 bit (AMD64)] Score of sklearn code: 0.9370629370629371 Python version: 3.7.4 (default, Sep 18 2019, 19:37:15) Score of sklearn code: 0.9300699300699301 Let me know if I should write an independent issue. |
I think we should add a test to see if all estimators either are deterministic and have no random_state, or are deterministic after setting the random_state.
The text was updated successfully, but these errors were encountered: