Skip to content

[MRG] TST Increases tol for check_pca_float_dtype_preservation assertion #15775

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

thomasjpfan
Copy link
Member

Reference Issues/PRs

Fixes #15774

What does this implement/fix? Explain your changes.

I think we need to bump the tol to 2e-4. The components_ are obtained directly from linalg.svd here:

U, S, V = linalg.svd(X, full_matrices=False)
# flip eigenvectors' sign to enforce deterministic output
U, V = svd_flip(U, V)
components_ = V

Any other comments?

The proper fix would be to dig into lapack and see why there is this difference between float32 and float64.

@@ -532,7 +532,7 @@ def check_pca_float_dtype_preservation(svd_solver):
assert pca_64.transform(X_64).dtype == np.float64
assert pca_32.transform(X_32).dtype == np.float32

assert_allclose(pca_64.components_, pca_32.components_, rtol=1e-4)
assert_allclose(pca_64.components_, pca_32.components_, rtol=2e-4)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe a comment as why this tolerance?

supporting platforms which we don't have in our CI doesn't sound logical to me though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we okay with conda-forge skipping this test for one of their platforms?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better to increase tolerance rather than ask distributors to skip tests..

Copy link
Member

@rth rth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Some of our tolerances are indeed too string in assertions, they happen to work on x86 (with one random seed), but can fail on other platforms (or if the random seed changes), and in that case increasing the tolerance slightly shouldn't hurt I think.

Copy link
Member

@qinhanmin2014 qinhanmin2014 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We only subtract the mean and calculate the svd, so I guess it's reasonable to increase tol here.

@qinhanmin2014 qinhanmin2014 merged commit 0fa54e5 into scikit-learn:master Dec 7, 2019
@qinhanmin2014 qinhanmin2014 added this to the 0.22.1 milestone Dec 7, 2019
rth pushed a commit that referenced this pull request Dec 10, 2019
ogrisel pushed a commit to ogrisel/scikit-learn that referenced this pull request Dec 31, 2019
ogrisel pushed a commit to ogrisel/scikit-learn that referenced this pull request Jan 2, 2020
ogrisel pushed a commit to ogrisel/scikit-learn that referenced this pull request Jan 2, 2020
ogrisel added a commit that referenced this pull request Jan 2, 2020
* DOC fixed default values in dbscan (#15753)

* DOC fix incorrect branch reference in contributing doc (#15779)

* DOC relabel Feature -> Efficiency in change log (#15770)

* DOC fixed Birch default value (#15780)

* STY Minior change on code padding in website theme (#15768)

* DOC Fix yticklabels order in permutation importances example (#15799)

* Fix yticklabels order in permutation importances example

* STY Update wrapper width (#15793)

* DOC Long sentence was hard to parse and ambiguous in _classification.py (#15769)

* DOC Removed duplicate 'classes_' attribute in Naive Bayes classifiers (#15811)

* BUG Fixes pandas dataframe bug with boolean dtypes (#15797)

* BUG Returns only public estimators in all_estimators (#15380)

* DOC improve doc for multiclass and types_of_target (#15333)

* TST Increases tol for check_pca_float_dtype_preservation assertion (#15775)

* update _alpha_grid class in _coordinate_descent.py (#15835)

* FIX Explicit conversion of ndarray to object dtype. (#15832)

* BLD Parallelize sphinx builds on circle ci (#15745)

* DOC correct url for preprocessing (#15853)

* MNT avoid generating too many cross links in examples (#15844)

* DOC Correct wrong doc in precision_recall_fscore_support (#15833)

* DOC add comment in check_pca_float_dtype_preservation (#15819)

Documenting the changes in #15775

* DOC correct indents in docstring _split.py (#15843)

* DOC fix docstring of KMeans based on sklearn guideline (#15754)

* DOC fix docstring of AgglomerativeClustering based on sklearn guideline (#15764)

* DOC fix docstring of AffinityPropagation based on sklearn guideline (#15777)

* DOC fixed SpectralCoclustering and SpectralBiclustering docstrings following sklearn guideline (#15778)

* DOC fix FeatureAgglomeration and MiniBatchKMeans docstring following sklearn guideline (#15809)

* TST Specify random_state in test_cv_iterable_wrapper (#15829)

* DOC Include LinearSV{C, R} in models that support sample_weights (#15871)

* DOC correct some indents (#15875)

* DOC Fix documentation of default values in tree classes (#15870)

* DOC fix typo in docstring (#15887)

* DOC FIX default value for xticks_rotation in plot_confusion_matrix (#15890)

* Fix imports in pip3 ubuntu by suffixing affected files (#15891)

* MNT Raise erorr when normalize is invalid in confusion_matrix (#15888)

* [MRG] DOC Increases search results for API object results (#15574)

* MNT Ignores warning in pyamg for deprecated scipy.random (#15914)

* DOC Instructions to troubleshoot Windows path length limit (#15916)

* DOC add versionadded directive to some estimators (#15849)

* DOC clarify doc-string of roc_auc_score and add references (#15293)

* MNT Adds skip lint to azure pipeline CI (#15904)

* BLD Fixes bug when building with NO_MATHJAX=1 (#15892)

* [MRG] BUG Checks to number of axes in passed in ax more generically (#15760)

* EXA Minor fixes in plot_sparse_logistic_regression_20newsgroups.py (#15925)

* BUG Do not shadow public functions with deprecated modules (#15846)

* Import sklearn._distributor_init first (#15929)

* DOC Fix typos, via a Levenshtein-style corrector (#15923)

* DOC in canned comment, mention that PR title becomes commit me… (#15935)

* DOC/EXA Correct spelling of "Classification" (#15938)

* BUG fix pip3 ubuntu update by suffixing file (#15928)

* [MRG] Ways to compute center_shift_total were different in "full" and "elkan" algorithms. (#15930)

* TST Fixes integer test for train and test indices (#15941)

* BUG ensure that parallel/sequential give the same permutation importances (#15933)

* Formatting fixes in changelog (#15944)

* MRG FIX: order of values of self.quantiles_ in QuantileTransformer (#15751)

* [MRG] BUG Fixes constrast in plot_confusion_matrix (#15936)

* BUG use zero_division argument in classification_report (#15879)

* DOC change logreg solver in plot_logistic_path (#15927)

* DOC fix whats new ordering (#15961)

* COSMIT use np.iinfo to define the max int32 (#15960)

* DOC Apply numpydoc validation to VotingRegressor methods (#15969)

Co-authored-by: Tiffany R. Williams <Tiffany8@users.noreply.github.com>

* DOC improve naive_bayes.py documentation (#15943)

Co-authored-by: Jigna Panchal <40188288+jigna-panchal@users.noreply.github.com>

* DOC Fix default values in Perceptron documentation (#15965)

* DOC Improve default values in logistic documentation (#15966)

Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>

* DOC Improve documentation of default values for imputers (#15964)

* EXA/MAINT Simplify code in manifold learning example (#15949)

* DOC Improve default values in SGD documentation (#15967)

* DOC Improve defaults in neural network documentation (#15968)

* FIX use safe_sparse_dot for callable kernel in LabelSpreading (#15868)

* BUG Adds attributes back to check_is_fitted (#15947)

Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>

* DOC update check_is_fitted what's new

* DOC change python-devel to python3-devel for yum. (#15986)

* DOC Correct the default value of values_format in plot_confusion_matrix (#15981)

* [MRG] MNT Updates pypy to use 7.2.0 (#15954)

* FIX Add missing 'values_format' param to disp.plot() in plot_confusion_matrix (#15937)

* FIX support scalar values in fit_params in SearchCV (#15863)

* support a scalar fit param

* pep8

* TST add test for desired behavior

* FIX introduce _check_fit_params to validate parameters

* DOC update whats new

* TST tests both grid-search and randomize-search

* PEP8

* DOC revert unecessary change

* TST add test for _check_fit_params

* olivier comments

* TST fixes

* DOC whats new

* DOC whats new

* TST revert type of error

* add olivier suggestions

* address olivier comments

* address thomas comments

* PEP8

* comments olivier

* TST fix test by passing X

* avoid to call twice tocsr

* add case column/row sparse in check_fit_param

* provide optional indices

* TST check content when indexing params

* PEP8

* TST update tests to check identity

* stupid fix

* use a distribution in RandomizedSearchCV

* MNT add lightgbm to one of the CI build

* move to another build

* do not install dependencies lightgbm

* MNT comments on the CI setup

* address some comments

* Test fit_params compat without dependency on lightgbm

Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>

* Remove abstractmethod that silently brake downstream packages (#15996)

* FIX restore BaseNB._check_X without abstractmethod decoration (#15997)

* Update v0.22 changelog for 0.22.1 (#16002)

- set the date
- move entry for quantile transformer to the 0.22.1 section
- fix alphabetical ordering of modules

* STY Removes hidden scroll bar (#15999)

* Flake8 fixes

* Fix: remove left-over lines that should have been deleted during conflict resolution when rebasing

* Fix missing imports

* Update version

* Fix test_check_is_fitted

* Make test_sag_regressor_computed_correctly deterministic (#16003)

Fix #15818.

Co-authored-by: cgsavard <claire.savard@colorado.edu>
Co-authored-by: Joel Nothman <joel.nothman@gmail.com>
Co-authored-by: Thomas J Fan <thomasjpfan@gmail.com>
Co-authored-by: Matt Hall <matt@agilegeoscience.com>
Co-authored-by: Kathryn Poole <kathryn.poole2@gmail.com>
Co-authored-by: lucyleeow <jliu176@gmail.com>
Co-authored-by: JJmistry <jayminm22@gmail.com>
Co-authored-by: Juan Carlos Alfaro Jiménez <JuanCarlos.Alfaro@uclm.es>
Co-authored-by: SylvainLan <sylvain.s.lannuzel@gmail.com>
Co-authored-by: Nicolas Hug <contact@nicolas-hug.com>
Co-authored-by: Hanmin Qin <qinhanmin2005@sina.com>
Co-authored-by: Adrin Jalali <adrin.jalali@gmail.com>
Co-authored-by: Vachan D A <vachanda@users.noreply.github.com>
Co-authored-by: Sambhav Kothari <sambhavs.email@gmail.com>
Co-authored-by: wenliwyan <12013376+wenliwyan@users.noreply.github.com>
Co-authored-by: shivamgargsya <shivam.gargshya@gmail.com>
Co-authored-by: Reshama Shaikh <rs2715@stern.nyu.edu>
Co-authored-by: Oliver Urs Lenz <oulenz@users.noreply.github.com>
Co-authored-by: Loïc Estève <loic.esteve@ymail.com>
Co-authored-by: Brian Wignall <BrianWignall@gmail.com>
Co-authored-by: Ritchie Ng <ritchieng@u.nus.edu>
Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
Co-authored-by: inderjeet <43402782+inder128@users.noreply.github.com>
Co-authored-by: scibol <scibol@users.noreply.github.com>
Co-authored-by: Tirth Patel <tirthasheshpatel@gmail.com>
Co-authored-by: Bibhash Chandra Mitra <bibhashm220896@gmail.com>
Co-authored-by: Alexandre Gramfort <alexandre.gramfort@m4x.org>
Co-authored-by: Tiffany R. Williams <Tiffany8@users.noreply.github.com>
Co-authored-by: Jigna Panchal <40188288+jigna-panchal@users.noreply.github.com>
Co-authored-by: @nkish <19225359+ankishb@users.noreply.github.com>
Co-authored-by: Pulkit Mehta <pulkit_mehta_work@yahoo.com>
Co-authored-by: David Breuer <DavidBreuer@users.noreply.github.com>
Co-authored-by: Niklas <niklas.sm+github@gmail.com>
Co-authored-by: Windber <guolipengyeah@126.com>
Co-authored-by: Stephen Blystone <29995339+blynotes@users.noreply.github.com>
Co-authored-by: Brigitta Sipőcz <b.sipocz@gmail.com>
panpiort8 pushed a commit to panpiort8/scikit-learn that referenced this pull request Mar 3, 2020
panpiort8 pushed a commit to panpiort8/scikit-learn that referenced this pull request Mar 3, 2020
Pseudomanifold pushed a commit to BorgwardtLab/scikit-learn that referenced this pull request Apr 24, 2020
* DOC fixed default values in dbscan (scikit-learn#15753)

* DOC fix incorrect branch reference in contributing doc (scikit-learn#15779)

* DOC relabel Feature -> Efficiency in change log (scikit-learn#15770)

* DOC fixed Birch default value (scikit-learn#15780)

* STY Minior change on code padding in website theme (scikit-learn#15768)

* DOC Fix yticklabels order in permutation importances example (scikit-learn#15799)

* Fix yticklabels order in permutation importances example

* STY Update wrapper width (scikit-learn#15793)

* DOC Long sentence was hard to parse and ambiguous in _classification.py (scikit-learn#15769)

* DOC Removed duplicate 'classes_' attribute in Naive Bayes classifiers (scikit-learn#15811)

* BUG Fixes pandas dataframe bug with boolean dtypes (scikit-learn#15797)

* BUG Returns only public estimators in all_estimators (scikit-learn#15380)

* DOC improve doc for multiclass and types_of_target (scikit-learn#15333)

* TST Increases tol for check_pca_float_dtype_preservation assertion (scikit-learn#15775)

* update _alpha_grid class in _coordinate_descent.py (scikit-learn#15835)

* FIX Explicit conversion of ndarray to object dtype. (scikit-learn#15832)

* BLD Parallelize sphinx builds on circle ci (scikit-learn#15745)

* DOC correct url for preprocessing (scikit-learn#15853)

* MNT avoid generating too many cross links in examples (scikit-learn#15844)

* DOC Correct wrong doc in precision_recall_fscore_support (scikit-learn#15833)

* DOC add comment in check_pca_float_dtype_preservation (scikit-learn#15819)

Documenting the changes in scikit-learn#15775

* DOC correct indents in docstring _split.py (scikit-learn#15843)

* DOC fix docstring of KMeans based on sklearn guideline (scikit-learn#15754)

* DOC fix docstring of AgglomerativeClustering based on sklearn guideline (scikit-learn#15764)

* DOC fix docstring of AffinityPropagation based on sklearn guideline (scikit-learn#15777)

* DOC fixed SpectralCoclustering and SpectralBiclustering docstrings following sklearn guideline (scikit-learn#15778)

* DOC fix FeatureAgglomeration and MiniBatchKMeans docstring following sklearn guideline (scikit-learn#15809)

* TST Specify random_state in test_cv_iterable_wrapper (scikit-learn#15829)

* DOC Include LinearSV{C, R} in models that support sample_weights (scikit-learn#15871)

* DOC correct some indents (scikit-learn#15875)

* DOC Fix documentation of default values in tree classes (scikit-learn#15870)

* DOC fix typo in docstring (scikit-learn#15887)

* DOC FIX default value for xticks_rotation in plot_confusion_matrix (scikit-learn#15890)

* Fix imports in pip3 ubuntu by suffixing affected files (scikit-learn#15891)

* MNT Raise erorr when normalize is invalid in confusion_matrix (scikit-learn#15888)

* [MRG] DOC Increases search results for API object results (scikit-learn#15574)

* MNT Ignores warning in pyamg for deprecated scipy.random (scikit-learn#15914)

* DOC Instructions to troubleshoot Windows path length limit (scikit-learn#15916)

* DOC add versionadded directive to some estimators (scikit-learn#15849)

* DOC clarify doc-string of roc_auc_score and add references (scikit-learn#15293)

* MNT Adds skip lint to azure pipeline CI (scikit-learn#15904)

* BLD Fixes bug when building with NO_MATHJAX=1 (scikit-learn#15892)

* [MRG] BUG Checks to number of axes in passed in ax more generically (scikit-learn#15760)

* EXA Minor fixes in plot_sparse_logistic_regression_20newsgroups.py (scikit-learn#15925)

* BUG Do not shadow public functions with deprecated modules (scikit-learn#15846)

* Import sklearn._distributor_init first (scikit-learn#15929)

* DOC Fix typos, via a Levenshtein-style corrector (scikit-learn#15923)

* DOC in canned comment, mention that PR title becomes commit me… (scikit-learn#15935)

* DOC/EXA Correct spelling of "Classification" (scikit-learn#15938)

* BUG fix pip3 ubuntu update by suffixing file (scikit-learn#15928)

* [MRG] Ways to compute center_shift_total were different in "full" and "elkan" algorithms. (scikit-learn#15930)

* TST Fixes integer test for train and test indices (scikit-learn#15941)

* BUG ensure that parallel/sequential give the same permutation importances (scikit-learn#15933)

* Formatting fixes in changelog (scikit-learn#15944)

* MRG FIX: order of values of self.quantiles_ in QuantileTransformer (scikit-learn#15751)

* [MRG] BUG Fixes constrast in plot_confusion_matrix (scikit-learn#15936)

* BUG use zero_division argument in classification_report (scikit-learn#15879)

* DOC change logreg solver in plot_logistic_path (scikit-learn#15927)

* DOC fix whats new ordering (scikit-learn#15961)

* COSMIT use np.iinfo to define the max int32 (scikit-learn#15960)

* DOC Apply numpydoc validation to VotingRegressor methods (scikit-learn#15969)

Co-authored-by: Tiffany R. Williams <Tiffany8@users.noreply.github.com>

* DOC improve naive_bayes.py documentation (scikit-learn#15943)

Co-authored-by: Jigna Panchal <40188288+jigna-panchal@users.noreply.github.com>

* DOC Fix default values in Perceptron documentation (scikit-learn#15965)

* DOC Improve default values in logistic documentation (scikit-learn#15966)

Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>

* DOC Improve documentation of default values for imputers (scikit-learn#15964)

* EXA/MAINT Simplify code in manifold learning example (scikit-learn#15949)

* DOC Improve default values in SGD documentation (scikit-learn#15967)

* DOC Improve defaults in neural network documentation (scikit-learn#15968)

* FIX use safe_sparse_dot for callable kernel in LabelSpreading (scikit-learn#15868)

* BUG Adds attributes back to check_is_fitted (scikit-learn#15947)

Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>

* DOC update check_is_fitted what's new

* DOC change python-devel to python3-devel for yum. (scikit-learn#15986)

* DOC Correct the default value of values_format in plot_confusion_matrix (scikit-learn#15981)

* [MRG] MNT Updates pypy to use 7.2.0 (scikit-learn#15954)

* FIX Add missing 'values_format' param to disp.plot() in plot_confusion_matrix (scikit-learn#15937)

* FIX support scalar values in fit_params in SearchCV (scikit-learn#15863)

* support a scalar fit param

* pep8

* TST add test for desired behavior

* FIX introduce _check_fit_params to validate parameters

* DOC update whats new

* TST tests both grid-search and randomize-search

* PEP8

* DOC revert unecessary change

* TST add test for _check_fit_params

* olivier comments

* TST fixes

* DOC whats new

* DOC whats new

* TST revert type of error

* add olivier suggestions

* address olivier comments

* address thomas comments

* PEP8

* comments olivier

* TST fix test by passing X

* avoid to call twice tocsr

* add case column/row sparse in check_fit_param

* provide optional indices

* TST check content when indexing params

* PEP8

* TST update tests to check identity

* stupid fix

* use a distribution in RandomizedSearchCV

* MNT add lightgbm to one of the CI build

* move to another build

* do not install dependencies lightgbm

* MNT comments on the CI setup

* address some comments

* Test fit_params compat without dependency on lightgbm

Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>

* Remove abstractmethod that silently brake downstream packages (scikit-learn#15996)

* FIX restore BaseNB._check_X without abstractmethod decoration (scikit-learn#15997)

* Update v0.22 changelog for 0.22.1 (scikit-learn#16002)

- set the date
- move entry for quantile transformer to the 0.22.1 section
- fix alphabetical ordering of modules

* STY Removes hidden scroll bar (scikit-learn#15999)

* Flake8 fixes

* Fix: remove left-over lines that should have been deleted during conflict resolution when rebasing

* Fix missing imports

* Update version

* Fix test_check_is_fitted

* Make test_sag_regressor_computed_correctly deterministic (scikit-learn#16003)

Fix scikit-learn#15818.

Co-authored-by: cgsavard <claire.savard@colorado.edu>
Co-authored-by: Joel Nothman <joel.nothman@gmail.com>
Co-authored-by: Thomas J Fan <thomasjpfan@gmail.com>
Co-authored-by: Matt Hall <matt@agilegeoscience.com>
Co-authored-by: Kathryn Poole <kathryn.poole2@gmail.com>
Co-authored-by: lucyleeow <jliu176@gmail.com>
Co-authored-by: JJmistry <jayminm22@gmail.com>
Co-authored-by: Juan Carlos Alfaro Jiménez <JuanCarlos.Alfaro@uclm.es>
Co-authored-by: SylvainLan <sylvain.s.lannuzel@gmail.com>
Co-authored-by: Nicolas Hug <contact@nicolas-hug.com>
Co-authored-by: Hanmin Qin <qinhanmin2005@sina.com>
Co-authored-by: Adrin Jalali <adrin.jalali@gmail.com>
Co-authored-by: Vachan D A <vachanda@users.noreply.github.com>
Co-authored-by: Sambhav Kothari <sambhavs.email@gmail.com>
Co-authored-by: wenliwyan <12013376+wenliwyan@users.noreply.github.com>
Co-authored-by: shivamgargsya <shivam.gargshya@gmail.com>
Co-authored-by: Reshama Shaikh <rs2715@stern.nyu.edu>
Co-authored-by: Oliver Urs Lenz <oulenz@users.noreply.github.com>
Co-authored-by: Loïc Estève <loic.esteve@ymail.com>
Co-authored-by: Brian Wignall <BrianWignall@gmail.com>
Co-authored-by: Ritchie Ng <ritchieng@u.nus.edu>
Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
Co-authored-by: inderjeet <43402782+inder128@users.noreply.github.com>
Co-authored-by: scibol <scibol@users.noreply.github.com>
Co-authored-by: Tirth Patel <tirthasheshpatel@gmail.com>
Co-authored-by: Bibhash Chandra Mitra <bibhashm220896@gmail.com>
Co-authored-by: Alexandre Gramfort <alexandre.gramfort@m4x.org>
Co-authored-by: Tiffany R. Williams <Tiffany8@users.noreply.github.com>
Co-authored-by: Jigna Panchal <40188288+jigna-panchal@users.noreply.github.com>
Co-authored-by: @nkish <19225359+ankishb@users.noreply.github.com>
Co-authored-by: Pulkit Mehta <pulkit_mehta_work@yahoo.com>
Co-authored-by: David Breuer <DavidBreuer@users.noreply.github.com>
Co-authored-by: Niklas <niklas.sm+github@gmail.com>
Co-authored-by: Windber <guolipengyeah@126.com>
Co-authored-by: Stephen Blystone <29995339+blynotes@users.noreply.github.com>
Co-authored-by: Brigitta Sipőcz <b.sipocz@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ppc4le failure of test_pca_dtype_preservation
4 participants