Skip to content

TST Fixes test and mis-matched pandas version #20149

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jun 1, 2021

Conversation

thomasjpfan
Copy link
Member

Follow up to #20143

This PR fixes the test that catches mismatched versions between the readme and _min_dependencies.

CC @glemaitre @ogrisel

@ogrisel
Copy link
Member

ogrisel commented May 28, 2021

circle ci is failing with:

Package pandas conflicts for:
seaborn -> pandas[version='>=0.14.0|>=0.22.0']
seaborn -> statsmodels[version='>=0.5.0'] -> pandas[version='>=0.14|>=0.21']
pandas==0.25.0

but I fail to understand why this version spec cannot be resolved... based on https://docs.conda.io/projects/conda-build/en/latest/resources/package-spec.html#package-match-specifications it seems that pandas[version='>=0.14.0|>=0.22.0'] should be equivalent to pandas[version='>=0.14.0'] and pandas[version='>=0.14|>=0.21'] to pandas[version='>=0.14'] and neither should conflict with pandas==0.25.0. What do I miss?

@ogrisel
Copy link
Member

ogrisel commented May 28, 2021

Also why did this test not fail in the #20143 PR itself?

@thomasjpfan
Copy link
Member Author

Also why did this test not fail in the #20143 PR itself?

For the circleci build, sklearn/_min_dependencies.py was not updated so circleci did not use the new minimum pandas version.

For test_min_dependencies_readme.py, there was a small bug in the test.

@thomasjpfan
Copy link
Member Author

I suspect the failure has to do with:

numpy==1.14.5
pandas==0.25.0 -> numpy[version='>=1.13.3,<2.0a0|>=1.14.6,<2.0a0']

where numpy>=1.14.6.

@cmarmo
Copy link
Contributor

cmarmo commented May 28, 2021

Also why did this test not fail in the #20143 PR itself?

For the circleci build, sklearn/_min_dependencies.py was not updated so circleci did not use the new minimum pandas version.

Just to clarify: the pandas version was set in sklearn/_min_dependencies.py to version 0.24.3 working fine.
For some reasons, unclear to me, the README.rst said that the minimal dependency was 0.25 : technically there is no reason to increase pandas min dependency to 0.25, then no reason to increase numpy minimal dependency, you can just fix the readme. This is why I asked in #20143.

@thomasjpfan
Copy link
Member Author

thomasjpfan commented May 28, 2021

pandas 0.25 added support for the sparse accessor, which we use to better support dataframes with sparse columns. I see two paths forward:

  1. Use pandas 0.24.3 and only update the README.
  2. Use pandas 0.25.0, but numpy would need to be bumped to 1.14.6.

@lorentzenchr
Copy link
Member

As the difference between numpy 1.14.5 and 1.14.6 is one single bugfix release, I'd go for option 2: pandas 0.25.0 and numpy 1.14.6.

@cmarmo
Copy link
Contributor

cmarmo commented May 29, 2021

@lorentzenchr @thomasjpfan , Thanks for your answers.

Copy link
Member

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@ogrisel ogrisel merged commit c09be6a into scikit-learn:main Jun 1, 2021
@ogrisel
Copy link
Member

ogrisel commented Jun 1, 2021

I merged because I was under the impression that everyone agreed this was a good small fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants