Skip to content

Release 1.4.2 #28774

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Apr 9, 2024
Merged

Release 1.4.2 #28774

merged 7 commits into from
Apr 9, 2024

Conversation

jeremiedbb
Copy link
Member

@jeremiedbb jeremiedbb commented Apr 5, 2024

Extra check list

  •  need to wait for joblib 1.4.0 release that has some fixes for numpy 2 support that 1.3.2 doesn't have.
  • move change log entries from 1.4.2 into 1.5
  • make sure we build the wheels using numpy 2 !

Only cherry-picked 87c32d3 and 01cd8d4 from the main branch because they are necessary for numpy 2 support.

Copy link

github-actions bot commented Apr 5, 2024

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: b5b65f7. Link to the linter CI: here

@jeremiedbb
Copy link
Member Author

in d88930f I'm forcing cibuildwheels to build scikit-learn against numpy 2. I checked the jobs and it's the case. cibuildwheels still tests the wheels against numpy < 2, which is a good thing because it confirms that we still support older versions of numpy even when we build against numpy 2.

For numpy 2 at test time, we have the scipy-dev nightly runs doing that for a while, and in addition I'll test the wheels locally on a couple of platforms before uploading them.

@thomasjpfan
Copy link
Member

For numpy 2 at test time, we have the scipy-dev nightly runs doing that for a while, and in addition I'll test the wheels locally on a couple of platforms before uploading them.

In the wheel builder, can we have some testing with NumPy 2.0? From reading the GitHub actions docs, I think this works for installing NumPy 2.0 for the Python 3.12 tests:

CIBW_TEST_REQUIRES: pytest pandas ${{ matrix.python == 312 && 'numpy==2.0.0rc1' || '' }}

@ogrisel
Copy link
Member

ogrisel commented Apr 8, 2024

For numpy 2 at test time, we have the scipy-dev nightly runs doing that for a while, and in addition I'll test the wheels locally on a couple of platforms before uploading them.

I think this works for installing NumPy 2.0 for the Python 3.12 tests:
CIBW_TEST_REQUIRES: pytest pandas ${{ matrix.python == 312 && 'numpy==2.0.0rc1' || '' }}

I am fine with either option. If we go for the latter, I would like to add an inline comment to explain that this is temporary and can be deleted when building the wheels for future scikit-learn releases.

@jeremiedbb
Copy link
Member Author

Thanks Thomas, let me try that.

I would like to add an inline comment to explain that this is temporary and can be deleted when building the wheels for future scikit-learn releases.

I added a comment, but this is very likely the last 1.4.X release anyway :)

@jeremiedbb
Copy link
Member Author

The py312 builds are now failing. This is expected because we need joblib 1.4, which should be released soon.

Copy link
Member

@thomasjpfan thomasjpfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good.

I hope the ~ 3 hours Windows + Python 3.12 build is a one-off issue: https://github.com/scikit-learn/scikit-learn/actions/runs/8603767124/job/23576402935?pr=28774

@jeremiedbb
Copy link
Member Author

It wasn't. I forgot to backport #28692 although I had labelled it :)
It's needed because the latest scipy release, the one with numpy 2 support, is built against the problematic version of openblas.

@jeremiedbb
Copy link
Member Author

Hum, the build still hangs...

@jeremiedbb
Copy link
Member Author

jeremiedbb commented Apr 9, 2024

I can reproduce locally. Windows amd cpu. The test sklearn/metrics/tests/test_pairwise.py::test_parallel_pairwise_distances_diagonal[float64-euclidean] hangs

Reproducer:

import numpy as np
from sklearn.metrics import pairwise_distances

X = np.random.normal(size=(1000, 10), scale=1e10)

pairwise_distances(X)  # works
pairwise_distances(X, n_jobs=2)  # hangs

Surprisingly I can only reproduce with python 3.12. With 3.11 it works. Below are the outputs of show_versions for 3.11 and 3.12. Nothing is different but the python version.

# Python 3.11

Python dependencies:
      sklearn: 1.4.2
          pip: 23.3.1
   setuptools: 68.2.2
        numpy: 2.0.0rc1
        scipy: 1.13.0
       Cython: 3.0.10
       pandas: None
   matplotlib: None
       joblib: 1.4.0
threadpoolctl: 3.4.0

Built with OpenMP: True

threadpoolctl info:
       user_api: blas
   internal_api: openblas
    num_threads: 16
         prefix: libopenblas
       filepath: C:\Users\J\miniconda3\envs\tst\Lib\site-packages\scipy.libs\libopenblas_v0.3.26-382-gb1e8ba50--72a863714eca5a50b38260dedc0c2f3a.dll
        version: 0.3.26.dev
threading_layer: pthreads
   architecture: Zen

       user_api: openmp
   internal_api: openmp
    num_threads: 16
         prefix: vcomp
       filepath: C:\Users\J\miniconda3\envs\tst\vcomp140.dll
        version: None
# Python 3.12

Python dependencies:
      sklearn: 1.4.2
          pip: 23.3.1
   setuptools: 68.2.2
        numpy: 2.0.0rc1
        scipy: 1.13.0
       Cython: 3.0.10
       pandas: None
   matplotlib: None
       joblib: 1.4.0
threadpoolctl: 3.4.0

Built with OpenMP: True

threadpoolctl info:
       user_api: blas
   internal_api: openblas
    num_threads: 16
         prefix: libopenblas
       filepath: C:\Users\J\miniconda3\envs\tst\Lib\site-packages\scipy.libs\libopenblas_v0.3.26-382-gb1e8ba50--72a863714eca5a50b38260dedc0c2f3a.dll
        version: 0.3.26.dev
threading_layer: pthreads
   architecture: Zen

       user_api: openmp
   internal_api: openmp
    num_threads: 16
         prefix: vcomp
       filepath: C:\Users\J\miniconda3\envs\tst\vcomp140.dll
        version: None

@lesteve
Copy link
Member

lesteve commented Apr 9, 2024

I am a bit puzzled since the wheel on main passes and uses scipy 1.13.0 as well:
https://github.com/scikit-learn/scikit-learn/actions/runs/8610015698/job/23594862032

Also #28692 had no failures in our test but was triggered by a MNE-Python report. Scipy 1.13 actually has a patch for the original bug hence the 0.3.26.dev version in the sklearn.show_versions() output.

All in all I think this is a slightly different bug maybe?

One slightly weird thing I see in your show_versions() output is that there is no BLAS info for the numpy 2.0.0rc1 wheel, this could well be a red herring though ...

Looking a bit at the Python 3.11 build log vs Python 3.12 build log, the Python 3.11 tests against numpy 1.26.4 for some reason. This one may also be another red herring.

@jeremiedbb
Copy link
Member Author

Also #28692 had no failures in our test but was triggered by a MNE-Python report. Scipy 1.13 actually has a patch for the original bug hence the 0.3.26.dev version in the sklearn.show_versions() output.

It's not related to #28692. The reproducer has nothing to do with pairwise_distances_reduction. The issue seems to come from a nesting of python threads and blas calls (gemm)

@jeremiedbb
Copy link
Member Author

jeremiedbb commented Apr 9, 2024

I can reproduce without numpy 2 as well. And regardless of the Python version. Hangs with openblas 0.3.26 and doesn't with openblas 0.3.27. So it really seem to come from the same openblas issue as in #28692, just not at the same place.

Python dependencies:
      sklearn: 1.4.2
          pip: 24.0
   setuptools: 69.2.0
        numpy: 1.26.4
        scipy: 1.13.0
       Cython: 3.0.10
       pandas: None
   matplotlib: None
       joblib: 1.4.0
threadpoolctl: 3.4.0

Built with OpenMP: True

threadpoolctl info:
       user_api: openmp
   internal_api: openmp
    num_threads: 16
         prefix: libomp
       filepath: C:\Users\J\miniconda3\envs\tst\Library\bin\libomp.dll
        version: None

       user_api: blas
   internal_api: openblas
    num_threads: 16
         prefix: libblas
       filepath: C:\Users\J\miniconda3\envs\tst\Library\bin\libblas.dll
        version: 0.3.26
threading_layer: pthreads
   architecture: Zen

       user_api: openmp
   internal_api: openmp
    num_threads: 16
         prefix: vcomp
       filepath: C:\Users\J\miniconda3\envs\tst\vcomp140.dll
        version: None
Python dependencies:
      sklearn: 1.4.2
          pip: 24.0
   setuptools: 69.2.0
        numpy: 1.26.4
        scipy: 1.13.0
       Cython: 3.0.10
       pandas: None
   matplotlib: None
       joblib: 1.4.0
threadpoolctl: 3.4.0

Built with OpenMP: True

threadpoolctl info:
       user_api: openmp
   internal_api: openmp
    num_threads: 16
         prefix: libomp
       filepath: C:\Users\J\miniconda3\envs\tst\Library\bin\libomp.dll
        version: None

       user_api: blas
   internal_api: openblas
    num_threads: 16
         prefix: libblas
       filepath: C:\Users\J\miniconda3\envs\tst\Library\bin\libblas.dll
        version: 0.3.27
threading_layer: pthreads
   architecture: Zen

       user_api: openmp
   internal_api: openmp
    num_threads: 16
         prefix: vcomp
       filepath: C:\Users\J\miniconda3\envs\tst\vcomp140.dll
        version: None

@ogrisel
Copy link
Member

ogrisel commented Apr 9, 2024

Our Windows nightly build do not fail:

https://github.com/scikit-learn/scikit-learn/actions/runs/8610015698/job/23594862032#step:6:2048

but the version of openblas from numpy (v0.3.23-293-gc2f4bdbb, did not have the thread-safety bug yet) is different from the one shipped with scipy (v0.3.26-382-gb1e8ba50, should have the fix for the thread-safety bug).

EDIT: as soon as numpy 2 is released, we might experience the bug in main (possibly after a lockfile update) if numpy 2 final still ships a bad version of OpenBLAS.

@ogrisel
Copy link
Member

ogrisel commented Apr 9, 2024

I opened an issue upstream to ask numpy maintainers to ship the openblas fix in NumPy 2 numpy/numpy#26240

@jeremiedbb
Copy link
Member Author

We had a meeting to discuss how to proceed here. Here are the conclusions:

  • The windows deadlock comes from the same OpenBLAS issue present in version 0.3.26.

  • Some of the previous builds in this PR were testing the wheels against numpy 2 successfully (except from the windows one due to the deadlock). I've removed that and the last builds test the wheels against numpy latest (1.26.4). I did that to have all jobs green and be able to upload the wheels through the gh actions.

@jeremiedbb jeremiedbb enabled auto-merge (squash) April 9, 2024 13:58
@jeremiedbb jeremiedbb merged commit d675b8d into scikit-learn:1.4.X Apr 9, 2024
44 checks passed
@thomasjpfan
Copy link
Member

@jeremiedbb Thank you for pushing this through! 🎉

@jeremiedbb
Copy link
Member Author

oh crap, I squashed

@jeremiedbb
Copy link
Member Author

no big deal since we very likely won't have a 1.4.3 but still 😡

@rgommers
Copy link
Contributor

Thanks all!

Re the openblas vendored in numpy 2.0.0rc1: we'll do a 2.0.0rc2 with openblas 0.3.27 included to fix this, probably next week.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants