Skip to content

BLD Improve make clean for edge case situations #29413

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jul 4, 2024

Conversation

lesteve
Copy link
Member

@lesteve lesteve commented Jul 4, 2024

So it turns out a better make clean is needed as was discussed around #29296 (comment)

When using meson editable mode and switching between numpy<2 and numpy>=2 in the same environment, you can get weird build errors and the way to get out of it is to delete the meson build folder

Removing the default meson build folder meson/cp312 if you are using Python 3.12 seems to be the simplest solution. I tried ninja clean -C build/cp312 which is a softer clean (my personal newbie understanding).

cc @rgommers for visibility or in case off the top of your head this seems like something that should just work in principle. This is completely fine to use this work-around to be honest. It could well be the case that there are some issues with our meson.build files, hard to tell ...

I am able to reproduce with a snippet like this

git clone https://github.com/scikit-learn/scikit-learn --depth 1
cd scikit-learn

conda create -n testenv meson-python cython 'numpy>=2' scipy -y
conda activate testenv

# just to make sure to start from scratch
rm -rf build

# Build with numpy>=2
pip install --verbose --no-build-isolation --editable . --check-build-dependencies --config-settings editable-verbose=true
# works fine
python -c 'import sklearn'

# install numpy<2
conda install 'numpy<2' -y
# compilation error
python -c 'import sklearn'

With the output:

meson-python: building scikit-learn: /home/lesteve/micromamba/envs/scikit-learn-dev/bin/ninja
[1/34] Compiling C++ object sklearn/utils/_vector...enerated_sklearn_utils__vector_sentinel.pyx.cpp.o
FAILED: sklearn/utils/_vector_sentinel.cpython-312-x86_64-linux-gnu.so.p/meson-generated_sklearn_utils__vector_sentinel.pyx.cpp.o 
ccache c++ -Isklearn/utils/_vector_sentinel.cpython-312-x86_64-linux-gnu.so.p -Isklearn/utils -I../../sklearn/utils -I../../../../micromamba/envs/scikit-learn-dev/lib/python3.12/site-packages/numpy/_core/include -Isklearn -I/home/lesteve/micromamba/envs/scikit-learn-dev/include/python3.12 -fvisibility=hidden -fvisibility-inlines-hidden -fdiagnostics-color=always -DNDEBUG -D_FILE_OFFSET_BITS=64 -Wall -Winvalid-pch -std=c++14 -O3 -fPIC -DNPY_NO_DEPRECATED_API=NPY_1_9_API_VERSION -MD -MQ sklearn/utils/_vector_sentinel.cpython-312-x86_64-linux-gnu.so.p/meson-generated_sklearn_utils__vector_sentinel.pyx.cpp.o -MF sklearn/utils/_vector_sentinel.cpython-312-x86_64-linux-gnu.so.p/meson-generated_sklearn_utils__vector_sentinel.pyx.cpp.o.d -o sklearn/utils/_vector_sentinel.cpython-312-x86_64-linux-gnu.so.p/meson-generated_sklearn_utils__vector_sentinel.pyx.cpp.o -c sklearn/utils/_vector_sentinel.cpython-312-x86_64-linux-gnu.so.p/sklearn/utils/_vector_sentinel.pyx.cpp
sklearn/utils/_vector_sentinel.cpython-312-x86_64-linux-gnu.so.p/sklearn/utils/_vector_sentinel.pyx.cpp:1239:10: fatal error: numpy/arrayobject.h: No such file or directory
 1239 | #include "numpy/arrayobject.h"
      |          ^~~~~~~~~~~~~~~~~~~~~
compilation terminated.

@betatim likely bumped into a similar issue but the compilation error was different:

sklearn/utils/_vector_sentinel.cpython-310-darwin.so.p/sklearn/utils/_vector_sentinel.pyx.cpp:4799:13: error: use of undeclared identifier 'PyDataType_ELSIZE'
    __pyx_r = PyDataType_ELSIZE(__pyx_v_self);
              ^
  sklearn/utils/_vector_sentinel.cpython-310-darwin.so.p/sklearn/utils/_vector_sentinel.pyx.cpp:4833:13: error: use of undeclared identifier 'PyDataType_ALIGNMENT'
    __pyx_r = PyDataType_ALIGNMENT(__pyx_v_self);

@lesteve lesteve changed the title BLD Improve make clean for edge cases situation BLD Improve make clean for edge case situations Jul 4, 2024
Copy link

github-actions bot commented Jul 4, 2024

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: 19579ec. Link to the linter CI: here

Copy link
Member

@betatim betatim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like it and adding it to the Makefile kind of institutionalises the knowledge. I'm not sure what the downsides are though, in particular for more experienced people than me.

Co-authored-by: Tim Head <betatim@gmail.com>
Copy link
Member

@jeremiedbb jeremiedbb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I knew I was not paranoid ! :D

@@ -1,6 +1,7 @@
# simple makefile to simplify repetitive build env management tasks under posix

PYTHON ?= python
DEFAULT_MESON_BUILD_DIR = build/cp$(shell python -c 'import sys; print(f"{sys.version_info.major}{sys.version_info.minor}")' )
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not just remove build/ ? It's not like the build takes hours.

Copy link
Member Author

@lesteve lesteve Jul 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In principle you can have different build folders under build and this does not make too much sense to delete them all.

Personally, these days I have many different scikit-learn environments that I create from lock-file to try to reproduce CI issues (especially this week 😓) so I would not want to delete them all and recompile from scratch when I go back to this environmen. Right now I have 3 folders build/cp39, build/cp311, build/cp312.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In your setup would you get weird error messages in the other envs if you deleted build/ or would it "just" rebuild?

For me it is fine to only delete one sub-directory, but I'd also be happy to just remove all of build/

Copy link
Member Author

@lesteve lesteve Jul 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You get error message like this if you delete your build folder:

❯ python -c 'import sklearn'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "<frozen importlib._bootstrap>", line 1360, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1322, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 1262, in _find_spec
  File "/home/lesteve/micromamba/envs/testenv/lib/python3.12/site-packages/_scikit_learn_editable_loader.py", line 310, in find_spec
    tree = self._rebuild()
           ^^^^^^^^^^^^^^^
  File "/home/lesteve/micromamba/envs/testenv/lib/python3.12/site-packages/_scikit_learn_editable_loader.py", line 338, in _rebuild
    if self._work_to_do(env):
       ^^^^^^^^^^^^^^^^^^^^^
  File "/home/lesteve/micromamba/envs/testenv/lib/python3.12/site-packages/_scikit_learn_editable_loader.py", line 325, in _work_to_do
    p = subprocess.run(dry_run_build_cmd, cwd=self._build_path, env=env, capture_output=True)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/lesteve/micromamba/envs/testenv/lib/python3.12/subprocess.py", line 548, in run
    with Popen(*popenargs, **kwargs) as process:
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/lesteve/micromamba/envs/testenv/lib/python3.12/subprocess.py", line 1026, in __init__
    self._execute_child(args, executable, preexec_fn, close_fds,
  File "/home/lesteve/micromamba/envs/testenv/lib/python3.12/subprocess.py", line 1955, in _execute_child
    raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: '/home/lesteve/scikit-learn/build/cp312'

So this would not be ideal:

  • env1 is broken let's rm -rf build and do make dev to rebuild from scratch
  • OK in the mean time, let's switch to env2 and now work on this other thing
  • ah OK weird error in env2 since I rm -rf build ah I did not think of this 😓
  • env2 needs to be rebuilt too oh well one of this day ...

@lesteve
Copy link
Member Author

lesteve commented Jul 4, 2024

I'm not sure what the downsides are though, in particular for more experienced people than me.

I don't think there is any downside, I have certainly been removing build folder from time to time just in case and never noticed any issue.

If you are a super experienced meson/ninja user and know how to fix this kind of things:

  • first I am interested to know how 😉
  • second don't use make clean and use the super duper pip/meson/ninja combo commands you know about

@betatim
Copy link
Member

betatim commented Jul 4, 2024

I knew I was not paranoid ! :D

Build systems can't be trusted

@betatim
Copy link
Member

betatim commented Jul 4, 2024

I'll merge this now. If Ralf has wisdom to add that leads to us changing our mine we can make a new PR.

@betatim betatim merged commit dd88e21 into scikit-learn:main Jul 4, 2024
30 checks passed
@lesteve lesteve deleted the better-make-clean branch July 4, 2024 11:59
@rgommers
Copy link
Contributor

rgommers commented Jul 4, 2024

cc @rgommers for visibility or in case off the top of your head this seems like something that should just work in principle.

It mostly works in principle, but it's not really possible to guarantee it when a dependency changes as much as between numpy 1.26 and 2.0.

Basically, if header content in numpy changes, ninja is smart enough to rebuild automatically (e.g., edit numpy/_core/include/numpy/numpyconfig.h in site-packages by adding a single blank line, and everything that depends on the numpy C API gets rebuilt).

The problem here though is that build.ninja contains:

-I../../../../micromamba/envs/scikit-learn-dev/lib/python3.12/site-packages/numpy/_core/include

and in numpy 1.26, the numpy/arrayobject.h header lives under numpy/core rather than numpy/_core. So this fails because the header isn't on the compiler's search path anymore (if it was, the rebuild would work). This is a line in build.ninja, so it can only be fixed by running meson setup --reconfigure again, not by running only ninja. And an editable build does the latter (the former is too slow).

So I think a make clean command makes sense here. Maybe reconfiguring would be slightly faster, but not worth trying I think. Better to document that if you make major updates to dependencies, things may break and if that happens, run the clean command.

snath-xoc pushed a commit to snath-xoc/scikit-learn that referenced this pull request Jul 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants