Skip to content

CI Move Pyodide CI from Azure to GitHub Actions #29791

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 69 commits into from
Mar 26, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
69 commits
Select commit Hold shift + click to select a range
22aa88f
Test Pyodide 0.27.0a2
agriyakhetarpal Sep 4, 2024
37884ac
Try to fix `js.process` `ModuleNotFoundError`
agriyakhetarpal Sep 4, 2024
d9d43fa
Install Pyodide JS library for testing
agriyakhetarpal Sep 4, 2024
776e2c9
Skip tests that require a network (with `pooch`)
agriyakhetarpal Sep 4, 2024
85d644d
Fix Pyodide NPM installation
agriyakhetarpal Sep 4, 2024
527c151
Don't explicitly set xbuildenv version
agriyakhetarpal Sep 4, 2024
5ba18f0
Install `pyodide-cli`
agriyakhetarpal Sep 4, 2024
d04b38e
Import `sklearn`, try testing without `--pyargs`
agriyakhetarpal Sep 4, 2024
d6708a6
Reinstall xbuildenv manually, install npm-pyodide in doc/
agriyakhetarpal Sep 4, 2024
2760c38
Add TODO note about Azure
agriyakhetarpal Sep 4, 2024
16267d7
Set `PYTHONVERBOSE` and `PYTHONDEBUG` for debugging
agriyakhetarpal Sep 5, 2024
961a9d4
Switch to `maint_tools/` instead of `doc/`
agriyakhetarpal Sep 5, 2024
ee3228e
Clean up changes, use `--pyargs`
agriyakhetarpal Sep 6, 2024
383bf7e
Fix a typo: paralell ➡️ parallel
agriyakhetarpal Sep 6, 2024
7efd95d
Try to run Cython from its pure Python wheel
agriyakhetarpal Sep 6, 2024
cfae5ff
Trigger [pyodide] wheel build, add Cython comment
agriyakhetarpal Sep 6, 2024
3ae370f
Move `tempita.py` to root dir (for now)
agriyakhetarpal Sep 6, 2024
f16a501
Bump verbosity for [pyodide] test suite
agriyakhetarpal Sep 6, 2024
382c8ba
Remove spurious missing `cache` fixture
agriyakhetarpal Sep 6, 2024
049ccca
Trigger [pyodide] tests too
agriyakhetarpal Sep 6, 2024
b398144
Revert "Move `tempita.py` to root dir (for now)"
agriyakhetarpal Sep 9, 2024
3af4ea2
Use `find_program` function for Tempita
agriyakhetarpal Sep 9, 2024
4a45a97
Clean up test dependencies [pyodide]
agriyakhetarpal Sep 9, 2024
745f6f9
Move `tempita` to an inner `meson.build`
agriyakhetarpal Sep 9, 2024
0675412
Revert "Move `tempita` to an inner `meson.build`"
agriyakhetarpal Sep 9, 2024
a9d7e6c
Make tempita usable with `find_program()`
agriyakhetarpal Sep 9, 2024
86247ad
Add Python shebang to `tempita` [pyodide]
agriyakhetarpal Sep 9, 2024
5b4d043
Exclude `_build_utils/` from installation
agriyakhetarpal Sep 9, 2024
8acf67e
Upload [pyodide] wheel artifact for debugging
agriyakhetarpal Sep 9, 2024
ecd3104
Fix `_build_utils/` folder exclusion
agriyakhetarpal Sep 9, 2024
17f1c64
Merge branch 'main' of https://github.com/scikit-learn/scikit-learn i…
lesteve Mar 24, 2025
6759395
on push
lesteve Mar 24, 2025
9a617e1
unneed changes
lesteve Mar 24, 2025
8d96f2a
tweak
lesteve Mar 24, 2025
e02c0e2
Remove unneeded change
lesteve Mar 24, 2025
a62074c
tweak
lesteve Mar 24, 2025
bf8ed31
Drop comments for Emscripten GHA workflow
agriyakhetarpal Mar 24, 2025
24388f1
Merge branch 'main' into updates-for-emscripten-ci
agriyakhetarpal Mar 24, 2025
ffc0cb8
Drop Azure Pyodide job
agriyakhetarpal Mar 24, 2025
16b0761
Drop Pyodide helper scripts
agriyakhetarpal Mar 24, 2025
033c488
Drop Pyodide commit marker
agriyakhetarpal Mar 24, 2025
b4b6ed7
Simplify; use cibuildwheel for Pyodide testing
agriyakhetarpal Mar 24, 2025
345bbed
Add `CIBW_TEST_COMMAND` for Pyodide
agriyakhetarpal Mar 24, 2025
04d850c
Add missing `CIBW_TEST_REQUIRES` for Pyodide
agriyakhetarpal Mar 24, 2025
99104e7
Fix `pytest` invocation: `svra` ➡️ `-svra`
agriyakhetarpal Mar 24, 2025
9f78d4d
`importlib` is now supported
agriyakhetarpal Mar 24, 2025
d7b3254
`test_qda_regularization` now xpasses
agriyakhetarpal Mar 24, 2025
c434599
`test_create_memmap_backed_data` now xpasses
agriyakhetarpal Mar 24, 2025
93ef796
Revert "Drop Pyodide commit marker"
agriyakhetarpal Mar 25, 2025
6e8a6cb
Reinstate [pyodide] commit marker
agriyakhetarpal Mar 25, 2025
03eb460
Don't use a pinned `actions/checkout`
agriyakhetarpal Mar 25, 2025
e93a405
Don't use a pinned `pypa/cibuildwheel` GHA
agriyakhetarpal Mar 25, 2025
f122250
Add more triggers for Pyodide builds
agriyakhetarpal Mar 25, 2025
e4091ec
[pyodide] [azure parallel] np.memmap appear to kind of work with Pyod…
lesteve Mar 25, 2025
6058521
[pyodide] [azure parallel] Use major version pinning
lesteve Mar 25, 2025
c243c16
[pyodide] [azure parallel] Add upload to Anaconda step
lesteve Mar 25, 2025
f19cca9
[pyodide] Remove scikit-learn/scikit-learn guard
lesteve Mar 25, 2025
bf0c3e5
[pyodide] skip tests
lesteve Mar 25, 2025
1e41a61
[pyodide] tweak
lesteve Mar 25, 2025
b6b3e09
[pyodide] tweak
lesteve Mar 25, 2025
5c9b11c
[pyodide] Remove debug changes
lesteve Mar 25, 2025
2fd9b50
Rename [pyodide] wheel artifact
agriyakhetarpal Mar 25, 2025
6802778
[pyodide] Tweak name
lesteve Mar 25, 2025
09ded37
Pin pypa/cibuildwheel and SPNW upload action
agriyakhetarpal Mar 25, 2025
74b7344
Pin pypa/cibuildwheel for CUDA CI
agriyakhetarpal Mar 25, 2025
4831d6d
Pin hashes for all actions in [pyodide] CI job
agriyakhetarpal Mar 25, 2025
8f81fa7
Use major version pinning for official Github actions
lesteve Mar 26, 2025
1db9ebd
[azure parallel] [pyodide]
lesteve Mar 26, 2025
a7bb1e6
revert CUDA CI pin [azure parallel] [pyodide]
lesteve Mar 26, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
104 changes: 104 additions & 0 deletions .github/workflows/emscripten.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,104 @@
name: Test Emscripten/Pyodide build

on:
schedule:
# Nightly build at 3:42 A.M.
- cron: "42 3 */1 * *"
push:
branches:
- main
# Release branches
- "[0-9]+.[0-9]+.X"
pull_request:
branches:
- main
- "[0-9]+.[0-9]+.X"
# Manual run
workflow_dispatch:

env:
FORCE_COLOR: 3

concurrency:
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
cancel-in-progress: true

permissions:
contents: read

jobs:
check_build_trigger:
name: Check build trigger
runs-on: ubuntu-latest
if: github.repository == 'scikit-learn/scikit-learn'
outputs:
build: ${{ steps.check_build_trigger.outputs.build }}
steps:
- name: Checkout scikit-learn
uses: actions/checkout@v4
with:
ref: ${{ github.event.pull_request.head.sha }}
persist-credentials: false

- id: check_build_trigger
name: Check build trigger
shell: bash
run: |
set -e
set -x

COMMIT_MSG=$(git log --no-merges -1 --oneline)

# The commit marker "[pyodide]" will trigger the build when required
if [[ "$GITHUB_EVENT_NAME" == schedule ||
"$GITHUB_EVENT_NAME" == workflow_dispatch ||
"$COMMIT_MSG" =~ \[pyodide\] ]]; then
echo "build=true" >> $GITHUB_OUTPUT
fi

build_wasm_wheel:
name: Build WASM wheel
runs-on: ubuntu-latest
needs: check_build_trigger
if: needs.check_build_trigger.outputs.build
steps:
- name: Checkout scikit-learn
uses: actions/checkout@v4
with:
persist-credentials: false

- uses: pypa/cibuildwheel@d04cacbc9866d432033b1d09142936e6a0e2121a # v2.23.2
env:
CIBW_PLATFORM: pyodide
SKLEARN_SKIP_OPENMP_TEST: "true"
SKLEARN_SKIP_NETWORK_TESTS: 1
CIBW_TEST_REQUIRES: "pytest pandas"
CIBW_TEST_COMMAND: "python -m pytest -svra --pyargs sklearn --durations 20 --showlocals"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to upload the Pyodide wheel to anaconda.org when the tests pass, like numpy does. I guess this can be done in a separate PR.

The middle-term goal would be to use the Pyodide wheel on anaconda.org inside our doc JupyterLite.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I left it for a follow-up PR, indeed. On being able to use the Pyodide wheel, there are two approaches I have in mind:

Which one of these would you prefer?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My preference would be the following for now because I feel this is the simplest option (we can always change our mind later):

  • upload the Pyodide wheel to anaconda.org after this worflow succeeds (similar to what numpy I think)
  • in the JupyterLite notebook, one of the first cell should do something like this (through sphinx-gallery notebook_modification_function)
    import micropip
    # this line is needed as workaround see https://github.com/pyodide/micropip/issues/223
    await micropip.install(["joblib", "threadpoolctl", "scipy"])
    await micropip.install("scikit-learn", index_urls="https://pypi.anaconda.org/scientific-python-nightly-wheels/simple", pre=True)

One of my concern for your approach in scikit-image is the size of the wheel (~12M for scikit-learn Pyodide wheel). On each commit of the repo, we will push an updated wheel and the size of the scikit-learn.github.io repo grows quite quickly. To be honest, maybe the size of the Pyodide wheel is actually quite small compared to the size of the website (~300MB) ...

This is the approach I've taken in Add JupyterLite-powered interactive galleries to the scikit-image documentation scikit-image/scikit-image#7644 that you reviewed some time back ;) (I'd be grateful for another look there!)

I'll try to have a closer look although I can't make any promises 😅

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see – the increased size of the docs repository would make it quite big (I thought new builds are being overwritten on the gh-pages branch). Also, I see that the size of the zipped docs is monotonically increasing with each version for recent versions: https://scikit-learn.org/dev/versions.html, so while it it's okay for archives, it's indeed nicer if a GitHub repository does not include any more binary files than it needs to.

I'm happy to implement the notebook modifications you suggest once this PR is merged!


- name: Upload wheel artifact
uses: actions/upload-artifact@v4
with:
name: pyodide_wheel
path: ./wheelhouse/*.whl
if-no-files-found: error

# Push to https://anaconda.org/scientific-python-nightly-wheels/scikit-learn
# WARNING: this job will overwrite any existing WASM wheels.
upload-wheels:
name: Upload scikit-learn WASM wheels to Anaconda.org
runs-on: ubuntu-latest
permissions: {}
needs: [build_wasm_wheel]
if: github.repository == 'scikit-learn/scikit-learn' && github.event_name != 'pull_request'
steps:
- name: Download wheel artifact
uses: actions/download-artifact@v4
with:
path: wheelhouse/
merge-multiple: true

- name: Push to Anaconda PyPI index
uses: scientific-python/upload-nightly-action@82396a2ed4269ba06c6b2988bb4fd568ef3c3d6b # 0.6.1
with:
artifacts_path: wheelhouse/
anaconda_nightly_upload_token: ${{ secrets.SCIKIT_LEARN_NIGHTLY_UPLOAD_TOKEN }}
33 changes: 0 additions & 33 deletions azure-pipelines.yml
Original file line number Diff line number Diff line change
Expand Up @@ -89,39 +89,6 @@ jobs:
LOCK_FILE: './build_tools/azure/pylatest_free_threaded_linux-64_conda.lock'
COVERAGE: 'false'

- job: Linux_Nightly_Pyodide
pool:
vmImage: ubuntu-22.04
variables:
# Need to match Python version and Emscripten version for the correct
# Pyodide version. For example, for Pyodide version 0.27.2, see
# https://github.com/pyodide/pyodide/blob/0.27.2/Makefile.envs
PYODIDE_VERSION: '0.27.2'
EMSCRIPTEN_VERSION: '3.1.58'
PYTHON_VERSION: '3.12.7'

dependsOn: [git_commit, linting]
condition: |
and(
succeeded(),
not(contains(dependencies['git_commit']['outputs']['commit.message'], '[ci skip]')),
or(eq(variables['Build.Reason'], 'Schedule'),
contains(dependencies['git_commit']['outputs']['commit.message'], '[pyodide]'
)
)
)
steps:
- task: UsePythonVersion@0
inputs:
versionSpec: $(PYTHON_VERSION)
addToPath: true

- bash: bash build_tools/azure/install_pyodide.sh
displayName: Build Pyodide wheel

- bash: bash build_tools/azure/test_script_pyodide.sh
displayName: Test Pyodide wheel

# Will run all the time regardless of linting outcome.
- template: build_tools/azure/posix.yml
parameters:
Expand Down
20 changes: 0 additions & 20 deletions build_tools/azure/install_pyodide.sh

This file was deleted.

53 changes: 0 additions & 53 deletions build_tools/azure/pytest-pyodide.js

This file was deleted.

9 changes: 0 additions & 9 deletions build_tools/azure/test_script_pyodide.sh

This file was deleted.

4 changes: 0 additions & 4 deletions sklearn/_loss/tests/test_loss.py
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,6 @@
)
from sklearn.utils import assert_all_finite
from sklearn.utils._testing import create_memmap_backed_data, skip_if_32bit
from sklearn.utils.fixes import _IS_WASM

ALL_LOSSES = list(_LOSSES.values())

Expand Down Expand Up @@ -390,9 +389,6 @@ def test_loss_dtype(

Also check that input arrays can be readonly, e.g. memory mapped.
"""
if _IS_WASM and readonly_memmap: # pragma: nocover
pytest.xfail(reason="memmap not fully supported")

loss = loss()
# generate a y_true and raw_prediction in valid range
n_samples = 5
Expand Down
2 changes: 1 addition & 1 deletion sklearn/datasets/tests/test_openml.py
Original file line number Diff line number Diff line change
Expand Up @@ -1475,7 +1475,7 @@ def _mock_urlopen_raise(request, *args, **kwargs):
(False, "pandas"),
],
)
def test_fetch_openml_verify_checksum(monkeypatch, as_frame, cache, tmpdir, parser):
def test_fetch_openml_verify_checksum(monkeypatch, as_frame, tmpdir, parser):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, cache was a spurious argument that was unused throughout the file. It broke the Pyodide tests and returned an error upon collection because the interpreter assumed that this was a fixture.

"""Check that the checksum is working as expected."""
if as_frame or parser == "pandas":
pytest.importorskip("pandas")
Expand Down
2 changes: 0 additions & 2 deletions sklearn/tests/test_common.py
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,6 @@
check_transformer_get_feature_names_out_pandas,
parametrize_with_checks,
)
from sklearn.utils.fixes import _IS_WASM


def test_all_estimator_no_base_class():
Expand Down Expand Up @@ -134,7 +133,6 @@ def test_check_estimator_generate_only_deprecation():
assert isgenerator(all_instance_gen_checks)


@pytest.mark.xfail(_IS_WASM, reason="importlib not supported for Pyodide packages")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is weird __import__ is not supported in Pyodide, I just tried __import__('joblib') in Pyodide console and this fails.

What I am guessing happens is that this test somehow works because pkg.util_walkpackages populates sys.modules and __import__("bla") works inside Pyodide if "bla" is already in sys.modules. Oh well 🤷 ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 I'm not sure why that's the case. I tried it just now – both in JupyterLite and in the Pyodide (stable) console and I was able to __import__("joblib") to get <module 'joblib' from '/lib/python3.12/site-packag es/joblib/__init__.py'>. I did need to install joblib with micropip, of course. Could you please check this again, just in case I might be missing something?

Either way, it's nice that the test now works – I hope!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think micropip.install('joblib') makes sys.modules['joblib'] exist and so __import__ work.

In any case this is not really important since the test appears to pass now 🤷.

@pytest.mark.filterwarnings(
"ignore:Since version 1.0, it is not needed to import "
"enable_hist_gradient_boosting anymore"
Expand Down
8 changes: 0 additions & 8 deletions sklearn/tests/test_discriminant_analysis.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,6 @@
assert_array_almost_equal,
assert_array_equal,
)
from sklearn.utils.fixes import _IS_WASM

# Data is just 6 separable points in the plane
X = np.array([[-2, -1], [-1, -1], [-1, -2], [1, 1], [1, 2], [2, 1]], dtype="f")
Expand Down Expand Up @@ -594,13 +593,6 @@ def test_qda_store_covariance():
)


@pytest.mark.xfail(
_IS_WASM,
reason=(
"no floating point exceptions, see"
" https://github.com/numpy/numpy/pull/21895#issuecomment-1311525881"
),
)
def test_qda_regularization():
# The default is reg_param=0. and will cause issues when there is a
# constant variable.
Expand Down
1 change: 0 additions & 1 deletion sklearn/utils/tests/test_testing.py
Original file line number Diff line number Diff line change
Expand Up @@ -854,7 +854,6 @@ def test_tempmemmap(monkeypatch):
assert registration_counter.nb_calls == 2


@pytest.mark.xfail(_IS_WASM, reason="memmap not fully supported")
def test_create_memmap_backed_data(monkeypatch):
registration_counter = RegistrationCounter()
monkeypatch.setattr(atexit, "register", registration_counter)
Expand Down