Skip to content

Add wasm CI #29093

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 12 commits into
base: main
Choose a base branch
from
Draft

Add wasm CI #29093

wants to merge 12 commits into from

Conversation

QuLogic
Copy link
Member

@QuLogic QuLogic commented Nov 6, 2024

PR summary

This adds wasm builds on CI through cibuildwheel; these are geared towards testing only, and not publishing, as they use cibuildwheel's test command, which requires shipping the test result images (which we don't do for release wheels.)

The wasm platform has several constraints:

  • Though we always import threading (MNT: Remove dummy_threading because threading is always available #23073), it's not actually supported Threading support pyodide/pyodide#237 and we need to either ignore it, or skip tests that use it.
  • Similarly, subprocesses don't work, so those tests must also be skipped; this also applies to the converters for SVG and PDF. For some reason shutil.which returns a real path, but due to the below, fails to run, so we need to explicitly catch that.
  • File system access is sandboxed, so we must skip attempting to read "Linux" system fonts, as they won't work.
  • wasm is 32-bit, so several tests with large allocations must be skipped.
  • This also leads to the old mlab.stride_windows implementation, as the NumPy stride_tricks implementation OOMs. This is now in Use old stride_windows implementation on 32-bit builds #29115 for separate review.

This gets us to about 80% tests passing, 19% skipped (mostly the SVG/PDF, I think), and a small handful that fail:

  • 4 tests have some floating point tolerance issues -> probably will have to allow the tolerance; these also fail for new Fedora on other arch/FreeType and I've been patching it downstream
  • 3 PDF tests fail because pyodide doesn't seem to support file.seek/file.tell? That seems an odd limitation, especially for the latter. -> patched here to avoid os.devnull
  • 2 PDF tests seem to not fail, but see an unclosed file leak; this is possibly related to the above test failures. -> related to above
  • 2 mlab tests don't raise numpy.linalg.LinAlgError; haven't investigated at all. -> @agriyakhetarpal says to skip these, so I have.
  • tests/test_simplification.py::test_throw_rendering_complexity_exceeded throws MemoryError instead of OverflowError; probably this will have to be skipped like the OOM ones, but I haven't checked the implementation.
  • tests/test_skew.py::test_skew_rectangle[png] fails to allocate the (8, 8)-inch figure, but there are several tests with a larger figure, so I'm not sure why this one in particular fails. -> I've opened TST: Calculate RMS and diff image in C++ #29102 to optimize the image comparisons, which will fix these.

cc @agriyakhetarpal

Closes #27870

PR checklist

This checks `os.geteuid`, but this is only available on Unix, not
Emscripten or WASM.
In Enscripten/WASM, this module can be imported, but doesn't work, so we
can't fall back to `dummy_threading` at import-time as we used to do.
The file system is either sandboxed or there are simply no other fonts,
so limit ourselves to our pre-shipped fonts.
On WASM, which is wholly 32-bit, casting unsigned int to signed long is
a narrowing conversion, which it seems to treat as an error. The Agg
buffer cannot be over `(1<<23)` in width or height, so this cast is
safe even with the smaller sizes.
This was originally for i686 on Fedora, but is now applicable to WASM,
which is 32-bit. The older implementation doesn't OOM.
@QuLogic QuLogic added the CI: Run cibuildwheel Run wheel building tests on a PR label Nov 6, 2024
@github-actions github-actions bot added backend: agg topic: animation and removed CI: Run cibuildwheel Run wheel building tests on a PR labels Nov 6, 2024
@QuLogic QuLogic added the CI: Run cibuildwheel Run wheel building tests on a PR label Nov 6, 2024
@github-actions github-actions bot removed the CI: Run cibuildwheel Run wheel building tests on a PR label Nov 6, 2024
@QuLogic
Copy link
Member Author

QuLogic commented Nov 6, 2024

The last commit is a hack, and I don't like that it's necessary. To avoid the null function or function signature mismatch at runtime, I've had to unhide all the FreeType symbols. This seems to be a bug in the toolchain somewhere.

This also does not fix anything in browsers; I'm unable to run any code that requires FreeType with the same exception.

@QuLogic
Copy link
Member Author

QuLogic commented Nov 6, 2024

2 PDF tests seem to not fail, but see an unclosed file leak; this is possibly related to the above test failures.

I've figured out what these are:

  • PdfPages._ensure_file creates a PdfFile when saving a figure.
  • This PdfFile.__init__ writes some information to the output about the overall PDF structure.
  • Some of this involves XObjects, which reference file offsets, and use fh.tell().
  • Pyodide for some strange reason doesn't support this, and raises an exception.
  • Because this is in PdfFile.__init__, no variable is assigned in PdfPages._ensure_file, and the file object is left floating.
  • Thus, even though PdfFile implements __exit__ and the test uses a context manager, the file is never closed.
  • Since pytest turns resource warnings into errors, this leaks into some other test.

tests/test_skew.py::test_skew_rectangle[png] fails to allocate the (8, 8)-inch figure, but there are several tests with a larger figure, so I'm not sure why this one in particular fails.

And this one doesn't seem to have failed on CI, so it does seem to be something to do with memory allocation patterns, and not something inherent to this test.

@agriyakhetarpal
Copy link

Thanks for this, @QuLogic! Some quick comments for now on the "2 mlab tests don't raise numpy.linalg.LinAlgError; haven't investigated at all." statement and on other things you've noticed:

This is generally because of the lack of observability for floating-point exceptions in the WASM runtime, so these tests will have to be skipped for now. Similar issues have been noted here: pyodide/pyodide#4859 and in previous conversations, too.

FreeType in pygame-ce isn't supported well either: pygame-community/pygame-ce#1967. I don't know if that's related or helpful.

these are geared towards testing only, and not publishing

While support for PyPI is going to demand a PEP and some effort across packaging tooling, in the meantime, it would be great for Matplotlib to publish these WASM wheels when ready to the https://anaconda.org/scientific-python-nightly-wheels index. We're uploading them for NumPy, statsmodels, PyWavelets (pywt), scikit-image, and pandas right now already – and more packages are planned; they shall potentially be used for interactive documentation with JupyterLite/jupyterlite-sphinx.

@agriyakhetarpal
Copy link

xref pyodide/pyodide#4510 because this effort will be helpful there, too.

@QuLogic
Copy link
Member Author

QuLogic commented Nov 6, 2024

This is generally because of the lack of observability for floating-point exceptions in the WASM runtime, so these tests will have to be skipped for now. Similar issues have been noted here: pyodide/pyodide#4859 and in previous conversations, too.

OK, I will skip them, then.

FreeType in pygame-ce isn't supported well either: pygame-community/pygame-ce#1967. I don't know if that's related or helpful.

Unfortunately, I don't see anything obvious there; they appear to be changing their extension itself, but not anything on the FreeType side of the build.

these are geared towards testing only, and not publishing

While support for PyPI is going to demand a PEP and some effort across packaging tooling, in the meantime, it would be great for Matplotlib to publish these WASM wheels when ready to the https://anaconda.org/scientific-python-nightly-wheels index. We're uploading them for NumPy, statsmodels, PyWavelets (pywt), scikit-image, and pandas right now already – and more packages are planned; they shall potentially be used for interactive documentation with JupyterLite/jupyterlite-sphinx.

We can publish there; it's just that these include the test images and are 5 times bigger as a result.

This adds a `pyproject.toml` config for it, so you can replicate locally
with cibuildwheel.
@QuLogic QuLogic added the CI: Run cibuildwheel Run wheel building tests on a PR label Nov 6, 2024
@agriyakhetarpal
Copy link

agriyakhetarpal commented Nov 6, 2024

We can publish there; it's just that these include the test images and are 5 times bigger as a result.

I see. In the in-tree recipe, we provide the test images in a separate test_data folder, but I am not sure when they were separated out. They don't seem to match the images in /matplotlib/lib/matplotlib/mpl-data/, though, so maybe we can work with different combinations for testing and uploading?

That is, we could use the standard pyodide build invocation to build and test Matplotlib WASM wheels out-of-tree in CI, but use cibuildwheel and leave out the "tests" install tag when publishing to Anaconda.org in order to get smaller wheels (plus, not test when building with cibuildwheel). Another reason why this could make sense, besides making the wheel smaller by not including ancillary data, is that the Pyodide version available in cibuildwheel remains coupled to the cibuildwheel version – at least, this will be the case until I finish pypa/cibuildwheel#2002. We currently don't have a minimum supported version policy or LTS releases for Pyodide, so the latest Pyodide version is the one we can reasonably support and recommend using – and the pyodide xbuildenv install command via pyodide-build will more freely allow you to choose a particular (compatible) Pyodide version in case you wish to pin it.

P.S. this is under the assumption that the images listed under /matplotlib/mpl-data/ are required in the test suite.

@QuLogic
Copy link
Member Author

QuLogic commented Nov 6, 2024

This also does not fix anything in browsers; I'm unable to run any code that requires FreeType with the same exception.

So this was unrelated now. The problem was in the debugger I had "Pause on caught exceptions" enabled, which for some reason made it crash.

Now, with the following page (note I symlinked the wheel name to a generic one so I didn't need to edit the page repeatedly):

Example HTML page
<!DOCTYPE html>
<html>
  <head>
    <script src="https://cdn.jsdelivr.net/pyodide/v0.26.3/full/pyodide.js"></script>
  </head>

  <body>
    <h1>Pyodide test page</h1>
    Open your browser console to see Pyodide output
    <br />
    <button id="run" onclick="evaluatePython()">Run</button>

    <h2>Output:</h2>
    <textarea id="output" style="width: 50%;" rows="100" disabled></textarea>
    <canvas id="canvas" style="vertical-align: top;"></canvas>
    <script type="text/javascript">
      const run = document.getElementById("run");
      const output = document.getElementById("output");
      const canvas = document.getElementById("canvas");

      output.value = "Initializing... ";
      run.disabled = true;
      async function main() {
        let pyodide = await loadPyodide();
        pyodide.setDebug(true);
        pyodide.setStdout({
          batched: (s) => output.value += `${s}\n`,
        });
        pyodide.setStderr({
          batched: (s) => output.value += `[ERR] ${s}\n`,
        });
        await pyodide.loadPackage("micropip");
        let micropip = pyodide.pyimport("micropip");
        await micropip.install("http://127.0.0.1:8000/matplotlib-3.10.0.dev0-cp312-cp312-pyodide_2024_0_wasm32.whl");
        output.value += "Done!\n";
        run.disabled = false;
        return pyodide;
      }
      let pyodideReadyPromise = main();

      async function runStuff(code) {
        let pyodide = await pyodideReadyPromise;
        output.value += `>>>${code}\n`;
        try {
          return pyodide.runPython(code);
        } catch (err) {
          output.value += `[EXC]:${err}`;
        }
      }

      async function evaluatePython() {
        await runStuff(`
            import sys
            print(sys.version)
        `);
        await runStuff(`
            import matplotlib
            print(matplotlib.__version__)
        `);
        const fig = await runStuff(`
            import matplotlib.pyplot as plt
            import numpy as np

            x = np.linspace(0, 10, 1000)
            y = np.sin(x * np.pi)

            fig, ax = plt.subplots()
            fig.text(0.5, 0.5, 'Test',
                     fontsize=24, horizontalalignment='center', verticalalignment='center')
            ax.plot(x, y)
            print(fig)

            fig
        `);

        fig.canvas.draw();
        const width = fig.bbox.width;
        const height = fig.bbox.height;
        const buffer = fig.canvas.buffer_rgba();
        const data = buffer.getBuffer("u8clamped");

        const imageData = new ImageData(data.data, width, height);
        canvas.width = width;
        canvas.height = height;
        const ctx = canvas.getContext("2d");
        ctx.putImageData(imageData, 0, 0);
        data.release();
      }
    </script>
  </body>
</html>
I am able to run the produced wheel and output a figure to a canvas:

image

It's probably not the most efficient implementation, but it's enough to prove the wasm wheel is working.

@hoodmane
Copy link
Contributor

hoodmane commented Nov 7, 2024

Pyodide doesn't seem to support file.seek/file.tell? That seems an odd limitation, especially for the latter.

I'm surprised to hear this, what's the best way to reproduce the failure?

@hoodmane
Copy link
Contributor

hoodmane commented Nov 7, 2024

Thanks so much @QuLogic for working on this!

Comment on lines +332 to +334
CFLAGS = "-fexceptions"
CXXFLAGS = "-fexceptions"
LDFLAGS = "-fexceptions"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Someone was suggesting recently that we ought to make this the default.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, Henry was suggesting it in our Discord server, IIRC, through which pybind/pybind11#5298 came up (as linked in the code comment above)

@hoodmane
Copy link
Contributor

hoodmane commented Nov 7, 2024

Running this myself I'm getting a crash in ft2font:

.venv-pyodide/lib/python3.12/site-packages/matplotlib/tests/test_agg.py .Pyodide has suffered a fatal error. Please report this to the Pyodide maintainers.
The cause of the fatal error was:
RuntimeError: null function or function signature mismatch
    at ft2font.cpython-312-wasm32-emscripten.so.af_autofitter_load_glyph (wasm://wasm/ft2font.cpython-312-wasm32-emscripten.so-0044c45a:wasm-function[558]:0x7f032)
    at ft2font.cpython-312-wasm32-emscripten.so.FT_Load_Glyph (wasm://wasm/ft2font.cpython-312-wasm32-emscripten.so-0044c45a:wasm-function[192]:0xfad4)
    at ft2font.cpython-312-wasm32-emscripten.so.FT2Font::load_char_with_fallback(FT2Font*&, unsigned int&, std::__2::vector<FT_GlyphRec_*, std::__2::allocator<FT_GlyphRec_*>>&, std::__2::unordered_map<long, FT2Font*, std::__2::hash<long>, std::__2::equal_to<long>, std::__2::allocator<std::__2::pair<long const, FT2Font*>>>&, std::__2::unordered_map<unsigned int, FT2Font*, std::__2::hash<unsigned int>, std::__2::equal_to<unsigned int>, std::__2::allocator<std::__2::pair<unsigned int const, FT2Font*>>>&, long, int, int&, int&, std::__2::set<char*, std::__2::less<char*>, std::__2::allocator<char*>>&, bool) (wasm://wasm/ft2font.cpython-312-wasm32-emscripten.so-0044c45a:wasm-function[190]:0xf4ac)
    at invoke_iiiiiiiiiiiii (/home/rchatham/Documents/programming/pyodide/dist/pyodide.asm.js:103252:40)
    at ft2font.cpython-312-wasm32-emscripten.so.PyFT2Font_set_text(PyFT2Font*, std::__2::basic_string_view<char32_t, std::__2::char_traits<char32_t>>, double, std::__2::variant<LoadFlags, int>) (wasm://wasm/ft2font.cpython-312-wasm32-emscripten.so-0044c45a:wasm-function[366]:0x43795)
    at ft2font.cpython-312-wasm32-emscripten.so.void pybind11::cpp_function::initialize<pybind11::array_t<double, 16> (*&)(PyFT2Font*, std::__2::basic_string_view<char32_t, std::__2::char_traits<char32_t>>, double, std::__2::variant<LoadFlags, int>), pybind11::array_t<double, 16>, PyFT2Font*, std::__2::basic_string_view<char32_t, std::__2::char_traits<char32_t>>, double, std::__2::variant<LoadFlags, int>, pybind11::name, pybind11::is_method, pybind11::sibling, pybind11::arg, pybind11::arg_v, pybind11::arg_v, char const*>(pybind11::array_t<double, 16> (*&)(PyFT2Font*, std::__2::basic_string_view<char32_t, std::__2::char_traits<char32_t>>, double, std::__2::variant<LoadFlags, int>), pybind11::array_t<double, 16> (*)(PyFT2Font*, std::__2::basic_string_view<char32_t, std::__2::char_traits<char32_t>>, double, std::__2::variant<LoadFlags, int>), pybind11::name const&, pybind11::is_method const&, pybind11::sibling const&, pybind11::arg const&, pybind11::arg_v const&, pybind11::arg_v const&, char const* const&)::'lambda'(pybind11::detail::function_call&)::__invoke(pybind11::detail::function_call&) (wasm://wasm/ft2font.cpython-312-wasm32-emscripten.so-0044c45a:wasm-function[365]:0x42d4a)
    at invoke_ii (/home/rchatham/Documents/programming/pyodide/dist/pyodide.asm.js:102969:40)
    at ft2font.cpython-312-wasm32-emscripten.so.pybind11::cpp_function::dispatcher(_object*, _object*, _object*) (wasm://wasm/ft2font.cpython-312-wasm32-emscripten.so-0044c45a:wasm-function[295]:0x224a9)
    at _PyEM_TrampolineCall_JS (/home/rchatham/Documents/programming/pyodide/dist/pyodide.asm.js:7289:33)
    at pyodide.asm.wasm.cfunction_call (wasm://wasm/pyodide.asm.wasm-029ee13e:wasm-function[2190]:0x20b80b) {
  pyodide_fatal_error: true
}

@QuLogic
Copy link
Member Author

QuLogic commented Nov 7, 2024

Pyodide doesn't seem to support file.seek/file.tell? That seems an odd limitation, especially for the latter.

I'm surprised to hear this, what's the best way to reproduce the failure?

Looking closer, these two tests use Path(os.devnull) as output filename. It appears that this specifically is what doesn't allow seek/tell (though the exception text confusingly references the opposite operation):

>>> import os
>>> fh = open(os.devnull, 'w')
>>> fh.tell()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
io.UnsupportedOperation: underlying stream is not seekable

I pushed a change to avoid using os.devnull.

@QuLogic
Copy link
Member Author

QuLogic commented Nov 8, 2024

Running this myself I'm getting a crash in ft2font:

Are you building with cibuildwheel? You need all the flags in there, or else the compiler is buggy with things somehow. Also, if you already have a checkout, Meson won't recreate the subprojects, so you'd need to apply the change to the FreeType meson.build in the cached copy (or delete subprojects/freetype-2.6.1 to recreate it).

@github-actions github-actions bot removed the CI: Run cibuildwheel Run wheel building tests on a PR label Nov 8, 2024
On wasm, this file doesn't support seeking, which is sometimes necessary
depending on file type.
@QuLogic QuLogic added the CI: Run cibuildwheel Run wheel building tests on a PR label Nov 8, 2024
@hoodmane
Copy link
Contributor

hoodmane commented Nov 8, 2024

Are you building with cibuildwheel? You need all the flags in there

I'm not building with cibuildwheel because there are test failures and I don't know how to get a copy of the wheel to inspect/debug if tests fail. @henryiii is there an easy way to do this?

Anyways I did copy the exact build command out of the log and copied the environment changes. So I think I'm building it right.

if you already have a checkout, Meson won't recreate the subprojects

I bet this is the problem.

@QuLogic
Copy link
Member Author

QuLogic commented Nov 8, 2024

Are you building with cibuildwheel? You need all the flags in there

I'm not building with cibuildwheel because there are test failures and I don't know how to get a copy of the wheel to inspect/debug if tests fail.

You can skip the tests by building with:

CIBW_TEST_COMMAND= cibuildwheel --platform pyodide

@QuLogic
Copy link
Member Author

QuLogic commented Nov 8, 2024

  • tests/test_skew.py::test_skew_rectangle[png] fails to allocate the (8, 8)-inch figure, but there are several tests with a larger figure, so I'm not sure why this one in particular fails.

More of these are happening now that the leaky file handle was fixed. I've opened #29102 to minimize memory usage in the image comparisons, and that should fix these.

@hoodmane
Copy link
Contributor

hoodmane commented Nov 8, 2024

Path(os.devnull) as output filename. It appears that this specifically is what doesn't allow seek/tell (though the exception text confusingly references the opposite operation)

Thanks for pointing this out. Opened a PR to fix it:
emscripten-core/emscripten#22886

@QuLogic
Copy link
Member Author

QuLogic commented Nov 8, 2024

Thanks for pointing this out. Opened a PR to fix it:
emscripten-core/emscripten#22886

Thanks; I see that the result is always 0, and I see that that is what happens with /dev/null as well. I think that may possibly produce invalid files since tell will always be 0.

These tests are only to confirm that Path objects work and we don't care about the output there, but I think we should avoid producing a file that could be invalid, so it makes even more sense now to switch these to temporary paths instead.

@hoodmane
Copy link
Contributor

hoodmane commented Nov 8, 2024

@QuLogic are you getting:

_______________________ ERROR collecting .venv-pyodide/lib/python3.12/site-packages/matplotlib/testing _______________________
/lib/python312.zip/_pyodide/_importhook.py:25: in find_spec
    parent_module = sys.modules[parent]
E   KeyError: 'matplotlib'

I made pyodide/pyodide#5170 to fix this. But I'm curious if you are doing something to work around it, or if you just don't see it?

@QuLogic
Copy link
Member Author

QuLogic commented Nov 8, 2024

Sorry no, not seeing that (but I'm testing in cibuildwheel only).

@hoodmane
Copy link
Contributor

hoodmane commented Nov 8, 2024

Okay now I'm getting the following 14 failures. Let me know if you need help with anything or suspect there is a problem with Pyodide / other upstream.

FAILED tests/test_axes.py::test_pcolornearestunits[png] - matplotlib.testing.exceptions.ImageComparisonFailure: images not close (RMS 4.715):
FAILED tests/test_axes.py::test_normal_axes - AssertionError: 
FAILED tests/test_backend_pdf.py::test_pdfpages_fspath - OSError: [Errno 70] Invalid seek
FAILED tests/test_backend_ps.py::test_savefig_to_stringio[eps afm-portrait-letter] - pytest.PytestUnraisableExceptionWarning: Exception ignored in: <_io.FileIO [closed]>
FAILED tests/test_constrainedlayout.py::test_hidden_axes - AssertionError: 
FAILED tests/test_image.py::test_imsave_fspath[png] - io.UnsupportedOperation: File or stream is not seekable.
FAILED tests/test_image.py::test_imsave_fspath[pdf] - OSError: [Errno 70] Invalid seek
FAILED tests/test_image.py::test_imshow_alpha[png] - pytest.PytestUnraisableExceptionWarning: Exception ignored in: <_io.FileIO [closed]>
FAILED tests/test_legend.py::test_figure_legend_outside - AssertionError: 
FAILED tests/test_simplification.py::test_throw_rendering_complexity_exceeded - MemoryError: std::bad_alloc
FAILED tests/test_simplification.py::test_para_equal_perp[png] - numpy.core._exceptions._ArrayMemoryError: Unable to allocate 11.0 MiB for an array with shape (600, 800, 3) and data type float64
FAILED tests/test_simplification.py::test_clipping_with_nans[png] - numpy.core._exceptions._ArrayMemoryError: Unable to allocate 11.0 MiB for an array with shape (600, 800, 3) and data type float64
FAILED tests/test_skew.py::test_set_line_coll_dash_image[png] - numpy.core._exceptions._ArrayMemoryError: Unable to allocate 11.0 MiB for an array with shape (600, 800, 3) and data type float64
FAILED tests/test_tightlayout.py::test_outward_ticks - AssertionError: 
14 failed, 7894 passed, 1845 skipped, 32 xfailed in 341.61s (0:05:41)

@QuLogic
Copy link
Member Author

QuLogic commented Nov 9, 2024

Okay now I'm getting the following 14 failures.

Most of these are fixed by this PR or the linked ones.

Let me know if you need help with anything or suspect there is a problem with Pyodide / other upstream.

If you have any idea why we need to disable hidden visibility on FreeType, it'd be nice to not have to patch that. It's built as a static library that's put in the shared library Python extension. You've seen what happens without the patch above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[ENH]: out-of-tree Pyodide builds in CI for Matplotlib
3 participants