Skip to content

MAINT: Convert umath_linalg to multi-phase init (PEP 489) #29030

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 28, 2025

Conversation

AA-Turner
Copy link
Contributor

@AA-Turner AA-Turner commented May 21, 2025

@AA-Turner AA-Turner force-pushed the multi-phase/umath_linalg branch 2 times, most recently from ebfb8a7 to 7121db6 Compare May 22, 2025 03:15
@AA-Turner AA-Turner force-pushed the multi-phase/umath_linalg branch 2 times, most recently from eae5c53 to 6876e24 Compare May 22, 2025 05:45
Copy link
Member

@mattip mattip left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please work on a single PR to reduce maintainer review burden. Once we are settled that we wish to make this change, we can revisit the other ones. Does this impact performance, either import time or runtime?

PyModuleDef_HEAD_INIT, /* m_base */
"_umath_linalg", /* m_name */
NULL, /* m_doc */
0, /* m_size */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now, in a future PR this can be changed.

Suggested change
0, /* m_size */
-1, /* m_size */

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See also my note in #29025 (comment). Briefly, I was mistaken in my understanding of the documentation, -1 is only valid for single-phase initialisation, and is forbidden for multi-phase.

The safest thing reinitialisation-wise would be to use the multiple interpreter opt-out suggested by CPython. This would enforce that each extension module is only imported & initialised once per process.

However, this causes test_reloading.py::test_full_reimport() to fail. Is that an acceptable trade-off? Multi-phase init (and at some point, isolated modules) opens the door to proper support for reloading/reimporting numpy.

cc @encukou re the best way to do a gradual migration to multi-phase

In the case of umath_linalg, we might be fine globals-wise, but multiarray will definitely need a solution here.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've just updated the HOWTO, see here before it gets to main docs.

Basically, to allow re-import, clear the "loaded" flag in PyModuleDef's m_clear. Note that this will do an actual reload of the module (sans C globals), that is, exec will run. (With single-phase, the module dict contents are stashed away and reused. You could emulate that, but )

The PyMutex stuff in the updated HOWTO is 3.13+ non-limited API, but isn't necessary in the medium term (on CPython you're protected by the GIL; even free-threading builds use an import lock which isn't likely to go away). For the long-term (or alternate CAPI implementations) you could add no-op shims.

The Py_MOD_MULTIPLE_INTERPRETERS_NOT_SUPPORTED flag will only allow loading in the main interpreter; that's redundant if you're limited to one module per process.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, this causes test_reloading.py::test_full_reimport() to fail. Is that an acceptable trade-off?

I am not opposed to breaking the test/changing the behavior. The ask would be to summarize how things change for:

  • Potential subinterpreter users (I think that just doesn't work at all anymore, but not sure).
  • Users for importlib.reload(), currently they get a warning that this is a bad idea. A full-blown error may not be so bad, though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential subinterpreter users (I think that just doesn't work at all anymore, but not sure).

As far as I can tell, importing numpy simply fails in a subinterpreter, and continues to do so with the explicit opt-out.

$ uvx --with numpy --python 3.13 python
Python 3.13.3+ (main, May 20 2025, 08:56:23) [GCC 13.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import _interpreters
>>> # first import in an interpreter with a 'legacy' shared GIL
>>> c = _interpreters.new_config('legacy')  
>>> id1 = _interpreters.create(c, reqrefs=True)
>>> _interpreters.incref(id1)  # bookkeeping
>>> _interpreters.exec(id1, 'import sys', restrict=True)  # stdlib works
>>> _interpreters.exec(id1, 'import numpy', restrict=True)
namespace(
  type=namespace(
    __name__='RuntimeError',
    __qualname__='RuntimeError',
    __module__='builtins'
  ), 
  msg='CPU dispatcher tracer already initlized',
  formatted='RuntimeError: CPU dispatcher tracer already initlized',
  errdisplay='Traceback (most recent call last):\n  File "<string>", line 1, in <module>\n'
             '[...SNIP...]\n'
             '    from numpy._core._multiarray_umath import (\n'
             '        add_docstring,  _get_implementing_args, _ArrayFunctionDispatcher)\n'
             'RuntimeError: CPU dispatcher tracer already initlized'
)
>>> # now import in an isolated interpreter with its own GIL
>>> id2 = _interpreters.create(reqrefs=True)
>>> _interpreters.exec(id2, 'import sys', restrict=True)
>>> print(_interpreters.exec(id2, 'import numpy', restrict=True))
double free or corruption (out)
$  # the REPL has crashed!

Users for importlib.reload(), currently they get a warning that this is a bad idea. A full-blown error may not be so bad, though.

Reloading currently works with a UserWarning, unchanged with the explict opt-out.


The observable change I've found is clearing sys.modules and re-importing. Previously, it emitted UserWarning: The NumPy module was reloaded (imported a second time). This can in some cases result in small but subtle issues and is discouraged., but would now raise ImportError: cannot load module more than once per process.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't care, but it may be good to add a release note fragment about this somewhere.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, added a brief note about the changes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For completeness, could you please check that %autoreload in IPython/Jupyter isn't affected?

Copy link
Contributor Author

@AA-Turner AA-Turner May 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rgommers Seems to work (editing __init__.py to add a spam() function):

$ ipython
Python 3.13.3+ (main, May 20 2025, 08:56:23) [GCC 13.3.0]
IPython 9.2.0 -- An enhanced Interactive Python. Type '?' for help.
In [1]: %load_ext autoreload
In [2]: %autoreload 2
In [3]: import numpy
In [4]: numpy.spam()
Out[4]: 'spam spam spam spam spam egg and chips'
In [5]: numpy.spam()
.../lib/python3.13/site-packages/IPython/extensions/deduperreload/deduperreload.py:290: DeprecationWarning: ast.Ellipsis is deprecated and will be re
moved in Python 3.14; use ast.Constant instead
  elif not isinstance(ast_elt, (ast.Ellipsis, ast.Pass)):
Out[5]: 'lobster thermidor'

I haven't tested in Jupyter, but I believe it shares an implementation with IPython.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, thanks for verifying. And yes, the implementation is shared.

@AA-Turner AA-Turner force-pushed the multi-phase/umath_linalg branch from 59ec898 to 6876e24 Compare May 22, 2025 07:01
@AA-Turner
Copy link
Contributor Author

Please work on a single PR to reduce maintainer review burden. Once we are settled that we wish to make this change, we can revisit the other ones.

Will do. For context, there is a proposal to deprecate support for single-phase initialisation in favour of multi-phase. We're working to improve the documentation here too.

A

@AA-Turner
Copy link
Contributor Author

Does this impact performance, either import time or runtime?

Local timings using python -Ximporttime -c 'import numpy.linalg._umath_linalg' 2>&1| grep _umath_linalg from an editable installation of numpy.

Multi-phase Single-phase
Mean time (_umath_linalg) 5.54ms 5.06ms
Mean time (total) 4396ms 3863ms
Proportion 0.126% 0.131%

No significant impact of multi-phase on import time. I will investigate running the benchmarks (my laptop is ill-suited to it, though), but there should be no real impact -- this only changes the import process, the end result (initialised module) should be identical performance-wise.

A

@AA-Turner AA-Turner force-pushed the multi-phase/umath_linalg branch 3 times, most recently from 565f85f to 5bbc0bf Compare May 24, 2025 02:21
@AA-Turner
Copy link
Contributor Author

I don't understand the PyPy-only numpy/tests/test_ctypeslib.py::TestAsArray::test_pointer failure.

@AA-Turner AA-Turner requested review from seberg and mattip May 24, 2025 03:36
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is required as otherwise the "cannot load module more than once per process" error is raised. Happy to split to a different preparatory PR if preferred.

@rgommers
Copy link
Member

Local timings using python -Ximporttime -c 'import numpy.linalg._umath_linalg' 2>&1| grep _umath_linalg from an editable installation of numpy.

Note that you cannot use editable installs for measuring import time. You're mostly measuring importlib traversable overhead and invoking ninja then.

For a regular install, I'm seeing the following timings on a macOS arm64 M1 machine over 15 invocations of your -Ximporttime invocation (cumulative time for _umath_linalg:

  • Single-phase init: 680 - 870 us
  • Multi-phase init: 700 - 910 us

Total cumulative import time is around 65 ms.

No significant impact of multi-phase on import time.

Yep, that holds up.

@rgommers
Copy link
Member

Please work on a single PR to reduce maintainer review burden. Once we are settled that we wish to make this change, we can revisit the other ones.

Will do.

Thanks for working on this! It'd be useful to convert the PRs other than the couple that are under active review to Draft status, to make it clear at a glance to other reviewers browsing the PR list what is safe to ignore.

@AA-Turner
Copy link
Contributor Author

AA-Turner commented May 24, 2025

In reply to @rgommers' comment in #29025 (comment), centralising discussion in this PR.

This change looks quite annoying:

-    import_array();
+    import_array1(-1);

import_array is public API and used by a ton of extension modules, binding generators, and docs.

At a glance from your PRs, there seems to be a larger issue in the CPython design with the normal meaning of a NULL return used for an error condition no longer being true. Returning 0 on success and -1 on error is going to lead to a lot of churn. What is going on there?

This is because the signature changes from PyObject *PyInit_<NAME>(void) to int module_exec(PyObject *module), so NULL is not a valid return value & -1 is used to signal errors.

I was not around when the API was designed, though in the Python C API we do frequently use -1 as an error sentinel. @encukou / @scoder / etc may be better placed to comment as authors of PEP 489, but it may just be that it was the most pragmatic choice at the time.

It may be worth considering a shorter alias for import_array1(-1) assuming no better solution is found, but this is clearly a bigger discussion on NumPy's API and not one I intend to start!

A

@scoder
Copy link
Contributor

scoder commented May 24, 2025

I had never looked up the definition of import_array() in NumPy, assuming it was a simple function that signaled errors in its return value. It's not:

#define import_array() { \
if (_import_array() < 0) { \
PyErr_Print(); \
PyErr_SetString( \
PyExc_ImportError, \
"numpy._core.multiarray failed to import" \
); \
return NULL; \
} \
}
#define import_array1(ret) { \
if (_import_array() < 0) { \
PyErr_Print(); \
PyErr_SetString( \
PyExc_ImportError, \
"numpy._core.multiarray failed to import" \
); \
return ret; \
} \
}
#define import_array2(msg, ret) { \
if (_import_array() < 0) { \
PyErr_Print(); \
PyErr_SetString(PyExc_ImportError, msg); \
return ret; \
} \
}

Luckily, however, it is a function for Cython code:

numpy/numpy/__init__.pxd

Lines 917 to 923 in 1d271d5

# Versions of the import_* functions which are more suitable for
# Cython code.
cdef inline int import_array() except -1:
try:
__pyx_import_array()
except Exception:
raise ImportError("numpy._core.multiarray failed to import")

That mitigates the issue, I guess. A (very) large chunk of the code that interacts with NumPy's C-API probably does it through Cython, not bare C code.

@scoder
Copy link
Contributor

scoder commented May 24, 2025

the signature changes from PyObject *PyInit_<NAME>(void) to int module_exec(PyObject *module)

Right. In the old mechanism, the init function created the module and returned it. In multi-phase init, there is a dedicated creation function that returns it (or an error indicated by NULL), and an exec function (to populate the module namespace) that receives the already created module and returns only an error indicator (because it has no objects to return).

@seberg
Copy link
Member

seberg commented May 24, 2025

It may be worth considering a shorter alias for import_array1(-1) assuming no better solution is found,

We have a better solution already, which is the explicit thing that isn't a macro that includes a return:

    if (PyArray_ImportNumPyAPI() < 0) {
        return -1;
    }

please just use that (downstream can use it if compiling with NumPy 2, or just vendor it).

@AA-Turner AA-Turner force-pushed the multi-phase/umath_linalg branch from 5bbc0bf to 9d65240 Compare May 26, 2025 10:19
@AA-Turner
Copy link
Contributor Author

Thanks for the pointer, I've switched to PyArray_ImportNumPyAPI() and PyUFunc_ImportUFuncAPI().

A

@seberg seberg added the 56 - Needs Release Note. Needs an entry in doc/release/upcoming_changes label May 26, 2025
@AA-Turner AA-Turner force-pushed the multi-phase/umath_linalg branch from 9d65240 to 3c3ee41 Compare May 26, 2025 10:51
@rgommers
Copy link
Member

xref https://discuss.python.org/t/proposal-officially-deprecate-support-for-legacy-single-phase-init-extension-modules/89262/44, which points out that the current docs on multi-phase init say:

All modules created using multi-phase initialization are expected to support sub-interpreters. Making sure multiple modules are independent is typically enough to achieve this.

Given the opt-out option, I assume it's a matter of those docs needing to be improved, but as it stands this PR conflicts with that requirement so it'd be nice to see that Discourse thread concluding and the docs fixed before we merge this.

@AA-Turner
Copy link
Contributor Author

Given the opt-out option, I assume it's a matter of those docs needing to be improved, but as it stands this PR conflicts with that requirement so it'd be nice to see that Discourse thread concluding and the docs fixed before we merge this.

For reference, the relevant PR is python/cpython#134775. Feel free to comment there with any thoughts. The proposed text currently says "All modules created using multi-phase initialization are expected to support sub-interpreters, or otherwise explicitly signal a lack of support." (emph. added).

A

@AA-Turner AA-Turner force-pushed the multi-phase/umath_linalg branch from 3c3ee41 to 43119cd Compare May 28, 2025 07:46
Copy link
Member

@seberg seberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally, LGTM, thanks. I would like to lower the open PR number soon :).

I think I have two asks left only:

  • It would be nice to restore/use textwrap.dedent
  • From the docs, the code seems wrong for no-gil as a mutex must be used? (I am a bit surprised by import not already having a mutex, but, OK.)

All modules created using multi-phase initialization are expected to support sub-interpreters, or otherwise explicitly signal a lack of support.

My understanding is that this PR adds that now and that there is no major change in behavior (besides brutal del sys.module[...]).
(Historically, people got away with using NumPy in subinterpreters even if it was buggy, but I think Python changes for non-phase init modules removed that ability for us.)

encoding='utf-8',
check=False,
)
assert p.returncode == 0, p.stdout
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dschult if you can have a quick look: I think doing the lazy loading test in a subprocess is identical (but safer as it avoids messing with sys.modules). Does that seem right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to make the change here as otherwise we get the new "cannot load module more than once per process" error. I believe it should be functionally identical in terms of the test.

@AA-Turner
Copy link
Contributor Author

From Petr above:

The PyMutex stuff in the updated HOWTO is 3.13+ non-limited API, but isn't necessary in the medium term (on CPython you're protected by the GIL; even free-threading builds use an import lock which isn't likely to go away). For the long-term (or alternate CAPI implementations) you could add no-op shims.

I chose not to add it as it's only relevant for CPython & the GIL is held during import, so I saw it as needless complexity.

I'm happy to add it you'd prefer to follow the documentation more closely. What would you prefer @seberg?

A

@seberg
Copy link
Member

seberg commented May 28, 2025

I chose not to add it as it's only relevant for CPython & the GIL is held during import, so I saw it as needless complexity.

Sorry, missed that, happy to not do that then.

@AA-Turner AA-Turner force-pushed the multi-phase/umath_linalg branch from 43119cd to 0ee269f Compare May 28, 2025 10:12
@seberg seberg changed the title GH-29021: Convert umath_linalg to multi-phase init (PEP 489) MAINT: Convert umath_linalg to multi-phase init (PEP 489) May 28, 2025
Copy link
Member

@seberg seberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Will put it in once tests pass, supporting multi-phase init seems like it's in our future anyway and with this out of the way, hopefully we can do the others quickly.
(Then making it work for multiarray and actually supporting it in the caching are longer things.)

If anyone still finds something off, we can follow up.

@AA-Turner
Copy link
Contributor Author

Thanks all for the reviews! I'll rebase the others once this has merged, do you have a preferred order?

A

@seberg seberg merged commit 5a87756 into numpy:main May 28, 2025
74 checks passed
@AA-Turner AA-Turner deleted the multi-phase/umath_linalg branch May 28, 2025 11:02
@mattip
Copy link
Member

mattip commented May 28, 2025

Do we need to wait for clarification around the current requirement that "All modules created using multi-phase initialization are expected to support sub-interpreters" and python/cpython#134775, which has not been merged yet?

@AA-Turner
Copy link
Contributor Author

AA-Turner commented May 28, 2025

If it provides solace, PEP 630 (2020) and the Python documentation since 3.11 both contain the text on explicit opt-out that was used as a guide here:

A non-negative PyModuleDef.m_size signals that a module supports multiple interpreters correctly. If this is not yet the case for your module, you can explicitly make your module loadable only once per process.

A

@rgommers
Copy link
Member

The proposed text currently says "All modules created using multi-phase initialization are expected to support sub-interpreters, or otherwise explicitly signal a lack of support." (emph. added).

Seems perfectly clear now, thanks! Glad to see this PR merged so smoothly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
56 - Needs Release Note. Needs an entry in doc/release/upcoming_changes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants