Skip to content

BUG: Silent int32 overflow in lapack work size computation leads to wrong exception #25564

Open
@pearu

Description

@pearu

The issue of np.linalg.eigh returning wrong results or crashing is still real (using numpy 1.26.2):

>>> import numpy as np
>>> n=32767
>>> b=np.random.rand(n)
>>> m_32767=np.diag(b)
>>> m_32767.shape
(32767, 32767)
>>> V_32767=np.linalg.eigh(m_32767)
 ** On entry to DSTEDC parameter number  8 had an illegal value
 ** On entry to DORMTR parameter number 12 had an illegal value
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/pearu/miniconda3/envs/jax-cuda-dev/lib/python3.11/site-packages/numpy/linalg/linalg.py", line 1487, in eigh
    w, vt = gufunc(a, signature=signature, extobj=extobj)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/pearu/miniconda3/envs/jax-cuda-dev/lib/python3.11/site-packages/numpy/linalg/linalg.py", line 118, in _raise_linalgerror_eigenvalues_nonconvergence
    raise LinAlgError("Eigenvalues did not converge")
numpy.linalg.LinAlgError: Eigenvalues did not converge

where the exception "Eigenvalues did not converge" is very likely wrong and misleading. With n == 32766, the above example works fine.

The underlying problem is that when computing lwork for the lapack syevd function using expression 1 + 6 * n + 2 * n * n, it will overflow when n == 32767. This problem is not unique to syevd but it exists for all lapack functions that work array sizes are quadratic wrt input sizes.

While switching to lapack implementations that uses 64-bit integer inputs, the overflow issue is seemingly resolved but in fact it is just harder to reproduce because the critical n size will be 2147483647 where the issue re-merges when the lwork expression above will overflow for int64.

I have implemented a solution to the same problem in JAX (jax-ml/jax#19288) that will lead to an overflow exception rather than wrong results or crashes. I think something similar is appropriate for NumPy as well.

Originally posted by @pearu in #13956 (comment)

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions