Skip to content

BUG: wrong errors e.g. "divide by zero encountered in matmul" on MacOS M4 #28687

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
malciin opened this issue Apr 10, 2025 · 20 comments
Open
Labels

Comments

@malciin
Copy link

malciin commented Apr 10, 2025

Describe the issue:

I guess its self explanatory. Following code:

np.identity(n=784) @ np.identity(n=784)

produces many wrong runtime warnings.

For small matrices is works as expected, for eg. np.identity(14) @ np.identity(14) doesn't produce warning but np.identity(15) @ np.identity(15) and bigger does.

Reproduce the code example:

import numpy as np
np.identity(n=15) @ np.identity(n=15) # produces warning
np.identity(n=14) @ np.identity(n=14) # no warning

# same result for
import numpy as np
np.ones(shape=(1, 2)) @ np.ones(shape=(2, 784))

Error message:

../test.py:3: RuntimeWarning: divide by zero encountered in matmul
  np.identity(n=784) @ np.identity(n=784)
../test.py:3: RuntimeWarning: overflow encountered in matmul
  np.identity(n=784) @ np.identity(n=784)
../test.py:3: RuntimeWarning: invalid value encountered in matmul
  np.identity(n=784) @ np.identity(n=784)

Python and NumPy Versions:

2.2.4
3.13.2 | packaged by Anaconda, Inc. | (main, Feb 6 2025, 12:55:35) [Clang 14.0.6 ]

I've checked also 3.11 & 3.9 but same result.

Runtime Environment:

No response

Context for the issue:

No response

@malciin malciin changed the title BUG: wrong errors "divide by zero encountered in matmul" on MacOS BUG: wrong errors e.g. "divide by zero encountered in matmul" on MacOS Apr 10, 2025
@tylerjereddy
Copy link
Contributor

FWIW, I don't see the issue on latest main on MacOS with accelerate backend nor latest main on x86_64 Linux with OpenBLAS backend:

--- a/numpy/linalg/tests/test_linalg.py
+++ b/numpy/linalg/tests/test_linalg.py
@@ -2406,3 +2406,12 @@ def test_vector_norm_empty():
         assert_equal(np.linalg.vector_norm(x, ord=1), 0)
         assert_equal(np.linalg.vector_norm(x, ord=2), 0)
         assert_equal(np.linalg.vector_norm(x, ord=np.inf), 0)
+
+
+
+@pytest.mark.parametrize("arr1, arr2", [
+    (np.identity(n=784), np.identity(n=784)),
+    (np.identity(n=14), np.identity(n=14)),
+])
+def test_gh_28687(arr1, arr2):
+    arr1 @ arr2

Incantation: spin test -t numpy/linalg/tests/test_linalg.py::test_gh_28687 -- -s

Maybe someone with a conda setup on Mac will reproduce more easily if this is just for a standard conda install. Sharing the contents of np.show_config() may also help.

@malciin
Copy link
Author

malciin commented Apr 10, 2025

Result of np.show_config() run in miniconda env

Build Dependencies:
  blas:
    detection method: system
    found: true
    include directory: unknown
    lib directory: unknown
    name: accelerate
    openblas configuration: unknown
    pc file directory: unknown
    version: unknown
  lapack:
    detection method: system
    found: true
    include directory: unknown
    lib directory: unknown
    name: accelerate
    openblas configuration: unknown
    pc file directory: unknown
    version: unknown
Compilers:
  c:
    commands: cc
    linker: ld64
    name: clang
    version: 15.0.0
  c++:
    commands: c++
    linker: ld64
    name: clang
    version: 15.0.0
  cython:
    commands: cython
    linker: cython
    name: cython
    version: 3.0.12
Machine Information:
  build:
    cpu: aarch64
    endian: little
    family: aarch64
    system: darwin
  host:
    cpu: aarch64
    endian: little
    family: aarch64
    system: darwin
Python Information:
  path: /private/var/folders/0j/bwqcs4y508s2n4ck4dhf3rpc0000gn/T/build-env-310vy4bc/bin/python
  version: '3.13'
SIMD Extensions:
  baseline:
  - NEON
  - NEON_FP16
  - NEON_VFPV4
  - ASIMD
  found:
  - ASIMDHP
  not found:
  - ASIMDFHM

But I believe its not related to conda. I can also reproduce it on default python3 installation on my mac:

Image

Result of np.show_config() for default python3 instalation:

{
  "Compilers": {
    "c": {
      "name": "clang",
      "linker": "ld64",
      "version": "15.0.0",
      "commands": "cc"
    },
    "cython": {
      "name": "cython",
      "linker": "cython",
      "version": "3.0.11",
      "commands": "cython"
    },
    "c++": {
      "name": "clang",
      "linker": "ld64",
      "version": "15.0.0",
      "commands": "c++"
    }
  },
  "Machine Information": {
    "host": {
      "cpu": "aarch64",
      "family": "aarch64",
      "endian": "little",
      "system": "darwin"
    },
    "build": {
      "cpu": "aarch64",
      "family": "aarch64",
      "endian": "little",
      "system": "darwin"
    }
  },
  "Build Dependencies": {
    "blas": {
      "name": "accelerate",
      "found": true,
      "version": "unknown",
      "detection method": "system",
      "include directory": "unknown",
      "lib directory": "unknown",
      "openblas configuration": "unknown",
      "pc file directory": "unknown"
    },
    "lapack": {
      "name": "accelerate",
      "found": true,
      "version": "unknown",
      "detection method": "system",
      "include directory": "unknown",
      "lib directory": "unknown",
      "openblas configuration": "unknown",
      "pc file directory": "unknown"
    }
  },
  "Python Information": {
    "path": "/private/var/folders/4d/0gnh84wj53j7wyk695q0tc_80000gn/T/build-env-kq4kuj35/bin/python",
    "version": "3.9"
  },
  "SIMD Extensions": {
    "baseline": [
      "NEON",
      "NEON_FP16",
      "NEON_VFPV4",
      "ASIMD"
    ],
    "found": [
      "ASIMDHP"
    ],
    "not found": [
      "ASIMDFHM"
    ]
  }
}
>>> import sys, numpy; print(numpy.__version__); print(sys.version)
2.0.2
3.9.6 (default, Mar 12 2025, 20:22:46) 
[Clang 17.0.0 (clang-1700.0.13.3)]

Environment: Macbook air M4 with currently newest macos (sequoia 15.4)

@ngoldbaum
Copy link
Member

I don't see any warnings on my M3 Macbook Pro using a pyenv python compiled from source and numpy compiled from source, or on homebrew's python and numpy builds.

Maybe this is specific to the M4 CPU?

@malciin malciin changed the title BUG: wrong errors e.g. "divide by zero encountered in matmul" on MacOS BUG: wrong errors e.g. "divide by zero encountered in matmul" on MacOS M4 Apr 11, 2025
@till-m
Copy link

till-m commented Apr 14, 2025

I can confirm that I also ran into this problem on an M4 Mac with numpy==2.2.4 using uv as package manager (which I think gets it's packages from PyPI. No warnings are produced if I do any of the following:

  • downgrade to numpy 1.26.4
  • use np.dot instead of matmul
  • smaller matrices

@seberg
Copy link
Member

seberg commented Apr 14, 2025

Ping @Developer-Ecosystem-Engineering, since this seems to probably originate in Accelerate and I don't think we can do anything about it.

use np.dot instead of matmul

Dot didn't report these, but we just changed that it should in 2.3, although if this is a common issue by then, I guess we may want to reconsider.

@kenshi84
Copy link

My computer is MacBook Pro (16-inch, Nov 2024) with Apple M4 Max running macOS 15.4.1, and I was able to reproduce the same issue (using both system and pyenv installed Python 3.9 & 3.12, NumPy 2.2.4). Furthermore, on the same machine I used UTM.app to create a virtual machine with macOS 15.3.2 and confirmed that it is not reproduced. So I suspect this issue is caused by something that was newly intreoduced in the latest macOS Sequoia 15.4.x.

@ngoldbaum
Copy link
Member

Maybe filing an official apple radar bug to Apple would help?

If someone with an M4 chip could verify that the spurious floating point exception is coming from accelerate that would also help.

@Developer-Ecosystem-Engineering
Copy link
Contributor

Developer-Ecosystem-Engineering commented Apr 18, 2025

Hi @ngoldbaum, and @seberg,

We are tracking this issue and are working through understanding exactly what is going on, if it's expected, what we can do about it short term (for NumPy) and long term. I apologize that the internal discussion & ideation isn't a part of the public discourse and how that seems like we aren't engaged on the issue.

We will circle back next week with an update.

@glenn-jocher
Copy link

Also seeing this on an M4 Macbook with latest 2.2.5 installed with pip in Python 3.12.

@zertez
Copy link

zertez commented Apr 29, 2025

Well this is annoying. I am currently enrolled in a machine learning course at uni and I just got a warning from the teacher on an assignment I delivered. It ended up being a 397 page pdf, where 389 of those pages were just this printout warning. It seems that the warning supression that I used in python doesn't work when used in a jupyter notebook. I can see that Apple is looking int to this, but I can't really wait for a fix, so I have been doing a lot of testing to find a workaround for this issue, since exams starts in two weeks.

I am using a macbook pro 16 with the M4 max, on 15.4.1. I am not using anaconda, I am using uv with a venv setup through uv. Using python 3.10.8.

This is what I have found:

The bug or error only affects the @ operator when working with float values. There are no errors with integers. This makes sense since NumPy delegates floating-point matrix operations to the BLAS library, which uses Apple's Accelerate framework. Both np.dot() and the @ operator should theoretically use BLAS for floating-point operations, but they appear to use different code paths or interfaces to the BLAS libraries. The np.dot() function works correctly for both floats and integers, suggesting it might be accessing underlying libraries differently or handling errors more robustly than the @ operator implementation. This suggests the issue is specific to how NumPy's @ operator interfaces with Apple's implementation of BLAS on Apple Silicon hardware, but only on the M4 chip it would seem...

The workaround:

So by I having narrowed it down to BLAS (which I can see others have also done here), I wanted to setup OpenBLAS, to check if this resolved the issue. After using brew to install OpenBLAS, I ran this code in terminal window:

OPENBLAS=/opt/homebrew/opt/openblas uv pip install numpy --no-binary numpy --config-setting setup-args=-Dblas=openblas --config-setting setup-args=-Dlapack=openblas

After restarting the kernel, numpy is now using OpenBLAS and I am no longer getting these warnings. Hope this might help some others as a temporary fix until Apple has resolved the issue. I am certain there are other ways of changing to OpenBLAS, but this is the method that worked for me in the end.

@seberg
Copy link
Member

seberg commented Apr 29, 2025

FWIW, if this is just a warning in @ for you (and not elsewhere), just setting np.errstate(all="ignore") should work.
(It will hide all similar warnings, but if you are not working interactively, you are unlikely to care.)

@nalzok
Copy link

nalzok commented Apr 30, 2025

I can reproduce the bug on M4 Pro. Perhaps it's due to the latest macOS 15.4.1 system update?

> sw_vers
ProductName:            macOS
ProductVersion:         15.4.1
BuildVersion:           24E263

> sysctl -a | grep machdep.cpu
machdep.cpu.cores_per_package: 14
machdep.cpu.core_count: 14
machdep.cpu.logical_per_package: 14
machdep.cpu.thread_count: 14
machdep.cpu.brand_string: Apple M4 Pro

> python3 -c 'import numpy as np; np.identity(n=14) @ np.identity(n=14)'

> python3 -c 'import numpy as np; np.identity(n=15) @ np.identity(n=15)'
<string>:1: RuntimeWarning: divide by zero encountered in matmul
<string>:1: RuntimeWarning: overflow encountered in matmul
<string>:1: RuntimeWarning: invalid value encountered in matmul

Here is the output from np.show_config()

Build Dependencies:
  blas:
    detection method: system
    found: true
    include directory: unknown
    lib directory: unknown
    name: accelerate
    openblas configuration: unknown
    pc file directory: unknown
    version: unknown
  lapack:
    detection method: system
    found: true
    include directory: unknown
    lib directory: unknown
    name: accelerate
    openblas configuration: unknown
    pc file directory: unknown
    version: unknown
Compilers:
  c:
    commands: cc
    linker: ld64
    name: clang
    version: 15.0.0
  c++:
    commands: c++
    linker: ld64
    name: clang
    version: 15.0.0
  cython:
    commands: cython
    linker: cython
    name: cython
    version: 3.0.12
Machine Information:
  build:
    cpu: aarch64
    endian: little
    family: aarch64
    system: darwin
  host:
    cpu: aarch64
    endian: little
    family: aarch64
    system: darwin
Python Information:
  path: /private/var/folders/0j/bwqcs4y508s2n4ck4dhf3rpc0000gn/T/build-env-mgxqnezl/bin/python
  version: '3.12'
SIMD Extensions:
  baseline:
  - NEON
  - NEON_FP16
  - NEON_VFPV4
  - ASIMD
  found:
  - ASIMDHP
  not found:
  - ASIMDFHM

@Developer-Ecosystem-Engineering
Copy link
Contributor

We will circle back next week with an update.

We've not lost track of this problem.

To answer @nalzok and others, M4 has SME support and in 15.4 these paths may be utilized by the system.

@ryn-mrgn
Copy link

ryn-mrgn commented May 5, 2025

I'm experiencing the same matmul runtime warnings (divide by zero, overflow, invalid value) on an M4 MacBook Pro with NumPy 2.x.x.

Here's a minimal reproducible example:

import numpy as np

a = np.random.uniform(low=1, high=1, size=(40, 2))
b = np.random.uniform(low=1, high=1, size=(2, 40))

c = a @ b  # Triggers RuntimeWarnings in NumPy 2.x.x on M4 Mac

Interestingly, this issue does not occur when either matrix has a smaller dimension:

a = np.random.uniform(low=1, high=1, size=(39, 2))
b = np.random.uniform(low=1, high=1, size=(2, 39))

c = a @ b  # No warnings

As others have observed, errors are only encountered with floats and not integers. np.errstate(all="ignore") works for me in the meantime.

@extrospective
Copy link

extrospective commented May 7, 2025

Encountered same issue.

Worked around with np.dot (the alternative to suppress warnings worries me about other warnings)
The test ryn-mrgn posted above is very good - and distinguished computers

My environment M4 Max, Sequoia 15.4.1

@Developer-Ecosystem-Engineering
Copy link
Contributor

Thank you all for the reproducible cases & scenarios. We are able to reproduce the issue and are circling the best approach for next steps.

@alecjacobson
Copy link

Is there a less aggressive version of np.errstate(all="ignore") that would ignore precisely these warnings?

@seberg
Copy link
Member

seberg commented May 24, 2025

No there isn't and it isn't possible. But of course you can put with np.errstate(all="ignore"): around the specific call.
FWIW, I think the 2.3.0 release in a few weeks will likely have a fix, even if that fix means that warnings/errors are never given; which to me isn't ideal but acceptable -- almost no-one really relies on these hopefully.

@cournape
Copy link
Member

I can confirm the issue on my macbook air m4.

FWIW, I noticed the issue after moving to a new macbook m4 from m1 through time machine, so I can confirm the only change was m1 -> m4. Everything else was the same, including OS, python version and so on.

@ruiiyang
Copy link

ruiiyang commented May 31, 2025

I am also hitting this error on my M4 pro chip MacBook. Other people using windows running the same script saying they do not see this error. Beside this runtime error on extmath
Python/3.9/lib/python/site-packages/sklearn/utils/extmath.py:203: RuntimeWarning: divide by zero encountered in matmul
ret = a @ b

I am also getting the other error
/Library/Python/3.9/lib/python/site-packages/sklearn/linear_model/_linear_loss.py:200: RuntimeWarning: divide by zero encountered in matmul
raw_prediction = X @ weights + intercept

my numpy is 2.0.2 python is 3.9

FIX IT!
I have lowered my numpy version from 2.02 to pip install numpy==1.24.4 Then issue is solved!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests