Power9 instruction gets executed on Power8 #28124

limburgher · 2025-01-07T22:20:09Z

Describe the issue:

From: https://bugzilla.redhat.com/show_bug.cgi?id=2336127

In 2024-12-21, we started to see an MLIR test failure on LLVM daily snapshots running on Rawhide on Power8.
We can only reproduce this issue on Rawhide.

After investigating, I found a numpy function using a Power9/Power ISA 3.0 instruction (mtvsrws):

Disassembly:
(gdb) disas
Dump of assembler code for function HALF_exp2(char**, npy_intp const*, npy_intp const*, void*):
...
0x00007fffe3bad210 <+160>: addis r9,r9,14336
0x00007fffe3bad214 <+164>: add r7,r7,r9
=> 0x00007fffe3bad218 <+168>: mtvsrws vs1,r7
0x00007fffe3bad21c <+172>: xscvspdpn vs1,vs1
0x00007fffe3bad220 <+176>: bl 0x7fffe38b3580 <0000001a.plt_call.exp2f@@GLIBC_2.27>

Backtrace:
(gdb) bt
#0 HALF_exp2 (args=, dimensions=, steps=, __NPY_UNUSED_TAGGEDdata=)
at ../numpy/_core/src/umath/loops_umath_fp.dispatch.c.src:182
#1 0x00007fffe3af615c in generic_wrapped_legacy_loop (__NPY_UNUSED_TAGGEDcontext=, data=, dimensions=,
strides=, auxdata=) at ../numpy/_core/src/umath/legacy_array_method.c:98
#2 0x00007fffe3b0d2f0 in try_trivial_single_output_loop (context=0x7fffffff8410, op=0x7fffffff8b30, order=,
errormask=) at ../numpy/_core/src/umath/ufunc_object.c:969
#3 PyUFunc_GenericFunctionInternal (ufunc=, ufuncimpl=, operation_descrs=0x7fffffff8730, op=0x7fffffff8b30,
casting=NPY_SAME_KIND_CASTING, order=, wheremask=0x0) at ../numpy/_core/src/umath/ufunc_object.c:2237
#4 ufunc_generic_fastcall (ufunc=, args=, len_args=, kwnames=, outer=)
at ../numpy/_core/src/umath/ufunc_object.c:4530
#5 0x00007ffff79e9e30 in PyObject_Vectorcall () from /lib64/libpython3.13.so.1.0

Reproducible: Always

Reproduce the code example:

N/A

Error message:

No response

Python and NumPy Versions:

Python 3.13, NumPy 2.2.1

Runtime Environment:

No response

Context for the issue:

No response

seberg · 2025-01-08T10:27:52Z

MLIR test failure on LLVM daily snapshots

Haven't looked into it closely, thought it was SIMD related first, but not sure now.

Unless I am missing build related changes (totally possible), there were no 2.2.1 changes in this code and the code isn't hand-vectorized (very few changes in 2.2.1 and this didn't even change since 2.1 at least).

So if you only build with "LLVM daily" when there is a new release of NumPy, then the change here may also be the LLVM daily part?

tuliom · 2025-01-08T12:13:46Z

@seberg The code with this instruction is in Numpy.
Keep in mind that LLVM daily is not being used to build Numpy.
It's still possible the compiler used to build Numpy (gcc 14.2.1) generated that instruction by mistake. We need more information before taking conclusions.

Another important thing: numpy on Fedora got updated from version 1.26.4 to version to 2.2.0 around December 12th. Has there been any changes to this code between those versions?

@limburgher would you be able to run a numpy scratch build on Fedora with more verbosity so that it prints the entire compiler commands, please? I'd like to see which parameters are being passed to GCC when compiling numpy/_core/src/umath/loops_umath_fp.dispatch.c.src.

seberg · 2025-01-08T12:31:43Z

from version 1.26.4 to version to 2.2.0 around December 12th

Ah, that is a big step and changes seem very plausible. So SIMD seems not unlikely (i.e. VSX3 is being used now), @seiko2plus would probably know without looking around.

If you can easily, it would help to to print:

np.show_runtime()

and you could also try to run with NPY_DISABLE_CPU_FEATURES=VSX3 (note that I don't know if this is a VSX3 instruction, but it would make sense), to see if we are dynamically dispatching to an incorrect target.

tuliom · 2025-01-08T13:14:30Z

@seberg It crashed in both cases:

$ NPY_DISABLE_CPU_FEATURES=VSX3 PYTHONPATH=/build/llvm-build/tools/mlir/python_packages/mlir_core  /usr/bin/python3.13 /root/src/llvm-project/mlir/test/python/pass_manager.py
/root/src/llvm-project/mlir/test/python/pass_manager.py:397: SyntaxWarning: invalid escape sequence '\-'
  connector = "\-- " if i == len(entries) - 1 else "|-- "
Illegal instruction (core dumped)

$ cat test.py 
#!/usr/bin/python3

import numpy

numpy.show_runtime()

numpy.show_config()
$ NPY_DISABLE_CPU_FEATURES=VSX3 ./test.py 
Illegal instruction (core dumped)

I wonder if there are Power9 instructions being executed very early.

tuliom · 2025-01-08T13:17:33Z

@limburgher I learned how to build numpy with higher verbosity. I apologize for the noise.

The log is available at https://kojipkgs.fedoraproject.org//work/tasks/1824/127671824/build.log

Most, if not all files, are built with -mcpu=power9 despite the this distribution being expected to work on Power8.

seberg · 2025-01-08T13:30:07Z

Nice, I think that narrows it down. It would seem that these tests for CPU capabilities misfire for some reason:

VSX3 = mod_features.new(
  'VSX3', 3, implies: VSX2, args: {'val': '-mcpu=power9', 'match': '.*[mcpu=|vsx].*'},
  detect: {'val': 'VSX3', 'match': 'VSX.*'},
  test_code: files(source_root + '/numpy/distutils/checks/cpu_vsx3.c')[0],
  extra_tests: {
    'VSX3_HALF_DOUBLE': files(source_root + '/numpy/distutils/checks/extra_vsx3_half_double.c')[0]
  }
)

unless this is cross compiling on a power9 machine and the baseline architecture needs to be explicitly set?

limburgher · 2025-01-08T14:08:04Z

@tuliom no worries. :)
We do explicitly set power9 in the spec for ppc64le, but only for rhel10+.

tuliom · 2025-01-08T14:20:34Z

@limburgher Thanks!

I believe I found the issue. That rhel10 code is also being executed on Fedora.
I proposed a fix downstream here: https://src.fedoraproject.org/rpms/numpy/pull-request/51

I believe we can fix this upstream issue.

limburgher · 2025-01-08T14:49:10Z

Excellent, thank you! I'll keep and eye on that PR.

ngoldbaum · 2025-01-09T16:22:07Z

Closing since this came down to a bug in fedora. Please reopen if this is a bug in NumPy and I'm misunderstanding.

limburgher added the 00 - Bug label Jan 7, 2025

seberg added the component: SIMD Issues in SIMD (fast instruction sets) code or machinery label Jan 8, 2025

seberg added the 57 - Close? Issues which may be closable unless discussion continued label Jan 8, 2025

ngoldbaum closed this as completed Jan 9, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Power9 instruction gets executed on Power8 #28124

Power9 instruction gets executed on Power8 #28124

limburgher commented Jan 7, 2025

seberg commented Jan 8, 2025

tuliom commented Jan 8, 2025

seberg commented Jan 8, 2025

tuliom commented Jan 8, 2025

tuliom commented Jan 8, 2025

seberg commented Jan 8, 2025

limburgher commented Jan 8, 2025

tuliom commented Jan 8, 2025

limburgher commented Jan 8, 2025

ngoldbaum commented Jan 9, 2025

Power9 instruction gets executed on Power8 #28124

Power9 instruction gets executed on Power8 #28124

Comments

limburgher commented Jan 7, 2025

Describe the issue:

Reproduce the code example:

Error message:

Python and NumPy Versions:

Runtime Environment:

Context for the issue:

seberg commented Jan 8, 2025

tuliom commented Jan 8, 2025

seberg commented Jan 8, 2025

tuliom commented Jan 8, 2025

tuliom commented Jan 8, 2025

seberg commented Jan 8, 2025

limburgher commented Jan 8, 2025

tuliom commented Jan 8, 2025

limburgher commented Jan 8, 2025

ngoldbaum commented Jan 9, 2025