Skip to content

BUG, SIMD: Fix invalid value encountered in several ufuncs #22771

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Dec 18, 2022

Conversation

seiko2plus
Copy link
Member

@seiko2plus seiko2plus commented Dec 11, 2022

closes #22461, #22772, #22797

  • Fix invalid value encountered in rint/trunc/ceil/floor on armhf/neon
  • Fix invalid value encountered in rint/trunc/ceil/floor on x86/SSE2
  • Fix invalid value encountered in expm1 when SVML/AVX512 enabled
  • Fix invalid value encountered in cos/sin on aarch64 & ppc64le

for more clarification check the linked issues above

@seiko2plus seiko2plus added 00 - Bug component: SIMD Issues in SIMD (fast instruction sets) code or machinery labels Dec 11, 2022
@seiko2plus seiko2plus force-pushed the issue_22461 branch 4 times, most recently from 351397f to 65fa37e Compare December 11, 2022 04:26
@seiko2plus seiko2plus changed the title BUG, SIMD: Fix invalid value encountered in cos/sin on aarch64 BUG, SIMD: Fix invalid value encountered in expm1/cos/sin/ Dec 11, 2022
@seiko2plus seiko2plus added the 09 - Backport-Candidate PRs tagged should be backported label Dec 11, 2022
@seiko2plus seiko2plus linked an issue Dec 11, 2022 that may be closed by this pull request
@seiko2plus seiko2plus force-pushed the issue_22461 branch 4 times, most recently from 0b7a4fd to 45b56f8 Compare December 14, 2022 07:08
@seiko2plus seiko2plus changed the title BUG, SIMD: Fix invalid value encountered in expm1/cos/sin/ BUG, SIMD: Fix invalid value encountered in several ufuncs Dec 14, 2022
@charris
Copy link
Member

charris commented Dec 14, 2022

The failure on ppc64le looks legit.

@mattip
Copy link
Member

mattip commented Dec 15, 2022

To save some clicks: the error is here and is when checking that using ufunc(np.array([np.nan], dtype=np.float32) does not emit a warning for ufunc in [np.sin, np.cos]. This is in the ppc64le run with EXPECT_CPU_FEATURES="VSX VSX2 VSX3" but the run with EXPECT_CPU_FEATURES="VX VXE VXE2" succeeds (does not emit the warning).

  Providing non-signaling comparison intrinsics that guarantee
  no FP invalid exception in case of qNaN sounds great but it
  cost unacceptable extra intrinsics on  ppc64le(VSX) and x86(SSE).

  Therefore, an integer definition #NPY_SIMD_CMPSIGNAL has been
  provided instead to differenate between SIMD extensions
  that support only supports signaling comparison.
@seiko2plus
Copy link
Member Author

seiko2plus commented Dec 15, 2022

The Power/ISA guide wasn't clear to me, I thought the legacy AltiVec FP comparison instructions are not going to raise invalid FP exceptions for quite nans. However, I discarded the whole patch that related to implementing non-signaling fp comparison, it was a bad approach from my side, see the last commit message for more clarification.

@seiko2plus
Copy link
Member Author

close/open retrigger Travis

@seiko2plus seiko2plus closed this Dec 15, 2022
@seiko2plus seiko2plus reopened this Dec 15, 2022
Copy link
Member

@seberg seberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logic looks fine, the new define also seems like a good choice to me. The rounding changes are hard to follow but the logic looks sound.
There are pretty extensive tests for floor, ceil, etc. in the SIMD tests, so I think this is good.

PS: I am not immediately sure how well weird corner cases are covered (especially for float32, like np.nextafter(2, 0)), but it seems unlikely to be an issue. We also seem to not have any integration tests, but that would be feature creep here.

npyv_b@len@ nnan_mask = npyv_notnan_@sfx@(x);
npyv_@sfx@ x_exnan = npyv_select_@sfx@(nnan_mask, x, npyv_setall_@sfx@(@default_val@));
npyv_@sfx@ out = __svml_@func@@func_suffix@(x_exnan);
out = npyv_select_@sfx@(nnan_mask, out, x);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh an SVML issue, interesting...

// a if |a| >= 2^23 or a == NaN
npyv_u32 mask = vcleq_f32(abs_x, two_power_23);
mask = vandq_u32(mask, nnan_mask);
return vbslq_f32(mask, floor, a);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logic looks right (didn't try to look make sure the corner cases are, the tests surely do). I guess this approach to floor is just a bit faster than then the casting approach (with the dance necessary to mask large values)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this approach to floor is just a bit faster than then the casting

To avoid testing finite then yes.

necessary to mask large values

To avoid divergences caused by sub/add


data_cmp = [func(a, b) for a, b in zip(data_a, data_b)]
cmp = to_bool(intrin(vdata_a, vdata_b))
assert cmp == data_cmp
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice parametrizing this :).

))
def test_unary_spurious_fpexception(self, ufunc, dtype, data, escape):
if escape and ufunc in escape:
return
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be pytest.xfail() or pytest.skip()? That way there is a very small chance to notice it once day again.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be pytest.xfail() or pytest.skip()?

It's normal to raise this kind of domain/invalid fp exception but if you mean the second case that related to float16 then yes.

@seberg
Copy link
Member

seberg commented Dec 16, 2022

Hmmmpf, tried to cancel and start the Travis CI ppc64le job, but right now it doesn't seem to help to get it started...

@charris charris added this to the 1.24.1 release milestone Dec 17, 2022
@charris
Copy link
Member

charris commented Dec 17, 2022

close/reopen

@charris charris closed this Dec 17, 2022
@charris charris reopened this Dec 17, 2022
@seberg
Copy link
Member

seberg commented Dec 18, 2022

OK, CI ran through now, lets just put it in, can follow-up, but I doubt there is anything (except removing SSE2 without SSE41 to simplify maintanence ;)). Thanks Sayed!

@seberg seberg merged commit f589407 into numpy:main Dec 18, 2022
@charris charris removed the 09 - Backport-Candidate PRs tagged should be backported label Dec 19, 2022
@charris charris removed this from the 1.24.1 release milestone Dec 19, 2022
@seberg
Copy link
Member

seberg commented Dec 20, 2022

@seiko2plus we should to follow up on this for 1.24.1 since this is backported. I am getting these on M1:

FAILED numpy/core/tests/test_umath.py::TestSpecialFloats::test_unary_spurious_fpexception[data10-escape10-f-cos] - AssertionError: Got warnings: [<warnings.WarningMessage object at 0x16a...
FAILED numpy/core/tests/test_umath.py::TestSpecialFloats::test_unary_spurious_fpexception[data10-escape10-f-sin] - AssertionError: Got warnings: [<warnings.WarningMessage object at 0x16e...
FAILED numpy/core/tests/test_umath.py::TestSpecialFloats::test_unary_spurious_fpexception[data11-escape11-f-cos] - AssertionError: Got warnings: [<warnings.WarningMessage object at 0x16e...
FAILED numpy/core/tests/test_umath.py::TestSpecialFloats::test_unary_spurious_fpexception[data11-escape11-f-sin] - AssertionError: Got warnings: [<warnings.WarningMessage object at 0x176...
FAILED numpy/core/tests/test_umath.py::TestSpecialFloats::test_unary_spurious_fpexception[data12-escape12-f-cos] - AssertionError: Got warnings: [<warnings.WarningMessage object at 0x176...
FAILED numpy/core/tests/test_umath.py::TestSpecialFloats::test_unary_spurious_fpexception[data12-escape12-f-sin] - AssertionError: Got warnings: [<warnings.WarningMessage object at 0x176...
FAILED numpy/core/tests/test_umath.py::TestSpecialFloats::test_unary_spurious_fpexception[data13-escape13-f-cos] - AssertionError: Got warnings: [<warnings.WarningMessage object at 0x176...
FAILED numpy/core/tests/test_umath.py::TestSpecialFloats::test_unary_spurious_fpexception[data13-escape13-f-sin] - AssertionError: Got warnings: [<warnings.WarningMessage object at 0x176...
FAILED numpy/core/tests/test_umath.py::TestSpecialFloats::test_unary_spurious_fpexception[data14-escape14-f-cos] - AssertionError: Got warnings: [<warnings.WarningMessage object at 0x176...
FAILED numpy/core/tests/test_umath.py::TestSpecialFloats::test_unary_spurious_fpexception[data14-escape14-f-sin] - AssertionError: Got warnings: [<warnings.WarningMessage object at 0x177...
FAILED numpy/core/tests/test_umath.py::TestSpecialFloats::test_unary_spurious_fpexception[data15-escape15-f-cos] - AssertionError: Got warnings: [<warnings.WarningMessage object at 0x176...
FAILED numpy/core/tests/test_umath.py::TestSpecialFloats::test_unary_spurious_fpexception[data15-escape15-f-sin] - AssertionError: Got warnings: [<warnings.WarningMessage object at 0x176...

Unless the fix is very simple, I suspect xfailing the test on apple is just as well, though.

@seiko2plus
Copy link
Member Author

except removing SSE2 without SSE41 to simplify maintanence ;)). Thanks Sayed!

I need some time to think about the consequences of this move, SSE2 is still part of our default baseline, and dropping it will require dispatching each SIMD kernel.

we should to follow up on this for 1.24.1 since this is backported. I am getting these on M1

I will look into it, it would be great if you could provide a build log or verbose test result at least.

@seberg
Copy link
Member

seberg commented Dec 21, 2022

These are all the nan test cases failing for dtype="f" and cos/sin. There is nothing new about this, it also fails e.g. in NumPy 1.23.5.

If you really want, here is the build log:

Running from numpy source directory.
/Users/sebastianb/forks/numpy/setup.py:67: DeprecationWarning: 

  `numpy.distutils` is deprecated since NumPy 1.23.0, as a result
  of the deprecation of `distutils` itself. It will be removed for
  Python >= 3.12. For older Python versions it will remain present.
  It is recommended to use `setuptools < 60.0` for those Python versions.
  For more details, see:
    https://numpy.org/devdocs/reference/distutils_status_migration.html 


  import numpy.distutils.command.sdist
numpy/random/_bounded_integers.pxd.in has not changed
numpy/random/_philox.pyx has not changed
numpy/random/_bounded_integers.pyx.in has not changed
numpy/random/_sfc64.pyx has not changed
numpy/random/_mt19937.pyx has not changed
numpy/random/bit_generator.pyx has not changed
numpy/random/_bounded_integers.pyx has not changed
numpy/random/mtrand.pyx has not changed
numpy/random/_generator.pyx has not changed
numpy/random/_pcg64.pyx has not changed
numpy/random/_common.pyx has not changed
Cythonizing sources
INFO: blas_opt_info:
INFO: blas_armpl_info:
INFO: customize UnixCCompiler
INFO:   libraries armpl_lp64_mp not found in ['/opt/homebrew/Caskroom/mambaforge/base/lib', '/usr/local/lib', '/usr/lib']
INFO:   NOT AVAILABLE
INFO: 
INFO: blas_mkl_info:
INFO:   libraries mkl_rt not found in ['/opt/homebrew/Caskroom/mambaforge/base/lib', '/usr/local/lib', '/usr/lib']
INFO:   NOT AVAILABLE
INFO: 
INFO: blis_info:
INFO:   libraries blis not found in ['/opt/homebrew/Caskroom/mambaforge/base/lib', '/usr/local/lib', '/usr/lib']
INFO:   NOT AVAILABLE
INFO: 
INFO: openblas_info:
INFO: C compiler: arm64-apple-darwin20.0.0-clang -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -fwrapv -O2 -Wall -fPIC -O2 -isystem /opt/homebrew/Caskroom/mambaforge/base/include -arch arm64 -fPIC -O2 -isystem /opt/homebrew/Caskroom/mambaforge/base/include -arch arm64 -ftree-vectorize -fPIC -fPIE -fstack-protector-strong -O2 -pipe -isystem /opt/homebrew/Caskroom/mambaforge/base/include -D_FORTIFY_SOURCE=2 -isystem /opt/homebrew/Caskroom/mambaforge/base/include

creating /var/folders/8t/_j0pcqpx34x2w69_dkkrk6540000gp/T/tmpc_to2r37/var
creating /var/folders/8t/_j0pcqpx34x2w69_dkkrk6540000gp/T/tmpc_to2r37/var/folders
creating /var/folders/8t/_j0pcqpx34x2w69_dkkrk6540000gp/T/tmpc_to2r37/var/folders/8t
creating /var/folders/8t/_j0pcqpx34x2w69_dkkrk6540000gp/T/tmpc_to2r37/var/folders/8t/_j0pcqpx34x2w69_dkkrk6540000gp
creating /var/folders/8t/_j0pcqpx34x2w69_dkkrk6540000gp/T/tmpc_to2r37/var/folders/8t/_j0pcqpx34x2w69_dkkrk6540000gp/T
creating /var/folders/8t/_j0pcqpx34x2w69_dkkrk6540000gp/T/tmpc_to2r37/var/folders/8t/_j0pcqpx34x2w69_dkkrk6540000gp/T/tmpc_to2r37
INFO: compile options: '-c'
INFO: arm64-apple-darwin20.0.0-clang: /var/folders/8t/_j0pcqpx34x2w69_dkkrk6540000gp/T/tmpc_to2r37/source.c
INFO: arm64-apple-darwin20.0.0-clang /var/folders/8t/_j0pcqpx34x2w69_dkkrk6540000gp/T/tmpc_to2r37/var/folders/8t/_j0pcqpx34x2w69_dkkrk6540000gp/T/tmpc_to2r37/source.o -L/opt/homebrew/Caskroom/mambaforge/base/lib -lopenblas -o /var/folders/8t/_j0pcqpx34x2w69_dkkrk6540000gp/T/tmpc_to2r37/a.out
INFO:   FOUND:
INFO:     libraries = ['openblas', 'openblas']
INFO:     library_dirs = ['/opt/homebrew/Caskroom/mambaforge/base/lib']
INFO:     language = c
INFO:     define_macros = [('HAVE_CBLAS', None)]
INFO: 
INFO:   FOUND:
INFO:     libraries = ['openblas', 'openblas']
INFO:     library_dirs = ['/opt/homebrew/Caskroom/mambaforge/base/lib']
INFO:     language = c
INFO:     define_macros = [('HAVE_CBLAS', None)]
INFO: 
non-existing path in 'numpy/distutils': 'site.cfg'
INFO: lapack_opt_info:
INFO: lapack_armpl_info:
INFO:   libraries armpl_lp64_mp not found in ['/opt/homebrew/Caskroom/mambaforge/base/lib', '/usr/local/lib', '/usr/lib']
INFO:   NOT AVAILABLE
INFO: 
INFO: lapack_mkl_info:
INFO:   libraries mkl_rt not found in ['/opt/homebrew/Caskroom/mambaforge/base/lib', '/usr/local/lib', '/usr/lib']
INFO:   NOT AVAILABLE
INFO: 
INFO: openblas_lapack_info:
INFO: C compiler: arm64-apple-darwin20.0.0-clang -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -fwrapv -O2 -Wall -fPIC -O2 -isystem /opt/homebrew/Caskroom/mambaforge/base/include -arch arm64 -fPIC -O2 -isystem /opt/homebrew/Caskroom/mambaforge/base/include -arch arm64 -ftree-vectorize -fPIC -fPIE -fstack-protector-strong -O2 -pipe -isystem /opt/homebrew/Caskroom/mambaforge/base/include -D_FORTIFY_SOURCE=2 -isystem /opt/homebrew/Caskroom/mambaforge/base/include

creating /var/folders/8t/_j0pcqpx34x2w69_dkkrk6540000gp/T/tmp7f36csyk/var
creating /var/folders/8t/_j0pcqpx34x2w69_dkkrk6540000gp/T/tmp7f36csyk/var/folders
creating /var/folders/8t/_j0pcqpx34x2w69_dkkrk6540000gp/T/tmp7f36csyk/var/folders/8t
creating /var/folders/8t/_j0pcqpx34x2w69_dkkrk6540000gp/T/tmp7f36csyk/var/folders/8t/_j0pcqpx34x2w69_dkkrk6540000gp
creating /var/folders/8t/_j0pcqpx34x2w69_dkkrk6540000gp/T/tmp7f36csyk/var/folders/8t/_j0pcqpx34x2w69_dkkrk6540000gp/T
creating /var/folders/8t/_j0pcqpx34x2w69_dkkrk6540000gp/T/tmp7f36csyk/var/folders/8t/_j0pcqpx34x2w69_dkkrk6540000gp/T/tmp7f36csyk
INFO: compile options: '-c'
INFO: arm64-apple-darwin20.0.0-clang: /var/folders/8t/_j0pcqpx34x2w69_dkkrk6540000gp/T/tmp7f36csyk/source.c
INFO: arm64-apple-darwin20.0.0-clang /var/folders/8t/_j0pcqpx34x2w69_dkkrk6540000gp/T/tmp7f36csyk/var/folders/8t/_j0pcqpx34x2w69_dkkrk6540000gp/T/tmp7f36csyk/source.o -L/opt/homebrew/Caskroom/mambaforge/base/lib -lopenblas -o /var/folders/8t/_j0pcqpx34x2w69_dkkrk6540000gp/T/tmp7f36csyk/a.out
INFO:   FOUND:
INFO:     libraries = ['openblas', 'openblas']
INFO:     library_dirs = ['/opt/homebrew/Caskroom/mambaforge/base/lib']
INFO:     language = c
INFO:     define_macros = [('HAVE_CBLAS', None)]
INFO: 
INFO:   FOUND:
INFO:     libraries = ['openblas', 'openblas']
INFO:     library_dirs = ['/opt/homebrew/Caskroom/mambaforge/base/lib']
INFO:     language = c
INFO:     define_macros = [('HAVE_CBLAS', None)]
INFO: 
Warning: attempted relative import with no known parent package
/opt/homebrew/Caskroom/mambaforge/base/lib/python3.10/distutils/dist.py:274: UserWarning: Unknown distribution option: 'define_macros'
  warnings.warn(msg)
running build
running config_cc
INFO: unifing config_cc, config, build_clib, build_ext, build commands --compiler options
running config_fc
INFO: unifing config_fc, config, build_clib, build_ext, build commands --fcompiler options
running build_src
INFO: build_src
INFO: building py_modules sources
INFO: building library "npymath" sources
WARN: Could not locate executable gfortran
WARN: Could not locate executable f95
WARN: Could not locate executable nagfor
WARN: Could not locate executable f90
WARN: Could not locate executable f77
WARN: Could not locate executable xlf90
WARN: Could not locate executable xlf
WARN: Could not locate executable ifort
WARN: Could not locate executable ifc
WARN: Could not locate executable g77
WARN: Could not locate executable g95
WARN: Could not locate executable pgfortran
WARN: don't know how to compile Fortran code on platform 'posix'
INFO:   adding 'build/src.macosx-11.0-arm64-3.10/numpy/core/src/npymath' to include_dirs.
INFO: None - nothing done with h_files = ['build/src.macosx-11.0-arm64-3.10/numpy/core/src/npymath/npy_math_internal.h']
INFO: building library "npyrandom" sources
INFO: building extension "numpy.core._multiarray_tests" sources
INFO: building extension "numpy.core._multiarray_umath" sources
INFO:   adding 'build/src.macosx-11.0-arm64-3.10/numpy/core/src/multiarray' to include_dirs.
INFO:   adding 'build/src.macosx-11.0-arm64-3.10/numpy/core/src/common' to include_dirs.
INFO:   adding 'build/src.macosx-11.0-arm64-3.10/numpy/core/src/umath' to include_dirs.
INFO: numpy.core - nothing done with h_files = ['build/src.macosx-11.0-arm64-3.10/numpy/core/src/multiarray/arraytypes.h', 'build/src.macosx-11.0-arm64-3.10/numpy/core/src/common/npy_sort.h', 'numpy/core/src/common/npy_partition.h', 'numpy/core/src/common/npy_binsearch.h', 'build/src.macosx-11.0-arm64-3.10/numpy/core/src/umath/funcs.inc', 'build/src.macosx-11.0-arm64-3.10/numpy/core/src/umath/simd.inc', 'build/src.macosx-11.0-arm64-3.10/numpy/core/src/umath/loops.h', 'build/src.macosx-11.0-arm64-3.10/numpy/core/src/umath/loops_utils.h', 'build/src.macosx-11.0-arm64-3.10/numpy/core/src/umath/matmul.h', 'numpy/core/src/umath/clip.h', 'build/src.macosx-11.0-arm64-3.10/numpy/core/src/common/templ_common.h', 'build/src.macosx-11.0-arm64-3.10/numpy/core/include/numpy/config.h', 'build/src.macosx-11.0-arm64-3.10/numpy/core/include/numpy/_numpyconfig.h', 'build/src.macosx-11.0-arm64-3.10/numpy/core/include/numpy/__multiarray_api.h', 'build/src.macosx-11.0-arm64-3.10/numpy/core/include/numpy/__ufunc_api.h']
INFO: building extension "numpy.core._umath_tests" sources
INFO: building extension "numpy.core._rational_tests" sources
INFO: building extension "numpy.core._struct_ufunc_tests" sources
INFO: building extension "numpy.core._operand_flag_tests" sources
INFO: building extension "numpy.core._simd" sources
INFO:   adding 'build/src.macosx-11.0-arm64-3.10/numpy/core/src/_simd' to include_dirs.
INFO: numpy.core - nothing done with h_files = ['build/src.macosx-11.0-arm64-3.10/numpy/core/src/_simd/_simd_inc.h', 'build/src.macosx-11.0-arm64-3.10/numpy/core/src/_simd/_simd_data.inc']
INFO: building extension "numpy.fft._pocketfft_internal" sources
INFO: building extension "numpy.linalg.lapack_lite" sources
INFO: building extension "numpy.linalg._umath_linalg" sources
INFO: building extension "numpy.random._mt19937" sources
INFO: building extension "numpy.random._philox" sources
INFO: building extension "numpy.random._pcg64" sources
INFO: building extension "numpy.random._sfc64" sources
INFO: building extension "numpy.random._common" sources
INFO: building extension "numpy.random.bit_generator" sources
INFO: building extension "numpy.random._generator" sources
INFO: building extension "numpy.random._bounded_integers" sources
INFO: building extension "numpy.random.mtrand" sources
INFO: building data_files sources
INFO: build_src: building npy-pkg config files
/opt/homebrew/Caskroom/mambaforge/base/lib/python3.10/site-packages/setuptools/command/install.py:34: SetuptoolsDeprecationWarning: setup.py install is deprecated. Use build and pip and other standards-based tools.
  warnings.warn(
running build_py
copying build/src.macosx-11.0-arm64-3.10/numpy/__config__.py -> build/lib.macosx-11.0-arm64-3.10/numpy
copying build/src.macosx-11.0-arm64-3.10/numpy/distutils/__config__.py -> build/lib.macosx-11.0-arm64-3.10/numpy/distutils
UPDATING build/lib.macosx-11.0-arm64-3.10/numpy/_version.py
set build/lib.macosx-11.0-arm64-3.10/numpy/_version.py to '1.25.0.dev0+243.g954aee7ab'
running build_clib
INFO: customize UnixCCompiler
INFO: customize UnixCCompiler using new_build_clib
INFO: CCompilerOpt.__init__[813] : load cache from file -> /Users/sebastianb/forks/numpy/build/temp.macosx-11.0-arm64-3.10/ccompiler_opt_cache_clib.py
INFO: CCompilerOpt.__init__[824] : hit the file cache
running build_ext
INFO: customize UnixCCompiler
INFO: customize UnixCCompiler using new_build_ext
INFO: CCompilerOpt.__init__[813] : load cache from file -> /Users/sebastianb/forks/numpy/build/temp.macosx-11.0-arm64-3.10/ccompiler_opt_cache_ext.py
INFO: CCompilerOpt.__init__[824] : hit the file cache
INFO: customize UnixCCompiler
WARN: #### ['arm64-apple-darwin20.0.0-clang', '-Wno-unused-result', '-Wsign-compare', '-Wunreachable-code', '-DNDEBUG', '-fwrapv', '-O2', '-Wall', '-fPIC', '-O2', '-isystem', '/opt/homebrew/Caskroom/mambaforge/base/include', '-arch', 'arm64', '-fPIC', '-O2', '-isystem', '/opt/homebrew/Caskroom/mambaforge/base/include', '-arch', 'arm64', '-ftree-vectorize', '-fPIC', '-fPIE', '-fstack-protector-strong', '-O2', '-pipe', '-isystem', '/opt/homebrew/Caskroom/mambaforge/base/include', '-D_FORTIFY_SOURCE=2', '-isystem', '/opt/homebrew/Caskroom/mambaforge/base/include'] #######
INFO: customize UnixCCompiler using new_build_ext
running install
running install_lib
copying build/lib.macosx-11.0-arm64-3.10/numpy/distutils/__config__.py -> /Users/sebastianb/forks/numpy/build/testenv/lib/python3.10/site-packages/numpy/distutils
copying build/lib.macosx-11.0-arm64-3.10/numpy/_version.py -> /Users/sebastianb/forks/numpy/build/testenv/lib/python3.10/site-packages/numpy
copying build/lib.macosx-11.0-arm64-3.10/numpy/__config__.py -> /Users/sebastianb/forks/numpy/build/testenv/lib/python3.10/site-packages/numpy
byte-compiling /Users/sebastianb/forks/numpy/build/testenv/lib/python3.10/site-packages/numpy/distutils/__config__.py to __config__.cpython-310.pyc
byte-compiling /Users/sebastianb/forks/numpy/build/testenv/lib/python3.10/site-packages/numpy/_version.py to _version.cpython-310.pyc
byte-compiling /Users/sebastianb/forks/numpy/build/testenv/lib/python3.10/site-packages/numpy/__config__.py to __config__.cpython-310.pyc
running install_data
copying build/src.macosx-11.0-arm64-3.10/numpy/core/lib/npy-pkg-config/npymath.ini -> /Users/sebastianb/forks/numpy/build/testenv/lib/python3.10/site-packages/numpy/core/lib/npy-pkg-config
copying build/src.macosx-11.0-arm64-3.10/numpy/core/lib/npy-pkg-config/mlib.ini -> /Users/sebastianb/forks/numpy/build/testenv/lib/python3.10/site-packages/numpy/core/lib/npy-pkg-config
running install_clib
running install_egg_info
running egg_info
writing numpy.egg-info/PKG-INFO
writing dependency_links to numpy.egg-info/dependency_links.txt
writing entry points to numpy.egg-info/entry_points.txt
writing top-level names to numpy.egg-info/top_level.txt
/opt/homebrew/Caskroom/mambaforge/base/lib/python3.10/site-packages/setuptools/command/egg_info.py:624: SetuptoolsDeprecationWarning: Custom 'build_py' does not implement 'get_data_files_without_manifest'.
Please extend command classes from setuptools instead of distutils.
  warnings.warn(
reading manifest template 'MANIFEST.in'
no previously-included directories found matching 'doc/build'
no previously-included directories found matching 'doc/source/generated'
no previously-included directories found matching 'benchmarks/results'
no previously-included directories found matching 'benchmarks/html'
no previously-included directories found matching 'benchmarks/numpy'
warning: no previously-included files matching '*.pyo' found anywhere in distribution
warning: no previously-included files matching '*.pyd' found anywhere in distribution
warning: no previously-included files matching '*.swp' found anywhere in distribution
warning: no previously-included files matching '*.bak' found anywhere in distribution
warning: no previously-included files matching '*~' found anywhere in distribution
adding license file 'LICENSE.txt'
adding license file 'LICENSES_bundled.txt'
writing manifest file 'numpy.egg-info/SOURCES.txt'
removing '/Users/sebastianb/forks/numpy/build/testenv/lib/python3.10/site-packages/numpy-1.25.0.dev0+243.g954aee7ab-py3.10.egg-info' (and everything under it)
Copying numpy.egg-info to /Users/sebastianb/forks/numpy/build/testenv/lib/python3.10/site-packages/numpy-1.25.0.dev0+243.g954aee7ab-py3.10.egg-info
running install_scripts
Installing f2py script to /Users/sebastianb/forks/numpy/build/testenv/bin
Installing f2py3 script to /Users/sebastianb/forks/numpy/build/testenv/bin
Installing f2py3.10 script to /Users/sebastianb/forks/numpy/build/testenv/bin
writing list of installed files to '/Users/sebastianb/forks/numpy/build/testenvtmp_install_log.txt'
INFO: 
########### EXT COMPILER OPTIMIZATION ###########
INFO: Platform      : 
  Architecture: aarch64
  Compiler    : clang

CPU baseline  : 
  Requested   : 'min'
  Enabled     : NEON NEON_FP16 NEON_VFPV4 ASIMD
  Flags       : none
  Extra checks: none

CPU dispatch  : 
  Requested   : 'max -xop -fma4'
  Enabled     : ASIMDHP ASIMDDP ASIMDFHM
  Generated   : 
              : 
  ASIMDHP     : NEON NEON_FP16 NEON_VFPV4 ASIMD
  Flags       : -march=armv8.2-a+fp16
  Extra checks: none
  Detect      : ASIMD ASIMDHP
              : numpy/core/src/umath/_umath_tests.dispatch.c
INFO: CCompilerOpt.cache_flush[857] : write cache to path -> /Users/sebastianb/forks/numpy/build/temp.macosx-11.0-arm64-3.10/ccompiler_opt_cache_ext.py
INFO: 
########### CLIB COMPILER OPTIMIZATION ###########
INFO: Platform      : 
  Architecture: aarch64
  Compiler    : clang

CPU baseline  : 
  Requested   : 'min'
  Enabled     : NEON NEON_FP16 NEON_VFPV4 ASIMD
  Flags       : none
  Extra checks: none

CPU dispatch  : 
  Requested   : 'max -xop -fma4'
  Enabled     : ASIMDHP ASIMDDP ASIMDFHM
  Generated   : none
INFO: CCompilerOpt.cache_flush[857] : write cache to path -> /Users/sebastianb/forks/numpy/build/temp.macosx-11.0-arm64-3.10/ccompiler_opt_cache_clib.py

@seiko2plus
Copy link
Member Author

seiko2plus commented Dec 22, 2022

@seberg, I wasn't able to reproduce it, maybe the clang commits an aggressive optimization leads somehow to optimizing out the new changes:

#if NPY_SIMD_CMPSIGNAL
// Eliminate NaN to avoid FP invalid exception
x_in = npyv_and_f32(x_in, npyv_reinterpret_f32_u32(npyv_cvt_u32_b32(nnan_mask)));
#endif

or maybe your build was cached as I can see from your build log:

NFO: customize UnixCCompiler using new_build_clib
INFO: CCompilerOpt.__init__[813] : load cache from file -> /Users/sebastianb/forks/numpy/build/temp.macosx-11.0-arm64-3.10/ccompiler_opt_cache_clib.py
INFO: CCompilerOpt.__init__[824] : hit the file cache
running build_ext
INFO: customize UnixCCompiler
INFO: customize UnixCCompiler using new_build_ext
INFO: CCompilerOpt.__init__[813] : load cache from file -> /Users/sebastianb/forks/numpy/build/temp.macosx-11.0-arm64-3.10/ccompiler_opt_cache_ext.py
INFO: CCompilerOpt.__init__[824] : hit the file cache
INFO: customize UnixCCompiler
WARN: #### ['arm64-apple-darwin20.0.0-clang', '-Wno-unused-result', '-Wsign-compare', '-Wunreachable-code', '-DNDEBUG', '-fwrapv', '-O2', '-Wall', '-fPIC', '-O2', '-isystem', '/opt/homebrew/Caskroom/mambaforge/base/include', '-arch', 'arm64', '-fPIC', '-O2', '-isystem', '/opt/homebrew/Caskroom/mambaforge/base/include', '-arch', 'arm64', '-ftree-vectorize', '-fPIC', '-fPIE', '-fstack-protector-strong', '-O2', '-pipe', '-isystem', '/opt/homebrew/Caskroom/mambaforge/base/include', '-D_FORTIFY_SOURCE=2', '-isystem', '/opt/homebrew/Caskroom/mambaforge/base/include'] #######
INFO: customize UnixCCompiler using new_build_ext

Could you please provide a non-cached build log and version of clang?

@seberg
Copy link
Member

seberg commented Dec 22, 2022

clang version 14.0.6
Target: arm64-apple-darwin22.1.0
Thread model: posix

Hmmm, maybe the compiler got updated recently with Ventura? (I am seeing this on main branch, but I didn't think it would matter). build.log

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
00 - Bug component: SIMD Issues in SIMD (fast instruction sets) code or machinery
Projects
None yet
4 participants