Skip to content

1.11.1 pip install fails on RHEL 7.2 (IBM Power8) with Python 2.7.12 #7836

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
bbelgodere opened this issue Jul 14, 2016 · 15 comments
Closed

Comments

@bbelgodere
Copy link

bbelgodere commented Jul 14, 2016

While attempting to install numpy on a Power8 (ppc) machine Running RHEL 7.2 I'm encountering the following error below. 1.11.0 installs without issue.

pip install numpy==1.11.1

ERROR

error: Command "gcc -pthread -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -Inumpy/core/include -Ibuild/src.linux-ppc64le-2.7/numpy/core/include/numpy -Inumpy/core/src/private -Inumpy/core/src -Inumpy/core -Inumpy/core/src/npymath -Inumpy/core/src/multiarray -Inumpy/core/src/umath -Inumpy/core/src/npysort -I/opt/share/Python-2.7.12/ppc64le/include/python2.7 -Ibuild/src.linux-ppc64le-2.7/numpy/core/src/private -Ibuild/src.linux-ppc64le-2.7/numpy/core/src/private -Ibuild/src.linux-ppc64le-2.7/numpy/core/src/private -c build/src.linux-ppc64le-2.7/numpy/core/src/npymath/npy_math_complex.c -o build/temp.linux-ppc64le-2.7/build/src.linux-ppc64le-2.7/numpy/core/src/npymath/npy_math_complex.o" failed with exit status 1

    ----------------------------------------
  Rolling back uninstall of numpy

Command "/opt/share/Python-2.7.12/ppc64le/bin/python -u -c "import setuptools, tokenize;__file__='/tmp/pip-build-LvkRpf/numpy/setup.py';exec(compile(getattr(tokenize, 'open', open)(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" install --record /tmp/pip-Bg_8Ko-record/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /tmp/pip-build-LvkRpf/numpy/

ENVIRONMENT DETAILS

OS: Red Hat Enterprise Linux 7.2 Maipo
Kernel: ppc64le Linux 3.10.0-327.el7.ppc64le
CPU: IBM PowerPC G3 POWER8 (raw) @ 160x 3.491GHz
$pip list
Cython (0.24)
numpy (1.11.0)
pip (8.1.2)
scipy (0.17.1)
setuptools (24.0.3)
virtualenv (15.0.2)
$gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/ppc64le-redhat-linux/4.8.5/lto-wrapper
Target: ppc64le-redhat-linux
Configured with: ../configure --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-bootstrap --enable-shared --enable-threads=posix --enable-checking=release --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --enable-linker-build-id --with-linker-hash-style=gnu --enable-languages=c,c++,objc,obj-c++,java,fortran,ada,go,lto --enable-plugin --enable-initfini-array --disable-libgcj --with-isl=/builddir/build/BUILD/gcc-4.8.5-20150702/obj-ppc64le-redhat-linux/isl-install --with-cloog=/builddir/build/BUILD/gcc-4.8.5-20150702/obj-ppc64le-redhat-linux/cloog-install --enable-gnu-indirect-function --enable-secureplt --with-long-double-128 --enable-targets=powerpcle-linux --disable-multilib --with-cpu-64=power7 --with-tune-64=power8 --build=ppc64le-redhat-linux
Thread model: posix
gcc version 4.8.5 20150623 (Red Hat 4.8.5-4) (GCC)
@rgommers
Copy link
Member

Do you get the same error with python setup.py install from a checkout or tarball of v1.11.1?

If so and it's specific to PPC (looks like it), it would help if you could bisect the issue to find the problematic commit. I can't reproduce this, which makes investigating a bit hard.

@bbelgodere
Copy link
Author

@rgommers looks like the first bad commit is 5006616

checkout of refs/tags/v1.11.0 works fine (specifically merge d51f874)

checkout of refs/tags/v1.11.1 fails as before see below for bisect.

we have RHEL7.2 set up on both PPC and x86 within our cluster

ldd --version
ldd (GNU libc) 2.17
Copyright (C) 2012 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Written by Roland McGrath and Ulrich Drepper.

x86 installed without issue

git checkout refs/tags/v1.11.1
git bisect start
git bisect bad
pip uninstall numpy
git bisect good d51f874f7c30f335189e1175199f3275adfc79f8

Bisecting: 24 revisions left to test after this (roughly 5 steps)
[a62cf4ebb0d0c64bd327fe695f220b8521f760a4] Merge pull request #7656 from charris/backport-7655

pip install git+file:///u/bmbelgod/numpy
git bisect bad

Bisecting: 12 revisions left to test after this (roughly 4 steps)
[1a7d6d6f0fad5f1fc2fba7275b78007eb8578a5c] Merge pull request #7590 from charris/backport-7584

pip install git+file:///u/bmbelgod/numpy
git bisect bad

Bisecting: 6 revisions left to test after this (roughly 3 steps)
[1a3e301557840095bbb93f53698f6f52d21984fb] Merge pull request #7551 from charris/backport-7549

pip install git+file:///u/bmbelgod/numpy
git bisect bad

Bisecting: 2 revisions left to test after this (roughly 2 steps)
[5006616659ebcd9960f730d28255648a562d7308] BUG: Extend glibc complex trig functions blacklist to glibc < 2.18.

pip install git+file:///u/bmbelgod/numpy
git bisect bad

Bisecting: 0 revisions left to test after this (roughly 1 step)
[bea5883d837599b8f10acc8c51bccae247cd9feb] Merge pull request #7530 from charris/backport-7529

pip install git+file:///u/bmbelgod/numpy
pip uninstall numpy
git bisect good
5006616659ebcd9960f730d28255648a562d7308 is the first bad commit
commit 5006616659ebcd9960f730d28255648a562d7308
Author: Nikola Forró <nforro@redhat.com>
Date:   Wed Apr 6 11:03:30 2016 +0200

    BUG: Extend glibc complex trig functions blacklist to glibc < 2.18.

    The library complex trig functions are inaccurate also in glibc
    versions 2.16 and 2.17, so extend the blacklist.

    Closes #7517.

:040000 040000 5a48106cb844eaf3434cbcf60e0cee88d1959c7d b93454e3a5d853fdff72e66f1fc9b5c29161e847 M  numpy

@charris
Copy link
Member

charris commented Jul 23, 2016

It seems that some of the trig functions are not compiling. What we need is to get the compiler error message. You can try compiling the tarball with python setup.py build.

@charris
Copy link
Member

charris commented Jul 23, 2016

Note that forcing the blacklist here doesn't lead to any problems for me. What version of glibc do you have?

@njsmith
Copy link
Member

njsmith commented Jul 23, 2016

Digging through the log on the mailing list, it looks like the actual compiler error is:

    numpy/core/src/npymath/npy_math_complex.c.src: In function ‘_real_part_reciprocall’:
    numpy/core/src/npymath/npy_math_complex.c.src:1616:25: error: storage size of ‘ux’ isn’t known
         union IEEEl2bitsrep ux, uy, us;
                             ^
    numpy/core/src/npymath/npy_math_complex.c.src:1616:29: error: storage size of ‘uy’ isn’t known
         union IEEEl2bitsrep ux, uy, us;
                                 ^
    numpy/core/src/npymath/npy_math_complex.c.src:1616:33: error: storage size of ‘us’ isn’t known
         union IEEEl2bitsrep ux, uy, us;
                                     ^
    numpy/core/src/npymath/npy_math_complex.c.src:1620:5: warning: implicit declaration of function ‘GET_LDOUBLE_EXP’ [-Wimplicit-function-declaration]
         ix = GET_LDOUBLE_EXP(ux);
         ^
    numpy/core/src/npymath/npy_math_complex.c.src:1633:5: warning: implicit declaration of function ‘SET_LDOUBLE_EXP’ [-Wimplicit-function-declaration]
         SET_LDOUBLE_EXP(us, 0x7fff - ix);
         ^
    numpy/core/src/npymath/npy_math_complex.c.src:1616:33: warning: unused variable ‘us’ [-Wunused-variable]
         union IEEEl2bitsrep ux, uy, us;
                                     ^
    numpy/core/src/npymath/npy_math_complex.c.src:1616:29: warning: unused variable ‘uy’ [-Wunused-variable]
         union IEEEl2bitsrep ux, uy, us;
                                 ^
    numpy/core/src/npymath/npy_math_complex.c.src:1616:25: warning: unused variable ‘ux’ [-Wunused-variable]
         union IEEEl2bitsrep ux, uy, us;
                             ^

and then later the same (?) thing again:

    numpy/core/src/npymath/npy_math_complex.c.src: In function ‘_real_part_reciprocall’:
    numpy/core/src/npymath/npy_math_complex.c.src:1616:25: error: storage size of ‘ux’ isn’t known
         union IEEEl2bitsrep ux, uy, us;
                             ^
    numpy/core/src/npymath/npy_math_complex.c.src:1616:29: error: storage size of ‘uy’ isn’t known
         union IEEEl2bitsrep ux, uy, us;
                                 ^
    numpy/core/src/npymath/npy_math_complex.c.src:1616:33: error: storage size of ‘us’ isn’t known
         union IEEEl2bitsrep ux, uy, us;
                                     ^
    numpy/core/src/npymath/npy_math_complex.c.src:1620:5: warning: implicit declaration of function ‘GET_LDOUBLE_EXP’ [-Wimplicit-function-declaration]
         ix = GET_LDOUBLE_EXP(ux);
         ^
    numpy/core/src/npymath/npy_math_complex.c.src:1633:5: warning: implicit declaration of function ‘SET_LDOUBLE_EXP’ [-Wimplicit-function-declaration]
         SET_LDOUBLE_EXP(us, 0x7fff - ix);
         ^
    numpy/core/src/npymath/npy_math_complex.c.src:1616:33: warning: unused variable ‘us’ [-Wunused-variable]
         union IEEEl2bitsrep ux, uy, us;
                                     ^
    numpy/core/src/npymath/npy_math_complex.c.src:1616:29: warning: unused variable ‘uy’ [-Wunused-variable]
         union IEEEl2bitsrep ux, uy, us;
                                 ^
    numpy/core/src/npymath/npy_math_complex.c.src:1616:25: warning: unused variable ‘ux’ [-Wunused-variable]
         union IEEEl2bitsrep ux, uy, us;
                             ^

...

    error: Command "gcc -pthread -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -Inumpy/core/include -Ibuild/src.linux-ppc64le-2.7/numpy/core/include/numpy -Inumpy/core/src/private -Inumpy/core/src -Inumpy/core -Inumpy/core/src/npymath -Inumpy/core/src/multiarray -Inumpy/core/src/umath -Inumpy/core/src/npysort -I/opt/share/Python-2.7.12/ppc64le/include/python2.7 -Ibuild/src.linux-ppc64le-2.7/numpy/core/src/private -Ibuild/src.linux-ppc64le-2.7/numpy/core/src/private -Ibuild/src.linux-ppc64le-2.7/numpy/core/src/private -c build/src.linux-ppc64le-2.7/numpy/core/src/npymath/npy_math_complex.c -o build/temp.linux-ppc64le-2.7/build/src.linux-ppc64le-2.7/numpy/core/src/npymath/npy_math_complex.o" failed with exit status 1

@njsmith
Copy link
Member

njsmith commented Jul 23, 2016

I think this indicates something going wrong with power8 long doubles. (I guess these are double-doubles, like powerpc?)

@charris
Copy link
Member

charris commented Jul 23, 2016

Long doubles, just as I suspected ;) The macros are defined in npy_math_private.h. Power PC was HAVE_LDOUBLE_DOUBLE_DOUBLE_BE or HAVE_LDOUBLE_DOUBLE_DOUBLE_LE. We probably don't support that in our fallback routines or with the macros as it isn't IEEE. The easiest solution might be to special case the blacklist of Power PC.

@bbelgodere could you grep the build directory for DOUBLE_DOUBLE_DOUBLE. Do you know if those machines still use the double_double format for long doubles.

@njsmith
Copy link
Member

njsmith commented Jul 23, 2016

The easiest solution might be to special case the blacklist of Power PC.

Or just disable long double entirely on that platform...

@bbelgodere
Copy link
Author

bbelgodere commented Jul 26, 2016

@charris and @rgommers

grep the build directory for DOUBLE_DOUBLE_DOUBLE request

(numpy_test) [bmbelgod@dccpc214 build]$ grep -nRHI "DOUBLE_DOUBLE_DOUBLE"
src.linux-ppc64le-2.7/numpy/core/src/npymath/ieee754.c:146:#if defined(HAVE_LDOUBLE_DOUBLE_DOUBLE_BE) || \
src.linux-ppc64le-2.7/numpy/core/src/npymath/ieee754.c:147:    defined(HAVE_LDOUBLE_DOUBLE_DOUBLE_LE)
src.linux-ppc64le-2.7/numpy/core/src/npymath/npy_math_complex.c:1608:#ifndef HAVE_LDOUBLE_DOUBLE_DOUBLE_BE
src.linux-ppc64le-2.7/numpy/core/src/npymath/npy_math_complex.c:3294:#ifndef HAVE_LDOUBLE_DOUBLE_DOUBLE_BE
src.linux-ppc64le-2.7/numpy/core/src/npymath/npy_math_complex.c:4980:#ifndef HAVE_LDOUBLE_DOUBLE_DOUBLE_BE
src.linux-ppc64le-2.7/numpy/core/include/numpy/config.h:189:#define HAVE_LDOUBLE_DOUBLE_DOUBLE_LE 1

output of python setup.py build

https://gist.github.com/bbelgodere/23db858f8d9df293b8c06da2812afcb3

@bbelgodere
Copy link
Author

@charris and @rgommers anything I can help test out?

@charris
Copy link
Member

charris commented Aug 11, 2016

@bbelgodere Hmm, I notice in npy_math_complex.c.src line 1608 that we have

#ifndef HAVE_LDOUBLE_DOUBLE_DOUBLE_BE

Try changing that to

#if !defined(HAVE_LDOUBLE_DOUBLE_DOUBLE_BE) && \
    !defined(HAVE_LDOUBLE_DOUBLE_DOUBLE_LE)

so that LE PPC gets checked as well.

If that doesn't work, the easiest thing might be to make the blacklist entries platform dependent.

charris added a commit to charris/numpy that referenced this issue Aug 11, 2016
The `_real_part_reciprocal` function is coded in two ways, one depending
on functions specific to IEEE floating point and the other using generic
code that should always work.  Because PPC long double is not IEEE the
generic version should always be chosen for that architecture, but that
is currently only done when the PPC is configured as big endian.  This
PR makes sure that the generic version is also chosen when the PPC is
configured as little endian.

Closes numpy#7836.
@bbelgodere
Copy link
Author

bbelgodere commented Aug 15, 2016

@charris yes, it compiled successfully with that change (eefc1b5). After install ran numpy.test('full') and had two failures.

Python 2.7.12 (default, Jul 14 2016, 11:07:38)
[GCC 4.8.5 20150623 (Red Hat 4.8.5-4)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy
>>> numpy.test('full')
Running unit tests for numpy
NumPy version 1.11.1
NumPy relaxed strides checking option: False
NumPy is installed in /opt/share/Python-2.7.12/ppc64le/lib/python2.7/site-packages/numpy-1.11.1-py2.7-linux-ppc64le.egg/numpy
Python version 2.7.12 (default, Jul 14 2016, 11:07:38) [GCC 4.8.5 20150623 (Red Hat 4.8.5-4)]
nose version 1.3.7

======================================================================
FAIL: test_longdouble.test_repr_roundtrip
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/opt/share/Python-2.7.12/ppc64le/lib/python2.7/site-packages/nose/case.py", line 197, in runTest
    self.test(*self.arg)
  File "/opt/share/Python-2.7.12/ppc64le/lib/python2.7/site-packages/numpy-1.11.1-py2.7-linux-ppc64le.egg/numpy/core/tests/test_longdouble.py", line 35, in test_repr_roundtrip
    "repr was %s" % repr(o))
  File "/opt/share/Python-2.7.12/ppc64le/lib/python2.7/site-packages/numpy-1.11.1-py2.7-linux-ppc64le.egg/numpy/testing/utils.py", line 379, in assert_equal
    raise AssertionError(msg)
AssertionError:
Items are not equal: repr was 1.0
 ACTUAL: 1.0
 DESIRED: 1.0

======================================================================
FAIL: test_kind.TestKind.test_all
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/opt/share/Python-2.7.12/ppc64le/lib/python2.7/site-packages/nose/case.py", line 197, in runTest
    self.test(*self.arg)
  File "/opt/share/Python-2.7.12/ppc64le/lib/python2.7/site-packages/numpy-1.11.1-py2.7-linux-ppc64le.egg/numpy/f2py/tests/test_kind.py", line 33, in test_all
    (i, selected_real_kind(i), selectedrealkind(i)))
  File "/opt/share/Python-2.7.12/ppc64le/lib/python2.7/site-packages/numpy-1.11.1-py2.7-linux-ppc64le.egg/numpy/testing/utils.py", line 75, in assert_
    raise AssertionError(smsg)
AssertionError: selectedrealkind(16): expected 10 but got 16

----------------------------------------------------------------------
Ran 6334 tests in 163.192s

FAILED (KNOWNFAIL=15, SKIP=1, failures=2)
<nose.result.TextTestResult run=6334 errors=0 failures=2>

@charris
Copy link
Member

charris commented Aug 15, 2016

Thanks for checking. The first fail comes from

    o = 1 + np.finfo(np.longdouble).eps
    assert_equal(np.longdouble(repr(o)), o,
                 "repr was %s" % repr(o))

I suspect np.finfo(np.longdouble).eps is busted for PPC and probably safe to ignore here. It might cause problems where eps is needed and we should probably fix it some day. I think the way to go here would be to hardwire support for IEEE floating point types and PPC long double. Strictly speaking, we only support IEEE, but it is probably worth trying to do better for the PPC. The current finfo routine is too general while not being quite general enough, sort of the worst of all worlds :)

The second error is an annoying artifact of f2py that turns up now and then. I'm really not sure what to do about it. I've been told that selectedrealkind matters, but apart from the test have heard no complaints.

@charris charris closed this as completed Aug 15, 2016
@charris charris reopened this Aug 15, 2016
charris added a commit to charris/numpy that referenced this issue Aug 15, 2016
The `_real_part_reciprocal` function is coded in two ways, one depending
on functions specific to IEEE floating point and the other using generic
code that should always work.  Because PPC long double is not IEEE the
generic version should always be chosen for that architecture, but that
is currently only done when the PPC is configured as big endian.  This
PR makes sure that the generic version is also chosen when the PPC is
configured as little endian.

Closes numpy#7836.
@charris
Copy link
Member

charris commented Aug 15, 2016

The failing test also suggests that repr(long double) does not have sufficient precision for PPC. That can probably be fixed by special casing PPC, but I'm not sure how many digits we should be using in that case.

@charris
Copy link
Member

charris commented Aug 15, 2016

Looks like future gnu compilers will use IEEE quad precision, see https://gcc.gnu.org/wiki/Ieee128PowerPC, but that will be long after RHEL 7.2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants