ENH: skip NPY_ALLOW_C_API for UFUNC_ERR_IGNORE #9985

ziyan · 2017-11-08T01:50:26Z

GIL unnecessary when numpy floating point error handling is set to ignore.

Fixes an issue where numpy might deadlock when computing a**2 where a is tiny, e.g. -2.3693744349064819e-197:

#0  0x00007f6ff8c5b536 in do_futex_wait.constprop () from /lib/x86_64-linux-gnu/libpthread.so.0
#1  0x00007f6ff8c5b5e4 in __new_sem_wait_slow.constprop.0 () from /lib/x86_64-linux-gnu/libpthread.so.0
#2  0x00007f6ff20b6768 in PyThread_acquire_lock (lock=0x55e085e26020, waitflag=<optimized out>) at ../Python/thread_pthread.h:324
#3  0x00007f6ff2028556 in PyEval_RestoreThread (tstate=0x7f6fe11adf40) at ../Python/ceval.c:359
#4  0x00007f6ff20e0b96 in PyGILState_Ensure () at ../Python/pystate.c:611
#5  0x00007f6fcb880dd6 in _error_handler (method=method@entry=0, errobj=errobj@entry=('double_scalars', None), errtype=errtype@entry=0x7f6fcb8d28c1 "underflow", retstatus=retstatus@entry=4, first=first@entry=0x7f6fef64c7b0) at numpy/core/src/umath/ufunc_object.c:119
#6  0x00007f6fcb8872ff in PyUFunc_handlefperr (errmask=521, errobj=('double_scalars', None), retstatus=retstatus@entry=4, first=first@entry=0x7f6fef64c7b0) at numpy/core/src/umath/ufunc_object.c:209
#7  0x00007f6fcb894f08 in double_power (a=<optimized out>, b=2, __NPY_UNUSED_TAGGEDc=<optimized out>) at numpy/core/src/umath/scalarmath.c.src:1168
#8  0x00007f6ff2083a17 in ternary_op.isra.5 (v=<optimized out>, w=<optimized out>, z=None, op_slot=48) at ../Objects/abstract.c:1065
#9  0x00007f6ff2029e0a in PyEval_EvalFrameEx

P.S. the stack is coming from numpy 1.11, but same problem seems to exist on master.

GIL unnecessary when numpy floating point error handling is set to ignore.

njsmith · 2017-11-08T01:52:18Z

Is there an bug open for the deadlock? Can you give a reproducer?

…

On Nov 7, 2017 19:50, "Ziyan Zhou" ***@***.***> wrote: GIL unnecessary when numpy floating point error handling is set to ignore. Fixes an issue where numpy might deadlock when computing a**2 where a is tiny, e.g. -2.3693744349064819e-197: #0 0x00007f6ff8c5b536 in do_futex_wait.constprop () from /lib/x86_64-linux-gnu/libpthread.so.0 #1 0x00007f6ff8c5b5e4 in __new_sem_wait_slow.constprop.0 () from /lib/x86_64-linux-gnu/libpthread.so.0 #2 0x00007f6ff20b6768 in PyThread_acquire_lock (lock=0x55e085e26020, waitflag=<optimized out>) at ../Python/thread_pthread.h:324 #3 0x00007f6ff2028556 in PyEval_RestoreThread (tstate=0x7f6fe11adf40) at ../Python/ceval.c:359 #4 0x00007f6ff20e0b96 in PyGILState_Ensure () at ../Python/pystate.c:611 #5 0x00007f6fcb880dd6 in _error_handler ***@***.***=0, ***@***.***=('double_scalars', None), ***@***.***=0x7f6fcb8d28c1 "underflow", ***@***.***=4, ***@***.***=0x7f6fef64c7b0) at numpy/core/src/umath/ufunc_object.c:119 #6 0x00007f6fcb8872ff in PyUFunc_handlefperr (errmask=521, errobj=('double_scalars', None), ***@***.***=4, ***@***.***=0x7f6fef64c7b0) at numpy/core/src/umath/ufunc_object.c:209 #7 0x00007f6fcb894f08 in double_power (a=<optimized out>, b=2, __NPY_UNUSED_TAGGEDc=<optimized out>) at numpy/core/src/umath/scalarmath.c.src:1168 #8 0x00007f6ff2083a17 in ternary_op.isra.5 (v=<optimized out>, w=<optimized out>, z=None, op_slot=48) at ../Objects/abstract.c:1065 #9 0x00007f6ff2029e0a in PyEval_EvalFrameEx P.S. the stack is coming from numpy 1.11, but same problem seems to exist on master. ------------------------------ You can view, comment on, or merge this pull request online at: #9985 Commit Summary - ENH skip NPY_ALLOW_C_API for UFUNC_ERR_IGNORE File Changes - *M* numpy/core/src/umath/extobj.c <https://github.com/numpy/numpy/pull/9985/files#diff-0> (5) Patch Links: - https://github.com/numpy/numpy/pull/9985.patch - https://github.com/numpy/numpy/pull/9985.diff — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#9985>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAlOaFT5hX2nuTgmZu3Kbkm5ceDJ-BKsks5s0QjkgaJpZM4QVul6> .

ziyan · 2017-11-08T02:08:00Z

There is no bug open, because I have not found a way to reproduce the deadlock with simple script. I will get back to you on that.

ziyan · 2017-11-08T03:15:00Z

I added a simple reproduction in this repository:
https://github.com/ziyan/numpy-wsgi-deadlock

You will need docker to reproduce it. But it is a simple apache2 wsgi running a simple python script hello.wsgi.

eric-wieser · 2017-11-08T04:33:31Z

Diff looks fine, but if there's a deadlock here, can't we still hit it in the other modes anyway? This doesn't seem to solve the underlying problem to me.

ziyan · 2017-11-08T04:48:36Z

You are right. It does not. I think the deadlock is an instance of gh-8559. And the conclusion there is that wsgi under default settings is not supported by numpy because it uses multiple sub interpreters. I tried WSGIApplicationGroup %{GLOBAL} and the problem seems to go away.

eric-wieser · 2017-11-08T04:51:14Z

Sounds pretty convincing to me. The patch looks uncontroversial, so I'll put it in. Thanks!

eric-wieser · 2017-11-08T04:53:10Z

Yeah, this sure looks like gh-5856 to me

ziyan · 2017-11-08T04:53:36Z

Thank you for the quick response!

twmr · 2017-12-01T22:57:37Z

Does this PR fix the deadlock in the following simple program ?

//
// gcc pyinterptest.c -I/usr/include/python2.7 -Wall -lpython2.7 -o pyinterptest && ./pyinterptest
//
#include <Python.h>


void run_numpy_code(void){
    PyRun_SimpleString("import numpy as np");
    PyRun_SimpleString("print(3)");
    PyRun_SimpleString("print(np.array([1.23434e-312])**2)");
    /* PyRun_SimpleString("print(np.array([1.23434e-312])**1.)"); */
}


void foo1(void) {
    // does not hang
    PyThreadState* mainThread = PyEval_SaveThread(); // REL GIL
    PyThreadState_New(mainThread->interp);
    PyEval_RestoreThread(mainThread); // ACC GIL

    run_numpy_code(); // does not hang
}

void foo2(void) {
    PyThreadState* mainThread = PyEval_SaveThread(); // REL GIL
    PyThreadState* t1 = PyThreadState_New(mainThread->interp);
    PyEval_RestoreThread(t1); // ACC GIL

    run_numpy_code(); // hangs
}

void foo3(void) {
    PyThreadState* mainThread = PyEval_SaveThread(); // REL GIL
    PyThreadState* t1  = PyThreadState_New(mainThread->interp);
    PyEval_RestoreThread(t1);
    PyEval_SaveThread(); // REL GIL
    PyEval_RestoreThread(mainThread); // ACC GIL

    run_numpy_code(); // does not hang
}

int main(int argc, char *argv[]) {
    Py_Initialize();
    PyEval_InitThreads(); // ACC GIL

    foo2();
    Py_Finalize();
    return 0;
}

twmr · 2017-12-02T07:55:54Z

I've just compiled numpy@master locally and can verify that the above program does not hang anymore. 👍

Do we want to add a stripped down version of the above program to the numpy unit tests? @eric-wieser

ENH skip NPY_ALLOW_C_API for UFUNC_ERR_IGNORE

bbf8011

GIL unnecessary when numpy floating point error handling is set to ignore.

eric-wieser merged commit 3d0c041 into numpy:master Nov 8, 2017

eric-wieser added 01 - Enhancement component: numpy._core Embedded Issues regarding embedded python interpreters labels Nov 8, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

ENH: skip NPY_ALLOW_C_API for UFUNC_ERR_IGNORE #9985

ENH: skip NPY_ALLOW_C_API for UFUNC_ERR_IGNORE #9985

Uh oh!

ziyan commented Nov 8, 2017

Uh oh!

njsmith commented Nov 8, 2017 via email

Uh oh!

ziyan commented Nov 8, 2017

Uh oh!

ziyan commented Nov 8, 2017

Uh oh!

eric-wieser commented Nov 8, 2017 •

edited

Loading

Uh oh!

ziyan commented Nov 8, 2017

Uh oh!

eric-wieser commented Nov 8, 2017

Uh oh!

eric-wieser commented Nov 8, 2017

Uh oh!

ziyan commented Nov 8, 2017

Uh oh!

twmr commented Dec 1, 2017 •

edited

Loading

Uh oh!

twmr commented Dec 2, 2017

Uh oh!

Uh oh!

Uh oh!

ENH: skip NPY_ALLOW_C_API for UFUNC_ERR_IGNORE #9985

ENH: skip NPY_ALLOW_C_API for UFUNC_ERR_IGNORE #9985

Uh oh!

Conversation

ziyan commented Nov 8, 2017

Uh oh!

njsmith commented Nov 8, 2017 via email

Uh oh!

ziyan commented Nov 8, 2017

Uh oh!

ziyan commented Nov 8, 2017

Uh oh!

eric-wieser commented Nov 8, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ziyan commented Nov 8, 2017

Uh oh!

eric-wieser commented Nov 8, 2017

Uh oh!

eric-wieser commented Nov 8, 2017

Uh oh!

ziyan commented Nov 8, 2017

Uh oh!

twmr commented Dec 1, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

twmr commented Dec 2, 2017

Uh oh!

Uh oh!

eric-wieser commented Nov 8, 2017 •

edited

Loading

twmr commented Dec 1, 2017 •

edited

Loading