-
-
Notifications
You must be signed in to change notification settings - Fork 10.8k
math functions fail confusingly on long integers (and object arrays generally) #368
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Perhaps it would make sense to raise a warning when an object array is created in this way, I suspect it is usually unintended and leads to these unexpected results. |
I'm not sure about raising a warning. There are a lot of cases where asarray will return OBJECT arrays as the default. Perhaps we could detect a failing "float" or "int" conversion Having object arrays look for attributes to implement the ufuncs has been a "feature" of Numeric from 1995. I think it would make sense to have the ufuncs which are also in the standard library to call the standard library function. One challenge is that you have cmath and math rather than a unified interface. I suppose if we always used cmath then we would get expected behavior for both real and complex "objects". |
If the fallback for sqrt is already implemented, why does I've been encountering this bug via scipy's |
Because the existing fallback is to look for a (The math versus cmath thing is a bit of a red herring, I think, since On Fri, Feb 15, 2013 at 12:57 PM, Dan Blanchard notifications@github.comwrote:
|
Does anyone have a clue where this fallback is implemented? I will gladly take a swing at fixing this, but I cannot for the life of me find where this happens. |
Don't really know it too well, but as a hint, that is what the Btw. does np.conjugate work for you on object arrays? Doesn't seem to me, but maybe I am just trying wrong right now. |
I believe I've enumerated all the functions that are broken for objects arrays. np.logical_not(x)
np.degrees(x)
np.rad2deg(x)
np.radians(x)
np.deg2rad(x)
np.arccos(x)
np.arccosh(x)
np.arcsin(x)
np.arcsinh(x)
np.arctan(x)
np.arctanh(x)
np.cos(x)
np.sin(x)
np.tan(x)
np.cosh(x)
np.sinh(x)
np.tanh(x)
np.exp(x)
np.exp2(x)
np.expm1(x)
np.log(x)
np.log2(x)
np.log10(x)
np.log1p(x)
np.sqrt(x)
np.ceil(x)
np.trunc(x)
np.fabs(x)
np.floor(x)
np.rint(x)
np.fmod(x, x)
np.logical_and(x, x)
np.logical_or(x, x)
np.logical_xor(x, x)
np.arctan2(x, x)
np.hypot(x, x) all fail with an A related issue---but I'm not sure if this one is as much a bug, since there's a somewhat sensible error message---is that all of these: np.isnan(x)
np.isinf(x)
np.isfinite(x)
np.signbit(x)
np.spacing(x)
np.modf(x)
np.logaddexp(x, x)
np.logaddexp2(x, x)
np.copysign(x, x)
np.nextafter(x, x) fail with a I believe the AttributeErrors are all being caused by lines 409 and 445 in loops.c.src, which both feature something like this: PyObject *ret = PyObject_CallMethod(in1 ? in1 : Py_None, meth, NULL); I think it would make the most sense to change them to something like this: PyObject *ret = NULL;
if (PyObject_HasAttrString(in1, meth)) {
ret = PyObject_CallMethod(in1 ? in1 : Py_None, meth, NULL);
} else {
// Call standard library math function if it exists, otherwise die
} I'm pretty new to the whole Python C API (although I'm very familiar with Python and C separately) and the Numpy source code, so I'm not sure how best to implement the else part of that conditional statement. Anyone have any ideas/suggestions? Is there a simple way to see if we have a function named |
@dan-blanchard I think the bug you are hitting in |
@bfroehle I actually have already upgraded scipy, but the fix for kendalltau is just a workaround to deal with this more general bug. I'd like to try to rectify the bug in general, because it is bizarre that you can't use any of the above math functions on an array of long integers (or objects). |
@dan-blanchard, You need to do exactly what you would do in python, load the math module, and call the appropriate function. An example of doing something similar from the ndarray.dot method is: static PyObject *
array_dot(PyArrayObject *self, PyObject *args, PyObject *kwds)
{
PyObject *fname, *ret, *b, *out = NULL;
static PyObject *numpycore = NULL;
char * kwords[] = {"b", "out", NULL };
if (!PyArg_ParseTupleAndKeywords(args, kwds, "O|O", kwords, &b, &out)) {
return NULL;
}
/* Since blas-dot is exposed only on the Python side, we need to grab it
* from there */
if (numpycore == NULL) {
numpycore = PyImport_ImportModule("numpy.core");
if (numpycore == NULL) {
return NULL;
}
}
fname = PyUString_FromString("dot");
if (out == NULL) {
ret = PyObject_CallMethodObjArgs(numpycore, fname, self, b, NULL);
}
ret = PyObject_CallMethodObjArgs(numpycore, fname, self, b, out, NULL);
Py_DECREF(fname);
return ret;
} As you can see the module is imported using PyImport_ImportModule and the reference is stored in a static variable so that it will be available for future calls. Then a PyObject containing the function name is constructed (using PyUString_FromString, a numpy macro that calls PyString_FromString on Python2 and PyUnicode_FromString on Python3.) and used to call the appropriate function, by calling PyObject_CallMethodObjArgs. In your case, you'll probably want to do the function look up yourself, and then call PyObject_CallFunctionObjArgs to save the look up overhead since you're looping. There are two wrinkles I see for doing this here: 1) not everything you might want is located in the math module, some are in operator, some have different names and some don't exist AFAIK, so you'll need a mapping. 2) Object arrays can be heterogeneous, what happens when an exception is raised for one object? I.E. a = np.array(['foo', 37L, 0.37], np.object)
np.sqrt(a) # what happens? |
Also, unless math vs. cmath is handled you'll leave a different confusing corner case than what we have now: import math, cmath
math.sqrt(47) # gives real
Out[2]: 6.855654600401044
cmath.sqrt(47) # gives complex
Out[3]: (6.855654600401044+0j)
math.sqrt(47 + 10j) # raises
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-4-ecff6b007476> in <module>()
----> 1 math.sqrt(47 + 10j) # raises
TypeError: can't convert complex to float
cmath.sqrt(47 + 10j) # gives complex
Out[5]: (6.893912354640718+0.7252775699467939j) Perhaps just calling the math version is the best solution, since otherwise we need to do type checking on each object. Although maybe such typechecking is necessary. |
@ewmoore Thank you so much for the tips. That should be more than enough for me to get a handle on this.
I think that in that case we would want to raise a TypeError (or maybe a ValueError since the expected type is just object here) and say something like "Element 0 in array does not support sqrt". The exact wording would be a little tricky, because we first check if the object has a method named sqrt, but then we call math.sqrt as a fallback.
I'm inclined to think that typechecking is the better solution here, but it's obviously slower. |
The speed issues seems irrelevant to me -- checking the type is not really One possible objection to type-checking is that regular python never In [3]: math.sqrt(-1) In [4]: cmath.sqrt(-1) OTOH, there's: In [9]: (-1) ** 0.5 In [10]: complex(-1) ** 0.5 And our goal here isn't to replicate math.sqrt or cmath.sqrt, but to In [11]: np.sqrt(-1) In [12]: np.sqrt(complex(-1)) So I guess I'm +1 on type-checking. On Thu, Mar 7, 2013 at 4:51 PM, Dan Blanchard notifications@github.comwrote:
|
Oh, and for errors, I think the logic we want is: import numbers (We may well want to literally implement the whole fallback lookup logic in On Fri, Mar 8, 2013 at 10:32 PM, Nathaniel Smith njs@pobox.com wrote:
|
The original issue reported was fixed by 6966bdc but the underlying issue is still present (at least as of version 1.11.0.dev0+2bc5cb9, which can be seen in the following snippet:
|
In my opinion it's still very confusing. If the number is bigger or smaller than 2^64 python always calls it an "int" but in the former case it fails and in the latter case it works. See below for an example
|
Hello everyone, I'm new to open source and I would like to contribute to resolving this issue. Not sure where to start but I'll read the thread thus far to catch myself up to speed. Please let me know how I can be of use. |
@jreyesco16 welcome. Take a look at the contributors page for some hints on how to get going. |
Hi all, @harrow I agree that (at least for a regular Numpy user like myself) For anyone encountering this problem now, a simple conversion avoids the error, e.g.
Would it be possible to have Numpy do the conversion internally or output a more comprehensive error message? @njsmith I am new here, would it be appropriate to open a new issue and close this one, since things have developed in the meantime? |
feat: Add vrecpe[q]_f64 and vrsqrte[q]_f64
Another confusing case, dealing with very large integers, which require using object arrays:
Specifying the
It's unclear whether the function doesn't support the input types (why?) or there is a bug related to the |
If you want a |
If you pass a 'long' integer greater than sys.maxint to np.sqrt, then it blows up with a confusing error message:
This was noticed because it breaks SciPy's kendalltau function: http://mail.scipy.org/pipermail/scipy-user/2012-July/032652.html
It was hard to diagnose because the error message is so misleading.
It looks like what's going on is that ndarray conversion on very large 'long' objects gives up and simply returns an object array. So... that seems reasonable, not sure what else we could do:
And then when called on an object array, np.sqrt does its weird ad-hoc fallback thing, and tries calling a .sqrt() method on each object. Which of course doesn't exist, since it's just something we made up. (But this is where the confusing error message comes from.) In fact, our handling of object arrays is pretty broken all around -- we can't even take the square root of float objects:
The math module versions of sqrt and friends accept long integers:
Mostly this just works by calling PyFloat_AsDouble, which goes via the float method if defined (as it is for longs). However, the math module does have special code for longs in some cases (log in 2.7, maybe more in future versions, who knows):
So in conclusion: np.sqrt and friends, when operating on object arrays, should fall back to the stdlib math functions. (This would be in addition to the current .sqrt method fallback. I guess the current fallback should probably be tried, since anyone defining our ad-hoc methods is presumably doing so specifically because they want numpy to respect that.)
The text was updated successfully, but these errors were encountered: