Skip to content

math functions fail confusingly on long integers (and object arrays generally) #368

Open
@njsmith

Description

@njsmith

If you pass a 'long' integer greater than sys.maxint to np.sqrt, then it blows up with a confusing error message:

>>> np.__version__
'1.8.0.dev-1234d1c'
>>> np.sqrt(sys.maxint)
3037000499.9760499
>>> np.sqrt(sys.maxint + 1)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: sqrt

This was noticed because it breaks SciPy's kendalltau function: http://mail.scipy.org/pipermail/scipy-user/2012-July/032652.html

It was hard to diagnose because the error message is so misleading.

It looks like what's going on is that ndarray conversion on very large 'long' objects gives up and simply returns an object array. So... that seems reasonable, not sure what else we could do:

>>> np.asarray(sys.maxint + 1)
array(9223372036854775808L, dtype=object)

And then when called on an object array, np.sqrt does its weird ad-hoc fallback thing, and tries calling a .sqrt() method on each object. Which of course doesn't exist, since it's just something we made up. (But this is where the confusing error message comes from.) In fact, our handling of object arrays is pretty broken all around -- we can't even take the square root of float objects:

>>> np.sqrt(np.array(1.0, dtype=object))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: sqrt

The math module versions of sqrt and friends accept long integers:

>>> math.sqrt(sys.maxint + 1)
3037000499.97605
>>> math.cos(sys.maxint + 1)
0.011800076512800236

Mostly this just works by calling PyFloat_AsDouble, which goes via the float method if defined (as it is for longs). However, the math module does have special code for longs in some cases (log in 2.7, maybe more in future versions, who knows):

>>> math.sqrt(sys.maxint ** 100)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
OverflowError: long int too large to convert to float
>>> math.log(sys.maxint ** 100)
4366.827237527655

So in conclusion: np.sqrt and friends, when operating on object arrays, should fall back to the stdlib math functions. (This would be in addition to the current .sqrt method fallback. I guess the current fallback should probably be tried, since anyone defining our ad-hoc methods is presumably doing so specifically because they want numpy to respect that.)

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions