Skip to content

ENH: Added support for arrays with dtype=object to np.isinf, np.isnan, np.isfinite #10820

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 8 commits into from

Conversation

madphysicist
Copy link
Contributor

@madphysicist madphysicist commented Mar 29, 2018

The loops check if the object has a __float__ or __complex__ method, then handle the object as usual. Preference is given to __float__ in all cases. This is a preliminary step to make sure I understand the basics of how to add to a ufunc, so please grade harshly! The next step will be adding datetime and timedelta support to np.isfinite, in preparation for adding support to np.histogram.

Two questions arise:

  1. Why are the integer loops for isinite, isinf, isnan not optimized to always return True regardless of the input data?
    a. Am I correct to assume that the values are actually cast to floating point before running through the normal loop?
    b. Is there any desire to change this? Or conversely, any motivation not to?
  2. Why are isneginf and isposinf not ufuncs like isinf? There does not appear to be any major issue except a minor break in backwards compatibility where the second parameter would be renamed from y to out.

),
'isfinite':
Ufunc(1, 1, None,
docstrings.get('numpy.core.umath.isfinite'),
None,
TD(inexact, out='?'),
TD(noint, out='?'),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if I have multiple overlapping TD calls? e.g., if I hadn't deleted the line TD(inexact, out='?') here.

@@ -1287,6 +1287,71 @@ def test_nan():

assert_raises(TypeError, test_nan)


class _CustomFloat(object):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there is already an implementation of mocks like these somewhere, I would be happy to get rid of them.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you use unittest.mock?

int v;

UNARY_LOOP {
PyObject *in1 = *(PyObject **)ip1;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am almost 100% sure that I don't need to Py_INCREF(in1) here or anywhere else, but I feel like someone should verify that since I do pass it to PyFloat_AsDouble and PyComplex_AsCComplex down the line.

if (cplx.real == -1.0 && cplx.imag == 0.0 && PyErr_Occurred()) {
PyErr_Format(PyExc_TypeError, "must be real or complex number, not %s",
Py_TYPE(in1)->tp_name);
break;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does a corresponding Py_DECREF(in1) need to go here? (see above)

else {
v = @func@(dbl) != 0;
}
*((npy_bool *)op1) = v;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does a corresponding Py_DECREF(in1) need to go here? (see above)

@eric-wieser
Copy link
Member

eric-wieser commented Mar 29, 2018

I think this would be a little cleaner via funcs.inc.src, in a similar way to how npy_ObjectLogicalNot is implemented. You should be able to just move most of your implementation to that file

@madphysicist
Copy link
Contributor Author

@eric-wieser. I have moved the implementation to funcs.inc.src. This includes changing the output dtype to np.object for cases when the input is np.object. I've also added comments to the docs for that.

I added support for integers (objects implementing __int__/__long__) as well. If you have a better idea about where to put the sub-function that checks for integer types, please let me know.

I've added a couple more comments to my code about concerns that I have with the correctness.

@@ -347,6 +347,12 @@ in the c-api_ documentation and the example in how-to-extend_.

.. _c-api: https://github.com/numpy/numpy/blob/master/doc/source/reference/c-api.array.rst
.. _how-to-extend: https://github.com/numpy/numpy/blob/master/doc/source/user/c-info.how-to-extend.rst
Support for ``dtype=object`` in ``isinf``, ``isnan``, ``isfinite``
Support for ``dtype=object`` in ``isnan``, ``isinf``, ``isfinite``
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rebase mistake

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@eric-wieser
Copy link
Member

Note this duplicates #6320 - you might want to compare to the approach used there

@@ -808,6 +808,7 @@ def english_upper(s):
docstrings.get('numpy.core.umath.isnan'),
None,
TD(inexact, out='?'),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if I have multiple overlapping TD calls? e.g., if this line read TD(noint, out='?'), making conflicting protoypes for dtype=object?

if(i1 == NULL) {
return NULL;
}
else {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am almost 100% sure that I don't need to Py_INCREF(i1) here or anywhere else, but I feel like someone should verify that since I do pass it to PyFloat_AsDouble, PyComplex_AsCComplex, etc down the line.

Just because all my tests pass does not mean that the refcounting is done correctly here.

@@ -226,6 +226,84 @@ npy_ObjectLCM(PyObject *i1, PyObject *i2)
return PyNumber_Absolute(tmp);
}

/* Utility used to check if a number is an integer or has an __int__/__long__ method */
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An alternative implementation is suggested by https://stackoverflow.com/a/49562820/2988730. However, using PyNumber_Long/PyNumber_Int will allow converstion of strings and objects with just __trunc__ but no __float__, which I don't think we want.

@madphysicist
Copy link
Contributor Author

@eric-wieser I agree that this should defer to #6320. I especially like the part where the result is always np.bool. This PR has two valuable things though: it calls __float__ and __complex__ as necessary, and it checks for integers. Basically, it treats python types more like you would expect numpy to treat them based on how it treats its own types.

I will leave this around for a bit longer to see if anyone can give me more pointers. In the meantime, I will go work on adding datetimes to np.isfinite and eventually to np.histogram.

@eric-wieser
Copy link
Member

In the meantime, I will go work on adding datetimes to np.isfinite and eventually to np.histogram.

Sounds good to me. Maybe add loops for the integer types too. I think there's a dead PR for that somewhere too.

@madphysicist
Copy link
Contributor Author

madphysicist commented Mar 29, 2018

@eric-wieser Let me know if you find it. That was one of the questions I had at the top. At least for the three functions that I have here, you don't even need a loop, just a boolean array of all ones. Is that something that ufuncs can easily support?

@eric-wieser
Copy link
Member

You just write a loop that ignores its inputs and writes ones to the output.

@eric-wieser
Copy link
Member

eric-wieser commented Feb 19, 2019

Let me know if you find it.

Still looking, but found a related one for bools: #12988

@mattip
Copy link
Member

mattip commented Nov 3, 2019

@madphysicist would you like to continue with this? The release note should be rewritten as a fragment in doc/release/upcoming_changes and the conflicts resolved to start to move forward.

@madphysicist
Copy link
Contributor Author

@mattip. Eventually. I'm going to have to re-familiarize myself with what I did here first.

Base automatically changed from master to main March 4, 2021 02:04
@seberg seberg added 54 - Needs decision 57 - Close? Issues which may be closable unless discussion continued labels Jul 8, 2021
@mattip
Copy link
Member

mattip commented Feb 7, 2024

Closing. This has been around for quite a while and has not moved very far. Please reopen if you want to pursue the idea.

@mattip mattip closed this Feb 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
01 - Enhancement 54 - Needs decision 57 - Close? Issues which may be closable unless discussion continued component: numpy._core
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants