Add `isfinite` support for `datetime64` and `timedelta64` #5610

gerritholl · 2015-02-26T18:49:43Z

numpy.isfinite should support datetime64 and timedelta64 objects. It should return True for any ordinary value, and False for datetime64('NaT') and timedelta64('NaT').

As a side-effect, this will make isclose and allclose usable for datetime64 and timedelta64 objects.

In [97]: numpy.isfinite(numpy.datetime64(10**9, 's'))
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-97-8555b18406e5> in <module>()
----> 1 numpy.isfinite(numpy.datetime64(10**9, 's'))

TypeError: ufunc 'isfinite' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''

The text was updated successfully, but these errors were encountered:

shoyer · 2015-10-14T03:33:47Z

I made some progress on this today: cc99cdd /
https://github.com/shoyer/numpy/tree/datetime64-isfinite

It works for unit-less datetime64/timedelta64, but fails for dates or times with units attached:

In [1]: import numpy as np

In [2]: np.isfinite(np.timedelta64(10))
Out[2]: True

In [3]: np.isfinite(np.timedelta64('NaT'))
Out[3]: False

In [4]: np.isfinite(np.timedelta64('NaT', 'us'))
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-4-3998f88199cc> in <module>()
----> 1 np.isfinite(np.timedelta64('NaT', 'us'))

TypeError: ufunc 'isfinite' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''

It appears that this needs something with the ufunc dtype resolution to handle datetimes with units, but I'm not quite sure what that is. Somehow PyUFunc_SimpleBinaryComparisonTypeResolver and PyUFunc_SimpleUnaryOperationTypeResolver can handle datetime64 with units but the default PyUFunc_DefaultTypeResolver cannot.

seberg · 2016-01-21T22:04:15Z

@shoyer I was wondering whether you got somewhere with this? I fear that if we change the NaT comparison handling, we probably cannot avoid implementing isnan and isfinite for it.
Otherwise even our own tests should not like them (though of course we could work around that).

EDIT: See also gh-5222

shoyer · 2016-01-22T00:03:54Z

@seberg nope, I got stuck on ufunc type resolution (in the linked branch). I do agree this would be useful, though I usually find checking for x != x is a good way to identify NaN/NaT even without a dedicated function.

rabernat · 2016-06-11T15:07:51Z

I just hit this bug. I hope it can be fixed.

ssanderson · 2016-07-20T04:33:11Z

I took a look at @shoyer's branch at cc99cdd. The issue seems to be that the default type resolver ends up invoking linear_search_type_resolver, which essentially executes something like:

for type in ufunc.registered_types:
    if ufunc_loop_matches(input_data, type, ...):
       <use type>

(the actual code here is https://github.com/numpy/numpy/blob/master/numpy/core/src/umath/ufunc_type_resolution.c#L1867).

For at least the scalar case, ufunc_loop_matches essentially boils down to a call to PyArray_CanCastTypeTo: https://github.com/numpy/numpy/blob/master/numpy/core/src/umath/ufunc_type_resolution.c#L1529-L1535 with a casting rule of 'safe'.

The issue ends up being that the input dtype is datetime64[<unit>] or timedelta64[<unit>], but the type registered for the ufunc is just vanilla datetime64 or timedelta64. Since casting from a datetime with a unit to one without a unit isn't deemed, safe, we decide that we don't have a ufunc registered for the input data. (This is why the dispatch works correctly on unit-less datetimes/timedeltas.).

I think the easiest fix here is probably to just add a new ufunc type resolver for isfinite. It doesn't seem worthwhile to try to shoehorn datetime/timedelta-handling logic into the default resolver, since almost all ufuncs that work with datetimes/timedeltas seem to need their own custom behavior to deal with unit conversion (there are already custom resolvers used for add, subtract and multiply for example.

seberg · 2016-07-20T10:58:29Z

@ssanderson would it also work to register the datetime types as types to np.isfinite and np.isnan (assuming we are OK using those names)? Honestly, have not tried to understand the issues yet really.

If you ahve some time, we all always appreciate pull requests!

ssanderson · 2016-07-20T17:28:44Z

@seberg I don't think it's possible to register ufuncs for the specific unit. Looking at the signature for PyUFunc_FromFuncAndData, it looks like you can only provide type numbers, but the units are metadata attached to the type number.

ssanderson · 2016-07-20T17:29:14Z

I made some progress on a custom resolver last night; I'll see if I can tie off that work tonight.

shoyer · 2016-07-21T00:55:07Z

It's possibly worth figuring out what path in ufunc_type_resolution isnan/isfinite normally take (for float data) and why that doesn't work for datetime64.

ssanderson · 2016-07-21T01:08:37Z

@shoyer I think my long post above outlines the path they take.

The root issue is that part of (at least the generic case) of ufunc resolution is checking whether the input dtype can be safely cast to a dtype for which there's a registered ufunc for the function being dispatched. ufuncs that work on datetime/timedelta are always just registered with a generic dtype, and it's not considered safe to cast from datetime64[<unit>] to generic datetime64 (which seems correct in general?) which means that, as far as I can tell, every ufunc that works on datetime/timedelta needs a custom resolver to work correctly.

The only exception seems to be ufuncs dispatched with PyUFunc_SimpleBinaryComparisonTypeResolver, which directly invokes PyArray_ResultType to figure out how to coerce the input, and then always yields NPY_BOOL for the output. Something similar, e.g. a PyUFunc_SimpleUnaryPredicateTypeResolver for functions that always yield NPY_BOOL would probably work for isfinite, and isnan if we wanted to make that work for datetime (I'd personally be 👎 on that: I'd rather have a generic isnull than have isnan work on non-floating values).

ssanderson · 2016-07-21T02:08:53Z

@shoyer specifically, the reason we end up trying to check coercions between a generic datetime and a unit-having datetime is that we get our target type in ufunc_loop_matches by calling PyArray_DescrFromType on the typenum being checked. The dtype created there has no unit set, so it behaves like a generic datetime when we try to check coercions against unit-aware inputs later on in the function.

So now I'm wondering if maybe the right fix here is to check if we're resolving a datetime/timedelta and if so copy the metadata from our input to the temporary dtype used for the casting check...

ssanderson · 2016-07-21T03:49:56Z

Here's an attempt that gets the isfinite tests working for datetime and friends: #7856.

I'm not totally sure whether the short-circuit on equal typenums I added is safe. The alternative, I think, is to figure out how to correctly populate the temp dtype's metadata by copying it from the input array. As far as I can tell, that requires a new publicly-visible function in numpy_api.py. At the very least, I couldn't figure out how to call a private function defined in multiarray from a function defined in umath.

pwolfram · 2017-11-17T21:24:40Z

@ssanderson, do we have any movement on this issue? It would be great if we could get this in sometime soon because it is affecting our workflow (MPAS-Dev/MPAS-Analysis#269). Can we get some idea of what is needed to resolve this for the short-term? For instance, #7856 has gone stale-- what help (if any) is needed to revitalize the PR? Or, is an alternative approach needed?

seberg · 2017-11-17T21:42:17Z

By now we have np.isnat, a PR is probably good. It might be that you can steal almost everything from there. Dunno if there is any discussion necessary about whether or not we should add this, but allowing it in isfinite seems OK.

starfleetjames · 2017-12-22T17:24:44Z

Bump. Ran into np.isnan failing on pd.datetimeindex, tried converting it to a datetime, pandas.Timestamp, and np.datetime64 but same issue.

willgdjones · 2018-03-14T12:08:35Z

Yep, just ran into this issue today too.

mhvk · 2018-03-14T13:37:19Z

I think what this requires is including the datetime and timedelta dtypes in the loops understood by isfinite, with the loops themselves essentially copied from isnat (but obviously with the opposite result). In principle, not difficult, though I labeled it as "intermediate" since anything to do with ufuncs is non-trivial.

madphysicist · 2018-03-27T19:56:10Z

@mhvk I've decided to try to solve this issue. It's high time I learned about ufuncs and the C API and I would like to use this as a stepping stone to make it possible to histogram datetimes.

mhvk · 2018-03-27T20:00:18Z

Great!

mattip · 2018-09-04T06:37:39Z

There is some discusssion around the idea that ufunc inner loops should have access to the operand dtypes. It is not clear how to do this yet in a backward-compatible fashion, perhaps extending the signature of PyUFuncGenericFunction to void loopfunc(char** args, npy_intp* dimensions, npy_intp* steps, void* data, PyArray_Descr * dtypes)? Then datetime64 loops could acces the units. Theoretically, we could repurpose the rarely-used data argument, but that would break code like PyUFunc_e_e and friends, which are used in downstream projects.

Edit: add link to PyUFuncGenericFunction

seberg · 2018-09-04T07:07:19Z

They need access to it, but I wonder if they don't need a hook already before (during/after the dtype resolution step) anyway. Also, it might be nice to try to think ahead and see that such a new hook is at a point where it could in principle modify a flexible output type to be correct. What I mean is, we have dtype resolution step (and specializations for ufuncs). After a specific loop is decided (eg. 's3s5->s'), that loop could get a hook to decide on things (or throw errors), so make it 's3s5->s8' (allowing output allocation to work). Then that function could also set a data void* (maybe a new one) to whatever it signal whatever its specific inner loop needs.

mattip · 2018-09-04T09:25:46Z

@seberg

Then that function could also set a data void*

Do you mean the dtype resolution function? Yes, I think I see where you are going with that, the resolution function could set the return dtype and also set the data pointer to a specific function as determined from the operand dtypes, this specific function would be called in the inner loop like PyUFunc_e_e and friends, as in the end of this example but not hard-coding the data pointer.

Error handling: the specific function could set a PyErr exception, and abort the inner loop function and then we could check after the loop execution PyErr_Occurred to raise the exception in the ufunc machinery. Note there is no return value from the inner loop functions, so there is no direct way to know that a loop erred out. This does in some way break the ufunc design that inner loops are not python-capi-aware, but I don't know if that was an explicit design goal.

seberg · 2018-09-04T09:57:33Z

Yes, for the most part. Except I do not know why we would need to pass in anything to the inner loop (except assuming the inner loop needs it, such as the string length. Say we have a units dtype, then they can already throw the error during/after/as part of the type resolution step. I do not think it needs to pass a function to the inner loop. At least not normally. (Of course some stranger things might pass in functions for functionality) Now the question is also, if we need an "after" hook in some sense, but probably not really. As an example: `unit + unit` goes into the type resolvers and it says finds it is the: "unit,unit->unit" registered loop, not it will call: unit_finalize_dtype_for_add(*actual_out_dtype_if_it_is_flexible, *actual_loop) That tests if the units make sense (and throws an error otherwise), and sets the output dtype to have the correct unit, etc. It then just assigns the normal `d,d->d` loop (or actually, it does not and that is already fixed during registration). But the thing is: the unit dtype should be able to reuse the double inner loops.

mhvk · 2018-09-04T14:39:27Z

My sense is that there is no real need to adjust the inner loops themselves for dealing with types - there is already the general purpose *data (I've used it for a prototype chained ufunc); so far, the only real weakness I have found with the whole ufunc mechanism is the inability to assign output arrays, but I think all that requires is moving the hooks for type resolution up earlier in the code, and allowing code to do more in that hook (using the type resolver also seems to have been the conclusion in this thread).

One part where a standard for passing to inner loops might be more useful is for where - working via a write-back buffer prevents meaningful use of where in reductions.

(It does seem we're a bit far off track of the actual issue here! It would seem useful to collect cases where the current ufunc structure doesn't work so well in a wiki; for what it is worth, for units I think chained ufuncs are the way to go, i.e., one calls an upper inner loop which calls more standard inner loops below).

concretevitamin · 2019-03-05T08:48:06Z

Just got hit by this bug. My use case is simple: np.histogram(ndarray_of_datetime64). Without being able to do this, I basically have no reason to use the datetime64 support. Please fix!

eric-wieser · 2019-03-05T09:23:40Z

I think #12988 might address this.

eric-wieser · 2019-03-05T09:38:34Z

@concretevitamin: np.histogram will work fine as long as you compute the bins for it yourself. There are more things blocking that than just nan-checking. linspace is another.

concretevitamin · 2019-03-05T23:37:15Z

@eric-wieser Sure - make ndarray of datetime64 function more like first-class citizen will be welcome!

This was referenced Feb 26, 2015

argmin/argmax, idxmin/idxmax and datetime64 pandas-dev/pandas#2982

Closed

[BUG] np.allclose does not work with np.timedelta64 #5630

Closed

ssanderson mentioned this issue Jul 21, 2016

Latest numpy and pandas quantopian/zipline#1339

Merged

kilianbreathnach mentioned this issue Nov 1, 2017

Pandas version-specific error: TypeError: ufunc 'isfinite' not supported for the input types facebook/prophet#343

Closed

xylar mentioned this issue Nov 17, 2017

error in indexNino34 on LANL IC MPAS-Dev/MPAS-Analysis#269

Closed

mhvk added 01 - Enhancement 23 - Wish List component: numpy._core defunct — difficulty: Intermediate labels Mar 14, 2018

titusjan mentioned this issue Mar 23, 2018

Possible to show date/time stamps on the X-Axis silx-kit/silx#1757

Closed

rgommers removed the 23 - Wish List label Aug 1, 2018

debsankha mentioned this issue Mar 30, 2019

ENH: isfinite support for datetime64 and timedelta64 #13218

Merged

charris closed this as completed in #13218 Apr 19, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `isfinite` support for `datetime64` and `timedelta64` #5610

Add `isfinite` support for `datetime64` and `timedelta64` #5610

gerritholl commented Feb 26, 2015

shoyer commented Oct 14, 2015

seberg commented Jan 21, 2016

shoyer commented Jan 22, 2016

rabernat commented Jun 11, 2016

ssanderson commented Jul 20, 2016 •

edited

Loading

seberg commented Jul 20, 2016

ssanderson commented Jul 20, 2016

ssanderson commented Jul 20, 2016

shoyer commented Jul 21, 2016

ssanderson commented Jul 21, 2016 •

edited

Loading

ssanderson commented Jul 21, 2016

ssanderson commented Jul 21, 2016

pwolfram commented Nov 17, 2017

seberg commented Nov 17, 2017

starfleetjames commented Dec 22, 2017 •

edited

Loading

willgdjones commented Mar 14, 2018

mhvk commented Mar 14, 2018

madphysicist commented Mar 27, 2018

mhvk commented Mar 27, 2018

mattip commented Sep 4, 2018 •

edited

Loading

seberg commented Sep 4, 2018 via email

mattip commented Sep 4, 2018

seberg commented Sep 4, 2018 via email

mhvk commented Sep 4, 2018

concretevitamin commented Mar 5, 2019

eric-wieser commented Mar 5, 2019

eric-wieser commented Mar 5, 2019 •

edited

Loading

concretevitamin commented Mar 5, 2019

Add isfinite support for datetime64 and timedelta64 #5610

Add isfinite support for datetime64 and timedelta64 #5610

Comments

gerritholl commented Feb 26, 2015

shoyer commented Oct 14, 2015

seberg commented Jan 21, 2016

shoyer commented Jan 22, 2016

rabernat commented Jun 11, 2016

ssanderson commented Jul 20, 2016 • edited Loading

seberg commented Jul 20, 2016

ssanderson commented Jul 20, 2016

ssanderson commented Jul 20, 2016

shoyer commented Jul 21, 2016

ssanderson commented Jul 21, 2016 • edited Loading

ssanderson commented Jul 21, 2016

ssanderson commented Jul 21, 2016

pwolfram commented Nov 17, 2017

seberg commented Nov 17, 2017

starfleetjames commented Dec 22, 2017 • edited Loading

willgdjones commented Mar 14, 2018

mhvk commented Mar 14, 2018

madphysicist commented Mar 27, 2018

mhvk commented Mar 27, 2018

mattip commented Sep 4, 2018 • edited Loading

seberg commented Sep 4, 2018 via email

mattip commented Sep 4, 2018

seberg commented Sep 4, 2018 via email

mhvk commented Sep 4, 2018

concretevitamin commented Mar 5, 2019

eric-wieser commented Mar 5, 2019

eric-wieser commented Mar 5, 2019 • edited Loading

concretevitamin commented Mar 5, 2019

Add `isfinite` support for `datetime64` and `timedelta64` #5610

Add `isfinite` support for `datetime64` and `timedelta64` #5610

ssanderson commented Jul 20, 2016 •

edited

Loading

ssanderson commented Jul 21, 2016 •

edited

Loading

starfleetjames commented Dec 22, 2017 •

edited

Loading

mattip commented Sep 4, 2018 •

edited

Loading

eric-wieser commented Mar 5, 2019 •

edited

Loading