-
-
Notifications
You must be signed in to change notification settings - Fork 10.8k
Add isfinite
support for datetime64
and timedelta64
#5610
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I made some progress on this today: cc99cdd / It works for unit-less datetime64/timedelta64, but fails for dates or times with units attached:
It appears that this needs something with the ufunc dtype resolution to handle datetimes with units, but I'm not quite sure what that is. Somehow |
@seberg nope, I got stuck on ufunc type resolution (in the linked branch). I do agree this would be useful, though I usually find checking for |
I just hit this bug. I hope it can be fixed. |
I took a look at @shoyer's branch at cc99cdd. The issue seems to be that the default type resolver ends up invoking
(the actual code here is https://github.com/numpy/numpy/blob/master/numpy/core/src/umath/ufunc_type_resolution.c#L1867). For at least the scalar case, The issue ends up being that the input dtype is I think the easiest fix here is probably to just add a new ufunc type resolver for |
@ssanderson would it also work to register the datetime types as types to If you ahve some time, we all always appreciate pull requests! |
@seberg I don't think it's possible to register ufuncs for the specific unit. Looking at the signature for |
I made some progress on a custom resolver last night; I'll see if I can tie off that work tonight. |
It's possibly worth figuring out what path in ufunc_type_resolution |
@shoyer I think my long post above outlines the path they take. The root issue is that part of (at least the generic case) of ufunc resolution is checking whether the input dtype can be safely cast to a dtype for which there's a registered ufunc for the function being dispatched. ufuncs that work on datetime/timedelta are always just registered with a generic dtype, and it's not considered safe to cast from The only exception seems to be ufuncs dispatched with |
@shoyer specifically, the reason we end up trying to check coercions between a generic datetime and a unit-having datetime is that we get our target type in So now I'm wondering if maybe the right fix here is to check if we're resolving a datetime/timedelta and if so copy the metadata from our input to the temporary dtype used for the casting check... |
Here's an attempt that gets the isfinite tests working for datetime and friends: #7856. I'm not totally sure whether the short-circuit on equal typenums I added is safe. The alternative, I think, is to figure out how to correctly populate the temp dtype's metadata by copying it from the input array. As far as I can tell, that requires a new publicly-visible function in numpy_api.py. At the very least, I couldn't figure out how to call a private function defined in |
@ssanderson, do we have any movement on this issue? It would be great if we could get this in sometime soon because it is affecting our workflow (MPAS-Dev/MPAS-Analysis#269). Can we get some idea of what is needed to resolve this for the short-term? For instance, #7856 has gone stale-- what help (if any) is needed to revitalize the PR? Or, is an alternative approach needed? |
By now we have |
Bump. Ran into np.isnan failing on pd.datetimeindex, tried converting it to a datetime, pandas.Timestamp, and np.datetime64 but same issue. |
Yep, just ran into this issue today too. |
I think what this requires is including the datetime and timedelta dtypes in the loops understood by |
@mhvk I've decided to try to solve this issue. It's high time I learned about ufuncs and the C API and I would like to use this as a stepping stone to make it possible to histogram datetimes. |
Great! |
There is some discusssion around the idea that ufunc inner loops should have access to the operand dtypes. It is not clear how to do this yet in a backward-compatible fashion, perhaps extending the signature of Edit: add link to |
They need access to it, but I wonder if they don't need a hook already
before (during/after the dtype resolution step) anyway.
Also, it might be nice to try to think ahead and see that such a new
hook is at a point where it could in principle modify a flexible output
type to be correct.
What I mean is, we have dtype resolution step (and specializations for
ufuncs). After a specific loop is decided (eg. 's3s5->s'), that loop
could get a hook to decide on things (or throw errors), so make it
's3s5->s8' (allowing output allocation to work).
Then that function could also set a data void* (maybe a new one) to
whatever it signal whatever its specific inner loop needs.
|
Do you mean the dtype resolution function? Yes, I think I see where you are going with that, the resolution function could set the return dtype and also set the Error handling: the specific function could set a |
Yes, for the most part. Except I do not know why we would need to pass
in anything to the inner loop (except assuming the inner loop needs it,
such as the string length.
Say we have a units dtype, then they can already throw the error
during/after/as part of the type resolution step. I do not think it
needs to pass a function to the inner loop. At least not normally.
(Of course some stranger things might pass in functions for
functionality)
Now the question is also, if we need an "after" hook in some sense, but
probably not really.
As an example:
`unit + unit` goes into the type resolvers and it says finds it is the:
"unit,unit->unit" registered loop, not it will call:
unit_finalize_dtype_for_add(*actual_out_dtype_if_it_is_flexible,
*actual_loop)
That tests if the units make sense (and throws an error otherwise), and
sets the output dtype to have the correct unit, etc.
It then just assigns the normal `d,d->d` loop (or actually, it does not
and that is already fixed during registration).
But the thing is: the unit dtype should be able to reuse the double
inner loops.
|
My sense is that there is no real need to adjust the inner loops themselves for dealing with types - there is already the general purpose One part where a standard for passing to inner loops might be more useful is for (It does seem we're a bit far off track of the actual issue here! It would seem useful to collect cases where the current ufunc structure doesn't work so well in a wiki; for what it is worth, for units I think chained ufuncs are the way to go, i.e., one calls an upper inner loop which calls more standard inner loops below). |
Just got hit by this bug. My use case is simple: |
I think #12988 might address this. |
@concretevitamin: |
@eric-wieser Sure - make ndarray of datetime64 function more like first-class citizen will be welcome! |
numpy.isfinite
should supportdatetime64
andtimedelta64
objects. It should returnTrue
for any ordinary value, andFalse
fordatetime64('NaT')
andtimedelta64('NaT')
.As a side-effect, this will make
isclose
andallclose
usable fordatetime64
andtimedelta64
objects.The text was updated successfully, but these errors were encountered: