-
-
Notifications
You must be signed in to change notification settings - Fork 11.1k
BUG: add exception to searchsorted incompatible inputs #26650
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Initial PR failed because of regression tests from more than 10 years ago. There are 3 tests: The first regression test passes:numpy/numpy/_core/tests/test_regression.py Lines 859 to 862 in 5aaede6
The second test passes too:numpy/numpy/_core/tests/test_regression.py Lines 2017 to 2020 in 5aaede6
The new verification does not catches the error and the expected exception is thrown in a getattr inside _wrapit :
File ~/code/os/numpy/numpy/_core/fromnumeric.py:46, in _wrapit(obj, method, *args, **kwds)
43 # As this already tried the method, subok is maybe quite reasonable here
44 # but this follows what was done before. TODO: revisit this.
45 arr, = conv.as_arrays(subok=False)
---> 46 result = getattr(arr, method)(*args, **kwds)
48 return conv.wrap(result, to_scalar=False)
TypeError: '<' not supported between instances of 'tuple' and 'float' The third test fails (logs) and is related to #642 and #2658.numpy/numpy/_core/tests/test_regression.py Lines 2069 to 2078 in 5aaede6
Which now throws: ValueError: Incompatible types for searching: a ((numpy.record, [('f0', '<i4'), ('f1', '<i4')])) and v (int64)
SummaryThere's already a rudimentary type verification by means of not being able to compare types, but is not enough to catch With this PR there will be two types of errors:
My suggestion from 647bb85 is to just use |
This PR has a negative impact on the performance for small arrays. A quick test:
Results in
(that does not mean we should not merge this PR, as it does fix a bug)
|
I agree! Is there any opportunity for "private" functions and methods that already trust the argument types? If so, the work has to be done only once to bring the Python built-in types to numpy types, all the rest of the work can be done under the hood. |
Any updates on this? A very similar PR has been approved: #26667 |
Personally, I prefer the "first make it right, then make it fast" approach and am inclined to accept this PR as is. But if someone wants to dive into moving the numpy/numpy/_core/src/multiarray/item_selection.c Line 2084 in 95073d1
The implementation would still look up |
@mattip The problem addressed in the PR is a bit on the boundary between fixing an bug, and providing a better error message, so I would prefer to solve it right in the first go. I would be willing to look in to the C implementation, but that will not be in the comings weeks, so if we do not want to wait there are no objections from my side to merge this. |
I can give a go at the C code as well but I'm very new to numpy's codebase, so a little guidance would be of much help. |
Co-authored-by: Pieter Eendebak <pieter.eendebak@gmail.com>
Performance considerations for this PR might have changed a bit since #27119. I am having trouble compiling on my own system, so I cannot provide any good benchmarks though. @luxedo Do you still want to have a go at the C code? I think (but it would be nice if a numpy developer could confirm) that you could move the check to |
I'm out for a couple more weeks or so :( but I'd love to give it a go |
Hi! I'm starting taking a look and trying to understand numpy's C code better. So far I got this: NPY_NO_EXPORT PyObject *
PyArray_SearchSorted(PyArrayObject *op1, PyObject *op2,
NPY_SEARCHSIDE side, PyObject *perm)
{
/* Check if types are comparable */
// Load 'less' ufunc to pass to py_resolve_dtypes_generic. DON'T KNOW HOW TO FIND THE ARGUMENTS.
PyUFuncObject *ufunc_less = PyUFunc_FromFuncAndData("Don't know how to get 'less' ufunc");
// Pack input types an a tuple to pass to py_resolve_dtypes_generic
PyObject *dtypes = PyTuple_Pack(2, PyArray_DESCR(op1), op2);
// Empty tuple as kwnames
PyObject *kwnames = PyTuple_New(0);
// Check if types are compatible
PyObject *resolve = py_resolve_dtypes_generic(ufunc_less, NPY_FALSE, &dtypes, 2, kwnames);
// Cleanup and early return if types are not comparable
bool incompatible_types = resolve == NULL;
Py_DECREF(dtypes);
Py_DECREF(kwnames);
Py_XDECREF(resolve);
if (incompatible_types) {
PyErr_SetString(PyExc_TypeError, "incomparable types");
return NULL;
}
... This won't compile yet. Because I don't know how to get the Is this in the correct path? Is there any other way to validate that the types can be compared? |
Closes #24032 by throwing an exception when
searchsorted
receives incompatible types