-
Notifications
You must be signed in to change notification settings - Fork 15
Add register_anonymous
to BinaryOp and use this for isclose
.
#10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Oops, I still need to do this for `UnaryOp`. We cache the compilation of `isclose` based on `(abs_tol, rel_tol)`. Overally, I don't know the performance difference of this `isclose`, but just as importantly this should be more memory efficient.
Still need to add a few tests and docstrings. I'm not sure I like `register_anonymous`. create_anonymous, build_anonymous, create, build, compile, jit?
So far, do unary, binary, and monoid. Semirings may come later. This required modifying how UDFs are registered. In particular, we are now more relaxed about bool inputs that have non-bool outputs. Numba has difficulty compiling functions for bools, so we upcast bool values to int8 during the computation. This may introduce subtle bugs if coercions are not appropriate. For example, bit-twidding and shifting of bools may be incorrect!
…rings. Specifically, we need to match the return type of binary functions to the type for the monoid.
It's risky to have flexible equality *and* hashability. I think it's best to choose one; in this case, flexible equality is nicer.
v = Vector.new_from_values([0, 1, 3], [1, 2, -4], dtype=dtypes.INT32) | ||
v << v.apply(unary.plus_one) | ||
result = Vector.new_from_values([0, 1, 3], [2, 3, -3], dtype=dtypes.INT32) | ||
assert v.isequal(result) | ||
|
||
|
||
def test_unaryop_udf_bool_result(): | ||
pytest.xfail('not sure why numba has trouble compiling this') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good news! The hack to change BOOL to INT8 during computation appeared to work. Well, at least this test passes now.
It sure would be nice to figure out ahead-of-time compiling with numba so we don't need to JIT compile the numpy functions every time we start up. |
Yeah, that would be great (assuming the binary isn't too large). One step at a time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only thing I might change is the location of _generate_isclose. It feels strange having that in vector.py, but I can't think of where else makes more sense. So I'm okay with it here for now.
This revealed that our hack to support boolean unary and binary functions is not compatible with monoids and semirings. Oh well. All bool operations in `unary.numpy` and `binary.numpy` should work "as expected", which, admittedly, is a little weird at times. Bool to float operations go to FP32.
Whew! Tests added. UDFs over bools works for unary and binary operations, but not monoid or semiring. Oh well. Hopefully our hack to support bool unary and binary continues to work okay with few surprises. Shall we merge? I may eventually have |
Oops, I still need to do this for
UnaryOp
.We cache the compilation of
isclose
based on(abs_tol, rel_tol)
. Overall, I don't know the performance difference of thisisclose
, but just as importantly this should be more memory efficient.This is still a little rough, but I put this up for feedback.