-
Notifications
You must be signed in to change notification settings - Fork 45
Refactor assertions in test_creation.py
#32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
3992a6e
to
1b30535
Compare
I unexpectedly spent a lot of time extending/modifying namely the In terms of refactoring assertions, I've actually found the process useful in making sure to test all the appropriate things in each test method. It also saves time in creating user-friendly error messages, especially if you want to fix/modify the message. I was thinking of moving these into say Tomorrow I'll have to fix a bug I just caught (see comment), but otherwise I'm happy for a review. |
Is the point of specified_kwargs that you can't use kwargs because the tests use
I expect this to be pretty thorny. The spec doesn't require any level of precision for output values, and different libraries may return different results if they use different implementations. There have been some ideas (e.g., #7), but none of it is simple. I'd personally rather see at least basic coverage of all the functions before we start to worry about testing floating-point correctness. Right now there are still too many functions that aren't tested at all, or are only tested to a bare minimum in test_signatures. |
Also, you probably figured this out already, but we don't need to generalize the argument handling of arange too much. arange's signature is quite unique in the specification. It's the only function with a signature as complicated as it is. |
I will be refactoring assertion messages anyway
abafee1
to
a4a0a35
Compare
Yep, although |
Also |
Looking at the code for test_meshgrid, my guess is that you are not limiting the number of input arrays, so the result is huge. This causes NumPy to try to allocate an array that doesn't fit in memory and it crashes Python. Note that the result shape from meshgrid is the product of the input shapes. You should filter any input where the resulting output would be larger than MAX_ARRAY_SIZE. It's also reasonable to not send more than some smallish number of total inputs. This is another example where the input strategy is complicated, so we want to just test the promotion in the normal test. If anything, I'm surprised this doesn't crash every time. Also meshgrid should only really be supporting 1-d inputs. That looks like a bug in the numpy implementation. |
("dtype", dtype, None), | ||
), | ||
label="kw", | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I'm missing a reason why this wouldn't work, but can all the drawing stuff in this test be factored into a @composite
test strategy. The logic here is complicated it is probably a good idea to have some meta tests for it to make sure it doesn't accidentally omit things it shouldn't.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've now moved specified_kwargs()
to hypothesis_helpers.py
and wrote a test method for it. I also introduced a named tuple which I think helps to clarify what the arguments mean.
assert list(a) == list(r), "arange() produced incorrect values" | ||
assert out.dtype == dtype | ||
assert out.ndim == 1, f"{out.ndim=}, but should be 1 [linspace()]" | ||
f_func = f"[linspace({start=}, {stop=}, {step=})]" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
f_func = f"[linspace({start=}, {stop=}, {step=})]" | |
f_func = f"[arange({start=}, {stop=}, {step=})]" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will be confusing (or wrong really) if only a single argument is passed because arange(n)
is the "start" argument (the first argument) but n is actually the stop. In fact, that's why start is positional-only, so that arange(start=n) is impossible. In general I would writing x=val
in error messages if x
is a positional-only argument because that makes it not copy-pastable and potentially confusing if a library doesn't use the same positional-only name as the spec.
size <= hh.MAX_ARRAY_SIZE | ||
), f"{size=} should be no more than {hh.MAX_ARRAY_SIZE}" # sanity check | ||
|
||
kw = data.draw( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should not test stop and step as keyword-arguments. arange is a little special. We really want it to be a function with three signatures
arange(stop, /, dtype=None)
arange(start, stop, /, dtype=None)
arange(start, stop, step, /, dtype=None)
(c.f. help(range)
) but there's no way to represent that as a single signature in Python. The closest we can do is to make two of the arguments optional by making them keyword arguments. The complication here is first that if only one argument is passed, it is actually the stop
(even though it is still called start
, and second, things like arange(n, step=k)
and arange(n, stop=None)
are meaningless. IMO we should make this clearer in the spec, but really arange only ought to support start, stop, and step as positional, i.e., as if it were the above 3 signatures. This was discussed some in data-apis/array-api#107 and data-apis/array-api#85.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see what you mean. So now I test the following argument combinations:
(start,)
(only ifstop is None
)(start, stop)
(start, stop, step)
Note step
must be used as a keyword argument when the argument combination isn't (start, stop, step)
, which test_arange
indeed passes it as (unless sometimes when step == 1
).
Co-authored-by: Aaron Meurer <asmeurer@gmail.com>
794ef30
to
aaf0a7d
Compare
4574441
to
ca2ef81
Compare
Fixed every issue you identified, thanks! I'm not familiar enough with |
I've merged this. We might still need to do some work on the arange test, but for now I think it's OK. The meshgrid change is fine. We can expand it when we convert it into the actual meshgrid test. |
In #30 I'm experimenting with the
ph.assert_dtype()
assertion util. As well as those promoted dtype checks, I realise a lot of dtype assertions relate to the default int and float, so I wanted to experiment with refactoring those too... I didn't get round to that today, seeing as how tricky just understandingtest_arange
was (I think it covers more now at least).