More promotion test cases #30

honno · 2021-10-19T09:41:33Z

This should help resolve #24. Starts from the unmerged #26 so the commit history is currently messy. I'm going to try cover some more methods before seeing about a refactor.

honno · 2021-10-20T16:10:37Z

I think I've covered every method now (save extensions). Parametrizing is a bit difficult given subtle differences in these methods, so something I'd recommend we can think about later. I made a Hypothesis strategy for 2-or-more promotable dtypes... it minimises okay, some room for improvement if/once I test for it.

Like with the elementwise/operator tests, I'm covering multi-dimensional arrays. I also thought to cover kwargs, although I didn't get round to fully implementing them today (e.g. the axis kwargs are limited so they're safe for all valid inputs, not respective to shapes). I'm now thinking of scrapping kwargs stuff (i.e. assume they're not going to affect dtype promotion behaviour), and leave that for you/me later to use in their own dedicated test methods. Scrapped kwargs.

The func name and respective dtypes are now included in all the error messages. I thought to keep that information at the end, so that in truncated scenarios the user can see the relevant bug information next to the test case name.

=============================================================================== short test summary info ===============================================================================
FAILED test_type_promotion.py::test_where[(int8, int16) -> int16] - AssertionError: out.dtype=int8, but should be int16 [where(int8, int16)]

array_api_tests/hypothesis_helpers.py

asmeurer · 2021-10-20T19:53:46Z

array_api_tests/test_type_promotion.py

+@given(shapes=hh.mutually_broadcastable_shapes(2, min_dims=1), data=st.data())
+def test_matmul(in_dtypes, out_dtype, shapes, data):
+    x1 = data.draw(xps.arrays(dtype=in_dtypes[0], shape=shapes[0]), label='x1')
+    x2 = data.draw(xps.arrays(dtype=in_dtypes[1], shape=shapes[1]), label='x2')


matmul needs the input shapes to align (see the conditions in test_matmul).

asmeurer · 2021-10-20T20:01:54Z

So it looks like none of the functions outside of elementwise are actually testable with the same test as elementwise. I wonder if we should just test the type promotion for these functions in their normal function tests, like I've done here

array-api-tests/array_api_tests/test_linalg.py

Line 274 in 9fbe990

    
           assert res.dtype == dh.promotion_table[x1.dtype, x2.dtype], "matmul() did not return the correct dtype"

For matmul especially I don't know if it's worth it trying to come up with an input strategy for it just so that we can split the type promotion out into a separate test. The primary benefit of testing type promotion separately, as I understand it, is that the types are parameterized, so that you always have every combination tested. I don't know hypothesis internals enough to know how high you have to set the max examples to ensure that for the more general test, but I think it will be good in general for library authors to do a run with a high max-examples once they feel their library is confirming just to confirm the tests didn't miss anything.

honno · 2021-10-21T16:42:30Z

I see what you mean about the limitations of the non-elementwise dtype promotion tests. To prepare for a future where we're testing the 2-or-more array-accepting methods, I've made hh.mutually_promotable_dtypes() able to return more-than-2 dtypes. I've kept methods like test_meshgrid around, so you/I can move and extend them when we get time to properly implement the tests.

Respectively I've factored out my util assert_dtypes, which I believe gives a succinct error message that still gives relevant information. I've tried using it for test_lingalg.py::test_matmul too as a proof of concept—if you're happy with it, I would like to use ph.assert_dtypes outside of test_type_promotion.py going forward when testing dtype promotion.

I've also created typing.py to factor out some type hints I've been using.

The primary benefit of testing type promotion separately, as I understand it, is that the types are parameterized, so that you always have every combination tested. I don't know hypothesis internals enough to know how high you have to set the max examples to ensure that for the more general test, but I think it will be good in general for library authors to do a run with a high max-examples once they feel their library is confirming just to confirm the tests didn't miss anything.

Mhm this is annoying. I'll keep this problem in the back of my mind, but yeah just asking authors to keep a high --max-examples in a separate CI job seems quite reasonable.

asmeurer · 2021-10-21T21:01:19Z

array_api_tests/pytest_helpers.py

+    in_dtypes: Tuple[DataType, ...],
+    out_name: str,
+    out_dtype: DataType,
+    expected: DataType


This function feels like it has too many arguments. expected could just default to taking the result type of the in_dtypes. Is the out_name really needed either?

I see, I've defaulted expected to the promoted result type. For tests that have different needs (i.e. returning xp.bool), they can pass that instead.

I've kept out_name but defautled it to out.dtype. I like the error message following the spec notation, so for most cases that means testing the dtype attribute of the out array, i.e. out.dtype. Again we can pass a different name for the odd diverging tests, such as meshgrid() where we're testing the out[i].dtype (for i in len(out)), or the inplace tests where it's x1.dtype.

Basically, for the test suite's purposes this is now usually 3 arguments to worry about (func_name, in_dtypes, and out_dtype).

asmeurer · 2021-10-21T21:01:34Z

Also we should note that single argument functions need to test dtypes too. They usually just return the same dtype as the input, but it's still something that needs to be tested.

honno · 2021-10-22T09:27:17Z

Also we should note that single argument functions need to test dtypes too. They usually just return the same dtype as the input, but it's still something that needs to be tested.

ph.assert_dtype() now works for this too, by passing in_dtypes as a 1-lengthed tuple.

asmeurer · 2021-10-22T22:27:54Z

Looks like your other PR created merge conflicts with this one.

asmeurer · 2021-10-22T22:33:13Z

assert_dtypes looks better. We should definitely just use separate tests for the non-elementwise functions. Maybe you can add a comment that that those tests will be moved into the main tests once they are written.

`numpy.array_api.matmul` currently doesn't support broadcastable shapes

We now test that `two_mutual_arrays()` generates mutually promotable dtypes

And update NumPy workflow to xfail the failing new test cases

honno · 2021-10-25T14:26:10Z

Looks like your other PR created merge conflicts with this one.
Maybe you can add a comment that that those tests will be moved into the main tests once they are written.

Fixed and done!

I should also note that for numpy.array_api, tensordot fails as it doesn't support broadcastable shapes, and meshgrid fails because it doesn't promote all of its returned arrays to the one promoted dtype.

asmeurer · 2021-10-25T19:42:10Z

I didn't realize np.tensordot doesn't broadcast. I wonder if that is intentional. It should be easy enough to fix in the implementation. For meshgrid we should figure out if that is intentional too. But either way, we should make the tests match the spec until the spec is updated.

honno · 2021-10-25T19:53:32Z

For meshgrid we should figure out if that is intentional too.

To remind ourselves, Atahn pointed out that PyTorch requires input arrays to have the same dtype (and so ignore type promotion issues altogether... but that would have to change if the spec remains). https://github.com/tensorflow/tensorflow/blob/v2.6.0/tensorflow/python/ops/array_ops.py#L3650

honno changed the title ~~Promotion test cases~~ More promotion test cases Oct 19, 2021

honno force-pushed the promotion-test-cases branch 2 times, most recently from f2ceeb1 to a25dfa8 Compare October 20, 2021 15:44

honno marked this pull request as ready for review October 20, 2021 15:54

asmeurer reviewed Oct 20, 2021

View reviewed changes

array_api_tests/hypothesis_helpers.py Outdated Show resolved Hide resolved

asmeurer reviewed Oct 20, 2021

View reviewed changes

honno force-pushed the promotion-test-cases branch from 1461f9a to 67a2d6b Compare October 21, 2021 11:06

asmeurer reviewed Oct 21, 2021

View reviewed changes

honno force-pushed the promotion-test-cases branch from 6fa2f9c to 2876e61 Compare October 22, 2021 09:20

honno requested a review from asmeurer October 22, 2021 17:19

honno mentioned this pull request Oct 22, 2021

Refactor assertions in test_creation.py #32

Merged

honno added 12 commits October 25, 2021 09:34

Rudimentary where type promotion test

7ae4b73

Rudimentary test_result_type, library-agnostic dh.result_type()

cbc2f26

Rudimentary test_meshgrid

8ce99e3

Make hh.shapes a wrapper function, rudimentary test_concat

705bfd1

Rudimentary test_stack

59364aa

Rudimentary test_matmul

3324f44

`numpy.array_api.matmul` currently doesn't support broadcastable shapes

Rudimentary tensor/vec dot tests

a9d5d20

Alias Param type hint as Tuple

508abcc

Include func/op and param dtypes in type promotion error messages

ee9ab19

Replace array filtering with xps.from_dtype() kwargs

65db28a

Scrap generating kwargs for promotion tests

b82358e

Better minimisation behaviour for multi_promotable_dtypes()

deebaef

honno added 8 commits October 25, 2021 09:34

Use setdefault instead of manualy keys check in hh.shapes()

c041228

Remove faulty matmul type promotion test

bbce580

mutually_promotable_dtypes() can generate 2-or-more dtypes

0c71410

Factor out assert_dtype and fmt_types, add typing.py

f438298

Construct test case name in ph.assert_dtype()

ed76b32

Use ph.assert_dtype in test_matmul (proof of concept)

6206a55

Type hint some hypothesis helpers

d6665c0

Default expected and out_name in ph.assert_dtype()

dc211d4

honno force-pushed the promotion-test-cases branch from 3c8c7a8 to 0ee54b2 Compare October 25, 2021 08:35

honno added 3 commits October 25, 2021 12:05

Accept single dtype in dh.resut_type() and thus ph.assert_dtype()

27ded9b

Remove sanity_check() in elementwise

8acad7b

We now test that `two_mutual_arrays()` generates mutually promotable dtypes

Comment that non-elementwise promotion tests are temporary

2d91340

honno force-pushed the promotion-test-cases branch from f41077e to 2d91340 Compare October 25, 2021 11:06

Support incomplete case names for xfails.txt

866dcd8

And update NumPy workflow to xfail the failing new test cases

asmeurer merged commit a88c3b8 into data-apis:master Oct 26, 2021

honno deleted the promotion-test-cases branch February 8, 2022 10:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

More promotion test cases #30

More promotion test cases #30

honno commented Oct 19, 2021 •

edited

Loading

honno commented Oct 20, 2021 •

edited

Loading

asmeurer Oct 20, 2021

asmeurer commented Oct 20, 2021 •

edited

Loading

honno commented Oct 21, 2021

asmeurer Oct 21, 2021

honno Oct 22, 2021

asmeurer commented Oct 21, 2021

honno commented Oct 22, 2021

asmeurer commented Oct 22, 2021

asmeurer commented Oct 22, 2021

honno commented Oct 25, 2021 •

edited

Loading

asmeurer commented Oct 25, 2021

honno commented Oct 25, 2021

More promotion test cases #30

More promotion test cases #30

Conversation

honno commented Oct 19, 2021 • edited Loading

honno commented Oct 20, 2021 • edited Loading

asmeurer Oct 20, 2021

Choose a reason for hiding this comment

asmeurer commented Oct 20, 2021 • edited Loading

honno commented Oct 21, 2021

asmeurer Oct 21, 2021

Choose a reason for hiding this comment

honno Oct 22, 2021

Choose a reason for hiding this comment

asmeurer commented Oct 21, 2021

honno commented Oct 22, 2021

asmeurer commented Oct 22, 2021

asmeurer commented Oct 22, 2021

honno commented Oct 25, 2021 • edited Loading

asmeurer commented Oct 25, 2021

honno commented Oct 25, 2021

honno commented Oct 19, 2021 •

edited

Loading

honno commented Oct 20, 2021 •

edited

Loading

asmeurer commented Oct 20, 2021 •

edited

Loading

honno commented Oct 25, 2021 •

edited

Loading