Skip to content

More promotion test cases #30

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 24 commits into from
Oct 26, 2021
Merged

Conversation

honno
Copy link
Member

@honno honno commented Oct 19, 2021

This should help resolve #24. Starts from the unmerged #26 so the commit history is currently messy. I'm going to try cover some more methods before seeing about a refactor.

@honno honno changed the title Promotion test cases More promotion test cases Oct 19, 2021
@honno honno force-pushed the promotion-test-cases branch 2 times, most recently from f2ceeb1 to a25dfa8 Compare October 20, 2021 15:44
@honno honno marked this pull request as ready for review October 20, 2021 15:54
@honno
Copy link
Member Author

honno commented Oct 20, 2021

I think I've covered every method now (save extensions). Parametrizing is a bit difficult given subtle differences in these methods, so something I'd recommend we can think about later. I made a Hypothesis strategy for 2-or-more promotable dtypes... it minimises okay, some room for improvement if/once I test for it.

Like with the elementwise/operator tests, I'm covering multi-dimensional arrays. I also thought to cover kwargs, although I didn't get round to fully implementing them today (e.g. the axis kwargs are limited so they're safe for all valid inputs, not respective to shapes). I'm now thinking of scrapping kwargs stuff (i.e. assume they're not going to affect dtype promotion behaviour), and leave that for you/me later to use in their own dedicated test methods. Scrapped kwargs.

The func name and respective dtypes are now included in all the error messages. I thought to keep that information at the end, so that in truncated scenarios the user can see the relevant bug information next to the test case name.

=============================================================================== short test summary info ===============================================================================
FAILED test_type_promotion.py::test_where[(int8, int16) -> int16] - AssertionError: out.dtype=int8, but should be int16 [where(int8, int16)]

@given(shapes=hh.mutually_broadcastable_shapes(2, min_dims=1), data=st.data())
def test_matmul(in_dtypes, out_dtype, shapes, data):
x1 = data.draw(xps.arrays(dtype=in_dtypes[0], shape=shapes[0]), label='x1')
x2 = data.draw(xps.arrays(dtype=in_dtypes[1], shape=shapes[1]), label='x2')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

matmul needs the input shapes to align (see the conditions in test_matmul).

@asmeurer
Copy link
Member

asmeurer commented Oct 20, 2021

So it looks like none of the functions outside of elementwise are actually testable with the same test as elementwise. I wonder if we should just test the type promotion for these functions in their normal function tests, like I've done here

assert res.dtype == dh.promotion_table[x1.dtype, x2.dtype], "matmul() did not return the correct dtype"
For matmul especially I don't know if it's worth it trying to come up with an input strategy for it just so that we can split the type promotion out into a separate test. The primary benefit of testing type promotion separately, as I understand it, is that the types are parameterized, so that you always have every combination tested. I don't know hypothesis internals enough to know how high you have to set the max examples to ensure that for the more general test, but I think it will be good in general for library authors to do a run with a high max-examples once they feel their library is confirming just to confirm the tests didn't miss anything.

@honno honno force-pushed the promotion-test-cases branch from 1461f9a to 67a2d6b Compare October 21, 2021 11:06
@honno
Copy link
Member Author

honno commented Oct 21, 2021

I see what you mean about the limitations of the non-elementwise dtype promotion tests. To prepare for a future where we're testing the 2-or-more array-accepting methods, I've made hh.mutually_promotable_dtypes() able to return more-than-2 dtypes. I've kept methods like test_meshgrid around, so you/I can move and extend them when we get time to properly implement the tests.

Respectively I've factored out my util assert_dtypes, which I believe gives a succinct error message that still gives relevant information. I've tried using it for test_lingalg.py::test_matmul too as a proof of concept—if you're happy with it, I would like to use ph.assert_dtypes outside of test_type_promotion.py going forward when testing dtype promotion.

I've also created typing.py to factor out some type hints I've been using.

The primary benefit of testing type promotion separately, as I understand it, is that the types are parameterized, so that you always have every combination tested. I don't know hypothesis internals enough to know how high you have to set the max examples to ensure that for the more general test, but I think it will be good in general for library authors to do a run with a high max-examples once they feel their library is confirming just to confirm the tests didn't miss anything.

Mhm this is annoying. I'll keep this problem in the back of my mind, but yeah just asking authors to keep a high --max-examples in a separate CI job seems quite reasonable.

in_dtypes: Tuple[DataType, ...],
out_name: str,
out_dtype: DataType,
expected: DataType
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function feels like it has too many arguments. expected could just default to taking the result type of the in_dtypes. Is the out_name really needed either?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, I've defaulted expected to the promoted result type. For tests that have different needs (i.e. returning xp.bool), they can pass that instead.

I've kept out_name but defautled it to out.dtype. I like the error message following the spec notation, so for most cases that means testing the dtype attribute of the out array, i.e. out.dtype. Again we can pass a different name for the odd diverging tests, such as meshgrid() where we're testing the out[i].dtype (for i in len(out)), or the inplace tests where it's x1.dtype.

Basically, for the test suite's purposes this is now usually 3 arguments to worry about (func_name, in_dtypes, and out_dtype).

@asmeurer
Copy link
Member

Also we should note that single argument functions need to test dtypes too. They usually just return the same dtype as the input, but it's still something that needs to be tested.

@honno honno force-pushed the promotion-test-cases branch from 6fa2f9c to 2876e61 Compare October 22, 2021 09:20
@honno
Copy link
Member Author

honno commented Oct 22, 2021

Also we should note that single argument functions need to test dtypes too. They usually just return the same dtype as the input, but it's still something that needs to be tested.

ph.assert_dtype() now works for this too, by passing in_dtypes as a 1-lengthed tuple.

@asmeurer
Copy link
Member

Looks like your other PR created merge conflicts with this one.

@asmeurer
Copy link
Member

assert_dtypes looks better. We should definitely just use separate tests for the non-elementwise functions. Maybe you can add a comment that that those tests will be moved into the main tests once they are written.

@honno honno force-pushed the promotion-test-cases branch from 3c8c7a8 to 0ee54b2 Compare October 25, 2021 08:35
@honno honno force-pushed the promotion-test-cases branch from f41077e to 2d91340 Compare October 25, 2021 11:06
And update NumPy workflow to xfail the failing new test cases
@honno
Copy link
Member Author

honno commented Oct 25, 2021

Looks like your other PR created merge conflicts with this one.
Maybe you can add a comment that that those tests will be moved into the main tests once they are written.

Fixed and done!

I should also note that for numpy.array_api, tensordot fails as it doesn't support broadcastable shapes, and meshgrid fails because it doesn't promote all of its returned arrays to the one promoted dtype.

@asmeurer
Copy link
Member

I didn't realize np.tensordot doesn't broadcast. I wonder if that is intentional. It should be easy enough to fix in the implementation. For meshgrid we should figure out if that is intentional too. But either way, we should make the tests match the spec until the spec is updated.

@honno
Copy link
Member Author

honno commented Oct 25, 2021

For meshgrid we should figure out if that is intentional too.

To remind ourselves, Atahn pointed out that PyTorch requires input arrays to have the same dtype (and so ignore type promotion issues altogether... but that would have to change if the spec remains). https://github.com/tensorflow/tensorflow/blob/v2.6.0/tensorflow/python/ops/array_ops.py#L3650

@asmeurer asmeurer merged commit a88c3b8 into data-apis:master Oct 26, 2021
@honno honno deleted the promotion-test-cases branch February 8, 2022 10:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add type promotion tests for non-elementwise functions
2 participants