Clean up the data in test_type_promotion.py #26

honno · 2021-10-05T15:39:50Z

This will resolve #21 .

Move the reusable data and helper functions to a central location
Change the data to be keyed on the dtype objects themselves.... or maybe dtype string names where appropriate
- Make promote_dtypes() a dict
Review more refactors and other niceties
- dtype objects
- is_*_dtype()
- dtype ranges
Clean up the parameterization logic in test_type_promotion.py
- Python scalars

honno · 2021-10-11T16:40:11Z

If you were to have a look today—note that I'm definitely not happy with the current parametrize refactor heh (but now that I've finally figured out everything it should be easier to clean-up tomorrow). I will also compare the parameters with master so that I don't miss anything.

asmeurer · 2021-10-11T20:59:27Z

Your approach to cleaning up the paramerizations seems pretty reasonable. Doing it all in a function with explicit loops instead of list comprehensions, and generating param objects instead of separate parameters and ids does look better.

honno · 2021-10-12T16:58:52Z

I think I'm happy with how I've refactored the parametrization logic and the tests now, although it would be good if I slept on it.

I will get to cleaning up the scalar promotion test case at the bottom tomorrow too. Generally would you mind if I used the word "builtin" to specify Python-type scalars? e.g. a test method test_operator_promotes_scalar would become test_operator_promotes_builtin. I think this would help distinguish from 0d arrays.

Also I just realised I should point all references of ah.dtype_objects (and similar) to what's inside dtype_helpers.

Also also I should keep reference to the notes you made regarding internal promotions and (0,) examples somewhere.

asmeurer · 2021-10-12T20:28:48Z

I will get to cleaning up the scalar promotion test case at the bottom tomorrow too. Generally would you mind if I used the word "builtin" to specify Python-type scalars? e.g. a test method test_operator_promotes_scalar would become test_operator_promotes_builtin. I think this would help distinguish from 0d arrays.

I usually try to call these "Python scalars". I think that's also what the spec calls them, at least right now. "Builtin" is unfortunately sometimes an overloaded term in Python, which sometimes means what you'd expect, but also sometimes means "compiled".

Also also I should keep reference to the notes you made regarding internal promotions and (0,) examples somewhere.

This probably isn't super important. Back when the tests were much simpler, an @example was possible. Unless you expect that might be the case again with the new hypothesis strategies, it's probably not that important. I expect hypothesis will always draw dimension 0 arrays anyway (also there was a typo anyway; it should be shape (), not (0,)). See https://data-apis.org/array-api/latest/API_specification/type_promotion.html#notes

asmeurer · 2021-10-13T20:54:18Z

I also don't test NaNs and inf scalars, as I expect you want to keep that all in the generated special cases tests.

No we should keep those. The special case tests only test the functions, not the operators. And they also don't test the dtype, only the value.

asmeurer · 2021-10-13T21:10:45Z

array_api_tests/test_type_promotion.py

+    {
+        '__add__': lambda m, M: lambda s: {
+            'min_value': max(m - s, m),
+            'max_value': min(M - s, M),


I don't think you need to be worrying about this. Any input value from the dtype range should work, even if the output overflows. I think if some library does something different for integer overflows, we will need to update a large part of the test suite to handle it.

honno · 2021-10-14T10:03:13Z

I think I'm happy with this now. I'm now testing NaNs and infs.

Note that I wrap potential overflow scenarios with a try-except block. This might be all we need, as if other libraries don't take the NumPy route of overflowing-without-error, they're likely going to throw an OverflowError. In any case this easily indicates to us/others in the future that this code is potentially erroneous.

try:
    out = func(*arrays)
except OverflowError:
    reject()

FYI hypothesis.reject() is an alias hypothesis.assume(False). I use it instead of assume(False) because it plays better with mypy (e.g. flycheck won't complain that out is possibly unbound).

Good shout on the dtype sorting flaw. I've now used a method which only relies on dtype object equality. The current minimisation behaviour should be:

Promotable dtype pairs sorted by minimisation order

bool     bool   
uint8    uint8  
uint16   uint16 
uint32   uint32 
uint64   uint64 
int8     int8   
int16    int16  
int32    int32  
int64    int64  
float32  float32
float64  float64
uint8    uint16 
uint16   uint8  
uint8    uint32 
uint32   uint8  
uint8    uint64 
uint16   uint32 
uint64   uint8  
uint32   uint16 
uint16   uint64 
uint64   uint16 
uint32   uint64 
uint64   uint32 
int8     int16  
int16    int8   
int8     int32  
int32    int8   
int8     int64  
int16    int32  
int64    int8   
int32    int16  
int16    int64  
int64    int16  
int32    int64  
int64    int32  
float32  float64
float64  float32
uint8    int8   
int8     uint8  
uint16   int8   
uint8    int16  
int8     uint16 
int16    uint8  
uint32   int8   
uint16   int16  
uint8    int32  
int8     uint32 
int16    uint16 
int32    uint8  
uint32   int16  
uint16   int32  
uint8    int64  
int16    uint32 
int32    uint16 
int64    uint8  
uint32   int32  
uint16   int64  
int32    uint32 
int64    uint16 
uint32   int64  
int64    uint32

Broadly the minimisation goes:

Same dtypes
Dtypes in the same category
Dtypes in mixed categories (i.e. uints and ints)

dtypes are ordered like so:

xp.bool
uints
ints
floats

And for pairs of mixed dtypes, I also preference the smaller dtype as the first item in the pair.

I believe you initially desired float first, then int-uint-bool. Personally I prefer the opposite scenario, which is also how hypothesis.extra.numpy/hypothesis.extra.array_api does it. I think (bool, bool) better indicates errors which are not related to dtypes at all, just an array of any dtype. (uintX, uintX) likewise better indicates problems with numerical arrays generally, as floats obviously have a larger "elements space" of real numbers, NaNs and infs.

Also, I can't actually test minimisation behaviour easily without using the MPL-licensed code Hypothesis does for this. I'll definitely mull it over though, so maybe expect a PR in the future for this.

(the code I used to print the above table)

from rich import print
from rich.table import Table
grid = Table.grid("1st", "2nd", padding=(0, 2))
for key in promotable_dtypes:
    grid.add_row(f"{key[0]}", f"{key[1]}")
print("")
print(grid)

asmeurer · 2021-10-15T23:22:02Z

I believe you initially desired float first, then int-uint-bool. Personally I prefer the opposite scenario, which is also how hypothesis.extra.numpy/hypothesis.extra.array_api does it. I think (bool, bool) better indicates errors which are not related to dtypes at all, just an array of any dtype. (uintX, uintX) likewise better indicates problems with numerical arrays generally, as floats obviously have a larger "elements space" of real numbers, NaNs and infs.

Sounds reasonable. It isn't important, but I also wonder if we should prefer the "default floating-point/integer dtype" first. I find it a little confusing that hypothesis always shrinks to float32. In my experience actually if only one of float32 or float64 fails, then float32 is the one that fails, because it is more likely to suffer bugs from overflows or upcasting. If anything, seeing "float32" makes me suspect that the shorter type is somehow relevant (although I know that this is general sort of problem with hypothesis shrinking that you just have to be on guard about, see https://hypothesis.works/articles/threshold-problem/). If the floating-point dtype is irrelevant to the failure, I think would generally prefer to see the default floating-point dtype (which is float64 for NumPy, but can be either in the spec).

Also, I can't actually test minimisation behaviour easily without using the MPL-licensed code Hypothesis does for this. I'll definitely mull it over though, so maybe expect a PR in the future for this.

Can we just import the internal hypothesis functions?

asmeurer · 2021-10-15T23:36:01Z

Great, this works against plain numpy now. When I test it, I get parameter names like array_api_tests/test_type_promotion.py::test_func_returns_array_with_correct_dtype[abs(<class 'numpy.int8'>) -> <class 'numpy.int8'>]. The <class 'numpy.int8'> bit also appears in the error messages, but I think that's fine. But I think we should make the parameterization always give abs(int8) -> int8 like it does with numpy.array_api.

I was worried what the error messages would look like with so many @composite strategies, but it looks like they are usable. Here is an example error from numpy (which does incorrect type promotion on 0-d arrays)

func = <ufunc 'add'>, in_dtypes = (<class 'numpy.int64'>, <class 'numpy.int32'>), out_dtype = <class 'numpy.int64'>
x_filter = <function <lambda>.<locals>.<lambda> at 0x112c7c0d0>, data = data(...)

    @pytest.mark.parametrize('func, in_dtypes, out_dtype, x_filter', gen_func_params())
    @given(data=st.data())
    def test_func_returns_array_with_correct_dtype(
        func, in_dtypes, out_dtype, x_filter, data
    ):
        if len(in_dtypes) == 1:
            x = data.draw(
                xps.arrays(dtype=in_dtypes[0], shape=hh.shapes).filter(x_filter), label='x'
            )
            out = func(x)
        else:
            arrays = []
            shapes = data.draw(
                hh.mutually_broadcastable_shapes(len(in_dtypes)), label='shapes'
            )
            for i, (dtype, shape) in enumerate(zip(in_dtypes, shapes), 1):
                x = data.draw(
                    xps.arrays(dtype=dtype, shape=shape).filter(x_filter), label=f'x{i}'
                )
                arrays.append(x)
            try:
                out = func(*arrays)
            except OverflowError:
                reject()
>       assert out.dtype == out_dtype, f'{out.dtype=!s}, but should be {out_dtype}'
E       AssertionError: out.dtype=int32, but should be <class 'numpy.int64'>
E       assert dtype('int32') == <class 'numpy.int64'>
E         +dtype('int32')
E         -<class 'numpy.int64'>

array_api_tests/test_type_promotion.py:99: AssertionError
------------------------------------------------------------------------------ Hypothesis -------------------------------------------------------------------------------
Falsifying example: test_func_returns_array_with_correct_dtype(
    data=data(...),
    func=<ufunc 'add'>,
    in_dtypes=(numpy.int64, numpy.int32),
    out_dtype=numpy.int64,
    x_filter=<function array_api_tests.test_type_promotion.<lambda>.<locals>.<lambda>>,
)
Draw 1 (shapes): ((), (1,))
Draw 2 (x1): array(0)
Draw 3 (x2): array([0], dtype=int32)

The input arrays, including shapes and dtypes are both easy to see. My only gripe is I would include the name of the function in the assertion error message.

honno · 2021-10-18T11:03:03Z

I also wonder if we should prefer the "default floating-point/integer dtype" first

Can we just import the internal hypothesis functions?

The main function I'm referring to is minimal(), which unfortunately is part of their internal tests/ utilities as opposed to being part of the hypothesis package. So regarding default ints/floats, that makes a lot of sense now, but it's probably not worth the hassle until I can figure out how to actually test this. Also I believe there's no consistent way of knowing what a library's default int/float is?

When I test it, I get parameter names like array_api_tests/test_type_promotion.py::test_func_returns_array_with_correct_dtype[abs(<class 'numpy.int8'>) -> <class 'numpy.int8'>]. The <class 'numpy.int8'> bit also appears in the error messages, but I think that's fine. But I think we should make the parameterization always give abs(int8) -> int8 like it does with numpy.array_api.

Right I see, that isn't pretty heh. I've fixed this now by mapping dtypes to names so the parameter ids are now consistent.

My only gripe is I would include the name of the function in the assertion error message.

I see that adding the func names to assertions could be useful. Additionally it could be useful to have dtypes involved, ala the parameter ids. How the func names should fit in an error message requires some thought, as it can end up making the messages harder-to-parse for a user (especially with the op funcs). So I would prefer if we left this out this PR, as I want to mull it over myself.

Generally I've tried to make error situations better. I've shortened the test method names and pass func/op names to test cases (so I dont have to pass the func obj and x_filter, which distracts error messages). With the readable parameter ids it's not the worst situation currently, and could indeed be enough, e.g.:

______________ test_func_promotion[equal(bool, bool) -> bool] ______________

func_name = 'equal', in_dtypes = (dtype('bool'), dtype('bool'))
out_dtype = dtype('bool')
...
E   AssertionError: out.dtype=float32, but should be bool
...
========================== short test summary info ===========================
FAILED test_type_promotion.py::test_func_promotion[equal(bool, bool) -> bool] AssertionError: out.dtype=float32, but should be bool
...

asmeurer · 2021-10-19T19:58:03Z

Also I believe there's no consistent way of knowing what a library's default int/float is?

You could use asarray(0.0).dtype. See https://data-apis.org/array-api/latest/API_specification/creation_functions.html#asarray-obj-dtype-none-device-none-copy-none

I see that adding the func names to assertions could be useful. Additionally it could be useful to have dtypes involved, ala the parameter ids. How the func names should fit in an error message requires some thought, as it can end up making the messages harder-to-parse for a user (especially with the op funcs). So I would prefer if we left this out this PR, as I want to mull it over myself.

Well I wish pytest would surface the assertion error messages a little better. Right now, they are sort of presented in a summary (but they are truncated). I think it was designed around most people not bothering putting messages in their assertions. But I think it's really useful for a test suite designed to be used by other people. It also has the advantage that hypothesis will treat assertions with separate messages as distinct failures.

asmeurer · 2021-10-19T20:00:42Z

I want to merge this. It's making the other PR hard to review. We can fix any issues in a later PR.

honno added 6 commits October 5, 2021 16:35

Refactor things from test_type_promotion.py to dtype_helpers.py

d4f8dea

Meta test for promote_dtype

d12808b

Use full dtype names instead of shorthands

29cec27

Use dtype object instead of name for dtype helpers

3f1060f

Improve parametrize names for scalar promotion tests

e4461b3

Convert dtype_signed and dtype_nbits to dicts

48fd544

honno force-pushed the type-promotion-refactor branch from 9ec75a5 to 48fd544 Compare October 7, 2021 16:16

honno added 2 commits October 8, 2021 09:29

Use dh.promotion_table instead of ah.promote_dtypes()

8c5fad0

Refactor dtype_helpers

c9d958e

honno mentioned this pull request Oct 11, 2021

Add type promotion tests for non-elementwise functions #24

Closed

honno added 6 commits October 11, 2021 14:01

Rudimentary elementwise parameters refactor

c07ba0e

Rudimentary operator parameters refactor

19cfff7

Merge branch 'master' into type-promotion-refactor

a113c7d

Namespace some imports

1c06936

Change promotion test names

85212da

Clarify operator-related variable names

ce6a7e8

honno force-pushed the type-promotion-refactor branch from d6a3379 to ce6a7e8 Compare October 11, 2021 16:37

honno added 6 commits October 12, 2021 08:59

Merge branch 'master' into type-promotion-refactor

eecae7d

Refactor type promotion tests

dd3fb86

Merge unary and binary tests

043726d

Remove unused pytest import in test_array_helpers.py

6efe39d

Use a defaultdict for storing filters

de5dd5a

Type hint parameter generators

331e95c

honno marked this pull request as ready for review October 12, 2021 16:48

honno added 2 commits October 12, 2021 18:04

Add filters to inplace operators test

09d2281

Make op_to_func public, use promotion_table in linalg tests

5a06078

asmeurer reviewed Oct 13, 2021

View reviewed changes

honno added 8 commits October 14, 2021 09:47

Remove elements factory

198df7e

Reject on OverflowError

e40cac8

Remove old scalar promotion tests

b37ecc9

Merge branch 'master' into type-promotion-refactor

8e79ebf

Construct inplace_op_to_symbol in dtype_helpers

06fc784

Generate NaNs and infs for scalar promotion tests

eab08d9

Dont catch OverflowError for single array funcs

4907178

Fix regression of dtype_objects instead of updated dtype_objs

75f44d7

honno requested a review from asmeurer October 14, 2021 10:04

honno added 2 commits October 14, 2021 12:24

Sort promotable_dtypes without relying on dtype comparisons

d4467f1

Merge branch 'master' into type-promotion-refactor

7bc75f0

honno added 4 commits October 18, 2021 10:13

Merge branch 'master' into type-promotion-refactor

e558ff2

Don't rely on str(dtype) for parameter ids

64ef13c

Pass func/op names in params

8f33325

Shorten test method names

7acf87a

honno added 4 commits October 18, 2021 13:49

Replace input dtype categories with dtype objects

0ff687e

Only use _op_to_func privately

49279d5

Replace func_out_categories with func_returns_bool

086cc10

Replace param generators with constructed lists

b7c0f51

honno mentioned this pull request Oct 19, 2021

More promotion test cases #30

Merged

asmeurer merged commit bb527e6 into data-apis:master Oct 19, 2021

honno deleted the type-promotion-refactor branch October 21, 2021 11:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Clean up the data in test_type_promotion.py #26

Clean up the data in test_type_promotion.py #26

Uh oh!

honno commented Oct 5, 2021 •

edited

Loading

Uh oh!

honno commented Oct 11, 2021

Uh oh!

asmeurer commented Oct 11, 2021

Uh oh!

honno commented Oct 12, 2021 •

edited

Loading

Uh oh!

asmeurer commented Oct 12, 2021

Uh oh!

asmeurer commented Oct 13, 2021

Uh oh!

asmeurer Oct 13, 2021

Uh oh!

honno commented Oct 14, 2021 •

edited

Loading

Uh oh!

asmeurer commented Oct 15, 2021

Uh oh!

asmeurer commented Oct 15, 2021

Uh oh!

honno commented Oct 18, 2021

Uh oh!

asmeurer commented Oct 19, 2021

Uh oh!

asmeurer commented Oct 19, 2021

Uh oh!

Uh oh!

Clean up the data in test_type_promotion.py #26

Clean up the data in test_type_promotion.py #26

Uh oh!

Conversation

honno commented Oct 5, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

honno commented Oct 11, 2021

Uh oh!

asmeurer commented Oct 11, 2021

Uh oh!

honno commented Oct 12, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

asmeurer commented Oct 12, 2021

Uh oh!

asmeurer commented Oct 13, 2021

Uh oh!

asmeurer Oct 13, 2021

Choose a reason for hiding this comment

Uh oh!

honno commented Oct 14, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

asmeurer commented Oct 15, 2021

Uh oh!

asmeurer commented Oct 15, 2021

Uh oh!

honno commented Oct 18, 2021

Uh oh!

asmeurer commented Oct 19, 2021

Uh oh!

asmeurer commented Oct 19, 2021

Uh oh!

Uh oh!

honno commented Oct 5, 2021 •

edited

Loading

honno commented Oct 12, 2021 •

edited

Loading

honno commented Oct 14, 2021 •

edited

Loading