Issues with "Mixing arrays and Python scalars" section #98

asmeurer · 2020-12-05T00:06:27Z

A few issues with the mixing arrays and Python scalars section that came up from testing:

The matmul operator (@) does not (and should not) support scalar types, so it should be excluded.
Should boolean arrays be required to work with arithmetic operators. NumPy gives errors in some cases, for example:

>>> import numpy as np
>>> np.array(False) - False
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: numpy boolean subtract, the `-` operator, is not supported, use the bitwise_xor, the `^` operator, or the logical_xor function instead.

How should casting work? For example, int8 array + large Python int. Should it cast the int, or give an error. Or should this be unspecified.
The above question also applies for Python float -> float32 casting, which is implicitly supported by the text that is currently there, but is unsafe in general (is there a "safe" variant of float64 -> float32, or is it always unsafe? Either way, this would be value based casting, which I thought we were trying to avoid).

The text was updated successfully, but these errors were encountered:

asmeurer · 2020-12-05T00:10:08Z

Actually, I think - is the only operator that NumPy doesn't allow for booleans. But there's also the reverse which is that bitwise operators are not supported for floats. So I think there should be some text like "for operations that are supported by the given dtypes" and the operators text should specify that (I guess the bitwise ones already do, but there's nothing in __sub__ about boolean support or non-support).

asmeurer · 2020-12-05T00:21:22Z

IMHO, the spec should split the operators into four classes:

Arithmetic operators (+, -, *, /, etc. (including unary + and -))
Boolean operators (~, ^, |, etc.)
Comparison operators (==, <=, ...)
Array operators (@ is the only one)

Arithmetic operators should be defined for integer and floating point dtypes, but not required for the boolean dtype. Boolean operators should be defined for integer (bitwise) and boolean dtypes, but are not required for floating point dtypes. Comparison operators are defined for all arrays regardless of dtype. The array operator (matmul) is defined only for integer and floating point dtypes, and also has additional shape constraints (the input arrays must have at least 1 (2?) dimensions, which align in a certain way).

Also, arithmetic, boolean, and array operators participate in type promotion. Comparison operators may error on invalid type combinations (which aren't specified anyway), but always return an array with a bool dtype.

But I may also be missing or forgetting prior discussions that happened with this.

asmeurer · 2020-12-05T00:23:45Z

A similar split could be made for the 2-argument array functions.

asmeurer · 2020-12-05T00:36:47Z

Also, it looks like NumPy doesn't do what the spec says for float scalars. The spec says:

The expected behavior is then equivalent to:

Convert the scalar to a 0-D array with the same dtype as that of the array
used in the expression.

Execute the operation for array <op> 0-D array (or 0-D array <op> array if scalar was the left-hand argument).

But NumPy apparently converts float scalars into float64:

>>> (np.array(0.0, dtype=np.float32) * 0.0).dtype
dtype('float64')
>>> (np.array(0.0, dtype=np.float32) * np.array(0.0, dtype=np.float32)).dtype
dtype('float32')

Is it intentional to give a different behavior from NumPy here (I didn't check the other array libraries yet)?

This is based on what the spec currently says, but there are some issues with it (see data-apis/array-api#98).

shoyer · 2020-12-17T00:14:56Z

Is it intentional to give a different behavior from NumPy here (I didn't check the other array libraries yet)?

Yes, this is an intentional deviation, copied from libraries such as PyTorch and JAX. On GPU/TPU it is important to avoid inadvertent type promotion to float64.

asmeurer · 2020-12-17T00:27:45Z

The NumPy behavior is also apparently only that way for shape () arrays:

>>> (np.array(0.0, dtype=np.float32) * 0.0).dtype
dtype('float64')
>>> (np.array([0.0], dtype=np.float32) * 0.0).dtype
dtype('float32')

So we should consider the NumPy promotion to float64 as incorrect (I intended to raise this issue on the NumPy tracker but didn't get around to it yet).

shoyer · 2020-12-17T01:20:47Z

So we should consider the NumPy promotion to float64 as incorrect (I intended to raise this issue on the NumPy tracker but didn't get around to it yet).

NumPy's type promotion rules treat 0d arrays the same scalars:

>>> (np.float32(0.0) * 0.0).dtype
dtype('float64')

asmeurer · 2020-12-17T01:28:55Z

Yeah, the behavior is the same whether it's an "array scalar" or "scalar constant" (or whatever NumPy calls them). But we don't make any such distinction in the spec, and don't special-case 0-D arrays as far as type promotion is concerned:

Type promotion rules must apply when determining the common result type for two array operands during an arithmetic operation, regardless of array dimension. Accordingly, zero-dimensional arrays must be subject to the same type promotion rules as dimensional arrays.

shoyer · 2020-12-17T02:33:39Z

Right, we intentionally deviated from NumPy here. There is no notion of separate "scalar" types, and 0d arrays are not treated differently from other arrays.

…

On Wed, Dec 16, 2020 at 5:29 PM Aaron Meurer ***@***.***> wrote: Yeah, the behavior is the same whether it's an "array scalar" or "scalar constant" (or whatever NumPy calls them). But we don't make any such distinction in the spec, and don't special-case 0-D arrays as far as type promotion is concerned: Type promotion rules must apply when determining the common result type for two array operands during an arithmetic operation, regardless of array dimension. Accordingly, zero-dimensional arrays must be subject to the same type promotion rules as dimensional arrays. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#98 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAJJFVUXHRC3JRSUHYQIS6DSVFNGJANCNFSM4UODP77Q> .

rgommers · 2020-12-17T11:53:04Z

IMHO, the spec should split the operators into four classes:
* Arithmetic operators (+, -, *, /, etc. (including unary + and -))

* Boolean operators (~, ^, |, etc.)

* Comparison operators (==, <=, ...)

* Array operators (@ is the only one)
Arithmetic operators should be defined for integer and floating point dtypes, but not required for the boolean dtype. Boolean operators should be defined for integer (bitwise) and boolean dtypes, but are not required for floating point dtypes. Comparison operators are defined for all arrays regardless of dtype. The array operator (matmul) is defined only for integer and floating point dtypes, and also has additional shape constraints (the input arrays must have at least 1 (2?) dimensions, which align in a certain way).

I agree with all this, seems like that would help make the operator description and behaviour clearer.

Also, arithmetic, boolean, and array operators participate in type promotion. Comparison operators may error on invalid type combinations (which aren't specified anyway), but always return an array with a bool dtype.

I would not mix this into the type promotion section; that would make it harder to understand for little gain. The individual functions already describe this, and I don't think anyone will be surprised by for example equal(int_array, int_array) -> bool_array.

asmeurer · 2020-12-17T17:54:14Z

Yes, as far as organization in the spec document, I think the above distinctions should go on the page about operators, not the page about type promotion (or someone more general if we also apply it to functions). The type promotion page describes those operator classes that participate in type promotion, but doesn't talk about the other ones.

rgommers · 2020-12-19T16:42:41Z

How should casting work? For example, int8 array + large Python int. Should it cast the int, or give an error. Or should this be unspecified.

Here's how libraries currently behave, testing an int32 array added to a Python ints (3 results: small, too large for 32-bit, too large for 64-bit).

Code:

import numpy as np
import dask.array as da
import torch
import tensorflow as tf
import jax.numpy as jnp
import mxnet
try:
    import cupy as cp
except ImportError:
    # CuPy is GPU-only, so may not be available
    cp = None


def ones(mod):
    # Create a (3, 2)-shaped array of int32 1's
    if mod in (da, mxnet.nd):
        x = mod.ones((3, 2), dtype=np.int32)  # MXNet doesn't have dtype literals
    else:
        x = mod.ones((3, 2), dtype=mod.int32)

    return x


def get_firstelem(x):
    # Get element (0, 0) from 2-D arrays
    if isinstance(x, da.Array):
        x = x.compute()

    if mod == mxnet.nd:
        x0 = int(x[0, 0].asscalar())  # MXNet doesn't return scalar
    else:
        x0 = int(x[0, 0])

    return x0



libraries = {
    'numpy': np,
    'pytorch': torch,
    'MXNet': mxnet.nd,
    'dask': da,
    'tensorflow': tf,
    'jax': jnp,
}

if cp is not None:
    libraries['cupy'] = cp


results = libraries.copy()


int_small = 1
int_above32bit = int(np.iinfo(np.int32).max) + 10
int_above64bit = int(np.iinfo(np.int64).max) + 10
int_vals = [int_small, int_above32bit, int_above64bit]
for name, mod in libraries.items():
    results[name] = None
    res = []
    for val in int_vals:
        x = ones(mod)
        try:
            tmp = val + x
            out = (tmp.dtype, get_firstelem(tmp))
        except Exception as e:
            out = e

        res.append(out)

    results[name] = res


for key, val in results.items():
    print('')
    print(key + ':\n' + (len(key) + 1) * '-')
    for v in val:
        print(v)

Results:

numpy:
------
(dtype('int32'), 2)
(dtype('int64'), 2147483658)
(dtype('float64'), 9223372036854775808)

pytorch:
--------
(torch.int32, 2)
(torch.int32, -2147483638)
Overflow when unpacking long

MXNet:
------
(<class 'numpy.int32'>, 2)
(<class 'numpy.int32'>, -2147483647)
(<class 'numpy.int32'>, -2147483647)

dask:
-----
(dtype('int32'), 2)
(dtype('int64'), 2147483658)
(dtype('float64'), 9223372036854775808)

tensorflow:
-----------
(tf.int32, 2)
(tf.int32, -2147483638)
Can't convert Python sequence with out-of-range integer to Tensor.

jax:
----
(dtype('int32'), 2)
(dtype('int32'), -2147483638)
Python int too large to convert to C long

cupy:
-----
(dtype('int32'), 2)
(dtype('int64'), 2147483658)
(dtype('int64'), -9223372036854775798)

NumPy does value-based casting.
MXNet consistently returns int32 (with overflows).
PyTorch, TensorFlow and JAX have mix between overflowing int32 and raising, depending on if the Python int fits into int64.
CuPy upcasts to int64 if Python int fits into int64, and overflows beyond that.
Dask behaves like NumPy here, but that's inherited behaviour - for a dask array of cupy arrays it would behave like CuPy.

So this is a mess. I personally think NumPy and MXNet are behaving the most consistent here, what CuPy does is also not unreasonable, and what PyTorch/TensorFlow/JAX do looks bad.

The above question also applies for Python float -> float32 casting, which is implicitly supported by the text that is currently there, but is unsafe in general (is there a "safe" variant of float64 -> float32, or is it always unsafe? Either way, this would be value based casting, which I thought we were trying to avoid).

Yes indeed. Here there's only two flavors, upcasting to float64 or keeping as float32 in which case the result will be positive or negative inf:

>>> float_above32bit
5.104235199577933e+38

>>> r = torch.ones(2, dtype=torch.float32) + float_above32bit
>>> r.dtype
torch.float32
>>> r
tensor([inf, inf])

>>> r = np.ones(2, dtype=np.float32) + float_above32bit
>>> r.dtype
dtype('float64')
>>> r
array([5.1042352e+38, 5.1042352e+38])

Only NumPy and CuPy upcast; MXNet, TensorFlow, JAX and PyTorch all keep float32 and turn the Python float into inf.

In both the int and float cases, I think we have little choice to simply state in the spec that the casting behaviour will be implementation-dependent.

shoyer · 2020-12-19T17:46:29Z

Unsigned integer overflow is technically undefined behavior in C/C++, and that seems to work out OK. It's not great, but I think these inconsistencies are mostly an indication that nobody expects Python's big ints to do something sensible when combined with fixed size arrays.

kgryte · 2021-11-04T09:42:16Z

Wrt the OP:

Disallowing scalar operands when using the matmul operator was addressed in Disallow scalar operands when using the matmul operator #307.
Whether boolean arrays should be allowed in arithmetic operations has been addressed. The spec currently includes guidance that operands should have numeric data types.
Casting and overflow behavior is left unspecified and implementation-dependent.
Same as (3).

Grouping operators into separate classes was addressed in #308.

As the concerns raised in this issue have been resolved, will close out. Any further concerns and discussion can be raised on a new issue.

This is based on what the spec currently says, but there are some issues with it (see data-apis/array-api#98).

asmeurer added a commit to data-apis/array-api-tests that referenced this issue Dec 5, 2020

Add test for array <binary op> scalar

613ae94

This is based on what the spec currently says, but there are some issues with it (see data-apis/array-api#98).

asmeurer mentioned this issue Dec 18, 2020

Summarize integer type promotion rules with a lattice? #101

Closed

asmeurer mentioned this issue Jan 8, 2021

Issues with array creation functions #107

Closed

asmeurer mentioned this issue Jan 20, 2021

Fix a typo in a type annotation #116

Merged

asmeurer mentioned this issue Jan 30, 2021

Explicitly specify which categories of dtypes the elementwise functions are defined for #121

Merged

kgryte added this to the v2021 milestone Oct 4, 2021

kgryte added the topic: Type Promotion Type promotion. label Nov 4, 2021

This was referenced Nov 4, 2021

Disallow scalar operands when using the matmul operator #307

Merged

Group operators into separate classes #308

Merged

kgryte closed this as completed Nov 4, 2021

cr313 added a commit to cr313/test-array-api that referenced this issue Apr 19, 2024

Add test for array <binary op> scalar

2ed2490

This is based on what the spec currently says, but there are some issues with it (see data-apis/array-api#98).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issues with "Mixing arrays and Python scalars" section #98

Issues with "Mixing arrays and Python scalars" section #98

asmeurer commented Dec 5, 2020

asmeurer commented Dec 5, 2020

asmeurer commented Dec 5, 2020 •

edited

Loading

asmeurer commented Dec 5, 2020

asmeurer commented Dec 5, 2020

shoyer commented Dec 17, 2020

asmeurer commented Dec 17, 2020

shoyer commented Dec 17, 2020

asmeurer commented Dec 17, 2020

shoyer commented Dec 17, 2020 via email

rgommers commented Dec 17, 2020

asmeurer commented Dec 17, 2020

rgommers commented Dec 19, 2020 •

edited

Loading

shoyer commented Dec 19, 2020

kgryte commented Nov 4, 2021

Issues with "Mixing arrays and Python scalars" section #98

Issues with "Mixing arrays and Python scalars" section #98

Comments

asmeurer commented Dec 5, 2020

asmeurer commented Dec 5, 2020

asmeurer commented Dec 5, 2020 • edited Loading

asmeurer commented Dec 5, 2020

asmeurer commented Dec 5, 2020

shoyer commented Dec 17, 2020

asmeurer commented Dec 17, 2020

shoyer commented Dec 17, 2020

asmeurer commented Dec 17, 2020

shoyer commented Dec 17, 2020 via email

rgommers commented Dec 17, 2020

asmeurer commented Dec 17, 2020

rgommers commented Dec 19, 2020 • edited Loading

shoyer commented Dec 19, 2020

kgryte commented Nov 4, 2021

asmeurer commented Dec 5, 2020 •

edited

Loading

rgommers commented Dec 19, 2020 •

edited

Loading