Skip to content

Avoid copies in array.astype #11970

Closed
@rth

Description

@rth

By default the following code, for an array X,

X = X.astype('float64')

will trigger a memory copy, including when X is of dtype float64, which can have a significant impact on runtime performance (cf. e.g. #11966 (comment))

In #11966 @massich made a list of places where this may be relevant.

Overall when we potentially cast to an array to the same dtype as it currently is,

  • for dense arrays we should probably always use X.atype(dtype, copy=False), unless a copy is explicitly needed.
  • for sparse arrays, copy argument was added to astype only a year ago unfortunately, so we may have to do something along the lines of,
    if sp_version >= (1, 1) or not sparse.issparse(X):
       copy_args = {'copy': False}
    else:
       copy_args = {}
    X.astype(dtype, *copy_args)
    or possibly to factorize that in utils/fixes.py

#11966 addressed a part of this.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions