Avoid extra copy when using astype in sparsefuncs_fast #11966

massich · 2018-09-01T13:27:46Z

Reference Issues/PRs

What does this implement/fix? Explain your changes.

It avoids getting an extra copy of the data in the following case:

X = np.empty([10,10], dtype=np.float64)
X = X.astype(np.float64)

Files to review:

Any other comments?

rth

Thanks @massich !

Waiting for CI..

massich · 2018-09-01T15:13:12Z

Actually, the list of files can be ignored right now and we can do it in a subsequent PR

rth · 2018-09-01T22:17:17Z

For a random sparse CSR array of shape (5000, 40000) with a 0.01 sparsity, this reduces the runtime of csr_row_norms by ~50% which is pretty nice (I have not checked the other methods). Thanks!

lesteve · 2018-09-02T08:15:29Z

I timed the modified functions following @rth comment and here are the results:

import numpy as np
from scipy.sparse import random
from sklearn.utils.sparsefuncs_fast import (
    csr_row_norms, csr_mean_variance_axis0,
    csc_mean_variance_axis0, incr_mean_variance_axis0)

csr = random(5000, 40000, format='csr')
csc = csr.asformat('csc')

print('csr_row_norms')
%timeit csr_row_norms(csr)

print('csr_mean_variance_axis0')
%timeit csr_mean_variance_axis0(csr)

print('csc_mean_variance_axis0')
%timeit csc_mean_variance_axis0(csc)

print('incr_mean_variance_axis0')
%timeit incr_mean_variance_axis0(csr, np.zeros(40000), np.ones(40000), np.array([1]))

Master:

csr_row_norms
6.7 ms ± 140 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
csr_mean_variance_axis0
21.2 ms ± 193 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
csc_mean_variance_axis0
15.3 ms ± 150 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
incr_mean_variance_axis0
22 ms ± 151 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

This PR:

csr_row_norms
2.05 ms ± 14.5 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
csr_mean_variance_axis0
17.1 ms ± 411 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
csc_mean_variance_axis0
10.8 ms ± 244 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
incr_mean_variance_axis0
17.9 ms ± 129 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

lesteve · 2018-09-02T08:18:01Z

Merging, thanks a lot @massich and nice tracking down of something that should have failed but did not in #11964 @rth!

…#11966)

Avoid extra copy if not needed

1837b82

rth approved these changes Sep 1, 2018

View reviewed changes

rth mentioned this pull request Sep 1, 2018

WIP: Migrate to Cython memoryviews in sklearn.utils #11964

Closed

don't use copy parametre, just check type and do nothing

2afb1c6

rth approved these changes Sep 1, 2018

View reviewed changes

rth changed the title ~~Avoid extra copy when using astype~~ Avoid extra copy when using astype in sparsefuncs_fast Sep 1, 2018

lesteve merged commit 51b1b7c into scikit-learn:master Sep 2, 2018

rth mentioned this pull request Sep 2, 2018

Avoid copies in array.astype #11970

Closed

jnothman pushed a commit to jnothman/scikit-learn that referenced this pull request Sep 2, 2018

Remove unnecessary copy for float64 in sparsefuncs_fast (scikit-learn…

48fd16d

…#11966)

jnothman pushed a commit to jnothman/scikit-learn that referenced this pull request Sep 17, 2018

Remove unnecessary copy for float64 in sparsefuncs_fast (scikit-learn…

a7a8814

…#11966)

OGordon100 mentioned this pull request Mar 5, 2019

ENH: Allow inplace copying in place in "detrend" function scipy/scipy#9792

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Avoid extra copy when using astype in sparsefuncs_fast #11966

Avoid extra copy when using astype in sparsefuncs_fast #11966

Uh oh!

massich commented Sep 1, 2018 •

edited

Loading

Uh oh!

rth left a comment

Uh oh!

massich commented Sep 1, 2018

Uh oh!

rth commented Sep 1, 2018

Uh oh!

lesteve commented Sep 2, 2018

Uh oh!

lesteve commented Sep 2, 2018

Uh oh!

Uh oh!

Uh oh!

Avoid extra copy when using astype in sparsefuncs_fast #11966

Avoid extra copy when using astype in sparsefuncs_fast #11966

Uh oh!

Conversation

massich commented Sep 1, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Any other comments?

Uh oh!

rth left a comment

Choose a reason for hiding this comment

Uh oh!

massich commented Sep 1, 2018

Uh oh!

rth commented Sep 1, 2018

Uh oh!

lesteve commented Sep 2, 2018

Uh oh!

lesteve commented Sep 2, 2018

Uh oh!

Uh oh!

massich commented Sep 1, 2018 •

edited

Loading