[MRG-1] ENH: Allow float32 to pass through with copy #5932

jseabold · 2015-11-28T17:14:09Z

This is the easy path to fixing this.

I was a bit surprised to see the float64 default and the copies elsewhere in the sparse functions. Are there rules for consistency as to float32 vs. float64 in the Cython code?

GaelVaroquaux · 2015-11-29T10:43:39Z

In general the good looks good, but I would like to know the exact problem that you are trying to solve. Ideally, this problem should be illustrated by a test that fails without your patch and passes with it. The idea being that it will guide later refactoring.

In the long run, we want to tackle the issue of multiple types in Cython using fused types.

jseabold · 2015-11-29T15:35:32Z

I found that computing cosine distance fails on 32-bit floating point sparse arrays. Added regression tests.

I added some clarifying documentation on the assign_rows_csr but perhaps this should fail given it's supposed to be no-copy?

Fused types would be nice.

MechCoder · 2015-12-01T22:05:10Z

This fix should work, but I'm afraid that this is equivalent to changing the dtype of the sparse matrix prior and passing it to the function as it produces a copy.

It would be great if you could add fused type support so that we can do this without copying ! Just for this function would be a great start.

springcoil · 2016-02-11T10:34:48Z

I had a quick look at this today - the code looks fine to me, and the fix works. I would recommend merging it anyway - but fused types needs done too.

ogrisel · 2016-02-16T14:54:22Z

sklearn/utils/sparsefuncs_fast.pyx

-        np.ndarray[DOUBLE, ndim=1] data = X.data
+        # might copy
+        np.ndarray[DOUBLE, ndim=1] data = np.asarray(X.data,
+                                                     dtype=np.float64)


This change to assign_rows_csr is not tested by the new regression test.

ogrisel · 2016-02-16T15:01:38Z

inplace_csr_row_normalize_l1 and inplace_csr_row_normalize_l2 are supposed to be inplace operations, therefore they cannot by design make a copy of the input data.

The only correct fix for those 2 functions is to use cython fused types for np.float32 and np.float64 (or directly cython.floating that should work for this case AFAIK).

Therefore I am -1 on this part of the PR.

+0 for the change to assign_rows_csr if a test is added, but I think it would be worth using fused types directly for this case as well.

GaelVaroquaux · 2016-02-16T15:11:04Z

inplace_csr_row_normalize_l1 and inplace_csr_row_normalize_l2 are
supposed to be inplace operations, therefore they cannot by design make
a copy of the input data.

Therefore I am -1 on this part of the PR.

Agreed with the analysis. Sorry for my previous 👍, it would have been
an error to have "in-place" functions copy data.

The only correct fix for those 2 functions is to use cython fused types for
np.float32 and np.float64 (or directly cython.floating that should work for
this case AFAIK).

That would be good.

jnothman · 2016-04-26T01:42:32Z

Can we close this as the incorrect patch, while waiting for a fix as part of @yenchenlin1994's GSoC?

jnothman · 2016-04-26T01:42:38Z

That's what I'm doing anyway...

GaelVaroquaux changed the title ~~ENH: Allow float32 to pass through with copy~~ [MRG+1] ENH: Allow float32 to pass through with copy Nov 29, 2015

jseabold added 3 commits November 29, 2015 09:28

TST: Regression test for 32-bit input

f1b7a11

ENH: Allow float32 to pass through with copy

6a4fa62

DOC: Clarify doc for copy behavior

fe6b225

jseabold force-pushed the fastsparse-float32 branch from bbd2675 to fe6b225 Compare November 29, 2015 15:35

amueller mentioned this pull request Dec 9, 2015

Add fused type to Cython files #5973

Closed

amueller added the Waiting for Reviewer label Dec 10, 2015

ogrisel reviewed Feb 16, 2016
View reviewed changes

ogrisel changed the title ~~[MRG+1] ENH: Allow float32 to pass through with copy~~ [MRG+1-1] ENH: Allow float32 to pass through with copy Feb 16, 2016

GaelVaroquaux changed the title ~~[MRG+1-1] ENH: Allow float32 to pass through with copy~~ [MRG-1] ENH: Allow float32 to pass through with copy Feb 16, 2016

yenchenlin mentioned this pull request Mar 14, 2016

[MRG+1] Use fused type in inplace normalize #6539

Merged

jnothman closed this Apr 26, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[MRG-1] ENH: Allow float32 to pass through with copy #5932

[MRG-1] ENH: Allow float32 to pass through with copy #5932

Uh oh!

jseabold commented Nov 28, 2015

Uh oh!

GaelVaroquaux commented Nov 29, 2015

Uh oh!

jseabold commented Nov 29, 2015

Uh oh!

MechCoder commented Dec 1, 2015

Uh oh!

springcoil commented Feb 11, 2016

Uh oh!

ogrisel Feb 16, 2016

Uh oh!

ogrisel commented Feb 16, 2016

Uh oh!

GaelVaroquaux commented Feb 16, 2016

Uh oh!

jnothman commented Apr 26, 2016

Uh oh!

jnothman commented Apr 26, 2016

Uh oh!

Uh oh!

Uh oh!

[MRG-1] ENH: Allow float32 to pass through with copy #5932

[MRG-1] ENH: Allow float32 to pass through with copy #5932

Uh oh!

Conversation

jseabold commented Nov 28, 2015

Uh oh!

GaelVaroquaux commented Nov 29, 2015

Uh oh!

jseabold commented Nov 29, 2015

Uh oh!

MechCoder commented Dec 1, 2015

Uh oh!

springcoil commented Feb 11, 2016

Uh oh!

ogrisel Feb 16, 2016

Choose a reason for hiding this comment

Uh oh!

ogrisel commented Feb 16, 2016

Uh oh!

GaelVaroquaux commented Feb 16, 2016

Uh oh!

jnothman commented Apr 26, 2016

Uh oh!

jnothman commented Apr 26, 2016

Uh oh!

Uh oh!