MAINT Remove `-Wcpp` warnings when compiling `sklearn.manifold._utils` #24925

rprkh · 2022-11-15T01:59:23Z

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Using memory views in place of the deprecated cnp.ndarray.

Any other comments?

glemaitre · 2022-11-15T20:30:57Z

sklearn/manifold/_utils.pyx

+def _binary_search_perplexity(
+        cnp.float32_t[:, :] sqdistances,


Suggested change

def _binary_search_perplexity(

cnp.float32_t[:, :] sqdistances,

cpdef float[:, :] _binary_search_perplexity(

const float[:, :] sqdistances,

Since this function is never used in cython but always in python, it's better to make it a pure def

sklearn/manifold/_utils.pyx

glemaitre · 2022-11-15T20:32:00Z

sklearn/manifold/_utils.pyx

@@ -63,7 +63,7 @@ cpdef cnp.ndarray[cnp.float32_t, ndim=2] _binary_search_perplexity(

    # This array is later used as a 32bit array. It has multiple intermediate
    # floating point additions that benefit from the extra precision
-    cdef cnp.ndarray[cnp.float64_t, ndim=2] P = np.zeros(
+    cdef cnp.float64_t[:, :] P = np.zeros(
        (n_samples, n_neighbors), dtype=np.float64)

    for i in range(n_samples):


You need to change the return line by:

return P.astype(np.float32)

This is not the current behavior. Currently it returns a float64 array (this is because it's using the def part of the function which doesn't care about the return type)

I'd rather keep the current behavior

sklearn/manifold/_utils.pyx

rprkh · 2022-11-16T07:42:31Z

Included the suggested changes.

Edit: Some tests are still failing as the memoryviewslice has no ravel attribute. I tried looking for solutions on SO and going through some of the code, but I'm not entirely sure what the fix should be.

jjerphan

Thank you, @rprkh.

Apart from a few comments regarding dtype preservation, everything LGTM.

sklearn/manifold/_utils.pyx

jjerphan · 2022-11-21T09:42:35Z

sklearn/manifold/_utils.pyx

@@ -118,4 +116,4 @@ cpdef cnp.ndarray[cnp.float32_t, ndim=2] _binary_search_perplexity(
    if verbose:
        print("[t-SNE] Mean sigma: %f"
              % np.mean(math.sqrt(n_samples / beta_sum)))
-    return P
+    return np.array(P, dtype=np.float32)


I think preserving dtype might better be treated in another PR.

For now, I propose just returning the numpy array using the base attribute of the memoryview.

Suggested change

return np.array(P, dtype=np.float32)

return P.base

It was discussed somewhere that np.asarray should be preferred because .base can point to a larger array than expected (we can have a view on only a subset of the array)

rprkh · 2022-11-22T03:44:31Z

Included the suggestions.

rprkh · 2022-11-22T03:57:23Z

sklearn/manifold/_utils.pyx

+    cdef double[:, :] P = np.zeros(
        (n_samples, n_neighbors), dtype=np.float64)


I think the failing tests are related to the dtype of P over here.

jjerphan

I think @rprkh is right in https://github.com/scikit-learn/scikit-learn/pull/24925/files#r1028804071: P is allocated with float64 but returned (and implicitly cast to float32).

That's the first time I stumble upon this behavior in Cython.

For now, I would simply cast it explicitly using what @glemaitre proposed earlier and add a TODO comment indicating that this function both support float32 and float64 and preserves inputs' dtype.

sklearn/manifold/_utils.pyx

jjerphan

LGTM given that the CI passes.

jeremiedbb

LGTM. Thanks @rprkh

rprkh · 2022-11-30T19:06:38Z

This was my first time working with Cython. Thanks for the reviews and suggestions @glemaitre @jjerphan @jeremiedbb. I look forward to contributing more in the future.

jjerphan · 2022-11-30T19:12:46Z

Glad to read that!

Feel free to pick other similar issues or to ask if you want to work on something more involved.

scikit-learn#24925) Co-authored-by: jeremie du boisberranger <jeremiedbb@yahoo.fr>

remove wcpp warnings

8b33492

github-actions bot added module:manifold cython labels Nov 15, 2022

glemaitre reviewed Nov 15, 2022

View reviewed changes

include changes

064fbff

remove astype from return P

2a43a8f

jjerphan reviewed Nov 21, 2022

View reviewed changes

return base attribute of memoryview

6505d3d

rprkh force-pushed the remove_wcpp_warnings branch from 70e5412 to 6505d3d Compare November 21, 2022 17:49

rprkh commented Nov 22, 2022

View reviewed changes

jjerphan reviewed Nov 22, 2022

View reviewed changes

sklearn/manifold/_utils.pyx Outdated Show resolved Hide resolved

sklearn/manifold/_utils.pyx Outdated Show resolved Hide resolved

include TODO comment

08d3661

jjerphan approved these changes Nov 22, 2022

View reviewed changes

jjerphan mentioned this pull request Nov 30, 2022

MAINT Remove all Cython, C and C++ compilations warnings #24875

Closed

22 tasks

jeremiedbb added 2 commits November 30, 2022 16:50

keep previous behavior

3314cc9

Merge remote-tracking branch 'upstream/main' into pr/rprkh/24925

32206a6

jeremiedbb approved these changes Nov 30, 2022

View reviewed changes

jjerphan merged commit f2f3b3c into scikit-learn:main Nov 30, 2022

rprkh deleted the remove_wcpp_warnings branch November 30, 2022 19:07

jjerphan pushed a commit to jjerphan/scikit-learn that referenced this pull request Jan 20, 2023

MAINT Remove -Wcpp warnings when compiling sklearn.manifold._utils (

d14118f

scikit-learn#24925) Co-authored-by: jeremie du boisberranger <jeremiedbb@yahoo.fr>

jjerphan pushed a commit to jjerphan/scikit-learn that referenced this pull request Jan 20, 2023

MAINT Remove -Wcpp warnings when compiling sklearn.manifold._utils (

ad380dd

scikit-learn#24925) Co-authored-by: jeremie du boisberranger <jeremiedbb@yahoo.fr>

		def _binary_search_perplexity(
		cnp.float32_t[:, :] sqdistances,

		cdef double[:, :] P = np.zeros(
		(n_samples, n_neighbors), dtype=np.float64)

Uh oh!

MAINT Remove -Wcpp warnings when compiling sklearn.manifold._utils #24925

MAINT Remove -Wcpp warnings when compiling sklearn.manifold._utils #24925

Uh oh!

Conversation

rprkh commented Nov 15, 2022

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Any other comments?

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

rprkh commented Nov 16, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jjerphan left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rprkh commented Nov 22, 2022

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jjerphan left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

jjerphan left a comment

Choose a reason for hiding this comment

Uh oh!

jeremiedbb left a comment

Choose a reason for hiding this comment

Uh oh!

rprkh commented Nov 30, 2022

Uh oh!

jjerphan commented Nov 30, 2022

Uh oh!

Uh oh!

MAINT Remove `-Wcpp` warnings when compiling `sklearn.manifold._utils` #24925

MAINT Remove `-Wcpp` warnings when compiling `sklearn.manifold._utils` #24925

rprkh commented Nov 16, 2022 •

edited

Loading