Skip to content

[MRG] Make subfunctions of sparsefuncs private #6659

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

yenchenlin
Copy link
Contributor

In my previous merged PR #6539 ,
I forgot to use cdef for subfunctions, which may expose these subfunctions as part of the Python-level API.

This PR would fix the issue described above.

ping @jnothman @MechCoder @ogrisel

Thanks, and sorry for my carelessness.

def _inplace_csr_row_normalize_l1(np.ndarray[floating, ndim=1] X_data, shape,
@cython.boundscheck(False)
@cython.wraparound(False)
@cython.cdivision(True)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need these?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Sorry if that is obvious)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are a large part of what makes Cython faster than Python.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without those checks disabled, the expression X[i] (for int[:] X, int i) generates:

  __pyx_t_1 = __pyx_v_i;
  __pyx_t_2 = -1;
  if (__pyx_t_1 < 0) {
    __pyx_t_1 += __pyx_v_X.shape[0];
    if (unlikely(__pyx_t_1 < 0)) __pyx_t_2 = 0;
  } else if (unlikely(__pyx_t_1 >= __pyx_v_X.shape[0])) __pyx_t_2 = 0;
  if (unlikely(__pyx_t_2 != -1)) {
    __Pyx_RaiseBufferIndexError(__pyx_t_2);
    {__pyx_filename = __pyx_f[0]; __pyx_lineno = 2; __pyx_clineno = __LINE__; goto __pyx_L1_error;}
  }
  __pyx_r = (*((int *) ( /* dim=0 */ (__pyx_v_X.data + __pyx_t_1 * __pyx_v_X.strides[0]) )));

With the checks disabled, it's just the last line.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for @jnothman 's explanation, Cool!

@rvraghav93 I add it partially just because original code has these.
But for @cython.cdivision(True), you can see this

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks heaps for the explanation Joel!! As I always say, I learn a lot from your comments!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks heaps for the explanation Joel!! As I always say, I learn a lot from your comments!

You're welcome. The intention with that comment was actually "Cython's not so scary. Write a three line snippet and try it yourself!" I hope it worked.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yenchenlin yenchenlin force-pushed the make-sparse-utils-funtions-private branch 2 times, most recently from 714f34f to 82b66e4 Compare April 14, 2016 12:14
@jnothman
Copy link
Member

LGTM

@jnothman jnothman changed the title Make subfunctions of sparsefuncs private [MRG+1] Make subfunctions of sparsefuncs private Apr 14, 2016
@yenchenlin yenchenlin changed the title [MRG+1] Make subfunctions of sparsefuncs private [WIP] Make subfunctions of sparsefuncs private Apr 14, 2016
@yenchenlin
Copy link
Contributor Author

@jnothman wait it broke ...
It seems that a function can not call a cdef subfunction?

@raghavrv
Copy link
Member

It can break with MRG state. You can rename it back to [MRG+1]. Core devs won't merge unless the tests pass finally!

@cython.boundscheck(False)
@cython.wraparound(False)
@cython.cdivision(True)
cdef _inplace_csr_row_normalize_l1(np.ndarray[floating, ndim=1] X_data, shape,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually, should be cdef void

@yenchenlin yenchenlin changed the title [WIP] Make subfunctions of sparsefuncs private [MRG+1] Make subfunctions of sparsefuncs private Apr 14, 2016
@jnothman
Copy link
Member

Sorry, I was a bit trigger-happy.

@jnothman
Copy link
Member

Btw, "private" isn't really the correct description. The difference between def and cdef lies in whether the function needs to be exposed to Python meaning it needs to accept Python objects as input (and hence handle any argument list parsing and conversion) and return a Python object as output.

@@ -275,13 +275,16 @@ def incr_mean_variance_axis0(X, last_mean, last_var, unsigned long last_n):
@cython.wraparound(False)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These flags are no longer relevant to this function

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though perhaps the flags should be applied at the module level...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jnothman
Did you mean add

#!python
#cython: wraparound=False

in front of any code?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that would prevent rewriting it again and again.

@jnothman jnothman changed the title [MRG+1] Make subfunctions of sparsefuncs private [MRG] Make subfunctions of sparsefuncs private Apr 14, 2016
@yenchenlin yenchenlin force-pushed the make-sparse-utils-funtions-private branch from 82b66e4 to 975b600 Compare April 14, 2016 15:59
@yenchenlin
Copy link
Contributor Author

hmm ... still broke.
I've tried to solve it but I still can't figure out what's wrong 😢

@jnothman any idea?

@raghavrv
Copy link
Member

raghavrv commented Apr 14, 2016

The cdef functions are those which can be called by other cdef and cpdef functions (those which will be compiled). You are attempting to call that from a python function def, which is why the cython compiler is complaining that there is "no suitable method found" (which can be accessed from inside a python function).

You need to declare them using cpdef to make them accessible from both C and python

http://docs.cython.org/src/reference/language_basics.html#functions-and-methods

Apologies for misunderstanding this :@ This is incorrect. Refer Joel's comment instead.

@MechCoder
Copy link
Member

MechCoder commented Apr 14, 2016

The difference between def and cdef lies in whether the function needs to be exposed to Python meaning it needs to accept Python objects as input.

Is that true? From here (http://docs.cython.org/src/reference/language_basics.html#functions-and-methods) it says both def and cdef can accept and return Python objects. I suppose it is just where they can be called from that makes a difference

@jnothman
Copy link
Member

They can accept and return Python objects, but if something is typed as a
non-Python object, it will only perform automatic conversions for def or
cpdef.

cpdef is not the solution here. But I also think we can't get cdef to work.

On 15 April 2016 at 06:51, Manoj Kumar notifications@github.com wrote:

The difference between def and cdef lies in whether the function needs to
be exposed to Python meaning it needs to accept Python objects as input.

Is that true? From here (
http://docs.cython.org/src/reference/language_basics.html#functions-and-methods)
it says both def and cdef can accept and return Python objects. I suppose
it is just where they can be called from that makes a difference


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#6659 (comment)

@yenchenlin
Copy link
Contributor Author

Hello @rvraghav93 , thanks!
However, motivated by @jnothman , I tried a little script to test it.

Any idea why following code works?
It also call a cdef function inside a python function

def function(int i):
    _function(i)

cdef void _function(int i):
    print i

function(10)

@jnothman
Copy link
Member

jnothman commented Apr 15, 2016

The reason why the current code does not work is not because it is a cdef called from a def. It is because the Cython translator needs to be able to resolve which type among a fused type to select for a particular input argument. But the translator doesn't know anything about X.data and doesn't know anything about numpy ndarrays. So it can't do that static typing without full conversion from arbitrary Python object, hence def/cpdef.

@yenchenlin
Copy link
Contributor Author

yenchenlin commented Apr 15, 2016

Yeah @jnothman , you are right.
cpdef is not the solution here, I've tried it and it still break.

Btw, "private" isn't really the correct description. The difference between def and cdef lies in whether the function needs to be exposed to Python meaning it needs to accept Python objects as input (and hence handle any argument list parsing and conversion) and return a Python object as output.

If so, why can't we leave it as def here?

@MechCoder
Copy link
Member

MechCoder commented Apr 15, 2016

Because of the same reason as @jnothman had mentioned. cdef is faster because of static typing, see (http://notes-on-cython.readthedocs.org/en/latest/classes.html) for a quick example.

Also "private" because you cannot import _inplace_csr_row_normalize_l1 outside sparsefuncs_fast.pyx sklearn. (Inside sklearn you can do a cimport)

But it seems like fused memoryviews can be an option though. Shouldn't be too tough to try it out. (http://docs.cython.org/src/userguide/fusedtypes.html#fused-types-and-arrays)

@yenchenlin
Copy link
Contributor Author

yenchenlin commented Apr 15, 2016

@MechCoder Can you give an brief example for how to use fused memoryviews to solve current problem?

(sorry if this is obvious)

@jnothman
Copy link
Member

Yes, we can, and should, leave it as def here, IMO.

On 15 April 2016 at 14:07, Yen notifications@github.com wrote:

@MechCoder https://github.com/MechCoder Can you give an brief example
for how to use fused memoryviews to solve current problem?


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#6659 (comment)

@MechCoder
Copy link
Member

@yenchenlin1994 Similar to what is being done over here (https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/linear_model/cd_fast.pyx#L287)

But I agree with @jnothman , this is unlikely to be a bottleneck and not worth the effort.

@jnothman
Copy link
Member

@MechCoder, that example involves neither cdef, nor fused types. Here the issue is the combination of cdef, fused types and numpy arrays.

@MechCoder
Copy link
Member

I wrongly extrapolated the logic in spite of your comments. (That sums up a really bad day at work)

@yenchenlin yenchenlin force-pushed the make-sparse-utils-funtions-private branch from 975b600 to 5111288 Compare April 15, 2016 06:15
@yenchenlin yenchenlin force-pushed the make-sparse-utils-funtions-private branch from 5111288 to 2f854c3 Compare April 15, 2016 06:16
@yenchenlin yenchenlin changed the title [MRG] Make subfunctions of sparsefuncs private [MRG] Make cython compiler directives global Apr 15, 2016
@yenchenlin yenchenlin changed the title [MRG] Make cython compiler directives global [MRG] Make subfunctions of sparsefuncs private Apr 15, 2016
@yenchenlin
Copy link
Contributor Author

yenchenlin commented Apr 15, 2016

Hey guys, I think that we agree we should still use def here since cdef function can't figure out which type among a fused type to select for a particular input argument?
If so, I will close this PR.

However, there should be a PR that make cython compiler directives global and also move the comments here from subfunction to function.
Should I create a new one?

ping @jnothman @MechCoder

@MechCoder
Copy link
Member

Please go ahead

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants