-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
[MRG] Make subfunctions of sparsefuncs private #6659
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MRG] Make subfunctions of sparsefuncs private #6659
Conversation
def _inplace_csr_row_normalize_l1(np.ndarray[floating, ndim=1] X_data, shape, | ||
@cython.boundscheck(False) | ||
@cython.wraparound(False) | ||
@cython.cdivision(True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need these?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Sorry if that is obvious)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are a large part of what makes Cython faster than Python.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without those checks disabled, the expression X[i]
(for int[:] X, int i
) generates:
__pyx_t_1 = __pyx_v_i;
__pyx_t_2 = -1;
if (__pyx_t_1 < 0) {
__pyx_t_1 += __pyx_v_X.shape[0];
if (unlikely(__pyx_t_1 < 0)) __pyx_t_2 = 0;
} else if (unlikely(__pyx_t_1 >= __pyx_v_X.shape[0])) __pyx_t_2 = 0;
if (unlikely(__pyx_t_2 != -1)) {
__Pyx_RaiseBufferIndexError(__pyx_t_2);
{__pyx_filename = __pyx_f[0]; __pyx_lineno = 2; __pyx_clineno = __LINE__; goto __pyx_L1_error;}
}
__pyx_r = (*((int *) ( /* dim=0 */ (__pyx_v_X.data + __pyx_t_1 * __pyx_v_X.strides[0]) )));
With the checks disabled, it's just the last line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks heaps for the explanation Joel!! As I always say, I learn a lot from your comments!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks heaps for the explanation Joel!! As I always say, I learn a lot from your comments!
You're welcome. The intention with that comment was actually "Cython's not so scary. Write a three line snippet and try it yourself!" I hope it worked.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rvraghav93 This is also documented here (http://docs.cython.org/src/reference/compilation.html#compiler-directives)
714f34f
to
82b66e4
Compare
LGTM |
@jnothman wait it broke ... |
It can break with |
@cython.boundscheck(False) | ||
@cython.wraparound(False) | ||
@cython.cdivision(True) | ||
cdef _inplace_csr_row_normalize_l1(np.ndarray[floating, ndim=1] X_data, shape, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually, should be cdef void
Sorry, I was a bit trigger-happy. |
Btw, "private" isn't really the correct description. The difference between |
@@ -275,13 +275,16 @@ def incr_mean_variance_axis0(X, last_mean, last_var, unsigned long last_n): | |||
@cython.wraparound(False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These flags are no longer relevant to this function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Though perhaps the flags should be applied at the module level...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that would prevent rewriting it again and again.
82b66e4
to
975b600
Compare
hmm ... still broke. @jnothman any idea? |
The
http://docs.cython.org/src/reference/language_basics.html#functions-and-methods Apologies for misunderstanding this :@ This is incorrect. Refer Joel's comment instead. |
Is that true? From here (http://docs.cython.org/src/reference/language_basics.html#functions-and-methods) it says both def and cdef can accept and return Python objects. I suppose it is just where they can be called from that makes a difference |
They can accept and return Python objects, but if something is typed as a cpdef is not the solution here. But I also think we can't get cdef to work. On 15 April 2016 at 06:51, Manoj Kumar notifications@github.com wrote:
|
Hello @rvraghav93 , thanks! Any idea why following code works? def function(int i):
_function(i)
cdef void _function(int i):
print i
function(10) |
The reason why the current code does not work is not because it is a cdef called from a def. It is because the Cython translator needs to be able to resolve which type among a fused type to select for a particular input argument. But the translator doesn't know anything about |
Yeah @jnothman , you are right.
If so, why can't we leave it as |
Because of the same reason as @jnothman had mentioned. cdef is faster because of static typing, see (http://notes-on-cython.readthedocs.org/en/latest/classes.html) for a quick example. Also "private" because you cannot import But it seems like fused memoryviews can be an option though. Shouldn't be too tough to try it out. (http://docs.cython.org/src/userguide/fusedtypes.html#fused-types-and-arrays) |
@MechCoder Can you give an brief example for how to use fused memoryviews to solve current problem? (sorry if this is obvious) |
Yes, we can, and should, leave it as def here, IMO. On 15 April 2016 at 14:07, Yen notifications@github.com wrote:
|
@yenchenlin1994 Similar to what is being done over here (https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/linear_model/cd_fast.pyx#L287) But I agree with @jnothman , this is unlikely to be a bottleneck and not worth the effort. |
@MechCoder, that example involves neither cdef, nor fused types. Here the issue is the combination of cdef, fused types and numpy arrays. |
I wrongly extrapolated the logic in spite of your comments. (That sums up a really bad day at work) |
975b600
to
5111288
Compare
5111288
to
2f854c3
Compare
Hey guys, I think that we agree we should still use However, there should be a PR that make cython compiler directives global and also move the comments here from subfunction to function. ping @jnothman @MechCoder |
Please go ahead |
In my previous merged PR #6539 ,
I forgot to use
cdef
for subfunctions, which may expose these subfunctions as part of the Python-level API.This PR would fix the issue described above.
ping @jnothman @MechCoder @ogrisel
Thanks, and sorry for my carelessness.