-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
MNT simplify some tree code with memoryviews #12886
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
we do test readonly arrays, the question is whether they survive all conversions to that point (probably they get copied somewhere). |
well that's interesting. Should also be interesting to @NicolasHug |
how stable are these curves though, can we get bars? ;) |
I know I'm doing something wrong here, but really don't know what. I get this very odd looking cython compile error:
Been trying to figure it out, but any help would be appreciated. |
don't ask me why but I think that
in both splitter and criterion will fix it (also remove Very weird that the other dtypes aren't causing the same thing. Looks similar to this where the fix is similar |
Ooh, that makes a lot of sense. I wish cython would print the complete type, like, |
But does it make sense to you that only It looks like a Cython bug to me. |
I think the difference may be that DOUBLE_t is the only one used in the context of a memoryview while passing between modules. I'd agree that it's a bug or at least an inconsistency that it has stricter type checking for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I personally find MemoryViews beneficial to maintainability, and helpful for catching bugs. If bench_tree is happy, and @NicolasHug is happy, I think this is a positive change.
Sorry, got confused. Not sure that @NicolasHug is the one most in need of being happy. Rather, if you are happy that this does not make life more difficult with your other work on trees, I'm happy for it. |
It sure doesn't make my life harder at least ^^ |
What are the units of the axis? You don't just use 10 samples / rows right? Have you tried removing some of the unnecessary casts in
is probably not needed since I assume looking at the annotated generated code ( In any case I don't think the overhead of passing the memview struct is causing the slowdown, because this struct is only passed once when building the criterion object right? It's not passed into tons of function calls. Also, I would assume the same overhead occurs for the Like I said above I'm not experienced with advanced Cython so... don't take all this for granted ;) |
Another lead might be that the previous code for here is the doc for C or F contiguous memviews |
Nice tips @NicolasHug . Those together bring all tests as shown in these plots, i.e. no overhead (most of it was after I applied your first comment): Now I need to figure out a way to to handle the read-only input issue (i.e. failing tests). |
I guess this looks good for a review now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not too familiar with the trees code but this definitely look more readable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is helpful to improve the maintainability of the tree code... but this also changes the interface of Criterion which users are known to develop custom extensions to. Perhaps the same is true of Splitter. I'm a little concerned that this will break code (even on not-really-supported API) sufficiently to not be worthwhile.
If we have reason to break these APIs in any case, then this maintainability improvement could be thrown in, but it's not very fair to users otherwise. What do others think?
My 2 cents is that we should not start caring about non public API changes. Quoting #10251 (comment)
|
It's also mostly (if not all) the C API, and not even the python API. I really don't think we should care about backward compatibility of those pieces of the code. That said, I'm happy to have a blog post somewhere (we don't have a blog, do we?) and announce well in advance that there are some C API changes to those parts and explain them. |
I don't see how C vs Python API should make a difference, except insofar as
it might be harder to write Cython code compatible with multiple versions
of Scikit-learn. If I am not mistaken it would require translation-time
switching based on sklearn.__version__.
Perhaps it's right that this is not public, and we should stick to our guns
and not support it. But we know people are extending Criterion, we recently
offered them a pxd to do so (albeit with no promises of future support),
and as long as this change is largely cosmetic, I don't see why it is good
to force downstream libraries to be pinned to a particular Scikit-learn
version.
|
Personally, I was really confused by the strides and it took me a while to figure they're just there to index the matrices. I guess we [mostly] agree that this part of the package is not the easiest to get used to, which contributes to inhibiting contributions to it by a wider developer community. FWIW, I personally find memoryviews significantly nicer than what they're replacing here, and much less intimidating. Generally, I think this PR is not the only thing which can potentially improve readability of the tree code base, and I'd find any contribution towards this goal valuable (the least it does is to ease my life working on trees, and that's why I started this PR). So I would have liked us to follow the "it's not public, therefore we can change the API" principle, at least here. I understand it's a nice thing to not change APIs, but:
I don't know, that's all I have I guess. |
@jnothman Is it possible to direct the people using the current version of Criterion to this PR and see what they think about it? |
We can certainly ask people at #12381
|
As far as I can tell the only change to the @camilstaps, you opened #10251 and needed custom Criterions, WDYT? Also @glemaitre , @jmschrei, @glouppe , you've been working a lot on the tree code, maybe you want to weight in? EDIT: for context: this PR greatly simplifies tree code by using memory views instead of pointer + stride, but breaks the Criterion interface, which is private but apparently used anyway by some users. |
Thanks for the persuasive market research, @NicolasHug :D Maybe we should just make the change. |
Regarding custom Criterion, I think that it is a minor issue. If I recall the PR which was exposing the criterion, it was mentioned that we did not want to officially support it but exposed the criterion for convenience instead. |
\o/ |
What did we decide about read only memoryview handling? Are we copying at master? In this PR? |
I don't think we're copying. Some tests did actually fail and then I changed them in splitter and criterions to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other than that LGTM!
sklearn/tree/_criterion.pyx
Outdated
@@ -454,7 +444,7 @@ cdef class ClassificationCriterion(Criterion): | |||
|
|||
for k in range(self.n_outputs): | |||
label_index = (k * self.sum_stride + | |||
<SIZE_t> y[i * self.y_stride + k]) | |||
<SIZE_t> self.y[i, k]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at the annotated html file we can see that using self
generates some small Python interactions. I still don't understand why: this is a cdef class and self.y
is properly typed, but I have observed this in #12807 as well.
Having only one Python interation is OK, but having interactions inside loops like here might be costly, so it's best to remove them. The workaround I found is to declare a local view and use it instead:
def const DOUBLE_t [:, ::1] y = self.y
Sorry @adrinjalali I know I pretty much told you to use self.
in #12886 (comment), but I wasn't aware of this at the time :s
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's quite strange that the observation contradicts what we'd expect. I did change them and put the local variables in place, but the overhead of the local variable seems to be kinda much more than the little python interaction when using self
. So I guess we should keep them as is, although along the way I did a bit more cleanups and the benchmarks are still intact.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay,
I'll try to come up with a reproducing example and ask Cython folks why we're having Python interactions. Meanwhile I agree, let's keep it as is.
Thanks for double-checking!
@jnothman any final verdict on this one? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm in favour. Is that final?
It is for this PR. I've found a bit more in the context of the NOCATS PR, but that can stay there. For future reference, any such changes should go through time performance tests, sometimes what we see is not intuitive and they may actually slow down the run. |
@glemaitre review ping :) |
* simplify apply_dense with memoryviews * change more Xs to memviews in splitter * remove commented out lines * trying to remove y_stride * float->dtype_t * criterion has no more y_stride * remove redundant constructs and handle const y input * fix criterion docstring for y's dtype * revert docstring for y * formatting: no space between type name and [:, :] * remove redundant X * more cleanup on criterion * address comment
)" This reverts commit ace4eb1.
)" This reverts commit ace4eb1.
* simplify apply_dense with memoryviews * change more Xs to memviews in splitter * remove commented out lines * trying to remove y_stride * float->dtype_t * criterion has no more y_stride * remove redundant constructs and handle const y input * fix criterion docstring for y's dtype * revert docstring for y * formatting: no space between type name and [:, :] * remove redundant X * more cleanup on criterion * address comment
Related to #10624
This for now is more of a question, before potentially going further.
There are parts of the tree code which can be significantly simplified using memoryviews, like the one presented in this PR. The question is, is this change valid? I think it passes all the tests and dosn't add anything to the time required for the tests, but I'm not sure if our tests cover things such as passing a readonly array to the function, which I guess is an edge case reading the discussion there.