-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
FIX fix performance regression in trees with low-cardinality features #23410
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for merging this a regression fix and explore in a later PR to main how to consolidate the 2 introsort/quicksort implementation with more extensive benchmarking for various data distributions.
Actually I spoke too quickly, we now have:
that was previously fixed in |
sklearn/tree/_splitter.pyx
Outdated
simultaneous_sort(&Xf[start_positive], &samples[start_positive], end - start_positive) | ||
sort(&Xf[start], &samples[start], end_negative - start) | ||
sort(&Xf[start_positive], &samples[start_positive], | ||
end - start_positive) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might need to re-introduce the if start_positive < end:
condition here.
We also need to document this fix in the 1.1.1 changelog similar to https://github.com/scikit-learn/scikit-learn/pull/23404/files#diff-5a8a66bfe0792b4a0e50eb7c8cc929f2a8e0d4d88ac2d0084e07b70d069ced89R39 |
I actually looked at the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 once the bug is fixed and we acknowledge the bug in what's new.
Nice trick! +1 for a follow-up PR for 1.2. |
To use the comparator, we would need to represent the values and indices as an "array of structures". Currently, it is an "structure of arrays". As for this PR, I am +1 on it as well. Backporting this fix should be easier, since the version on |
Here we only have an array of values (feature values) and an array of indices (sample indices) to sort, isn't it? |
This representation is like a "structure of arrays". Currently, the code is similar to this: struct MyArrays:
double* values_array
int* indices_array and To use a comparator, the values and indices need to be it's own structure: struct ValueIndex:
double value
int indices
ValueIndex* array_to_sort This way the index will follow the value when |
Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 once this has a change log entry under 1.1.1. (As suggested in #23410 (review))
Backporting will be slightly more involved because the 1.1.X
branch still users pointers for Xf
, but it should not be too difficult.
Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>
@thomasjpfan the code snippet in https://stackoverflow.com/a/63800579/6513708 shows how to do it directly with a structure of arrays without materializing it as an array of index-value pairs, no? Although I am not sure how it works. EDIT: actually this solution would not inplace-sort the values, only the indices. We would need an extra step to re-shuffle the values based on the sorted indices which would require a temporary copy. Not necessarily a big deal but we will need to keep that temp buffer around to avoid reallocating it each time. Not sure it's worth it and I am not sure about the CPU cache impact of this extra shuffle w.r.t. the existing inplace implementation. |
…scikit-learn#23410) Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com> Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>
…#23410) Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com> Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>
…scikit-learn#23410) Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com> Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>
…scikit-learn#23410) Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com> Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>
…#23410) Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com> Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>
A more conservative alternative to #23404. This reverts #22868 and fixes the conflicts.
With the following benchmark script, I get a similar performance in this PR and in 1.0.2: