-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
Replacing np.zeroes by np.empty for Gradient Boosting for better scaling by cores #14380
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Thanks for the PR @SmirnovEgorRu, there seems to be a few unrelated changes though? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More detailed comments
shape=(self.n_features, self.max_bins), | ||
dtype=HISTOGRAM_DTYPE | ||
) | ||
|
||
for feature_idx in prange(n_features, schedule='static', nogil=True): | ||
# for feature_idx in prange(n_features, schedule='static', nogil=True): | ||
for feature_idx in range(n_features): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why replace prange
with range
?
@@ -103,7 +103,9 @@ cdef class HistogramBuilder: | |||
|
|||
def compute_histograms_brute( | |||
HistogramBuilder self, | |||
const unsigned int [::1] sample_indices): # IN | |||
const unsigned int [::1] sample_indices, | |||
hist_struct [:, ::1] parent): # IN |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't fully understand why you need to change the API here.
We have 2 routines to compute histograms:
-
compute_histograms_subtraction
, used when we know the histogram of the parent and the sibling of a node (using the subtraction trick) -
compute_histograms_brute
used in all other cases. With that function, we do not need the histogram of the parent.
For reference @NicolasHug did a similar PR here: #14392 but unfortunately this does not seem to solve the scalabilty profile. |
Closing as a duplicate of #18341 (credited you there). |
I have prepared a PR with replacing np.zeroes to np.empty for performance improvement on multi-core systems.
Original issue: #14306