-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
[DOC] Better document n_jobs param in forest models #17480
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
trees. This works by creating n_jobs sets of trees and computing them | ||
in parallel. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's a little bit ambiguous IMHO, also I'm not sure that's what happens in practice. Did you want to refer to the type of scheduling that is applied, e.g. dynamic vs static scheduling?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, @NicolasHug ! Can definitely clarify.
I hadn't thought to refer to static v dynamic scheduling -- we looked at the some of the documentation that was marked "completed" in #14228, and it didn't seem to include that level of detail. Do you have a recommendation for where to look at an example ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unfortunately the list in #14228 (comment) doesn't seem up to date
and in fact the docstring for the random forest was already updated in #14628
If you think the current version isn't clear enough we can definitly improve it though. What did you find confusing / unclear?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, @NicolasHug ! I also thought the documentation was mostly quite clear -- our addition was just meant to add a bit more detail about how tree models themselves are parallelized, since the current description focused on the class methods.
But that might be the more sensible description ! Happy to close this if you think it's confusing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it, yeah in this case I think this could be closed. Sorry for the extra work!
@emdupre |
Reference Issues/PRs
What does this implement/fix? Explain your changes.
This PR better documents the
n_jobs
parameter for forest models.Additional comments
Submitted with @annejeevan for the #DataUmbrella sprint.