Skip to content

[DOC] Better document n_jobs param in forest models #17480

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

emdupre
Copy link
Contributor

@emdupre emdupre commented Jun 6, 2020

Reference Issues/PRs

What does this implement/fix? Explain your changes.

This PR better documents the n_jobs parameter for forest models.

Additional comments

Submitted with @annejeevan for the #DataUmbrella sprint.

Comment on lines +1008 to +1009
trees. This works by creating n_jobs sets of trees and computing them
in parallel.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's a little bit ambiguous IMHO, also I'm not sure that's what happens in practice. Did you want to refer to the type of scheduling that is applied, e.g. dynamic vs static scheduling?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @NicolasHug ! Can definitely clarify.

I hadn't thought to refer to static v dynamic scheduling -- we looked at the some of the documentation that was marked "completed" in #14228, and it didn't seem to include that level of detail. Do you have a recommendation for where to look at an example ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unfortunately the list in #14228 (comment) doesn't seem up to date

and in fact the docstring for the random forest was already updated in #14628

If you think the current version isn't clear enough we can definitly improve it though. What did you find confusing / unclear?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @NicolasHug ! I also thought the documentation was mostly quite clear -- our addition was just meant to add a bit more detail about how tree models themselves are parallelized, since the current description focused on the class methods.

But that might be the more sensible description ! Happy to close this if you think it's confusing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, yeah in this case I think this could be closed. Sorry for the extra work!

@reshamas
Copy link
Member

reshamas commented Jun 6, 2020

@emdupre
Congrats on submitting this PR.  Can you, in the description of this PR:
1)  cc your pair partner?
2)  Include #DataUmbrella so we can track the PR?
Thanks.

@emdupre
Copy link
Contributor Author

emdupre commented Jun 6, 2020

@emdupre
Congrats on submitting this PR. Can you, in the description of this PR:

  1. cc your pair partner?
  2. Include #DataUmbrella so we can track the PR?
    Thanks.

Thanks, @reshamas ! Done.

@NicolasHug NicolasHug closed this Jun 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants