Skip to content

DOC Clarified parallelization of plot_partial_dependence with n_jobs #19749

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

InterferencePattern
Copy link
Contributor

Reference Issues/PRs

Contributes to resolution of #14228

What does this implement/fix? Explain your changes.

Clarifies in documentation the level at which plot_partial_dependence is parallelized by n_jobs.

Any other comments?

Copy link
Member

@thomasjpfan thomasjpfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR @jimbudarz !

Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>
Copy link
Member

@thomasjpfan thomasjpfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -173,7 +173,9 @@ def plot_partial_dependence(
differences between the `'brute'` and `'recursion'` method.

n_jobs : int, default=None
The number of CPUs to use to compute the partial dependences.
The number of CPUs to use to compute the partial dependences. Computation
is parallelized over specified by the `features` parameter.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry my previous suggestion had poor grammar. This revised wording is also consistent with the rest of the docstring:

Suggested change
is parallelized over specified by the `features` parameter.
is parallelized over the target features.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel that we should avoid 'target' since that might be confused with 'label'

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. I think replacing the "target features" in the docstring with "selected features" would be clearer:

Suggested change
is parallelized over specified by the `features` parameter.
is parallelized over the selected features.

(I am trying to prevent the reader from thinking that we are computing the PD over all the features)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think referencing the features parameter is still appropriate, though, right? Why would you like to drop the specific reference to features as a parameter?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The alternative was:

Computation is parallelized over features specified by the features parameter.

which works as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed.

Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>
@thomasjpfan thomasjpfan changed the title Clarified parallelization of plot_partial_dependence with n_jobs DOC Clarified parallelization of plot_partial_dependence with n_jobs Mar 22, 2021
@thomasjpfan thomasjpfan changed the base branch from 0.24.X to main March 22, 2021 21:19
@thomasjpfan thomasjpfan changed the base branch from main to 0.24.X March 22, 2021 21:20
@thomasjpfan
Copy link
Member

I noticed that that target branch is scikit-learn:0.24.X, can you open a new PR that targets main? This way your improvement will show up in future versions of scikit-learn and not just 0.24.X.

When we cut another release for 0.24.X, we will cherry pick doc changes to that branch, so those changes will appear on the website.

@InterferencePattern
Copy link
Contributor Author

Sure thing. The new PR is #19750

@thomasjpfan
Copy link
Member

Closing in flavor of #19750

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants