Skip to content

**params documentation for GridSearchCV.fit is ambiguous #29917

Closed
@CameronBieganek

Description

@CameronBieganek

GridSearchCV.fit

Describe the issue linked to the documentation

The documentation for the **params parameter to the fit method of GridSearchCV leads to confusion. Here is the current text:

Parameters passed to the fit method of the estimator, the scorer, and the CV splitter.

If a fit parameter is an array-like whose length is equal to num_samples then it will be split across CV groups along with X and y. For example, the sample_weight parameter is split because len(sample_weights) = len(X).

I was worried that this meant that grid_search.fit(X, y, groups=g) would split g up across the CV partitions, which is definitely not the right behavior. The correct behavior is to pass the groups parameter unchanged to the CV splitter, e.g. cv.split(X, y, groups=groups). I read through the source code and it does appear that the groups parameter will get passed through unchanged to split, so it looks like the behavior is correct. But we could use something in the docstring that clarifies this behavior.

Suggest a potential alternative/fix

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions