Skip to content

model_selection.cross_validate doesn't pass groups argument to estimator #20349

@thomsentner

Description

@thomsentner

Describe the workflow you want to enable

Nested cross validation is currently impossible with a grouped k-fold iterator in the inner loop. The currently proposed workflow by sklearn includes model_selection.cross_val_score or model_selection.cross_validate in the outer loop, and model_selection.GridSearchCV in the inner loop. However, model_selection.cross_validate only uses the groups parameter for its own cv instance, which also seems to be documented.

Describe your proposed solution

Pass the groups parameter from model_selection.cross_validate to the estimator through model_selection. _validation._fit_and_score. It actually seems like very minimal code changes would be necessary, passing along the groups parameter in three lines of code would be sufficient.

Additional context

sklearn's nested cross validation documentation actually assumes this functionality to be in place already, as GroupKFold is suggested as a compatible cv instance.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions