Skip to content

Unclear message regarding param validation #26897

@adrinjalali

Description

@adrinjalali

In the context of #26896 I wrote a test and got a message which I'm really puzzled about. The error message says: ValueError: No valid specification of the columns. Only a scalar, list or slice of all integers or all strings, or boolean mask is allowed

This is the code, and the error message:

import numpy as np

from sklearn import set_config
from sklearn.model_selection import cross_validate

from sklearn.tests.test_metaestimators_metadata_routing import (
    ConsumingClassifier,
    ConsumingScorer,
    ConsumingSplitter,
)

set_config(enable_metadata_routing=True)

X = np.ones((10, 2))
y = np.array([0, 0, 1, 1, 2, 2, 3, 3, 4, 4])

scorer = ConsumingScorer().set_score_request(
    sample_weight="score_weights", metadata="score_metadata"
)
splitter = ConsumingSplitter().set_split_request(
    groups="split_groups", metadata="split_metadata"
)
estimator = ConsumingClassifier().set_fit_request(
    sample_weight="fit_sample_weight", metadata="fit_metadata"
)
n_samples = len(X)
rng = np.random.RandomState(0)
score_weights = rng.rand(n_samples)
score_metadata = rng.rand(n_samples)
split_groups = rng.randint(0, 3, n_samples)
split_metadata = rng.rand(n_samples)
fit_sample_weight = rng.rand(n_samples)
fit_metadata = rng.rand(n_samples)

cross_validate(
    estimator,
    X=X,
    y=y,
    scoring=scorer,
    cv=splitter,
    params=dict(
        score_weights=score_weights,
        score_metadata=score_metadata,
        split_groups=split_groups,
        split_metadata=split_metadata,
        fit_sample_weight=fit_sample_weight,
        fit_metadata=fit_metadata,
    ),
)

And the error message:

$ python /tmp/1.py
Traceback (most recent call last):
  File "/tmp/1.py", line 35, in <module>
    cross_validate(
  File "/home/adrin/Projects/sklearn/scikit-learn/sklearn/utils/_param_validation.py", line 211, in wrapper
    return func(*args, **kwargs)
  File "/home/adrin/Projects/sklearn/scikit-learn/sklearn/model_selection/_validation.py", line 406, in cross_validate
    results = parallel(
  File "/home/adrin/Projects/sklearn/scikit-learn/sklearn/utils/parallel.py", line 65, in __call__
    return super().__call__(iterable_with_config)
  File "/home/adrin/miniforge3/envs/sklearn/lib/python3.10/site-packages/joblib/parallel.py", line 1085, in __call__
    if self.dispatch_one_batch(iterator):
  File "/home/adrin/miniforge3/envs/sklearn/lib/python3.10/site-packages/joblib/parallel.py", line 901, in dispatch_one_batch
    self._dispatch(tasks)
  File "/home/adrin/miniforge3/envs/sklearn/lib/python3.10/site-packages/joblib/parallel.py", line 819, in _dispatch
    job = self._backend.apply_async(batch, callback=cb)
  File "/home/adrin/miniforge3/envs/sklearn/lib/python3.10/site-packages/joblib/_parallel_backends.py", line 208, in apply_async
    result = ImmediateResult(func)
  File "/home/adrin/miniforge3/envs/sklearn/lib/python3.10/site-packages/joblib/_parallel_backends.py", line 597, in __init__
    self.results = batch()
  File "/home/adrin/miniforge3/envs/sklearn/lib/python3.10/site-packages/joblib/parallel.py", line 288, in __call__
    return [func(*args, **kwargs)
  File "/home/adrin/miniforge3/envs/sklearn/lib/python3.10/site-packages/joblib/parallel.py", line 288, in <listcomp>
    return [func(*args, **kwargs)
  File "/home/adrin/Projects/sklearn/scikit-learn/sklearn/utils/parallel.py", line 127, in __call__
    return self.function(*args, **kwargs)
  File "/home/adrin/Projects/sklearn/scikit-learn/sklearn/model_selection/_validation.py", line 838, in _fit_and_score
    fit_params = _check_method_params(X, params=fit_params, indices=train)
  File "/home/adrin/Projects/sklearn/scikit-learn/sklearn/utils/validation.py", line 1983, in _check_method_params
    method_params_validated[param_key] = _safe_indexing(
  File "/home/adrin/Projects/sklearn/scikit-learn/sklearn/utils/__init__.py", line 341, in _safe_indexing
    indices_dtype = _determine_key_type(indices)
  File "/home/adrin/Projects/sklearn/scikit-learn/sklearn/utils/__init__.py", line 288, in _determine_key_type
    raise ValueError(err_msg)
ValueError: No valid specification of the columns. Only a scalar, list or slice of all integers or all strings, or boolean mask is allowed

cc @jeremiedbb

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions