ENH ascii visualisation for metadata routing #31535

adrinjalali · 2025-06-12T13:14:46Z

Adding visualisation for metadata routing.

Right now it gives such output:

def run_test_1():
    numeric_features = ["age", "fare"]
    numeric_transformer = Pipeline(
        steps=[
            ("imputer", SimpleImputer(strategy="median")),
            (
                "scaler",
                StandardScaler()
                .set_fit_request(sample_weight="inner_weights")
                .set_transform_request(copy=True),
            ),
        ]
    )

    categorical_features = ["embarked", "sex", "pclass"]
    categorical_transformer = Pipeline(
        steps=[
            ("encoder", OneHotEncoder(handle_unknown="ignore")),
            ("selector", SelectPercentile(chi2, percentile=50)),
        ]
    )
    preprocessor = ColumnTransformer(
        transformers=[
            ("num", numeric_transformer, numeric_features),
            ("cat", categorical_transformer, categorical_features),
        ]
    )

    # %%
    # Append classifier to preprocessing pipeline.
    # Now we have a full prediction pipeline.
    clf = Pipeline(
        steps=[
            ("preprocessor", preprocessor),
            ("classifier", LogisticRegression().set_fit_request(sample_weight=False)),
        ]
    )

    param_grid = {
        "preprocessor__num__imputer__strategy": ["mean", "median"],
        "preprocessor__cat__selector__percentile": [10, 30, 50, 70],
        "classifier__C": [0.1, 1.0, 10, 100],
    }

    scorer = get_scorer("accuracy").set_score_request(sample_weight=True)

    search_cv = RandomizedSearchCV(
        clf, param_grid, cv=GroupKFold(), scoring=scorer, random_state=0
    )

    # Get the routing information
    test = get_routing_for_object(search_cv)

    visualise_routing(test)


def run_test_2():
    est = make_pipeline(
        make_pipeline(StandardScaler().set_fit_request(sample_weight=True)),
        make_pipeline(StandardScaler().set_fit_request(sample_weight=False)),
        make_pipeline(StandardScaler()),
        make_pipeline(StandardScaler().set_fit_request(sample_weight=WARN)),
    )

    visualise_routing(get_routing_for_object(est))


def run_test_3():
    est = RandomizedSearchCV(estimator=LogisticRegression(), param_distributions={})
    visualise_routing(get_routing_for_object(est))

    est = RandomizedSearchCV(
        estimator=LogisticRegression(),
        param_distributions={},
        scoring=get_scorer("accuracy").set_score_request(sample_weight=True),
    )
    visualise_routing(get_routing_for_object(est))


if __name__ == "__main__":
    # Enable metadata routing
    set_config(enable_metadata_routing=True)
    run_test_1()
    run_test_2()
    run_test_3()

$ python sklearn/utils/tests/test_metadata_routing_visualise.py

=== METADATA ROUTING TREE ===
RandomizedSearchCV
├── estimator (Pipeline)
│   ├── preprocessor (ColumnTransformer)
│   │   ├── num (Pipeline)
│   │   │   ├── imputer (SimpleImputer(strategy='median'))
│   │   │   └── scaler (StandardScaler())
│   │   │           ➤ copy[transform✓]
│   │   │           ➤ inner_weights→sample_weight[fit↗]
│   │   ├── cat (Pipeline)
│   │   │   ├── encoder (OneHotEncoder(handle_unknown='ignore'))
│   │   │   └── selector (SelectPercentile(percentile=50, score_func=<function chi2 at 0x7f7b495dfb00>))
│   │   └── remainder (None)
│   └── classifier (LogisticRegression())
│           ➤ sample_weight[fit✗]
├── scorer (_Scorer)
│       ➤ sample_weight[score✓]
└── splitter (GroupKFold(n_splits=5, random_state=None, shuffle=False))
        ➤ groups[split✓]

Parameter summary:
fit
 ├─ ✓ copy
 │   • ✓ requested:
 │       - RandomizedSearchCV/estimator/preprocessor/num/scaler.transform
 ├─ ✓ groups
 │   • ✓ requested:
 │       - RandomizedSearchCV/splitter.split
 ├─ ✓ inner_weights
 │   • ✓ requested:
 │       - RandomizedSearchCV/estimator/preprocessor/num/scaler.fit
 ├─ ✓ sample_weight
 │   • ✓ requested:
 │       - RandomizedSearchCV/scorer.score
 │   • ✗ ignored:
 │       - RandomizedSearchCV/estimator/classifier.fit

score
 ├─ ✓ sample_weight
 │   • ✓ requested:
 │       - RandomizedSearchCV/scorer.score


=== METADATA ROUTING TREE ===
Pipeline
├── pipeline-1 (Pipeline)
│   └── standardscaler (StandardScaler())
│           ➤ copy[transform⛔,inverse_transform⛔]
│           ➤ sample_weight[fit✓]
├── pipeline-2 (Pipeline)
│   └── standardscaler (StandardScaler())
│           ➤ copy[transform⛔,inverse_transform⛔]
│           ➤ sample_weight[fit✗]
├── pipeline-3 (Pipeline)
│   └── standardscaler (StandardScaler())
│           ➤ copy[transform⛔,inverse_transform⛔]
│           ➤ sample_weight[fit⛔]
└── pipeline-4 (Pipeline)
    └── standardscaler (StandardScaler())
            ➤ copy[transform⛔,inverse_transform⛔]
            ➤ sample_weight[fit⚠]

Parameter summary:
fit
 ├─ ⛔ copy
 │   • ⛔ errors:
 │       - Pipeline/pipeline-1/standardscaler.transform
 │       - Pipeline/pipeline-2/standardscaler.transform
 │       - Pipeline/pipeline-3/standardscaler.transform
 ├─ ⛔ sample_weight
 │   • ✓ requested:
 │       - Pipeline/pipeline-1/standardscaler.fit
 │   • ✗ ignored:
 │       - Pipeline/pipeline-2/standardscaler.fit
 │   • ⚠ warns:
 │       - Pipeline/pipeline-4/standardscaler.fit
 │   • ⛔ errors:
 │       - Pipeline/pipeline-3/standardscaler.fit

predict
 ├─ ⛔ copy
 │   • ⛔ errors:
 │       - Pipeline/pipeline-1/standardscaler.transform
 │       - Pipeline/pipeline-2/standardscaler.transform
 │       - Pipeline/pipeline-3/standardscaler.transform

predict_proba
 ├─ ⛔ copy
 │   • ⛔ errors:
 │       - Pipeline/pipeline-1/standardscaler.transform
 │       - Pipeline/pipeline-2/standardscaler.transform
 │       - Pipeline/pipeline-3/standardscaler.transform

predict_log_proba
 ├─ ⛔ copy
 │   • ⛔ errors:
 │       - Pipeline/pipeline-1/standardscaler.transform
 │       - Pipeline/pipeline-2/standardscaler.transform
 │       - Pipeline/pipeline-3/standardscaler.transform

decision_function
 ├─ ⛔ copy
 │   • ⛔ errors:
 │       - Pipeline/pipeline-1/standardscaler.transform
 │       - Pipeline/pipeline-2/standardscaler.transform
 │       - Pipeline/pipeline-3/standardscaler.transform

score
 ├─ ⛔ copy
 │   • ⛔ errors:
 │       - Pipeline/pipeline-1/standardscaler.transform
 │       - Pipeline/pipeline-2/standardscaler.transform
 │       - Pipeline/pipeline-3/standardscaler.transform

transform
 ├─ ⛔ copy
 │   • ⛔ errors:
 │       - Pipeline/pipeline-1/standardscaler.transform
 │       - Pipeline/pipeline-2/standardscaler.transform
 │       - Pipeline/pipeline-3/standardscaler.transform
 │       - Pipeline/pipeline-4/standardscaler.transform

inverse_transform
 ├─ ⛔ copy
 │   • ⛔ errors:
 │       - Pipeline/pipeline-1/standardscaler.inverse_transform
 │       - Pipeline/pipeline-2/standardscaler.inverse_transform
 │       - Pipeline/pipeline-3/standardscaler.inverse_transform
 │       - Pipeline/pipeline-4/standardscaler.inverse_transform


=== METADATA ROUTING TREE ===
RandomizedSearchCV
├── estimator (LogisticRegression())
│       ➤ sample_weight[fit⛔]
├── scorer (_PassthroughScorer)
│       ➤ sample_weight[score⛔]
└── splitter (None)

Parameter summary:
fit
 ├─ ⛔ sample_weight
 │   • ⛔ errors:
 │       - RandomizedSearchCV/estimator.fit
 │       - RandomizedSearchCV/scorer.score

score
 ├─ ⛔ sample_weight
 │   • ⛔ errors:
 │       - RandomizedSearchCV/scorer.score


=== METADATA ROUTING TREE ===
RandomizedSearchCV
├── estimator (LogisticRegression())
│       ➤ sample_weight[fit⛔]
├── scorer (_Scorer)
│       ➤ sample_weight[score✓]
└── splitter (None)

Parameter summary:
fit
 ├─ ⛔ sample_weight
 │   • ✓ requested:
 │       - RandomizedSearchCV/scorer.score
 │   • ⛔ errors:
 │       - RandomizedSearchCV/estimator.fit

score
 ├─ ✓ sample_weight
 │   • ✓ requested:
 │       - RandomizedSearchCV/scorer.score

github-actions · 2025-06-12T13:15:44Z

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

_{Generated for commit: 5296b93. Link to the linter CI: here}

glemaitre · 2025-06-13T08:00:16Z

I like how it looks. I see that the symbols are defined with their mining in the parameter summary but I would be happy to get a small legend right at the start before to go into the tree and the summary. It should take a single line.

A second thoughts is about being able to filter the output: maybe I might only be interested in the tree but not the summary of vice-versa and maybe only on particular parameter. I think that the current view should be the default but having the flexibility to reduce the amount of info could be nice.

StefanieSenger · 2025-06-13T08:47:02Z

I've already gotten a demonstration, and had expressed by appreciation. I think that will be very useful. Now I have some suggestions:

Is it possible to make that colourful? Maybe colours could even be used instead of ✓, ✗, ⛔, ⚠, it it is somehow possible to kind of force a colourful terminal output, also if users don't use shells like zsh?
I am not sure about ⛔, since it is the most pointy and colourful symbol just means: implicitly not requested (set_fit_request(metadata=None)). But from the symbol it looks a bit like explicitly not requested (set_fit_request(metadata=False)).
I also wonder if ✓, ✗, ⛔, ⚠ are safe to be used for any users or if they could look cryptic for some users? These are ascii and thus available for any user?

adrinjalali · 2025-06-13T13:04:31Z

Legend: yes, it should have a small one, and a link to a page in the docs where things are better explained.
Filtering: yep, easy to implement, and makes sense to have.
Symbols: I'm happy to have alternative suggestions for symbols, and I'll try to improve what we have. The ⛔ sign is pretty nice since it's the only case where it results in an actual error. But I'm also not happy with these signs and what users understand intuitively from them.
Colors: This one is very tricky and I rather not have it to start with. It's relatively easy to have colors in a white background with black font color, but terminals have all sorts of color themes. Mine is green on black, others have white/bright gray on black/dark gray, some use the black on white, etc. Having a color setting for all of them which works and is easily visible to a half colorblind person like me is pretty tricky. So I usually tend to avoid colors in my own work. But I'd be happy to review such suggestions in another PR.

adrinjalali added 30 commits June 3, 2025 11:41

FEAT visualise routing

6b2ddaf

...

958e14d

...

e4d7155

...

961896b

...

b354300

...

6ba6c43

...

5193d10

...

d3b1d75

...

1a9c467

...

d0f91df

...

df4e7db

...

40b3891

...

a2c4bba

...

8eefa78

...

bcf4df3

...

2166a78

...

0224016

...

58c5ed1

...

0ca159c

...

8ddd7e6

...

eb70879

...

81d2e88

...

dc9c8b5

...

ec8a6c9

...

bd124c3

...

2b3c38b

...

3d9dc0e

...

91d445b

...

855d8fb

...

e6b3f49

adrinjalali added 12 commits June 11, 2025 13:56

...

ff0ab4f

...

483a218

...

11bc481

...

49ebb32

...

759fc92

...

7325c5a

...

99001bf

...

330eadc

...

e718cdc

...

898f82b

MNT refactoring in routing _MetadataRequester

dfacaf9

Merge branch 'slep6/refactor' into slep6/visualise-cleaned

5296b93

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

ENH ascii visualisation for metadata routing #31535

ENH ascii visualisation for metadata routing #31535

Uh oh!

adrinjalali commented Jun 12, 2025

Uh oh!

github-actions bot commented Jun 12, 2025

Uh oh!

glemaitre commented Jun 13, 2025

Uh oh!

StefanieSenger commented Jun 13, 2025 •

edited

Loading

Uh oh!

adrinjalali commented Jun 13, 2025

Uh oh!

Uh oh!

Uh oh!

ENH ascii visualisation for metadata routing #31535

Are you sure you want to change the base?

ENH ascii visualisation for metadata routing #31535

Uh oh!

Conversation

adrinjalali commented Jun 12, 2025

Uh oh!

github-actions bot commented Jun 12, 2025

✔️ Linting Passed

Uh oh!

glemaitre commented Jun 13, 2025

Uh oh!

StefanieSenger commented Jun 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

adrinjalali commented Jun 13, 2025

Uh oh!

Uh oh!

StefanieSenger commented Jun 13, 2025 •

edited

Loading