WIP Elkan optimal variable threshold decision making #29150

ogrisel · 2024-05-31T10:35:56Z

Draft PR that builds on top of #29149. The diff will be much smaller once #29149 is merged. The new stuff in this PR is the proof of concept code at the end of the example (that is, 0578617).

I experimented a bit with a follow-up on the work on TunedThresholdClassifierCV to see how beneficial it would be to allow hard decisions to optimize a cost model at prediction time given calibrated probabilistic predictions.

The results look promising on the transaction fraud task. This strategy seem to further improve on using tuned fixed decision thresholds.

Note: this is just a proof of concept at this stage, this PR is not meant to be reviewed for merge as this, but I think this idea is worth exploring.

/cc @glemaitre @lorentzenchr @aperezlebel @GaelVaroquaux

github-actions · 2024-05-31T10:37:20Z

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

_{Generated for commit: a1b997e. Link to the linter CI: here}

…king

…metrics for the variable threshold decision makers

ogrisel · 2024-06-05T09:32:29Z

I fixed 2 bugs in a1b997e. Now the dependency of the optimal threshold w.r.t. the amount makes sense: the higher the amount of the transaction, the lower the optimal threshold to decide that a transaction is fraudulent and the average threshold value if very close to what TunedThresholdClassifierCV finds empirically.

The general conclusions of the notebook still hold:

using a fixed threshold based on the theoretical formula is competitive with empirical tuning with cross-val, especially when the underlying model is recalibrated with isotonic regression,
using variable threshold decision making works a even a bit better but really requires well calibrated probabilistic classifier.

However, using variable threshold decision making is not that much a game changer compared to keeping a fixed threshold. This is likely very problem dependent though.

ogrisel added 2 commits May 31, 2024 12:24

DOC improve the cost-sensitive learning example

2c4cf73

DRAFT experiment with Elkan-optimal variable threshold decision making

0578617

Add section headers for the experiments

b7b78f0

glemaitre self-requested a review May 31, 2024 12:25

ogrisel added 2 commits June 4, 2024 17:51

Merge branch 'main' into elkan-optimal-variable-threshold-decision-ma…

d5f6dfa

…king

Fix Elkan optimal threshold formula and manual calls to the business …

a1b997e

…metrics for the variable threshold decision makers

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

WIP Elkan optimal variable threshold decision making #29150

WIP Elkan optimal variable threshold decision making #29150

Uh oh!

ogrisel commented May 31, 2024 •

edited

Loading

Uh oh!

github-actions bot commented May 31, 2024 •

edited

Loading

Uh oh!

ogrisel commented Jun 5, 2024 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

WIP Elkan optimal variable threshold decision making #29150

Are you sure you want to change the base?

WIP Elkan optimal variable threshold decision making #29150

Uh oh!

Conversation

ogrisel commented May 31, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented May 31, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✔️ Linting Passed

Uh oh!

ogrisel commented Jun 5, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

ogrisel commented May 31, 2024 •

edited

Loading

github-actions bot commented May 31, 2024 •

edited

Loading

ogrisel commented Jun 5, 2024 •

edited

Loading