-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
Add PAV algorithm for calibration_curve/reliability diagrams #23132
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@aijordan For your information. |
It's possible that using Centered Isotonic Regression (#21454) would make the reliability diagram look even better but might break the theoretical results of the paper you linked above. |
Posted by @ogrisel in #23767 (comment)
|
The reliability diagram is a statistical diagnostics/verification tool and not something that needs to be pleasant for the eye, but easy to interpret. BTW, I never understood why the PAV (=CORP) approach is good enough for calibration of classifiers (modifies actual model predictions), but not good enough in a reliability diagram (diagnostics) - within scikit-learn 🤨 |
It's not a matter of being pleasing to the eye but a matter about being misleading about the shape of the asymptotic curve. The asymptotic curve will be smooth most of the time, and the CORP finite sample size estimate can induce the reader into thinking otherwise, which I find misleading and a potential source of confusion for our users.
I actually have the exact same concern with isotonic regression as a post hoc calibrator. I would much rather use the centered isotonic regression as post hoc classifier which is mostly strictly monotonic (and as a result would not introduce an unexpected change to pure ranking metrics such as ROC AUC / Gini index) and converges to the same solution as the step-wise constant calibrator in the large sample size. |
That‘s not correct. For instance, tree based models or GLMs with categorical features are not smooth in the predictions. |
Indeed that might be the case. Although depending on the size of the training set, I suspect that they are still be much smoother than what the CORP reliability diagram suggests. To settle this debate we will need some experiments with a few large datasets we we can subsample both the training set and the test set used to estimate the reliability curve and compare the small test sample CORP/smoothed reliability curves to the CORP curve on the full test set. We could also have a reliability diagram with a user settable option to decide what strategy they want to use (fixed binning as we do now, CORP induced bin edges and some smooth estimate). Still comparing the methods on a few canonical datasets would help us make informed recommendations in the docstring of that parameter. |
Describe the workflow you want to enable
Describe your proposed solution
Add the strategy PAV as [1] and [2] to
calibration_curve
, there called CORP. This is basically applying isotonic regression as binning strategy, which we already have in scikit-learn.[1] Dimitriadis, T., Gneiting, T., & Jordan, A.I. (2021). Stable reliability diagrams for probabilistic classifiers. Proceedings of the National Academy of Sciences of the United States of America, 118. https://doi.org/10.1073/pnas.2016191118
[2] https://cran.r-project.org/package=reliabilitydiag
Describe alternatives you've considered, if relevant
No response
Additional context
Given the recency of the paper, it has clearly not enough citations (yet). But I have the impression that this is a good strategy for a reliability diagrams with good theoretical and practical properties.
To my knowledge, this strategy is nowhere available in the Python universe as of now.
The text was updated successfully, but these errors were encountered: