-
-
Notifications
You must be signed in to change notification settings - Fork 25.9k
Regression error characteristic curve #31441
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
This is a fantastic idea. thank you for bringing REC curves into the conversation! Having more nuanced diagnostics for regression models has been a long-standing gap in the standard ML workflow. The analogy to ROC/PR curves is spot-on: one summary metric (e.g., MAE/RMSE) can’t always capture performance tradeoffs, especially across varying error tolerances. The cumulative nature of the REC curve gives a much clearer picture of model robustness in practical contexts. Also appreciate the inclusion of reference. Bi & Bennett (2003) is a solid foundation. And your submitted PR (#31380) looks very promising from a quick glance. I’d love to see this feature included in future versions of scikit-learn. It could really benefit both academic research and applied ML workflows. Looking forward to seeing how this progresses! Great work. |
@alexshtf Thanks for opening this issue. I am -1 on this feature for 2 reasons:
Therefore, I am closing this issue. But still, feel free to continue discussion. |
@lorentzenchr I believe ease of implementation is not that important - otherwise, why does scikit-learn have So I believe the only remaining issue are the insights. Regressors are used for many downstream tasks, not just "let's predict a value and show it to the user", i.e,. computing bids for ad auctions. In general, more accurate bids generate better revenue, but simple summary metrics don't always help you understand why a certain predictors behaves as it does. So I thought it might be useful and valuable for other people as well. It appears you disagree on the diagnostic value, but it appears I should first dig deeper into CAP curves and see if they could provide similar value for the same use cases. |
Counter question: What to you learn from a fitted model by looking at the CDF of some error (loss function or score) of its predictions? |
Well, the x axis of the curve serves as an error threshold, so directly what you observe is what portion of the data has errors less than Secondly, it lets you do is compare models. In some sense, model A is better than model B if for any threshold Finally, If your curve quickly grows, stops at, say, 0.7, and grows very slowly towards 1, you understand that you have a long tail of samples with a large error (a kind of a upwards knee shape). Alternatively, if it grows a bit slowly but saturates at 1, you see that you do not have a long tail. Plotting several curves of various models lets you compare this behavior, because "quickly" and "slowly" are more appreciated by humans as relative to some baseline, rather than some absolute qualitative concept. And there is always a baseline on the plot - a constant predictor. So for me these aspects were very useful. But as I said, I need to understand CAP curves more deeply to understand if similar observations can be made using CAP curves. |
Uh oh!
There was an error while loading. Please reload this page.
Describe the workflow you want to enable
Add more fine-grained diagnostic, similar to ROC or Precision-Recall curves, to regression problems. It appears that this library has a lot of excellent tools for classification, and I believe it would benefit from some additional tools for regression.
Describe your proposed solution
Compute Regression Error Characteristic (REC) [1] curve - for each error threshold the percentage of samples whose error is below that threshold. This is essentially the CDF of the regression errors. Its function is similar to that of ROC curves - allows comparing performance profiles of regressors beyond just one summary statistic, such as RMSE or MAE.
I already implement a pull-request:
#31380
Screenshot from the merge request:
If you believe this feature is useful, please help me with reviewing and merging it.
Describe alternatives you've considered, if relevant
Regression Receiver Operating Characteristic (RROC) curves, proposed [2], which plot over-prediction vs under-prediction, are a different form of diagnostic curves for regression. They may also be useful, but I think we should begin from somewhere, and I belive it's better to begin from REC, both because the paper has more citations, and because it turned out to be very useful for me at work, and I believe it can be similarly useful to other scientists.
Additional context
References
[1]: Bi, J. and Bennett, K.P., 2003. Regression error characteristic curves. In Proceedings of the 20th international conference on machine learning (ICML-03) (pp. 43-50).
[2]: Hernández-Orallo, J., 2013. ROC curves for regression. Pattern Recognition, 46(12), pp.3395-3411.
The text was updated successfully, but these errors were encountered: