-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
Support for regression by classification #15850
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Why do you want to evaluate on the binned targets? I'd assume that you ultimately want to evaluate on the cts targets, but might want a classification evaluation as a diagnostic. This doesn't actually fit the resampling case nicely because resamplers do not modify the predicted data, and indeed I find it quite strange that one could use a resamplers to change the prediction space. I don't really see why regression by classification could not be supported by TransformedTargetRegressor. Does |
This is what is done in the referenced article, and you are right, this can be handled by I am not fully comfortable with evaluating in the continuous space because the inverse of the binning transform is ambiguous due to rounding errors. I would rather switch to classification metrics altogether, especially for small n_bins. My approach to regression by classification would be more accurately described as formulating regression problems as classification problems. It would be nice to be able to "skip" the inverse transformation with |
@dabana Since you your task is regression (independently how you do it) you should use regression evaluation metrics. A corner case that would be when performing ordinal regression where I suppose you need evaluation metrics for both domains (classification and regression) but this is not the case here. |
Yes, evaluating faithfully to the task (and not necessarily faithfully to
how you model it) is a very essential part of predictive learning.
If you believe that the task is still meaningfully evaluated in the
transformed space then you should be able to discretise y using
task-related rules before applying scikit-learn.
If TransformedTargetRegressor supports regression by classification,
evaluating in the original space, then I think that is what we should
support, and this issue can be closed (although maybe we could do with an
example of this technique in our gallery).
|
But if it is already supported, why not support evaluation in the transformed space too? It could just be a simple boolean input to The task at hand has identifiable "task-related rules" for creating classes, it is true. But the class definitions are ambiguous (region/regulation dependent for instance). This is why we opted for regression in the first place. But we are looking back at classification for several reasons.
@jnothman I think you should leave the issue open. |
(the above might need some tweaking) if |
Sorry @jnothman, I am not sure why evaluating in the transformed space is "bad practice in general". Can you explain a bit more?
Or something like: class TransformedTargetScorer():
def __init__(self, scoring):
self.scorer = get_scorer(scoring)
def __call__(self, estimator, X, y_true):
if hasattr(estimator, 'get_transformed_targets'):
y_true = estimator.get_transformed_targets(X, y_true)
return self.scorer(estimator, X, y_true) Thanks for the discussion. This is the way I implemented it. |
We try to encourage good practice, particularly around evaluation. Evaluating in the classification space does not tell you about how well you solved the regression problem. I think the API can make it possible, but should not make it too easy or the default. |
okai fine with me if you close. I found this an article on Arxiv about discretizing the targets (action space) in reinforcement learning. But that's about it. Most of the time discretization is performed on the features. |
But that has an extrinsic evaluation
|
see #15872 |
@dabana What I can see from this issue is that you might have a real-world and maybe good use-case to apply a "regression by classification" approach using several components from scikit-learn. This is actually useful and we could think about making an example documenting on how to solve such problems: (i) identify when it will be profitable for someone, (ii) how to make a machine learning model in this case, (iii) how to evaluate such model, and (iv) compare with some other baseline to show the benefit. |
I would say that this does not meet the scikit-learn inclusion criteria. The discussion then shifted to whether or not to evaluate a model on the transformed scale (discretized in this example). While some interpretability tools can make (more) sense on the transformed scale, I second @jnothman‘s comment about bad practise. Summary: I‘m closing this issue. |
That is correct. Although, I think that we could include an example that leverages the aforementioned technique as we did with the inductive clustering example. |
@chkoar From a statistical point of view, I have major concerns to convert a regression task on a continuous target into a classification task. Therefore, I would rather not put it in an example. |
scikit-learn has all the components in order to implement the regression by discretization approach. If I am not mistaken even WEKA includes this meta estimator by default. Since we always evaluate using cross-validation, I would appreciate if you could list your major concerns regarding the approach. Thanks. |
Fair enough. But let‘s not make a long discussion out of it. (Disclaimer: Maybe some points are misplaced as I have not studied the approach in detail.)
|
@lorentzenchr I found the closing a bit abrupt :) I completely agree with your point regarding the non-inclusion of a potential meta-estimator and the danger and bad/wrong practice of evaluating a regression problem with the subjacent classification proxy problem. Regarding not introducing an example, I would not be as categorical. If there is a meaningful and well-defined problem where it makes sense, I would not be against it. Your arguments seem fair to me and really meaningful for linear models. However, I am not sure that the tree-based/rule-based models would not benefit from the classification proxy problem. But it might be possible that there are better alternatives (skope-rules for regression?) out there. |
Indeed. That‘s why I took the time to lay out my reasons. And as a contributor, the most frustrating experience is to not get a response at all. This one was stalled for 1.5 years. |
As reference for a more scientific reasoning, "V. Fedorov, F. Mannino, Rongmei Zhang" Consequences of dichotomization" doi: 10.1002/pst.331 concludes:
|
I just read the abstract. Is it only looking at binarizing the output? |
It means transforming / discretizing the observed |
Thanks for the reference. This looks like a compelling reference. |
As info: There is now a longer discussion here: https://stats.stackexchange.com/questions/565537/is-there-ever-a-reason-to-solve-a-regression-problem-as-a-classification-problem |
My team and I are working on an application of regression by classification, a technique described in this article.
In a nut shell regression by classification is approaching a regression problem with multi-class classification algorithms. The key part of this technique is to perform discretization, or binning, of the (continous) target prior to classification. The article mentions 3 different approaches for target discretization which are supported by sklearn's KBinsDiscretizer.
In regression by classification, the choice of the numbers of classes, the n_bins parameter, is critical. One straight forward way to tune this parameter and to choose the binning strategy is to use cross-validation. But because transformations on y (see #4143) are currently forbidden in scikit-learn, this is not "natively" supported.
We found a way around this by creating our own meta-estimator, as suggested by @jnothman elsewhere. But one problem remained. How can we tell scikit-learn to compute evaluation metrics on BINNED targets, and not the original CONTINOUS targets?
We achieved this by hacking the _PredictScorer class on our scikit-learn fork. The hack looks for a special custom method called get_transformed_targets on our home-brewed meta-estimator. If this method is present, the score is computed using transformed (binned) targets. Here is the hack:
Another problem we encounter is to use the KBinsDiscretizer class on targets. We plan on doing this with a custom meta-transformer.
It would be nice if the regression by classification was supported by scikit-learn out of the box. Perhaps the re-sampling options coming soon will make this possible, but it will have to be tested.
The text was updated successfully, but these errors were encountered: