Skip to content

[Question] Is auto-sklearn ensemble for classification using predict_proba + threshold adaptions #1549

@ggjj11

Description

@ggjj11

Short Question Description

Is auto-sklearn ensemble for classification using predict_proba + threshold adaptions?

I read the original paper and read about the unique way auto-sklearn makes use of a "metric" to select from an ensemble of trained models. These trained models were trained during hyperparaneter optimization and are, therefore rather diverse.
When selecting e.g. recall/precision/fscore/... as a metric for the final ensemble, several of these explored models are added to the ensemble based on the performance on a validation set.

Now recall/precision/fscore... depend on the decision threshold (which is typically assumed to be 0.5).

Is auto-sklearn making use of e.g. the precision recall curves of the already during hyperparaneter optimization trained classifiers to identify best performing models?

The threshold is not a typical hyperparaneter like e.g. depth of trees (in decision tree classifiers) etc., but rather a hyperparaneter of more subtle kind.
Let me still call it the "threshold hyperparameter" because it affects prediction results.
Is this threshold hyperparameter considered during building the best final ensemble?
I could not find documentation about how exactly the ensemble building takes place and if the threshold is considered as some pseudo hyperparameter for classification. In any case it would be rather cheap to check if a recall can be improved for a base classifier when we change the decision threshold.

Thank you in advance for the great software!

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions