Closed
Description
While working on the following example (#26486), I found a couple of issues regarding the SAMME.R algorithm, which is the default algorithm in AdaBoostClassifier
:
- The algorithm was implemented based on the following paper. However, this paper is a preprint. In the final version of the paper, the SAMME.R algorithm is not presented. So we implemented an unpublished algorithm.
- SAMME.R can show some diverging behaviour as shown in AdaBoost's training error can increase with a larger number of trees #20443 (comment).
For these two reasons, I think that we should deprecate this algorithm and remove the parameter algorithm
from the AdaBoostClassifier
.
In addition, I think that we should monitor the latest work on multiclass AdaBoost, where additional theoretical founding is revealed, cf. https://proceedings.neurips.cc/paper/2021/hash/17f5e6db87929fb55cebeb7fd58c1d41-Abstract.html