Skip to content

AdaboostClassfier predict does not always correspond to class with highest probability #14084

@prabhat00155

Description

@prabhat00155

Description

AdaboostClassifier model.predict() is not always equal to np.argmax(model.predict_proba(), axis=1).

Steps/Code to Reproduce

import numpy as np
from sklearn.datasets import load_digits
from sklearn.model_selection import train_test_split
from sklearn.ensemble import AdaBoostClassifier

data = load_digits()
X = data.data
y = data.target

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.5, random_state=42)
model = AdaBoostClassifier(base_estimator=None, n_estimators=20, learning_rate=1,
                           algorithm='SAMME', random_state=42).fit(X_train, y_train)


print(np.mean(np.argmax(model.predict_proba(X_test), axis=1) == model.predict(X_test)))

Expected Results

1.0

print(model.predict_proba(X_test)[0]) gives the class probability scores for the first test example:
[0.09949952 0.1000983  0.10002473 0.09970723 0.10019514 0.10022755
 **0.10087143** 0.09974335 0.10005309 0.09957965]

Clearly class 6 has the highest probability of 0.10087143, but predict() gives class 5 as the result.

print(model.predict(X_test)[0])
Result: 5

Actual Results

0.44048943270300334

Versions

System:
python: 3.6.8 (v3.6.8:3c6b436a57, Dec 24 2018, 02:04:31) [GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.57)]
executable: /Users/prroy/Documents/MachineLearning/onnx_projects/onnx_env/bin/python3.6
machine: Darwin-18.6.0-x86_64-i386-64bit

BLAS:
macros: NO_ATLAS_INFO=3, HAVE_CBLAS=None
lib_dirs:
cblas_libs: cblas

Python deps:
pip: 19.1.1
setuptools: 41.0.1
sklearn: 0.21.2
numpy: 1.16.4
scipy: 1.1.0
Cython: None
pandas: 0.23.4

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions