Description
Describe the bug
When trying to train/evaluate a support vector machine in scikit-learn, I am experiencing some unexpected behaviour and I am wondering whether I am doing something wrong or that this is a possible bug.
In a very specific subset of circumstances, namely:
LeaveOneOut()
is used as cross-validation procedure- The SVM is used, with
probability = True
and a smallC
such as0.01
- The y labels are balanced (i.e. the mean of y is 0.5)
The results of the trained SVM are very good on randomly generated data - while they should be near chance. If the y labels are a bit different, or the SVM is swapped out for a LogisticRegression
, it gives expected results (Brier of 0.25, AUC near 0.5).
But for the named circumstances, the Brier is roughly 0.10 - 0.15 and AUC > 0.9 if the y labels are balanced.
Steps/Code to Reproduce
from sklearn import svm
from sklearn.linear_model import LogisticRegression
import numpy as np
from sklearn.model_selection import GridSearchCV, StratifiedKFold, LeaveOneOut, KFold
from sklearn.metrics import roc_auc_score, brier_score_loss
from tqdm import tqdm
import pandas as pd
N = 20
N_FEATURES = 50
scores = []
for z in tqdm(range(500)):
X = np.random.normal(0, 1, size=(N, N_FEATURES))
y = np.random.binomial(1, 0.5, size=N)
if z < 10:
y = np.array([0, 1] * int(N/2))
y = np.random.permutation(y)
y_real, y_pred = [], []
skf_outer = LeaveOneOut()
for train_index, test_index in skf_outer.split(X, y):
X_train, X_test = X[train_index], X[test_index, :]
y_train, y_test = y[train_index], y[test_index]
clf = svm.SVC(probability=True, C=0.01)
clf.fit(X_train, y_train)
predictions = clf.predict_proba(X_test)[:, 1]
y_pred.extend(predictions)
y_real.extend(y_test)
scores.append([np.mean(y),
brier_score_loss(np.array(y_real), np.array(y_pred)),
roc_auc_score(np.array(y_real), np.array(y_pred))])
df_scores = pd.DataFrame(scores)
df_scores.columns = ['y_label', 'brier', 'auc']
df_scores['y_0.5'] = df_scores['y_label'] == 0.5
df_scores = df_scores.groupby(['y_0.5']).mean()
print(df_scores)
Expected Results
I would expect that all results would be somewhat similar, with a Brier ~0.25 and AUC ~0.5.
Actual Results
y_label brier auc
y_0.5
False 0.514649 0.298204 0.216884
True 0.500000 0.159728 0.999080
Here, you can see that if the np.mean
of the y_labels
is 0.5, the results are actually really really good.
While the data is randomly generated for 500 times
Versions
System:
python: 3.8.15 (default, Nov 24 2022, 14:38:14) [MSC v.1916 64 bit (AMD64)]
executable: C:\ProgramData\Anaconda3\envs\test\python.exe
machine: Windows-10-10.0.19044-SP0
Python dependencies:
sklearn: 1.2.0
pip: 22.2.2
setuptools: 61.2.0
numpy: 1.19.5
scipy: 1.10.0
Cython: 0.29.14
pandas: 1.4.4
matplotlib: 3.6.3
joblib: 1.2.0
threadpoolctl: 2.2.0
Built with OpenMP: True
threadpoolctl info:
filepath: C:\ProgramData\Anaconda3\envs\test\Library\bin\mkl_rt.1.dll
prefix: mkl_rt
user_api: blas
internal_api: mkl
version: 2021.4-Product
num_threads: 8
threading_layer: intel
filepath: C:\Users\manuser\AppData\Roaming\Python\Python38\site-packages\scipy.libs\libopenblas-802f9ed1179cb9c9b03d67ff79f48187.dll
prefix: libopenblas
user_api: blas
internal_api: openblas
version: 0.3.18
num_threads: 16
threading_layer: pthreads
architecture: Prescott
filepath: C:\ProgramData\Anaconda3\envs\test\Lib\site-packages\sklearn\.libs\vcomp140.dll
prefix: vcomp
user_api: openmp
internal_api: openmp
version: None
num_threads: 8
filepath: C:\ProgramData\Anaconda3\envs\test\Library\bin\libiomp5md.dll
prefix: libiomp
user_api: openmp
internal_api: openmp
version: None
num_threads: 8
filepath: C:\Users\manuser\AppData\Roaming\Python\Python38\site-packages\mxnet\libopenblas.dll
prefix: libopenblas
user_api: blas
internal_api: openblas
version: None
num_threads: 16
threading_layer: pthreads
architecture: Prescott
filepath: C:\ProgramData\Anaconda3\envs\test\Lib\site-packages\torch\lib\libiomp5md.dll
prefix: libiomp
user_api: openmp
internal_api: openmp
version: None
num_threads: 16
filepath: C:\ProgramData\Anaconda3\envs\test\Lib\site-packages\torch\lib\libiompstubs5md.dll
prefix: libiomp
user_api: openmp
internal_api: openmp
version: None
num_threads: 1