-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
MAINT replace obj and grad in sigmoid calibration with private loss module #27185
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MAINT replace obj and grad in sigmoid calibration with private loss module #27185
Conversation
I tried replacing with the loss_gradient method of the HalfBinomialLoss. But some of the tests fail including the very first one FAILED [ 75%]RUNNING THE L-BFGS-B CODE
* * *
Machine precision = 2.220D-16
N = 2 M = 10
At X0 0 variables are exactly at the bounds
At iterate 0 f= 3.26474D+01 |proj g|= 1.74728D+00
At iterate 1 f= 3.08460D+01 |proj g|= 2.65248D+00
At iterate 2 f= 2.57930D+01 |proj g|= 1.17055D+00
At iterate 3 f= 2.54497D+01 |proj g|= 1.27662D+00
At iterate 4 f= 2.50560D+01 |proj g|= 3.80842D-01
At iterate 5 f= 2.49982D+01 |proj g|= 5.93913D-02
At iterate 6 f= 2.49966D+01 |proj g|= 4.17555D-03
At iterate 7 f= 2.49966D+01 |proj g|= 6.16637D-05
At iterate 8 f= 2.49966D+01 |proj g|= 1.31444D-06
At iterate 9 f= 2.49966D+01 |proj g|= 1.47570D-07
* * *
Tit = total number of iterations
Tnf = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip = number of BFGS updates skipped
Nact = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F = final function value
* * *
N Tit Tnf Tnint Skip Nact Projg F
2 9 11 1 0 0 1.476D-07 2.500D+01
F = 24.996567512257208
CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL
This problem is unconstrained.
sklearn/tests/test_calibration.py:60 (test_calibration[False-sigmoid])
0.1691915026787319 != 0.4701915941927794
Expected :0.4701915941927794
Actual :0.1691915026787319
<Click to see difference>
data = (array([[1.55552559, 2.69252392, 3.45217209, 3.20990507, 3.54706644,
4.39080293],
[1.46335164, 2.521954..., 1, 1, 0, 0, 1, 1, 0, 1, 1, 0,
0, 1, 1, 1, 0, 1, 0, 0, 1, 0, 0, 0, 1, 0, 1, 0, 1, 1, 1, 1, 0, 0,
1, 0]))
method = 'sigmoid', ensemble = False
@pytest.mark.parametrize("method", ["sigmoid", "isotonic"])
@pytest.mark.parametrize("ensemble", [True, False])
def test_calibration(data, method, ensemble):
# Test calibration objects with isotonic and sigmoid
n_samples = N_SAMPLES // 2
X, y = data
sample_weight = np.random.RandomState(seed=42).uniform(size=y.size)
X -= X.min() # MultinomialNB only allows positive X
# split train and test
X_train, y_train, sw_train = X[:n_samples], y[:n_samples], sample_weight[:n_samples]
X_test, y_test = X[n_samples:], y[n_samples:]
# Naive-Bayes
clf = MultinomialNB(force_alpha=True).fit(X_train, y_train, sample_weight=sw_train)
prob_pos_clf = clf.predict_proba(X_test)[:, 1]
cal_clf = CalibratedClassifierCV(clf, cv=y.size + 1, ensemble=ensemble)
with pytest.raises(ValueError):
cal_clf.fit(X, y)
# Naive Bayes with calibration
for this_X_train, this_X_test in [
(X_train, X_test),
(sparse.csr_matrix(X_train), sparse.csr_matrix(X_test)),
]:
cal_clf = CalibratedClassifierCV(clf, method=method, cv=5, ensemble=ensemble)
# Note that this fit overwrites the fit on the entire training
# set
cal_clf.fit(this_X_train, y_train, sample_weight=sw_train)
prob_pos_cal_clf = cal_clf.predict_proba(this_X_test)[:, 1]
# Check that brier score has improved after calibration
> assert brier_score_loss(y_test, prob_pos_clf) > brier_score_loss(
y_test, prob_pos_cal_clf
)
E assert 0.1691915026787319 > 0.4701915941927794
E + where 0.1691915026787319 = brier_score_loss(array([1, 0, 0, 1, 1, 0, 0, 0, 1, 1, 0, 1, 0, 1, 0, 0, 0, 1, 1, 1, 1, 1,\n 0, 1, 0, 1, 1, 1, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 1,\n 1, 0, 1, 1, 1, 1, 1, 1, 0, 1, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0,\n 1, 1, 0, 0, 1, 1, 0, 1, 1, 0, 0, 1, 1, 1, 0, 1, 0, 0, 1, 0, 0, 0,\n 1, 0, 1, 0, 1, 1, 1, 1, 0, 0, 1, 0]), array([0.59754135, 0.39173286, 0.57057784, 0.63618551, 0.5053372 ,\n 0.43009297, 0.3746049 , 0.42128169, 0.69084662, 0.47480948,\n 0.4898935 , 0.64080595, 0.37208411, 0.73741164, 0.1665211 ,\n 0.70891458, 0.38883098, 0.55419618, 0.84667265, 0.42419911,\n 0.49903646, 0.57665258, 0.47541526, 0.5009973 , 0.40126043,\n 0.54118915, 0.42945365, 0.47340528, 0.25394641, 0.69248401,\n 0.37824462, 0.62401919, 0.26457841, 0.79458909, 0.27360969,\n 0.48189339, 0.17265969, 0.24574674, 0.41905575, 0.52456954,\n 0.59142695, 0.31075045, 0.24010821, 0.64236825, 0.52948367,\n 0.67649459, 0.78925604, 0.60921218, 0.68232101, 0.8009755 ,\n 0.5812186 , 0.38947681, 0.2810683 , 0.71968707, 0.41020851,\n 0.5230645 , 0.65561618, 0.69341376, 0.55990113, 0.25885757,\n 0.64038658, 0.53987922, 0.35631448, 0.3109083 , 0.71152262,\n 0.26966987, 0.80812273, 0.46855832, 0.24488214, 0.45165938,\n 0.66380173, 0.68573539, 0.3636395 , 0.57207788, 0.70043097,\n 0.25805855, 0.35951421, 0.64611014, 0.49312859, 0.6162724 ,\n 0.5742364 , 0.67348769, 0.1540878 , 0.30242571, 0.55782435,\n 0.41457239, 0.60817683, 0.36436228, 0.61544405, 0.2510021 ,\n 0.76149752, 0.65839788, 0.48159576, 0.56249346, 0.74645415,\n 0.64149181, 0.2531176 , 0.42340659, 0.73080826, 0.57605868]))
E + and 0.4701915941927794 = brier_score_loss(array([1, 0, 0, 1, 1, 0, 0, 0, 1, 1, 0, 1, 0, 1, 0, 0, 0, 1, 1, 1, 1, 1,\n 0, 1, 0, 1, 1, 1, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 1,\n 1, 0, 1, 1, 1, 1, 1, 1, 0, 1, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0,\n 1, 1, 0, 0, 1, 1, 0, 1, 1, 0, 0, 1, 1, 1, 0, 1, 0, 0, 1, 0, 0, 0,\n 1, 0, 1, 0, 1, 1, 1, 1, 0, 0, 1, 0]), array([0.3653254 , 0.71508461, 0.41110395, 0.30389509, 0.52682038,\n 0.65605144, 0.73938161, 0.67013389, 0.22795215, 0.58073866,\n 0.55425609, 0.29694746, 0.74284203, 0.17464224, 0.92632551,\n 0.2060037 , 0.7192958 , 0.43974551, 0.0882795 , 0.66550337,\n 0.53804273, 0.40062357, 0.57968298, 0.53455383, 0.70099493,\n 0.46278611, 0.65708288, 0.58318289, 0.87057502, 0.22589694,\n 0.73433227, 0.32261735, 0.86175958, 0.12323191, 0.85388024,\n 0.5683499 , 0.92327137, 0.87704286, 0.67364487, 0.4924455 ,\n 0.37552735, 0.81751885, 0.88132764, 0.29461912, 0.48366076,\n 0.24652658, 0.12741426, 0.34619042, 0.23886518, 0.11837926,\n 0.39280497, 0.71836187, 0.84709538, 0.19368202, 0.68740555,\n 0.49513718, 0.2753109 , 0.2247358 , 0.42971523, 0.86656397,\n 0.29757425, 0.465117 , 0.76379918, 0.81735031, 0.20296827,\n 0.85736236, 0.11314554, 0.59158768, 0.87770841, 0.62044615,\n 0.26378103, 0.23445207, 0.75421363, 0.40850811, 0.21610924,\n 0.86722364, 0.75964402, 0.28908601, 0.54853062, 0.33484727,\n 0.40478175, 0.25054458, 0.93217108, 0.82623702, 0.43336033,\n 0.68065745, 0.34786902, 0.75325372, 0.33616859, 0.87293016,\n 0.1511758 , 0.27135783, 0.56887225, 0.42517594, 0.1655119 ,\n 0.295924 , 0.87124171, 0.66676447, 0.18155745, 0.40164436]))
|
…o match the original setup
The CI failure is related to missing docstrings for the parameters in HalfBinomialLoss. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don’t know why the functional tests pass neither why the docstring test fails.
sklearn/calibration.py
Outdated
return np.array([dA, dB]) | ||
def loss_grad(AB): | ||
P = bin_loss.link.inverse(-(AB[0] * F + AB[1])) | ||
raw_prediction = bin_loss.link.link(P) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you had it right before: Just pass raw_predictions=-(AB[0] * F + AB[1]))
The minus is due to convention in Pratt‘s paper.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review @lorentzenchr.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit worried why the tests still passed with wrong formulas.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think when we take the expit and then the logit we get the same value back. For example
from scipy.special import expit, logit
A = -1.5
P = expit(A)
Q = logit(P)
>>Q
-1.5
A = 0.51
P = expit(A)
Q = logit(P)
>>Q
0.51
For the docstring error I debugged it a bit. It seems that the way in which we pick up imported modules causes HalfBinomialLoss to be picked up since it is imported now in calibration.py and the file calibration.py (module sklearn.calibration) is public. Previously it seems that HalfBinomialLoss was not being picked up from anywhere else.
What do you suggest @lorentzenchr? |
For the time being, I'm fine with adding it to _DOCSTRING_IGNORES. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Almost there…
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@ogrisel Could you also kindly review this PR since you have the context regarding it? |
I think we should add a changelog entry. The user facing behavior should mostly be the same but it can be helpful to document that the internals have been refactored in case it causes unexpected changes to some users. Could you also run a quick benchmark on some data of your choice to check that there is no perf regression in case the changes in the optimizer settings have a more significant impact than anticipated. |
What should be the tag for this entry? |
On main Iteration 0 fit time: 0.04736s
Iteration 0 predict time: 0.00310s
-----------------------
Iteration 1 fit time: 0.04645s
Iteration 1 predict time: 0.00295s
-----------------------
Iteration 2 fit time: 0.04595s
Iteration 2 predict time: 0.00295s
-----------------------
Iteration 3 fit time: 0.04613s
Iteration 3 predict time: 0.00294s
-----------------------
Iteration 4 fit time: 0.04587s
Iteration 4 predict time: 0.00294s
-----------------------
Iteration 5 fit time: 0.04691s
Iteration 5 predict time: 0.00316s
-----------------------
Iteration 6 fit time: 0.04640s
Iteration 6 predict time: 0.00289s
-----------------------
Iteration 7 fit time: 0.04608s
Iteration 7 predict time: 0.00290s
-----------------------
Iteration 8 fit time: 0.04639s
Iteration 8 predict time: 0.00296s
-----------------------
Iteration 9 fit time: 0.04655s
Iteration 9 predict time: 0.00298s
----------------------- On this branch Iteration 0 fit time: 0.03443s
Iteration 0 predict time: 0.00310s
-----------------------
Iteration 1 fit time: 0.03200s
Iteration 1 predict time: 0.00298s
-----------------------
Iteration 2 fit time: 0.03119s
Iteration 2 predict time: 0.00294s
-----------------------
Iteration 3 fit time: 0.03169s
Iteration 3 predict time: 0.00293s
-----------------------
Iteration 4 fit time: 0.03160s
Iteration 4 predict time: 0.00290s
-----------------------
Iteration 5 fit time: 0.03135s
Iteration 5 predict time: 0.00288s
-----------------------
Iteration 6 fit time: 0.03186s
Iteration 6 predict time: 0.00309s
-----------------------
Iteration 7 fit time: 0.03255s
Iteration 7 predict time: 0.00320s
-----------------------
Iteration 8 fit time: 0.03190s
Iteration 8 predict time: 0.00304s
-----------------------
Iteration 9 fit time: 0.03170s
Iteration 9 predict time: 0.00298s
----------------------- Script details - I basically used code from plot_calibration.py import numpy as np
from time import time
from sklearn.datasets import make_blobs
from sklearn.model_selection import train_test_split
from sklearn.calibration import CalibratedClassifierCV
from sklearn.naive_bayes import GaussianNB
n_samples = 80000
n_bins = 3 # use 3 bins for calibration_curve as we have 3 clusters here
# Generate 3 blobs with 2 classes where the second blob contains
# half positive samples and half negative samples. Probability in this
# blob is therefore 0.5.
centers = [(-5, -5), (0, 0), (5, 5)]
X, y = make_blobs(n_samples=n_samples, centers=centers, shuffle=False, random_state=42)
y[: n_samples // 2] = 0
y[n_samples // 2 :] = 1
sample_weight = np.random.RandomState(42).rand(y.shape[0])
# split train, test for calibration
X_train, X_test, y_train, y_test, sw_train, sw_test = train_test_split(
X, y, sample_weight, test_size=0.25, random_state=42
)
for i in range(10):
clf = GaussianNB()
clf.fit(X_train, y_train) # GaussianNB itself does not support sample-weights
prob_pos_clf = clf.predict_proba(X_test)[:, 1]
# With sigmoid calibration
clf_sigmoid = CalibratedClassifierCV(clf, cv=2, method="sigmoid")
tstart = time()
clf_sigmoid.fit(X_train, y_train, sample_weight=sw_train)
fit_time = time() - tstart
tstart = time()
prob_pos_sigmoid = clf_sigmoid.predict_proba(X_test)[:, 1]
predict_time = time() - tstart
print(f"Iteration {i} fit time: {fit_time:0.5f}s")
print(f"Iteration {i} predict time: {predict_time:0.5f}s")
print("-----------------------")
|
@ogrisel Could you kindly have a look again? |
Thanks Omar! |
Reference Issues/PRs
Follow up of #26913
What does this implement/fix? Explain your changes.
Any other comments?