[MRG] FIX+TST regression in multinomial logstc reg when class_weight is auto #5420

raghavrv · 2015-10-16T13:33:52Z

lesteve · 2015-10-16T14:05:06Z

There are still Y_bin (note the capital Y) vs y_bin elsewhere in the file. Do you reckon their usage was intended, and if yes do you know the difference in meaning?

Using two variables that only differ by capitalization does not seem like a very good idea...

raghavrv · 2015-10-16T14:21:33Z

It appears that the usage is intended (though I am not sure why we need to have the variable named Y_bin for multinomial alone, maybe just not to confuse it with the y_bin computed in the case of ovr I think...).

I think we can safely rename Y_bin --> y_bin without side-effects... (but that would be for another PR I feel?)

ogrisel · 2015-10-16T14:34:12Z

sklearn/linear_model/tests/test_logistic.py

@@ -875,3 +875,11 @@ def test_warm_start():
                        assert_greater(2.0, cum_diff, msg)
                    else:
                        assert_greater(cum_diff, 2.0, msg)
+
+
+@ignore_warnings


why ignore the warnings? which warnings are raised in that case?

auto is deprecated in favor of balanced? In that case put a please use assert_warns to make it expliclit that we expect a DeprecationWarning.

Warning that auto will no longer be supported!

okay will do so :)

Is it me or is 'balanced' and 'auto' are treated the same, see: this line where the check is against 'balanced' and not 'auto'.

check seems to against auto only?

if class_weight == "auto": class_weight_ = compute_class_weight(class_weight, mask_classes, y_bin) sample_weight *= class_weight_[le.fit_transform(y_bin)]

Do you mean some other line? sorry I don't get what you are saying!

check seems to against auto only?

sorry I meant against 'auto' and not 'balanced', just as you said.

raghavrv · 2015-10-16T14:49:30Z

Or should I do that in this PR itself? (renaming Y_bin --> y_bin)

ogrisel · 2015-10-16T14:51:30Z

The rule is Y_bin for variables with a 2D shape and y_bin for 1D arrays.

ogrisel · 2015-10-16T14:52:18Z

Actually it's not a rule but a convention in the scikit-learn code base. Although it's not always respected it's better to follow it for consistency.

raghavrv · 2015-10-16T14:53:41Z

Ah thanks for clarifying!! :)

amueller · 2015-10-16T15:53:01Z

why is there a deprecation here?

amueller · 2015-10-16T15:53:53Z

Or rather: does this work correctly with "balanced"?

amueller · 2015-10-16T15:55:18Z

I guess merge this and open a new issue?

lesteve · 2015-10-16T15:57:07Z

Or rather: does this work correctly with "balanced"?

I don't think 'auto' and 'balanced' give the same result:

from sklearn.datasets import make_classification
from sklearn.linear_model import LogisticRegression
X, y = make_classification(n_classes=5, n_informative=5, random_state=42)
model_auto = LogisticRegression(multi_class='multinomial', class_weight='auto',
                                solver='lbfgs', random_state=0)
model_auto.fit(X, y)
model_balanced = LogisticRegression(multi_class='multinomial', class_weight='balanced',
                                    solver='lbfgs', random_state=0)

model_balanced.fit(X, y)

print(model_auto.coef_ - model_balanced.coef_)

Output:

[[  2.54746659e-03   7.43233265e-03   2.77261631e-03   1.41348054e-03
    1.21013318e-02  -4.56288617e-03   5.69287167e-04   3.33629140e-03
   -3.88104802e-04  -3.55079940e-03  -8.57260871e-04  -2.45097273e-04
    1.13524688e-02  -1.24548720e-03   6.89260176e-03  -9.50383491e-03
   -2.10713335e-03  -1.24984758e-03   3.73549840e-04  -5.89461179e-03]
 [  2.58017862e-03  -5.26711165e-03  -1.64755693e-03   1.42570316e-03
   -7.51183245e-04  -4.10772362e-03   4.18610654e-03   1.77434558e-03
    6.46642076e-04  -7.12319759e-03   1.72079922e-03  -7.03747457e-03
    1.13293794e-02   2.52818914e-03   2.43273106e-03  -4.66653993e-03
    6.56878824e-03   7.39691852e-03   3.33240485e-03  -5.31231989e-03]
 [ -5.66662259e-03  -1.14696637e-02  -3.77677123e-03   3.57538831e-03
   -2.71981075e-03   5.81756973e-03   2.93235232e-03   2.75617311e-03
    3.39429171e-03   7.01936893e-03  -2.84235294e-03   3.78963983e-04
   -9.90384206e-04  -1.11753851e-03  -5.80121488e-03   1.60822984e-02
    1.75337541e-03   5.06311469e-03  -1.73672061e-04   2.62750563e-03]
 [ -2.39735439e-03   4.92023929e-03   1.52716529e-03  -1.42230878e-03
   -9.35013669e-05   2.29086976e-03  -1.18793225e-03  -6.11619047e-03
   -8.21475297e-04   2.31691010e-03  -1.67619010e-03   6.32054327e-03
   -1.16387129e-02   1.80561897e-03   2.53636983e-03   4.33820296e-04
   -1.14790487e-02  -6.76347623e-03   9.24463117e-04  -1.40200438e-03]
 [  2.93633178e-03   4.38420342e-03   1.12454655e-03  -4.99226324e-03
   -8.53683648e-03   5.62170314e-04  -6.49981377e-03  -1.75061963e-03
   -2.83135369e-03   1.33771796e-03   3.65500470e-03   5.83064593e-04
   -1.00527511e-02  -1.97078240e-03  -6.06048778e-03  -2.34574388e-03
    5.26401842e-03  -4.44670941e-03  -4.45674574e-03   9.98143044e-03]]

amueller · 2015-10-16T15:57:33Z

see #5422

ogrisel · 2015-10-16T15:57:41Z

Or rather: does this work correctly with "balanced"?

That's a good question. I would rather check that before merging. And add a test if there is no test for that case.

amueller · 2015-10-16T15:58:26Z

@lesteve they are not supposed to give the same result. But having special treatment for one and not the other seems fishy

ogrisel · 2015-10-16T15:58:54Z

I don't think 'auto' and 'balanced' give the same result.

This is expected. This is why we deprecated 'auto'. The results are not what people would naively expect. It was a misleading sklearn specific heuristic. 'balanced' implements the industry standard.

lesteve · 2015-10-16T15:59:38Z

OK sorry for the noise and thanks for the details.

amueller · 2015-10-16T16:07:23Z

The fix is to do if class_weight in ["auto", "balanced"]: and pass the class_weight parameter to the compute_class_weight function. But why wasn't this caught in the tests? There is a test for class_weight="balanced" for linear models, right? maybe only in the binary case? hum

amueller · 2015-10-16T16:10:38Z

    for name, Classifier in linear_classifiers:
        if name == "LogisticRegressionCV":
            # Contrary to RidgeClassifierCV, LogisticRegressionCV use actual
            # CV folds and fit a model for each CV iteration before averaging
            # the coef. Therefore it is expected to not behave exactly as the
            # other linear model.
            continue
        yield check_class_weight_balanced_linear_classifier, name, Classifier

amueller · 2015-10-16T16:11:11Z

Can you remove the if and see if it passes after the fix I suggested above?

raghavrv · 2015-10-16T17:01:15Z

I tried and it doesn't pass...!

amueller · 2015-10-16T17:41:54Z

hm ok. We should have some form of regression test, though. I'm not sure the comment is correct. @ogrisel added it in 07560e4. We are learning models with exactly the same parameters. Cross-validation or not, that should be deterministic, right?

amueller · 2015-10-16T17:44:25Z

Also I'm surprised by the behavior of LogisticRegressionCV. None of the other CV models does model averaging, right? This should be better documented, I think.

MechCoder · 2015-10-16T22:08:01Z

@amueller

The idea is that in LogisticRegressionCV when class_weight is set to auto, the class_weights are calculated for every fold. While using StratifiedKFold the class frequency is not guaranteed to be the same for every train fold. So calculating the class_weights apriori and allowing LogisticRegressionCV to do it will not produce the same results.

In RidgeClassifierCV, since it is LOO, we can set the class_weights before.

We do not do model averaging, the comment in that sense is slightly wrong (unless refit is set to False). It should be .. model for each CV iteration before the final fit

amueller · 2015-10-16T22:20:01Z

Ah, I overlooked the refit parameter. But that negates the argument, right? The final coef_ will have class weight computed with all samples, because refit=True by default.

MechCoder · 2015-10-16T22:28:22Z

Just a second. Trying out something.

MechCoder · 2015-10-16T22:36:48Z

Actually I do get the same coefs :|

MechCoder · 2015-10-16T23:05:33Z

If LogisticRegressionCV is given the class_weights parameter as a dict, the value of the optimal C found is different from that of that in "auto" (especially in the case of very small data as in the tests). Took me some time to figure that out.

MechCoder · 2015-10-16T23:07:10Z

The fix looks good to me though. (It was working before, I think it somehow broke while adding the class_weight=balaced parameter.

amueller · 2015-10-16T23:24:16Z

why is the value of C found different if the parameters are the same?

MechCoder · 2015-10-17T01:19:30Z

Because the parameters for every fold are not the same. Minimal code to reproduce

X = np.array([[-1.0, -1.0], [-1.0, 0], [-.8, -1.0],
                    [1.0, 1.0], [1.0, 0.0]])
y = np.array([0, 0, 0, 1, 1])
C = [0.0001, 0.35938137]
cv = check_cv(None, X, y, classifier=True) 

# Pre assigned class weights, do not change for every fold.
# Training a logreg model for every fold and averaging the score.
class_weight = {1: 5 / (np.sum(y == 1) * 2),
                0: 5 / (np.sum(y == 0) * 2)}
for Cs in C:
    score = 0.0
    for train, test in cv:
        logreg = LogisticRegression(C=Cs, class_weight=class_weight, solver='lbfgs')
        logreg.fit(X[train], y[train])
        score += logreg.score(X[test], y[test])
    print("%f score for %f" % (score/3.0, Cs))

# Compute the class weights for every fold.
for Cs in C:
    score = 0.0
    for train, test in cv:
        labels = [0, 1]
        t = compute_class_weight("balanced", [0, 1], y[train])
        class_weight = dict(zip(labels, t))
        logreg = LogisticRegression(solver='lbfgs', class_weight=class_weight, C=Cs)
        logreg.fit(X[train], y[train])
        score += logreg.score(X[test], y[test])
    print("%f score for %f" % (score/3.0, Cs))

Note that the first one chooses a C of 0.359 while the second chooses a C of 0.0001 (np argmax chooses the first C). This leads to differences in the final fit.

TomDLT · 2015-10-17T08:43:03Z

Sorry I introduced this bug in this commit 94eb619#diff-1452d97adf1a0b794930e405aa58d64bR613

It should be solved in #5008

ogrisel · 2015-10-17T13:53:02Z

This should be better documented, I think.

I agree.

amueller · 2015-10-17T17:56:04Z

have to look more closely at #5008, not sure I have time for that today. For the failing test, we could just search over a single C [or maybe better two very far apart Cs].

TomDLT · 2015-10-19T12:24:41Z

The idea is that in LogisticRegressionCV when class_weight is set to auto, the class_weights are calculated for every fold. While using StratifiedKFold the class frequency is not guaranteed to be the same for every train fold. So calculating the class_weights apriori and allowing LogisticRegressionCV to do it will not produce the same results.

I changed this behavior in # 5008.

raghavrv · 2015-10-19T12:29:30Z

I am closing this in favour of #5008 which fixes a few more issues which were raised here by @amueller @MechCoder and others...

raghavrv changed the title ~~FIX+TST regression in multinomial logstc reg when class_weight is auto~~ [MRG] FIX+TST regression in multinomial logstc reg when class_weight is auto Oct 16, 2015

raghavrv mentioned this pull request Oct 16, 2015

use class_weight='auto' and multi_class='multinomial' in LogisticRegression cause UnboundLocalError in line 637 of logistic.py #5415

Closed

ogrisel reviewed Oct 16, 2015
View reviewed changes

FIX+TST regression in multinomial logstc reg when class_weight is auto

a17b70b

raghavrv force-pushed the regression_in_logistic branch from be44403 to a17b70b Compare October 16, 2015 14:48

FIX fix class_weights 'balanced'

67478f6

TomDLT mentioned this pull request Oct 19, 2015

[MRG+1] fix logistic regression class weights #5008

Merged

raghavrv closed this Oct 19, 2015

raghavrv deleted the regression_in_logistic branch October 19, 2015 12:29

Uh oh!

[MRG] FIX+TST regression in multinomial logstc reg when class_weight is auto #5420

[MRG] FIX+TST regression in multinomial logstc reg when class_weight is auto #5420

Uh oh!

Conversation

raghavrv commented Oct 16, 2015

Uh oh!

lesteve commented Oct 16, 2015

Uh oh!

raghavrv commented Oct 16, 2015

Uh oh!

ogrisel Oct 16, 2015

Choose a reason for hiding this comment

Uh oh!

ogrisel Oct 16, 2015

Choose a reason for hiding this comment

Uh oh!

raghavrv Oct 16, 2015

Choose a reason for hiding this comment

Uh oh!

raghavrv Oct 16, 2015

Choose a reason for hiding this comment

Uh oh!

lesteve Oct 16, 2015

Choose a reason for hiding this comment

Uh oh!

raghavrv Oct 16, 2015

Choose a reason for hiding this comment

Uh oh!

lesteve Oct 16, 2015

Choose a reason for hiding this comment

Uh oh!

raghavrv commented Oct 16, 2015

Uh oh!

ogrisel commented Oct 16, 2015

Uh oh!

ogrisel commented Oct 16, 2015

Uh oh!

raghavrv commented Oct 16, 2015

Uh oh!

amueller commented Oct 16, 2015

Uh oh!

amueller commented Oct 16, 2015

Uh oh!

amueller commented Oct 16, 2015

Uh oh!

lesteve commented Oct 16, 2015

Uh oh!

amueller commented Oct 16, 2015

Uh oh!

ogrisel commented Oct 16, 2015

Uh oh!

amueller commented Oct 16, 2015

Uh oh!

ogrisel commented Oct 16, 2015

Uh oh!

lesteve commented Oct 16, 2015

Uh oh!

amueller commented Oct 16, 2015

Uh oh!

amueller commented Oct 16, 2015

Uh oh!

amueller commented Oct 16, 2015

Uh oh!

raghavrv commented Oct 16, 2015

Uh oh!

amueller commented Oct 16, 2015

Uh oh!

amueller commented Oct 16, 2015

Uh oh!

MechCoder commented Oct 16, 2015

Uh oh!

amueller commented Oct 16, 2015

Uh oh!

MechCoder commented Oct 16, 2015

Uh oh!

MechCoder commented Oct 16, 2015

Uh oh!

MechCoder commented Oct 16, 2015

Uh oh!