-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
ENH add sample weight support to MLP #30155
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Hi @lorentzenchr, I think I addressed all the comments for this PR: #25646, and I was just waiting for approval or further review. If you plan to review and approve it. I can get it ready again. |
doc/whats_new/upcoming_changes/sklearn.neural_network/30155.feature.rst
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks good. Just a couple of comments regarding using fully the common test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally looks good, just a few comments
@@ -542,6 +542,17 @@ | |||
] | |||
}, | |||
MDS: {"check_dict_unchanged": dict(max_iter=5, n_components=1, n_init=2)}, | |||
MLPClassifier: { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is good enough for me. Thanks @lorentzenchr
@ogrisel mentioned to me stochastic solver and the equivalence test might not be best friend. So SGD/Adam + early-stopping should be omitted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exactly what I did in 432bcc0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note I would like to develop a similar test that can handle estimator/hyper-param combinations that lead to a stochastic fit (e.g. with internal resampling operations like bagging / CV, non-convex objective functions sensitive to random inits, ...).
However, this involves running the fit operation many times (typically 30 times or more) with different random seeds and then doing a KS-test on predictions for estimators fitted with weighted data vs repeated data. As a result, this is much more expensive to run and setting a non-noisy yet informative pvalue threshold might be operationally tricky in the context of a CI that runs many such tests. So for now, this is done manually in a side notebook as referenced in #16298 (comment).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks @lorentzenchr and @zshu115x
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is a test to check that we pass the sample_weight
to the underlying function during early-stopping
@pytest.mark.parametrize(
"MLPEstimator, score_func",
[(MLPClassifier, accuracy_score), (MLPRegressor, r2_score)],
)
@pytest.mark.parametrize("solver", ["sgd", "adam"])
def test_mlp_early_stopping_score_sample_weight(MLPEstimator, score_func, monkeypatch, solver):
"""Check that sample_weight is passed to scoring functions when using
early-stopping."""
rng = np.random.RandomState(42)
sample_weight = rng.randint(1, 6, size=y_iris.size)
def mock_score(y_true, y_pred, sample_weight=None):
mock_score.last_sample_weight = sample_weight
return score_func(y_true, y_pred, sample_weight=sample_weight)
mock_score.last_sample_weight = None
if isinstance(MLPEstimator(), MLPClassifier):
monkeypatch.setattr(
"sklearn.neural_network._multilayer_perceptron.accuracy_score", mock_score
)
else:
monkeypatch.setattr(
"sklearn.neural_network._multilayer_perceptron.r2_score", mock_score
)
mlp = MLPEstimator(
hidden_layer_sizes=(10,),
solver=solver,
max_iter=10,
early_stopping=True,
validation_fraction=0.5,
random_state=42,
)
mlp.fit(X_iris, y_iris, sample_weight=sample_weight)
assert mock_score.last_sample_weight is not None
assert mock_score.last_sample_weight.shape == (sample_weight.size // 2,)
MLPClassifier: { | ||
"check_sample_weight_equivalence": [ | ||
dict(solver="lbfgs"), | ||
dict(solver="lbfgs", early_stopping=True, n_iter_no_change=1, tol=1e-2), | ||
] | ||
}, | ||
MLPRegressor: { | ||
"check_sample_weight_equivalence": [ | ||
dict(solver="sgd", tol=1e-2, random_state=42), | ||
] | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Stupid me, there is no early-stopping for lbfgs
so we are actually not testing anything. So my proposal was useless here.
MLPClassifier: { | |
"check_sample_weight_equivalence": [ | |
dict(solver="lbfgs"), | |
dict(solver="lbfgs", early_stopping=True, n_iter_no_change=1, tol=1e-2), | |
] | |
}, | |
MLPRegressor: { | |
"check_sample_weight_equivalence": [ | |
dict(solver="sgd", tol=1e-2, random_state=42), | |
] | |
}, | |
MLPClassifier: { | |
"check_sample_weight_equivalence": dict(solver="lbfgs") | |
}, | |
MLPRegressor: { | |
"check_sample_weight_equivalence": dict(solver="sgd", tol=1e-2, random_state=42) | |
}, |
I'll provide a test that at least check that we pass sample-weight to the underlying scorer when computing early-stopping.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could indeed have an estimator-specific test with a fixed random seed to check that it's doing something reasonable on a synthetic dataset crafted specifically to lead to very different predictions if the weights were not handled properly:
import numpy as np
from sklearn.neural_network import MLPClassifier
X = np.asarray([0, 0, 0, 1, 1, 1, 1]).reshape(-1, 1)
y = np.asarray([0, 1, 1, 2, 2, 2, 1])
sw = np.asarray([2, 0, 0, 0.1, 0.1, 0.1, 3])
global_random_seed = None # TODO: set me when adapting this to pytest.
solver = "sgd" # TODO: adapt to use pytest.parametrize instead.
clf = MLPClassifier(
random_state=global_random_seed, solver=solver, max_iter=10_000
).fit(X, y, sample_weight=sw)
np.testing.assert_array_equal(clf.predict([[0], [1]]), [0, 1])
I would expect this to pass for all solvers and for any value of global_random_seed
.
For MLPRegressor
it might be slightly trickier to write a similar test, but I suppose this is possible as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@glemaitre Do we really need that? You can read the code and convince yourself that sample weights are passed. I have the impression that we get neurotic if we go for that kind of rigor.
Such a test would also be a unprecedented, we don't have it for SGD nor GBT nor HGBT, ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For me the benefit is twofold:
- We had in the past in multiple estimators this bug where we don't pass the sample weight the evaluation internal metric (for early-stopping or internal CV). So there is not reason that this issue happen in the future if we don't have a non-regression test.
- IMO, the test also serve as documentation: e.g. I could find about it when going through the test suite of the histogram gradient boosting.
So even though we might never need to touch this code and it might seem an overkill, I still find it useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hesitate to commit this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would rather test values of best_validation_score_
than the monkeypatch solution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've to stop for today.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would rather test values of best_validation_score_ than the monkeypatch solution.
This is a viable alternative as well. I might get some time to make a small test tonight inside this direction.
MLPClassifier: { | ||
"check_sample_weight_equivalence": [ | ||
dict(solver="lbfgs"), | ||
dict(solver="lbfgs", early_stopping=True, n_iter_no_change=1, tol=1e-2), | ||
] | ||
}, | ||
MLPRegressor: { | ||
"check_sample_weight_equivalence": [ | ||
dict(solver="sgd", tol=1e-2, random_state=42), | ||
] | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could indeed have an estimator-specific test with a fixed random seed to check that it's doing something reasonable on a synthetic dataset crafted specifically to lead to very different predictions if the weights were not handled properly:
import numpy as np
from sklearn.neural_network import MLPClassifier
X = np.asarray([0, 0, 0, 1, 1, 1, 1]).reshape(-1, 1)
y = np.asarray([0, 1, 1, 2, 2, 2, 1])
sw = np.asarray([2, 0, 0, 0.1, 0.1, 0.1, 3])
global_random_seed = None # TODO: set me when adapting this to pytest.
solver = "sgd" # TODO: adapt to use pytest.parametrize instead.
clf = MLPClassifier(
random_state=global_random_seed, solver=solver, max_iter=10_000
).fit(X, y, sample_weight=sw)
np.testing.assert_array_equal(clf.predict([[0], [1]]), [0, 1])
I would expect this to pass for all solvers and for any value of global_random_seed
.
For MLPRegressor
it might be slightly trickier to write a similar test, but I suppose this is possible as well.
Reposting an earlier suggestion that disappeared because it was posted on a thread for an outdated piece of code:
So as a complement, we could test that import numpy as np
from sklearn.neural_network import MLPClassifier
X = np.asarray([0, 0, 0, 1, 1, 1, 1]).reshape(-1, 1)
y = np.asarray([0, 1, 1, 2, 2, 2, 1])
sw = np.asarray([2, 0, 0, 0.1, 0.1, 0.1, 3])
global_random_seed = None # TODO: set me when adapting this to pytest.
solver = "sgd" # TODO: adapt to use pytest.parametrize instead.
clf = MLPClassifier(
random_state=global_random_seed, solver=solver, max_iter=10_000
).fit(X, y, sample_weight=sw)
np.testing.assert_array_equal(clf.predict([[0], [1]]), [0, 1]) I would expect this to pass for all solvers and for any value of For EDIT: rephrased and fixed meaningless long sentence in the second paragraph. |
Firstly, I don't understand this long sentence. Secondly, yes we have stochastic solvers with |
doc/whats_new/upcoming_changes/sklearn.neural_network/30155.feature.rst
Outdated
Show resolved
Hide resolved
Sorry, I must have been distracted when writing this comment. I fixed it by splitting in several sentences.
I had not realized that the common estimator check would pass for
I agree that if the common check is stable enough to not be marked XFAIL, then it's good enough. Let's keep this PR as is for now and maybe reevaluate for alternative testing only when actually needed. |
Anything I can do to help get this merged? Not sure where it's stuck in the review process. |
@adamjstewart You could review if you you judge yourself knowledgeable for this PR. |
I wouldn't exactly judge myself as knowledgeable of sklearn internals, but I have been using your branch locally for my experiments and haven't discovered any errors yet. |
@adamjstewart A review would really help. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ran this and it didn't crash. Would love to see type hints, but the existing code is also missing type hints.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, @lorentzenchr. I finally reviewed this PR. Just a few minor renaming suggestions below. I can take care of them and merge if you wish.
Yes, please - and thank you for the offer. |
BTW, I tried merging this branch into my local branch for #30143, and I confirmed that the refined equivalence check will fail for the SGD solver. Lowering the But let's do that in a follow-up PR (e.g. in #30143 itself or another one). |
Reference Issues/PRs
Supersedes and closes #25646 (which seems stalled).
What does this implement/fix? Explain your changes.
This adds
sample_weight
support toMLPClassifier
andMLPRegressor
.Any other comments?
The objective function being minimized reads$obj = \frac{1}{n} \sum_i loss_i + \frac{alpha}{2n}\lVert coef\rVert^2$ . This is then extended to $obj = \frac{1}{\sum_i s_i} \sum_i s_i \cdot loss_i + \frac{\alpha}{2\sum_i s_i}\lVert coef\rVert^2$ with sample weights $s_i$ .
Tests rely on the common tests (so far).