-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
[WIP] Consistent and informative error message for partial_fit when n_features changes #12465
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There's one place I didn't manage to add the helper function: in MLP. |
This pull request introduces 1 alert when merging 1b9e6a0 into 9ec5a15 - view on LGTM.com new alerts:
Comment posted by LGTM.com |
@jeremiedbb |
The error message comes from Thanks for the hint. It turns out it's |
def test_lda_partial_fit_dim_mismatch(): | ||
# test `n_features` mismatch in `partial_fit` | ||
rng = np.random.RandomState(0) | ||
n_components = rng.randint(3, 6) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed this test because it does exactly what the common test check_estimators_partial_fit_n_features
does.
@@ -202,7 +187,7 @@ def test_lda_transform_mismatch(): | |||
random_state=rng) | |||
lda.partial_fit(X) | |||
assert_raises_regexp(ValueError, r"^The provided data has", | |||
lda.partial_fit, X_2) | |||
lda.transform, X_2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed that because as is, it was a duplicate of test_lda_partial_fit_dim_mismatch
and I think it did not do what it was expected to do, i.e. "# test n_features
mismatch in partial_fit and transform"
Why does |
We want We could consider adding a warning if pytest is not importable in |
7862836
to
627d2ee
Compare
Ok, i removed the pytest dependency. Before I change the status to [MRG], we can discuss if this is really the way we want to do it. Because enforcing the error message match in the common tests will break a lot of sklearn compatible estimators, right ? |
Yes, we could deprecate the lenient behaviour instead. But: Elsewhere we
have allowed any message that matches one of a few expressions. Or: we can
invert the condition: prohibit certain messages, such as "cannot be
broadcast" rather than requiring a certain message.
|
@amueller do you have an opinion on that ? |
my opinion is #13969. |
closing. superseeded by n_features_in |
This is WIP towards it. Extension of #12431
Currently, when n_features changes between calls to partial_fit, for some estimators an error is raised with an informative error message, and for other estimators an error is raised but the message is not very informative (broadcasting error from numpy or bad shape from pairwise, ...).
In addition, for estimators which give an informative error message, the message differs between estimators.
Finally, there is a common test which checks that an estimator raises an error in that case. However, it's only done for classifiers, regressors and clusterers.
This PR fixes these 3 aspects: