-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
[MRG+1] Stronger tests for variable importances #5261
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
9ad805b
to
6290fcf
Compare
6290fcf
to
78974de
Compare
@@ -663,7 +772,7 @@ def check_sparse_input(name, X, X_sparse, y): | |||
|
|||
def test_sparse_input(): | |||
X, y = datasets.make_multilabel_classification(random_state=0, | |||
n_samples=40) | |||
n_samples=50) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not that I disagree, but what is the reason for this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was trying to make the tree construction more stable by automatically rescaling the sample weights. Nothing worked with much success (it just moved the issue to someplace else...), but this test failed a few times because of this. Using more samples made that test more stable.
I've made some comments, but I support this. |
58fafac
to
d1371e5
Compare
Thanks for your review @jmschrei ! I believe I addressed all your suggestions. |
d1371e5
to
bcc6f1b
Compare
importances = est.feature_importances_ | ||
assert_true(np.all(importances >= 0.0)) | ||
|
||
for scale in [10, 100, 1000]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for including my suggestion. I apologize for being pedantic, but I meant a scale of more like [1e-8, 1e-4, 1e-1, 1e1, 1e4, 1e8]. Do you think this test is sufficient?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, which such larger/smaller scale factors, tests dont pass anymore. Numerical discrepancies creep in, leading to slightly different impurity values... I am not sure we can do anything about that. (My idea was to internally scale down sample_weight
by 1. / sample_weight.max()
, but this just moves the numerical issues somewhere else...)
So for me this test is more a safeguard that nothing obviously wrong is happening, rather than a bulletproof test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, okay. Thanks for checking!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but 1e-4 would still work, right?
Other than my one comment, LGTM. |
With the last comment resolved, this LGTM, +1. |
assert_equal(n_important, 3) | ||
|
||
X_new = est.transform(X, threshold="mean") | ||
assert_less(0 < X_new.shape[1], X.shape[1]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line confuses me.
X_new.shape[1] == 3
So he first expression evaluates to True, which is cast to 1, which is less than X.shape[1] == 10
So the assert would hold even if X_new.shape[1] == 11
. Actually it succeeds for all values of X_new.shape[1]
as long as X.shape[1] > 1
I guess remove the 0 <
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
woops :) This one has been there for long without anybody noticing
thanks for more tests :) looks good though I didn't check the math in the theoretical importances. |
@amueller Thanks for the comments! I fixed those |
|
||
# Check correctness | ||
assert_almost_equal(entropy(y), sum(importances)) | ||
assert_less(np.abs(true_importances - importances).mean(), 0.01) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you use assert_less over assert_array_almost_equal?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find it easier to control the quality of the approximation, rather than having all single importance values match up to some digit (but doing this, I only require their mean to do so).
I would add a test to check the parallel computation of importances.
|
Otherwise looks good. |
Thanks for the review. I fixed the last comments. Merging. |
[MRG+1] Stronger tests for variable importances
This PR adds stronger tests for variable importances in forests, including: