[MRG] EXA Improve example plot_svm_anova.py #11731

qinhanmin2014 · 2018-08-01T15:18:37Z

I think the example plot_svm_anova.py is not good. It claims that This example shows how to perform univariate feature selection to improve the classification scores. However, with some non-informative features, we actually get the worst result when we select number of features equal to the original dataset. The reason is that the features in digits dataset are either 0 or 1, so it seems not reasonable to add non-informative features with np.random.random.
Related comment #11588 (review)
In the new example, I use the iris dataset (4 features) and add 36 non-informative features. We can find that our model achieves best performance when we select around 10% of features.
Before the PR:

After the PR:

agramfort · 2018-08-02T09:52:59Z

what if you keep the old data and just scale the features?

qinhanmin2014 · 2018-08-02T12:10:00Z

@agramfort
I tried to scale the digits dataset with scale but the problem is how to do feature selection
(1) We can't use chi2 because we now have negative features
(2) We'll get warnings with f_classif because we have constant features
(3) It takes >100 seconds if we use mutual_info_classif, seems unacceptable from my side
What's more, the result is not so good (see below, our models are supposed to achieve best performance when we select around 25% (64/264) of features)
f_classif:

mutual_info_classif:

agramfort · 2018-08-02T12:47:07Z

ok thanks for giving it a try. I agree now that Iris is more adapted.

qinhanmin2014 · 2018-08-04T01:40:33Z

Not sure whether it's appropriate to use cv=5 for the iris dataset since we do not have enough data.
I plot different result here for comparison:

qinhanmin2014 · 2019-01-26T07:43:54Z

Wondering if someone can review it. The original example is wrong IMO.
ping @jnothman @adrinjalali maybe (apologies if the ping makes you unhappy :))

examples/svm/plot_svm_anova.py

qinhanmin2014 · 2019-01-27T00:25:45Z

thanks @adrinjalali I agree that your version is better.

adrinjalali · 2019-01-27T13:51:06Z

examples/svm/plot_svm_anova.py

 transform = SelectPercentile(chi2)
-
-clf = Pipeline([('anova', transform), ('svc', SVC(gamma="auto"))])
+clf = Pipeline([('anova', transform),


the transform variable is only used here, I guess we can remove it and have SelectPercentile(chi2) directly in the pipeline. The comment above the pipeline also needs to change accordingly.

adrinjalali

otherwise LGTM, thanks @qinhanmin2014 !

qinhanmin2014 · 2019-01-27T14:05:25Z

ping @jnothman @agramfort
maybe we can hurry this into 0.20.3, since it's not good to include a wrong example
output from Circle:

This reverts commit d8214fe.

EXA Improve example plot_svm_anova.py

467bddf

qinhanmin2014 changed the title ~~EXA Improve example plot_svm_anova.py~~ [MRG] EXA Improve example plot_svm_anova.py Aug 3, 2018

qinhanmin2014 added 2 commits August 25, 2018 11:59

Merge branch 'master' into svm-anova-example

25e7d8f

Merge branch 'master' into svm-anova-example

c4a8649

adrinjalali reviewed Jan 26, 2019

View reviewed changes

examples/svm/plot_svm_anova.py Outdated Show resolved Hide resolved

address review

632f57e

adrinjalali reviewed Jan 27, 2019

View reviewed changes

adrinjalali approved these changes Jan 27, 2019

View reviewed changes

address comment

9325d8c

more precise y_label

aab7945

jnothman approved these changes Jan 28, 2019

View reviewed changes

jnothman merged commit 1deb95a into scikit-learn:master Jan 28, 2019

qinhanmin2014 deleted the svm-anova-example branch January 28, 2019 02:16

glemaitre pushed a commit to glemaitre/scikit-learn that referenced this pull request Jan 30, 2019

EXA Improve example plot_svm_anova.py (scikit-learn#11731)

dd73bab

thomasjpfan pushed a commit to thomasjpfan/scikit-learn that referenced this pull request Feb 6, 2019

EXA Improve example plot_svm_anova.py (scikit-learn#11731)

22eb834

thomasjpfan pushed a commit to thomasjpfan/scikit-learn that referenced this pull request Feb 7, 2019

EXA Improve example plot_svm_anova.py (scikit-learn#11731)

824e75e

qinhanmin2014 mentioned this pull request Feb 19, 2019

Release 0.20.3 #13186

Merged

17 tasks

jnothman pushed a commit to jnothman/scikit-learn that referenced this pull request Feb 19, 2019

EXA Improve example plot_svm_anova.py (scikit-learn#11731)

c62a0e9

xhluca pushed a commit to xhluca/scikit-learn that referenced this pull request Apr 28, 2019

EXA Improve example plot_svm_anova.py (scikit-learn#11731)

d8214fe

xhluca pushed a commit to xhluca/scikit-learn that referenced this pull request Apr 28, 2019

Revert " EXA Improve example plot_svm_anova.py (scikit-learn#11731)"

f720379

This reverts commit d8214fe.

xhluca pushed a commit to xhluca/scikit-learn that referenced this pull request Apr 28, 2019

Revert " EXA Improve example plot_svm_anova.py (scikit-learn#11731)"

61aaaf8

This reverts commit d8214fe.

koenvandevelde pushed a commit to koenvandevelde/scikit-learn that referenced this pull request Jul 12, 2019

EXA Improve example plot_svm_anova.py (scikit-learn#11731)

760da79

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MRG] EXA Improve example plot_svm_anova.py #11731

[MRG] EXA Improve example plot_svm_anova.py #11731

qinhanmin2014 commented Aug 1, 2018 •

edited

Loading

agramfort commented Aug 2, 2018

qinhanmin2014 commented Aug 2, 2018

agramfort commented Aug 2, 2018 via email

qinhanmin2014 commented Aug 4, 2018

qinhanmin2014 commented Jan 26, 2019

qinhanmin2014 commented Jan 27, 2019

adrinjalali Jan 27, 2019

adrinjalali left a comment

qinhanmin2014 commented Jan 27, 2019 •

edited

Loading

[MRG] EXA Improve example plot_svm_anova.py #11731

[MRG] EXA Improve example plot_svm_anova.py #11731

Conversation

qinhanmin2014 commented Aug 1, 2018 • edited Loading

agramfort commented Aug 2, 2018

qinhanmin2014 commented Aug 2, 2018

agramfort commented Aug 2, 2018 via email

qinhanmin2014 commented Aug 4, 2018

qinhanmin2014 commented Jan 26, 2019

qinhanmin2014 commented Jan 27, 2019

adrinjalali Jan 27, 2019

Choose a reason for hiding this comment

adrinjalali left a comment

Choose a reason for hiding this comment

qinhanmin2014 commented Jan 27, 2019 • edited Loading

qinhanmin2014 commented Aug 1, 2018 •

edited

Loading

qinhanmin2014 commented Jan 27, 2019 •

edited

Loading