Skip to content

DOC Fix incorrect 0-1 scaling in the RBM example #19363

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Apr 8, 2021

Conversation

Necior
Copy link
Contributor

@Necior Necior commented Feb 5, 2021

Fix incorrect 0-1 scaling method in the "Restricted Boltzmann Machine features for digit classification" example.

Therefore, fix #19362.

@Necior
Copy link
Contributor Author

Necior commented Feb 10, 2021

Thank you for taking a look! I've updated the MR to use the minmax_scale function, as suggested.

Are the related docs rebuilded automatically when deployed or should I update anything else?

BTW, proper scaling changes the results slightly (generally for the better if I understand correctly). Before/after diff:

 Logistic regression using RBM features:
               precision    recall  f1-score   support
 
            0       0.99      0.98      0.99       174
-           1       0.91      0.93      0.92       184
+           1       0.92      0.93      0.93       184
            2       0.92      0.96      0.94       166
-           3       0.93      0.88      0.90       194
+           3       0.94      0.88      0.91       194
            4       0.97      0.94      0.95       186
-           5       0.94      0.91      0.92       181
-           6       0.97      0.96      0.97       207
+           5       0.93      0.92      0.92       181
+           6       0.97      0.97      0.97       207
            7       0.94      0.99      0.97       154
-           8       0.89      0.90      0.89       182
-           9       0.87      0.90      0.89       169
+           8       0.89      0.89      0.89       182
+           9       0.88      0.91      0.89       169
 
-    accuracy                           0.93      1797
-   macro avg       0.93      0.94      0.93      1797
-weighted avg       0.93      0.93      0.93      1797
+    accuracy                           0.94      1797
+   macro avg       0.94      0.94      0.94      1797
+weighted avg       0.94      0.94      0.94      1797
 
 
 Logistic regression using raw pixel features:
               precision    recall  f1-score   support
 
            0       0.90      0.92      0.91       174
-           1       0.60      0.59      0.60       184
-           2       0.75      0.85      0.80       166
-           3       0.77      0.78      0.78       194
-           4       0.82      0.84      0.83       186
+           1       0.60      0.58      0.59       184
+           2       0.76      0.85      0.80       166
+           3       0.78      0.78      0.78       194
+           4       0.81      0.84      0.83       186
            5       0.77      0.76      0.76       181
            6       0.90      0.87      0.89       207
            7       0.85      0.88      0.87       154
            8       0.67      0.58      0.62       182
            9       0.74      0.76      0.75       169

     accuracy                           0.78      1797
    macro avg       0.78      0.78      0.78      1797
 weighted avg       0.78      0.78      0.78      1797

@jjerphan
Copy link
Member

jjerphan commented Feb 10, 2021

Are the related docs rebuilded automatically when deployed or should I update anything else?

You don't have to update anything: the auto_examples outputs (like the one you provided) are automatically built by Sphinx for the docs. 🙂

Edit: see this part of the configuration for the generation of auto_examples:

scikit-learn/doc/conf.py

Lines 326 to 346 in 468c3f4

sphinx_gallery_conf = {
'doc_module': 'sklearn',
'backreferences_dir': os.path.join('modules', 'generated'),
'show_memory': False,
'reference_url': {
'sklearn': None},
'examples_dirs': ['../examples'],
'gallery_dirs': ['auto_examples'],
'subsection_order': SubSectionTitleOrder('../examples'),
'binder': {
'org': 'scikit-learn',
'repo': 'scikit-learn',
'binderhub_url': 'https://mybinder.org',
'branch': binder_branch,
'dependencies': './binder/requirements.txt',
'use_jupyter_lab': True
},
# avoid generating too many cross links
'inspect_global_variables': False,
'remove_config_comments': True,
}

Copy link
Member

@jjerphan jjerphan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks @Necior!

@Necior
Copy link
Contributor Author

Necior commented Feb 10, 2021

Thanks for the review and providing relevant configuration snippet! Now it totally makes sense.

@thomasjpfan thomasjpfan changed the title Fix incorrect 0-1 scaling in the RBM example DOC Fix incorrect 0-1 scaling in the RBM example Apr 8, 2021
Copy link
Member

@thomasjpfan thomasjpfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR @Necior !

LGTM

@thomasjpfan thomasjpfan merged commit dff37c4 into scikit-learn:main Apr 8, 2021
thomasjpfan pushed a commit to thomasjpfan/scikit-learn that referenced this pull request Apr 19, 2021
@glemaitre glemaitre mentioned this pull request Apr 22, 2021
12 tasks
glemaitre pushed a commit to glemaitre/scikit-learn that referenced this pull request Apr 22, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Incorrect(?) scaling to [0, 1] in the "Restricted Boltzmann Machine features for digit classification" example
4 participants