Skip to content

[MRG] Adding variable alphaCorrection to classes in naive_bayes.py. #16747

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

arka204
Copy link
Contributor

@arka204 arka204 commented Mar 22, 2020

Reference Issues/PRs

Fixes #10772
References PR #9131
References PR #10775

What does this implement/fix? Explain your changes.

This PR adds a new variable alphaCorrection in classes in naive_bayes.py, which is set to True by default and if set to False, then for alpha=0 (or greater, but still smaller than _ALPHA_MIN) alpha is not being rounded up to _ALPHA_MIN.

Any other comments?

I decided to propose a solution to a problem mentioned in issues above.
This makes possible for user to use functions with smaller alpha, even equal 0, which was earlier impossible, because after setting it to something smaller than _ALPHA_MIN it was changed to _ALPHA_MIN in _check_alpha(self).
By adding a new variable I hope to ensure that small (or equal 0) alpha is used by user intentionally.

Copy link
Member

@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it strange that we should need two parameters to control this. As @amueller wrote in #10775 (comment), going back to just allowing small alpha but making sure there is a warning seems a reasonable way forward. On the other hand, we can't do it without breaking backwards compatibility. So I suppose we could use something akin to this PR as a temporary measure, but change the default of alphaCorrection to False over a deprecation period.

Additive (Laplace/Lidstone) smoothing parameter
(set alpha=0 and alphaCorrection=False, for no smoothing).

alphaCorrection : bool, default=True
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we only use camel case in class names.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I change name to: alpha_correction or something like that?

@arka204
Copy link
Contributor Author

arka204 commented Mar 27, 2020

I'm not sure if I don't understand or was misunderstood. Perhaps instead of alphaCorrection I should have used something akin to allow_small_alpha (and set it by default to False)?

Trying to set alpha=0 was resulting in changing it to _ALPHA_MIN in _check_alpha(self).

alphaCorrection=True means we want to have our alpha corrected/adjusted -> set to _ALPHA_MIN when it is lower, so the behavior stays as it was.

On the other hand alphaCorrection=False would mean that it won't be changed/adjusted, thus allowing user to set it to 0 and giving a warning instead of changing it's value to _ALPHA_MIN.

That means if user sets alphaCorrection=True, they would get exact same result as before, only setting it to False changes something.

This is why I cannot understand changing default value of this variable to False.

@arka204
Copy link
Contributor Author

arka204 commented Apr 11, 2020

@jnothman what do You think?

Merging changes from the main repository
@arka204
Copy link
Contributor Author

arka204 commented Apr 18, 2020

@amueller what is Your opinion on this?

Merging changes from the main repository
@KumarGanesha1996
Copy link

hello kind sirs... what is status on it? i think you treat it gentleman very bad. please respond to him questions... @jnothman @amueller

@cmarmo
Copy link
Contributor

cmarmo commented May 12, 2020

Hi @arka204, your code have a linting issue.
scikit-learn use lint flake8 directives: in order to check your code locally you can run

$ flake8 <your_modified_file>.py

Hope this can help.

@cmarmo
Copy link
Contributor

cmarmo commented May 22, 2020

HI @arka204, as long as there are conflicts in your code there is small chance it will be reviewed. Could you please synchronize with the master version of scikit-learn (you can check how to do that at this page), and resolve conflicts? Thanks!

@arka204
Copy link
Contributor Author

arka204 commented May 22, 2020

Thanks for your help @cmarmo !
I changed the naming from alphaCorrection to force_alpha and swapped True with False regarding this variable.
Hope that this will make my code more intuitive and readible.

Copy link

@KumarGanesha1996 KumarGanesha1996 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well done kind sir

@arka204
Copy link
Contributor Author

arka204 commented May 30, 2020

Ping @jnothman, @amueller.
Do You think it can be merged now?

@arka204 arka204 changed the title [WIP] Adding variable alphaCorrection to classes in naive_bayes.py. [MRG] Adding variable alphaCorrection to classes in naive_bayes.py. May 30, 2020
Copy link
Member

@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise looking good

@@ -752,6 +752,17 @@ def test_alpha():
X, y, classes=[0, 1])


def test_check_alpha():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please also test that appropriate warnings are raised in each case (use pytest.warns)

@jnothman
Copy link
Member

Please add an entry to the change log at doc/whats_new/v0.24.rst. Like the other entries there, please reference this pull request with :pr: and credit yourself (and other contributors if applicable) with :user:

@cmarmo
Copy link
Contributor

cmarmo commented Aug 24, 2020

@arka204 will you be able to address comments and fix conflicts? Thanks for you work!

@hongshaoyang
Copy link
Contributor

@arka204 , I've opened #18805 to take over this stalled PR. Thank you.

@cmarmo cmarmo added the Superseded PR has been replace by a newer PR label Nov 10, 2020
Base automatically changed from master to main January 22, 2021 10:52
@cmarmo
Copy link
Contributor

cmarmo commented May 10, 2022

Closing as superseded by #18805 and now #22269.

@cmarmo cmarmo closed this May 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module:naive_bayes Superseded PR has been replace by a newer PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BernoulliNB and MultinomialNB documentation for alpha=0
5 participants