-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
Naive Bayes Classifier with Mixed Bernoulli/Gaussian Models #12957
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Yes this was discussed in
#10856 (comment)
and subsequent comments. I would personally be in favour if we can
construct a reasonably simple API
|
Hi @jarednielsen , |
I personally would appreciate it, if the general/mixed naive Bayes classifier would also work with the other NB classifiers and not only with Gaussian and Bernoulli. |
I agree, we should make the general NBC as, well, general as possible :) Let me know when you've merged PR #12569! |
I don't think #12569 is a prerequisite for this. I would consider an
interface a little like ColumnTransformer that specifies the naive Bayes
estimator for each set of columns. Ideally GeneralizedNaiveBayes would
share code with ColumnTransformer for handling heterogeneous features.
|
@jarednielsen Are you working on this? As @jnothman already said, you could also start to work on a |
I think a Mixed / General Naive Bayes classifier that allows one to mix and match the already available sklearn implementations of Naive Bayes with categorical and continuous columns is a use case that has been left unattended. It has been asked several times on Stackoverflow etc. with mostly pretty unsatisfactory answers. I have already derived and written some code that does this but have never made a PR before. Would be happy to team up on this @jarednielsen |
take |
Description
I suggest allowing mixed datasets (half binary variables, half real-valued variables) into the Naive Bayes classifier. Currently the
GaussianNB
andBernoulliNB
classes handle one case or the other, but not combined. I'd be happy to write the code for this, so I'm curious if this has been explored before and if it would be helpful!For example, on the Titanic dataset, gender is a bernoulli variable while age is real-valued. Passing both into a Naive Bayes classifier would improve it.
This is related to this currently pending PR: #12569
The text was updated successfully, but these errors were encountered: