How to determine the distributions in Naive Bayes

naive_bayes

#1

Hello,

While trying to understand the different distributions for which NaiveBayes can be used I am not being able to understand something which I will outline below with a dataset example:

If our independent variable has only two levels(like in the red box) we say it is a bernoulli model.
If it is like in the red circle it is gaussian and if the y has several levels(more than 2) we say it is a multi-nomial model??
I understand that for text classification if we use binary representation it is bernoulli model and if we use count representation it is multinomial,but I am getting confused in case of datasets such as the one shown above.
Can someone please clarify if my understanding is correct,and kindly rectify me if I am wrong.


#2

MultinomialNB implements the naive Bayes algorithm for multinomially distributed data and is one of the two classic naive Bayes variants used in text classification (where the data are typically represented as word vector counts, although tf-if vectors are also known to work well in practice).

The circle data is multinomially distributed .So you can use the MultinomialNB for classification on it.

BernoulliNB implements the naive Bayes training and classification algorithms for data that is distributed according to multivariate Bernoulli distributions; i.e., there may be multiple features but each one is assumed to be a binary-valued (Bernoulli, boolean) variable.

The box data has two level.So you can use the BernoulliNB for classification.

Hope this helps!

Regards,
Hinduja