How to decide when to use Naive Bayes for classification




While learning about Naive Bayes for classification,I just wanted to see it’s comparison with Random Forest and hence I did the below on a telecom dataset for predicting churn.

rf <- randomForest(churn ~.,data = churnTrain)
rf_predict <- predict(rf,churnTest)

nb <- naiveBayes(churn ~.,data = churnTrain)
nb_predict <- predict(nb,churnTest)

The output shows that randomforest has better accuracy

Just wanted to know under what situations should we use NaiveBayes and not the other classification algo’s



Naive Bayes performs well when we have multiple classes and working with text classification. Advantage of Naive Bayes algorithms are:

  1. It is simple and if the conditional independence assumption actually holds, a Naive Bayes classifier will converge quicker than discriminative models like logistic regression, so you need less training data. And even if the NB assumption doesn’t hold.
  2. It requires less model training time

The main difference between Naive Bayes(NB) and Random Forest (RF) are their model size. Naive Bayes model size is low and quite constant with respect to the data. The NB models cannot represent complex behavior so it won’t get into over fitting. On the other hand, Random Forest model size is very large and if not carefully built, it results to over fitting. So, When your data is dynamic and keeps changing. NB can adapt quickly to the changes and new data while using a RF you would have to rebuild the forest every time something changes.



So are you saying that NB will work better with a real-time moving window classification since it works faster than the Random Forest method?



Naive Bayes is that it’s a good algorithm for working with text classification. When dealing with text, it’s very common to treat each unique word as a feature, and since the typical person’s vocabulary is many thousands of words, this makes for a large number of features. The relative simplicity of the algorithm and the independent features assumption of Naive Bayes make it a strong performer for classifying texts.




Naive Bayes works best when you have small training data set, relatively small features(dimensions). If you have huge feature list, the model may not give you accuracy, because the likelihood would be distributed and may not follow the Gaussian or other distribution. Another condition for Naive Bayes to work is that features should be dependent of each other - if you understand the domain, then try to analyze how each features are related to each other, are they affecting the each others likelihood. if not Naive Bayes can give you good result.


Wasn’t the precondition of NB, that the attributes are independent of each other. Please check