I am working on an email classification problem with 4 classes. I used Naive Bayes algorithm for classification. The accuracy turns out to be decent (~65%) given that there is a some overlap in the classes (i.e. couple of terms are common among 2-3 classes). The constraint is that I cannot combine any of the classes together. I tried playing around a bit with a custom stop-word list, but acccuracy did not improve much.
Now, I want to check if I can improve the accuracy by ensembling multiple Naive Bayes model. I haven’t done ensembling before, so had a couple of queries:
How to ensemble multiple Naive Bayes models? (Is it is using different Laplace estimators? )I looked up the web for the R codes, but was not successful. Any help with the R codes will be highly appreciated.
I read that ensembling models of the same ML algorithm (NB, in this case) might not fetch significant improvement in accuracy as they are likely to be correlated to each other . Hence, only marginal improvement can be achieved. Any thoughts on this?
Is ensembling a trial and error approach or is there a systematic way of doing it? For eg. how would I know which other ML algorithms (Random Forest/SVM etc.) would be best to ensemble with my initial Naive Bayes Model?