what is the best approach to classifying emails into different categories, i have a train set containing all the category label i want to build a model on the top of it so that i can label my test data set. I’m using r for this task please suggest algorithm as well text processing techniques with R code as well if possible.
my dataset contains two columns one column contains the label(category) and another column contains text string.
SVM does a fairly good job provided you are continuously working on train set to correct misclassification. Start with SVM as its easy to train and fairly simple to code.