I have around 13000 rows of text data of health care data. Each row has raw text column which contains 1-10 sentences , and a category column which is one of the 400 categories or classes. Which classifier algortihms should I try on this data?
Some categories are independent while some are somewhat related. Distribution of data among categories is not uniform either i.e some of the categories(around 40 of them) have less data
I am attaching log probabilites of each class here.