Classifying IT Incident



Trying to solve a use case with text analytics. I have data exported from a ticketing system which has a category, type of ticket, title & problem description. I have to visualize the problem types, for example ( new employee registration, password reset). If I implement algorithms using data science, my assumption is to extra the ticket descriptions with a set of keywords and finally come to a conclusion that there are 30% of password request came through.

What are the best set of algorithms can I use which can precisely help me predict my use case in Machine Learning? I am a newbie here, please let me know if I am missing any info. Thanks in advance.


Hello Daisy,

Based on the info you provide, if you have all your fields in your database very well coded (no misspellings, etc) you can get a very clear view of the number of incidents associated to “password change” with just a frequency table. This frequency table you can build with several levels if you need: by level of employee, category, etc.

No high Machine Learning at all…

Kind Regards,



Look up Topic modeling and specifically LDA( latent Dirichlet Analysis), you can find a package in R which can run this algorithm.

What you need to do is specify the number of topics you want the algorithm to spit out and text. LDA will produce an output with topics and prominent keywords which describe that topic and percentage of that topic in each text. You can then find simple common themes in your topics keyword and name them a topic you think it’s picking up on and use them as your classification.

Hope this helps.


Thank you aayush.




Can you please share that dataset regarding that IT incident ?
I am also working on same problem using naive bayes algorithm,
hope i can add value to your work.

Utkarsh Khare