What is the research methodology of text classification project?


I’m currently writing my thesis. My research project is building a model to classify text and I also investigate the impact of using stemming, BoW and TF-IDF on the performance of the models. My question is what the research methodology for such a project. I have been searching for the last two days but I did not find explicit information about what I want. I have read in some research that the type of research is experimental and others say is Knowledge Discovery in Dataset (KDD). So any help about this as well whether this research is quantitative or qualitative. I have been read on the internet but unfortunately, I also did not find an answer.


Hi @mayona, both explanations are correct, the major classification about research type is experimental and the minor refers to KDD. But, in this last, search for LEXICAL ANALISYS, this is the prior phase of KDD. Knowlegde is the consequence of lexical analisys, you can find some softwares that made this effort simple: “Sphinx Lexica” (from France), QSR Nvivo and MAXQDA.

In correct metodology order the phases are:

  1. identifying lexical components that express or represent some ideia or concept;
  2. agregate the commons lexical to concept;
  3. express the concept using the lexical components identified.

Ihope helps you!