I have news articles with timestamps, what would be a general workflow and frameworks to use to identify new and emerging “topics/words” in the articles. Any ideas or suggestions would be helpful!
As a basic type of analysis we can group some of the articles as a part of week and remove stop words and have a word count , per week and a basic trend lines for each word for different weeks .
@shivas feel free to comment and ask more on it!!
I think I agree with @palbha here. You could use TFIDF to decrease the importance of words that are present in a majority of articles (irrespective of date) and then only pick “new” or “rare” words coming in the recent article.
Thanks @palbha. My thought was on the same line