DATASET OPTIMIZER (Sentiment Analysis)



I am focusing on machine learning. Currently i am improving naive bayes text classifier. I am not using library but on my own coding from scratch.

Having enthusiast on research to provide better and original algorithm. One of them is able to reduce dataset more compact so that we can have dataset from big number of rows into very small number of rows compared to number of rows from query.

Suppose you have dataset with 10 thousand rows for specified query with 5 thousand rows, then using dataset optimizer, we can reduce dataset into 5 thousand rows or even (very) less. This i may call sufficient dataset.

If you want to try it, you can send your dataset at least 10 thousand rows and your query (must be relevant to dataset)

Please i can’t tackle any of your request. I’ll pick one of them and send it back to you for your review.

This must give a valuable dataset which maintain accuracy or improve accuracy.

Thanks in advance


@Seremonia, I would recommend instead of asking for a dataset, you should use an openly available dataset to test your model. Here is a resource for you:


Hi, thanks

previously i used dataset from kaggle but little bit late that mostly was expired. I can’t submit a test, rather doing myself by splitting dataset. Or i used mostly on my work.

As you mentioned on archive-ics-edu, although i knew this source, but i never use it before, until you remind me as suggested. So, thank you, i will try with this link

It would be nice if someone could provide their own project to test with me. Otherwise, it’s a good start, thank you.


If there is someone could give us a link on sentiment analysis competition (latest still active), please don’t hesitate to notify us on this thread.

Thank you so much