Welcome to Practice Problem : Twitter Sentiment Analysis



Thanks that was helpful!


Hi, am I missing something? How is id - 18 considered racist or sexist in the train file?


How do I find the actual class values for test data?
Labels for test data is not given in the question. How do i find TP, TN, FP, FN?


You have to predict the labels in the test set using the information from the train set.


Yes. But while calculating the performance of classification we need actual values to compare with our predicted values right? How do i get it?


You can split your training data into train and validation set (suppose we take 80-20 split). You can use the 80% data to train your model and remaining 20% to test the performance of your model.

The test set provided will not have the target variable. You have to make predictions on the test set and submit the same. Once you submit, you will get the score.


Hi @brewercj39,

You can consider this incorrect label as an error in the data and deal accordingly.


Thanks a lot.


Hii, when I am training the model I am getting zero split in the tree. I am not able to identify it. Any suggestion will be highly appreciated.


@vini.akki are you training on the one-hot-encoded vectors?


For anyone who is new to text classification … this is a good place to start Youtbe


Hi, i uploaded my solution (source code and test.csv, which is updated with label values). I am getting a error screen with the message that “keys: [tweet] were not expected but found in your submission”.
I am unable to understand to understand this message. Can you pls. help me understand what this represents ?


@RamPichai the submission file should contain only 2 columns: id and label.


Thanks. I was able to submit it now and the result shows with the score in some decimals. What does it mean ? How do i see what improvements can i make on this ? This score is out of what ? Kindly help


You can go ahead and check the leaderboard to find your rank and score of other participants. In order to check if the score improves, you can make changes in your model and submit another set of predictions.


@RamPichai the evaluation metric is F1-Score, its value varies from 0 to 1.


Ok, thanks. I am a beginner and thus i am getting all sort of such questions. I just checked Leaderboard and i can see so many users listed down on that. But, w.r.t this Twitter Sentiment Analysis, i am unable to check others score and where i stand on this Hackathon. Is there anything i am missing here ?


I am able to see everybody’s rank and score. The link for the twitter sentiment leaderboard is shared below.

Twitter Sentiment Analysis LB

Also, I would recommend you to go through the following blog


That’s wonderful. The swift with which i get the responses is very motivating. Thanks a ton.


I am splitting the training data into train and validation set.
While the confusion matrix on validation set is giving me 70% accuracy, my score on the leaderboard is 0.05 that is very low.
Am i making some mistake while submitting?