# Understanding the results of this ML project

#1

I ran my code and got the following result in python:

1. For Naive Bayes

``````               precision    recall  f1-score   support
``````

Less than 50k 0.98 0.85 0.91 93576
More than 50k 0.24 0.72 0.36 6186

avg / total 0.93 0.84 0.88 99762

2. For XGBoost

``````                 precision    recall  f1-score   support

Less than 50k       0.96      0.99      0.98     93576
More than 50k       0.77      0.37      0.50      6186

avg / total         0.95      0.95      0.95     99762
``````

As can be seen the precision value of miniority class has increased to 0.77 which is what we wanted in this project I guess. What does the recall and F1-score indicate in this case in terms of predicting less than or more than 50k. I read the theoritical definition of the same from wiki but could not relate it specifically to this project. Can you please explain what each entry in this table means?
I got this result by using metrics.classification_report in Python.

Regards
Raju

#2

@pudkeaayush

Basic interpretation -
Precision - How many % of positive instances predicted by the model is actually correct.
Recall - How many % of positive instances out of total number of positive instances is caught by the model

So if precision of xgboost for more than 50k is 0.77, that means that out of 100 more than 50k instances predicited by your model 77% of them are actually more than 50k (correct).

Similarly, Recall of xgboost for more than 50k is 0.37, that means that out of 100 actual more than 50k instances in your data, the model is only predicting 37% of them as more than 50k.

F measure is a harmonic mean of precision and recall. So doesn’t have a exact interpretation.Formula -> https://en.wikipedia.org/wiki/F1_score

Hope this helps.

Regards,
Aayush