How can I create Confusion Matrix in Python?

data_science
python

#1

Hi,

I am using naive bayes algorithm to predict probability of different classes of test data set. Now, I want to check the power of model. Should I use confusion matrix or log-loss ratio? Can you help me with the methods to plot confusion matrix or log-loss ratio?

Thanks,
Mukesh


#2

Mukesh,

To evaluate probability of multi class classification we should go with Log loss (logistic regression loss or cross-entropy loss, is defined on probability estimates) where as confusion matrix check the number of observations actually in matches with original value not the probability.

You can also refer this link for more evaluation metrics and python codes to perform it:
http://scikit-learn.org/stable/modules/model_evaluation.html

Regards,
Imran


#3

#here is how you create a confusion matrix
from sklearn.metrics import confusion_matrix
from sklearn.cross_validation import train_test_split
import numpy as np

a_train, a_test, b_train, b_test = train_test_split(x, y, test_size=0.20, random_state=8)
classes, count = np.unique(b_test, return_counts=True)

b_test_pred = predict_whatever_your_model

confusion_matrix(b_test, b_test_pred, classes)


#4

Hello,

You might be interested by my project https://github.com/scls19fr/pandas_confusion and its Pip package https://pypi.python.org/pypi/pandas_confusion

With this package confusion matrix can be pretty-printed, plot. You can binarize a confusion matrix, get class statistics such as TP, TN, FP, FN, ACC, TPR, FPR, FNR, TNR (SPC), LR+, LR-, DOR, PPV, FDR, FOR, NPV and some overall statistics

Kind regards