Examples of how to create a confusion matrix and infer the true positive, true negative, false positive and false negative values using scikit learn in python ?
Create a confusion matrix with scikit-learn
To create a confusion matrix a solution is to use scikit-learn:
from sklearn.metrics import confusion_matrixy_true = [1,1,0,0,1]y_pred = [1,1,1,0,1]cm = confusion_matrix(y_true, y_pred, labels=[0, 1])print(cm)
returns
[[1 1][0 3]]

Get tn, fp, fn, tp for a binary classification
tn, fp, fn, tp = confusion_matrix(list(y_true), list(y_pred), labels=[0, 1]).ravel()print('True Positive', tp)print('True Negative', tn)print('False Positive', fp)print('False Negative', fn)
gives
True Positive 3True Negative 1False Positive 1False Negative 0
Calculate the accuracy score
from sklearn.metrics import accuracy_scoreaccuracy_score(y_true, y_pred)
gives
0.8
same as doing
acc = (tp+tn) / (tp+tn+fn+fp)print(acc)
gives
0.8
Calculate the tp, tn, fp and fn rates
tot = cm.sum()
same as
tot = tn+tp+fp+fn
returns here
5
and then
print('True Positive Rate', tp/tot)print('True Negative Rate', tn/tot)print('False Positive Rate', fp/tot)print('False Negative Rate', fn/tot)
returns
True Positive Rate 0.6True Negative Rate 0.2False Positive Rate 0.2False Negative Rate 0.0
Get tn, fp, fn, tp with more than two categories
from sklearn.metrics import confusion_matrixy_true = [1,1,0,0,1,1,2,2]y_pred = [1,1,1,0,2,1,2,2]cm = confusion_matrix(y_true, y_pred, labels=[0, 1])cm = confusion_matrix(y_true, y_pred)print(cm)
gives
[[1 1 0][0 3 1][0 0 2]]

and
fp = cm.sum(axis=0) - np.diag(cm)fn = cm.sum(axis=1) - np.diag(cm)tp = np.diag(cm)tn = cm.sum() - (fp + fn + tp)print(fp,fn,tp,tn)
returns
[0 1 1] [1 1 0] [1 3 2] [6 3 5]
Get tp, tn,fp, fn for a given category
idx = 0print(fp[idx], fn[idx], tp[idx], tn[idx])
returns
0 1 1 6print(fp[idx], fn[idx], tp[idx], tn[idx])
Category 1
idx = 1
returns
1 1 3 3
Category 2
idx = 2
returns
1 0 2 5
Calculate the accuracy score
from sklearn.metrics import accuracy_scoreaccuracy_score(y_true, y_pred)
gives
0.75
same as
acc = np.diag(cm).sum() / cm.sum()
Plot a confusion matrix with matplotlib and seaborn
import numpy as npimport matplotlib.pyplot as pltimport seaborn as snimport pandas as pdimport seaborn as snsimport mathfrom mpl_toolkits.axes_grid1 import make_axes_locatableimport matplotlib as mplmpl.style.use('seaborn')df_cm = pd.DataFrame(cm,index = [i for i in range(cm.shape[0])],columns = [i for i in range(cm.shape[1])])fig = plt.figure()plt.clf()ax = fig.add_subplot(111)ax.set_aspect(1)cmap = sns.cubehelix_palette(light=1, as_cmap=True)res = sn.heatmap(df_cm, annot=True, fmt='.2f', cmap=cmap)res.invert_yaxis()#plt.yticks([0.5,1.5,2.5], [ '0', '1', '2'],va='center')plt.title('Confusion Matrix')plt.savefig('confusion_matrix_1.png', dpi=100, bbox_inches='tight' )plt.show()
Normalize the confusion matrix
Just add
sum = cm.sum()cm = cm * 100.0 / ( 1.0 * sum )
and replace
res = sn.heatmap(df_cm, annot=True, fmt='.2f', cmap=cmap)
by
res = sn.heatmap(df_cm, annot=True, vmin=0.0, vmax=100.0, fmt='.2f', cmap=cmap)

