Evaluation Metrics for Machine Learning or Data Models

In data modelling, after a point, it becomes easy to train a model using historical data. Still, because of the different characteristics of models and datasets, it becomes difficult to evaluate the model using the right set of evaluation metrics. The model evaluation process goes through understanding the model and data to understand the right evaluation metrics for a problem. Before applying any evaluation metric in the process, we should be knowledgeable about important metrics to evaluate the model correctly. So in this article, we will cover the basics of different evaluation metrics. The list of these evaluation metrics is as follows:

Confusion Matrix
Classification Accuracy
Precision/Specificity
Recall
F-1 Score
AUC-ROC
Root Mean Square Error(RMSE)
Cross-entropy Loss
Gini Coefficient
Jaccard Score

Confusion Matrix

It is a matrix of size (a x a) where ‘a’ is the number of classes available in the classification data. The x-axis of this matrix can consist of the actual values and the y-axis can consist of the predicted values or vice versa. If the dataset has only two classes or belongs to the binary classification problem then the size of the matrix will be 2 X 2.

We can also call it an error matrix which is a matrix representation of model performance comparing the predictions of the model to ground truth labels. The below image is an example of a confusion matrix for model classifying between Spam and Not Spam.

Here is the following interpretation we can do with the confusion matrix:

True Positive(TP): Correct Positive Predictions
True Negative(TN): Correct Negative Predictions
False Positive(FP): Incorrect Positive Predictions
False Negative(FN): Incorrect Negative Predictions

Using the above values, we can calculate the following rates:

True Positive Rate(TPR) = TP/Actual Positive = TP/ (TP + FN) = 45/(45+25) = 0.65
False Negative Rate(FNR) = FN/Actual Positive = FN/ (TP + FN) = 25/(45+25) = 0.36
True Negative rate = TN/Actual Negative = TN/ (TN + FP) = 30/(30+5) = 0.85
False positive rate = FP/Actual Negative = FP/ (TN + FP) =5/(30+5) = 0.15

Here using the values in the above confusion matrix we have calculated 4 evaluation metrics.

Classification Accuracy

Using the above interpretation, we can easily calculate the classification accuracy using the following formula:

Classification accuracy = (correct prediction) / (all prediction) =(TP + TN) / (TP + TN + FP + FN)

According to the above confusion matrix, classification accuracy will be

Classification accuracy = (45 + 30)/ (45 + 30 + 5 + 25) = 0.71

Here we can see the accuracy of the model is 0.71 or 71%. This means that model will give 71 right answers out of 100 questions.

Precision/Specificity

With imbalanced data, classification accuracy is not the best indication to represent the model performance. In such conditions, we need to deal with a class-specific problem and precision or specificity is the best way to check the model’s performance. To get the value of this indicator, we need to have the true positive divided by the sum value to false positive and true positive.

Precision/Specificity = True Positive(TP) / (True Positive(TP) + False Positive(FP))

By this calculation, we quantify the predictions from the model that actually belongs to the positive class. Let’s have a look at the below diagram: