In this post, I will show an example calculation of f-score, accuracy, precision and recall. This is in continuation to my previous post. I have used fast.ai cats vs dogs example. Below image is a confusion matrix for famous cats vs dogs Kaggle competition. It represents how well the trained model is. Let’s understand this.
The model was trained to classify two classes – cats and dogs. In above shown confusion matrix, left side represents true labels and the bottom side represents labels that are predicted by the model. To know what those each square boxes mean, please check the post about this in Part 1. For example, box with 954 tells that out of all examples, there were 954 examples were predicted as cats which were actually cats. There were 18 examples that were predicted as cats but actually were dogs. Now, let’s evaluate this model.
Cats vs Dogs Example
Accuracy as explained earlier is ratio of all true predictions to number of all examples. This model has accuracy of 0.968
Precision, Recall and F-Score can be calculated for each individual class. Here, I have calculated these measures for cats class.
Precision for the cats class is ratio of how many cats were truly predicted to all predicted cats.
Recall for the cats class is ratio of how many cats were truly predicted to all predictions for cats examples.
F-Score as shown in the image is weighted average of precision and recall.
This example was a binary classification. In case of multi-class classification, calculations are similar except the fact that you will have to add more numbers for each class. For example, in precision, we pick up the true positives and divide it by true positive plus values in the entire column.