F-Score, Accuracy, Precision, Recall Explained – Part 1

By | September 4, 2018

This post is in reference to the blog about EAST. While studying research papers, we often encounter terms like F-Score, Accuracy, Precision and Recall. Let’s get to know about them in a bit detail.

In machine learning, once you build your model, it is important to check how well your model is. So, evaluating your model is one of the most significant tasks in data science.

The easiest way to understand this, is to first to get familiar with something called confusion matrix.

Confusion Matrix

If you ask any deep learning engineer or data scientist, they will tell you that they love confusion matrix. I am a big fan of it too. Mostly, I use scikit-learn python library to draw confusion matrix in jupyter notebooks. Anyways, let’s see an image of confusion matrix to understand it.

confusion matrix example
confusion matrix example

Above image demonstrates basic structure of a confusion matrix. A confusion matrix is often used to measure performance of a classification model on test data with known ground truth labels. Let’s define each term from the table:

confusion matrix - colourful
confusion matrix – colourful

True Positives (TP): These are the correctly predicted positive values. That means that model outputs yes for the given class whose actual output is also yes.

True Negative (TN): These are the correctly predicted negative values. That means that model outputs no for the given glass whose actual output is also no.

False Positive(FP): This means when actual class is no, it says yes.

False Negative(FN): This means when actual class is yes, it says no.

Accuracy: It is one of the important measures. It is simply a ratio of correct observations to total observations. It is a good performance measure and highly used. But it won’t necessarily show how well your model is. For example, I was working on a multi-class classification problem. Accuracy results were 99% upon testing.

When I evaluated confusion matrix, for some of the individual classes accuracy was lesser than 90% even. That means, if I test from those particular class, then accuracy would be much lower.

Accuracy = TP+TN/TP+FP+TN+FN

Precision: Precision is the ratio of correctly predicted positives to all predicted positives.

Precision = TP/TP+FP

Recall: Recall is the ratio of correctly predicted to all predictions for that class.

Recall = TP/TP+FN

F-Score: F-score is weighted average of precision and recall. It is more useful than accuracy when class distribution is uneven. It is also called F1-Score or F measure.

F-Score = 2 * (Recall * Precision) / (Recall + Precision)

Please check an elaborate example in part 2 of this post.