Confusion Matrix - Is "accuracy(metric)" flawed ?

February 25, 2022

Hello friends, as a machine learning student I grappled with this concept and was finally enlightened 😁 to the fact that accuracy(metric) may not provide the right insight into the effectiveness of your model.

Reference confusion matrix

		Actual
		Positive	Negative
Predicted	Positive	True +ve	False +ve
Predicted	Negative	False -ve	True -ve

Let's start with some examples:

Example1: Consider a class of 500 school students and you are trying to predict how many brush their teeth before coming to school.

		Actual
		Positive	Negative
Predicted	Positive	250	50	300
Predicted	Negative	0	200	200
		250	250

Total sample size: 500

Ground Truth:

Actual positive: 250, Actual negative: 250

Prediction:

Predicted positive: 300, Predicted negative: 200

Accuracy: True +ve + True -ve/Total samples = 250+200/500 = 90% accuracy

Example2: Consider the same class of 500 school students and now you are trying to predict how many are affected because of covid and you can recommend them for isolation/quarantine.

		Actual
		Positive	Negative
Predicted	Positive	5	0	300
Predicted	Negative	5	490	200
		10	490

Total sample size: 500

Ground Truth:

Actual positive: 10, Actual negative: 490

Prediction:

Predicted positive: 5, Predicted negative: 495

Accuracy: True +ve + True -ve/Total samples = 495/500 = 99% accuracy

Hurray! you got an accuracy of 99% not bad at all. However, when you re-review the data again you realize that even though your model is 99% accurate but it failed to identify 5 students who were positive, but incorrectly classified as negative, you know the consequence of this miss 😒

Whom should you blame? -> the metrics

Why did it happen? -> unbalanced data, the # of negative class samples(490) outweighed the positive class samples(10)

What should you do next? -> find a more robust metric/s

Search for a robust metric/s:

1) Precision(of the positive class): Percentage of True positive correctly classified by prediction.

True +ve

Total +ve

(predicted)

Precision

(True+ve)/ Total +ve(predicted)

Example1

250

300

100%

Example2

100%

😩 sadly precision metric is not able to catch the problem, Let's try Recall...

2) Recall(of the positive class): Percentage of True positive from the actual positive(ground truth) classification

True +ve

Total +ve (actual)

Precision

(True+ve)/ Total +ve(actual)

Example1

250

100%

Example2

50%

We got it this time, recall is able to capture the poor quality of positive class prediction.

Does this mean we don't need precision? Not really let me explain this in my next post 😁

Have a great day!

Search This Blog

ML-medley- "Is it too complex?"

Confusion Matrix - Is "accuracy(metric)" flawed ?

Comments

Post a Comment

Popular posts from this blog

ROC-AUC explained

R2 or Adjusted R2(Where is the adjustment ?)