Precision of a Model

In machine learning, precision is a performance metric used to evaluate a classification model’s ability to correctly identify positive cases. Precision measures the proportion of true positive predictions among all positive predictions made by the model. It answers the question: “Of all the cases predicted as positive, how many were actually positive?”


Definition for Precision of a Model

Precision is defined as the ratio of true positive predictions (TP) to the sum of true positive predictions and false positive predictions (FP). Mathematically, it is expressed as:

Precision Formula:

\( \text{Precision} = \dfrac{\text{TP}}{\text{TP} + \text{FP}} \)

  • TP: True Positives (correctly predicted positive cases)
  • FP: False Positives (incorrectly predicted positive cases)

Importance of Precision

Precision is particularly important in scenarios where the cost of false positives is high. For example, in spam email detection, a false positive means a legitimate email is marked as spam, which can lead to important emails being missed.


Components in a Confusion Matrix

The confusion matrix helps visualize the distribution of predictions:

Predicted: PositivePredicted: Negative
Actual: PositiveTrue Positive (TP)False Negative (FN)
Actual: NegativeFalse Positive (FP)True Negative (TN)
Components in a Confusion Matrix

Precision uses the values from the first column:

Precision = TP / (TP + FP)


Detailed Examples with Steps to Calculate Precision

Below are ten real-world examples that explain precision calculation step-by-step:


Example 1 – Spam Email Detection

Scenario: A model predicts whether an email is spam.

  • True positives (TP): 70
  • False positives (FP): 30

Steps:

  1. Calculate total predicted positives: TP + FP = 70 + 30 = 100.
  2. Calculate precision: Precision = TP / (TP + FP) = 70 / 100 = 0.7 (70%).

Example 2 – Fraud Detection

Scenario: A model predicts whether a transaction is fraudulent.

  • TP: 50
  • FP: 10

Steps:

  1. Total predicted positives: TP + FP = 50 + 10 = 60.
  2. Precision: Precision = 50 / 60 = 0.8333 (83.33%).

Example 3 – Cancer Detection

Scenario: A model predicts whether a patient has cancer.

  • TP: 40
  • FP: 20

Steps:

  1. Total predicted positives: TP + FP = 40 + 20 = 60.
  2. Precision: Precision = 40 / 60 = 0.6667 (66.67%).

Example 4 – Defect Detection

Scenario: A model predicts whether a product is defective.

  • TP: 90
  • FP: 30

Steps:

  1. Total predicted positives: TP + FP = 90 + 30 = 120.
  2. Precision: Precision = 90 / 120 = 0.75 (75%).

Example 5 – Loan Default Prediction

Scenario: A model predicts whether a customer will default on a loan.

  • TP: 100
  • FP: 50

Steps:

  1. Total predicted positives: TP + FP = 100 + 50 = 150.
  2. Precision: Precision = 100 / 150 = 0.6667 (66.67%).

Example 6 – Social Media Post Classification

Scenario: A model predicts whether a post is spam.

  • TP: 150
  • FP: 50

Steps:

  1. Total predicted positives: TP + FP = 150 + 50 = 200.
  2. Precision: Precision = 150 / 200 = 0.75 (75%).

Example 7 – Sentiment Analysis

Scenario: A model predicts positive sentiments in text reviews.

  • TP: 80
  • FP: 20

Steps:

  1. Total predicted positives: TP + FP = 80 + 20 = 100.
  2. Precision: Precision = 80 / 100 = 0.8 (80%).

Example 8 – Object Detection

Scenario: A model predicts whether an object is a car.

  • TP: 200
  • FP: 100

Steps:

  1. Total predicted positives: TP + FP = 200 + 100 = 300.
  2. Precision: Precision = 200 / 300 = 0.6667 (66.67%).

Example 9 – Recommendation Systems

Scenario: A model recommends movies to users.

  • TP: 70
  • FP: 30

Steps:

  1. Total predicted positives: TP + FP = 70 + 30 = 100.
  2. Precision: Precision = 70 / 100 = 0.7 (70%).

Example 10 – Image Classification

Scenario: A model predicts whether an image contains a cat.

  • TP: 120
  • FP: 80

Steps:

  1. Total predicted positives: TP + FP = 120 + 80 = 200.
  2. Precision: Precision = 120 / 200 = 0.6 (60%).

Conclusion

Precision is a crucial metric for evaluating classification models, especially when false positives carry significant consequences. It ensures that the model’s positive predictions are reliable. However, precision alone may not fully represent a model’s performance and should often be paired with metrics like recall and F1 score for comprehensive evaluation.