Omni calculator
Last updated:

Confusion Matrix Calculator

Table of contents

What is a confusion matrix in machine learning?How to read a confusion matrix?Confusion matrix calculator with an exampleFAQs

With this confusion matrix calculator, we aim to help you to calculate various metrics that can be used to assess your machine learning model's performance. The confusion matrix is the most prevalent way of analyzing the results of a classification machine learning model. It is thus a critical topic to understand in this field.

We have prepared this article to help you understand what a confusion matrix is and how to calculate a confusion matrix. We will also explain how to interpret the confusion matrix examples to make sure you understand the concept thoroughly.

What is a confusion matrix in machine learning?

You can see a confusion matrix as way of measuring the performance of a classification machine learning model. It summarizes the results of a classification problem using four metrics: true positive, false negative, false positive, and true negative.

However, the use of a confusion matrix goes way beyond just these four metrics. Using these four metrics, the confusion matrix allows us to assess the performance of the classification machine learning model using more versatile metrics, such as accuracy, precision, recall, and more.

We will talk about the definitions of these metrics in detail in the next section. You will be able, for example, to calculate accuracy from the confusion matrix all by yourself!

How to read a confusion matrix?

After understanding the definition of a confusion matrix in machine learning, it's time to talk about how to read a confusion matrix.

A confusion matrix has four components:

  • True positive (TP) - These are the correct predictions made that are labeled as positive. You can input this and the below values in the confusion matrix calculator's first section.
  • False negative (FN) - These are the wrong predictions made that are labeled as negative.
  • False positive (FP) - These are the wrong predictions made that are labeled as positive.
  • True negative (TN) - These are the correct predictions made that are labeled as negative.

Using these four components, we can calculate various metrics to help us in analyzing the performance of the machine learning model:

  • accuracy - Accuracy is the proportion of the correct predictions in the confusion matrix out of all predictions made. You can calculate accuracy from confusion matrix, as well as other metrics, using our tool.
  • precision - Precision is the proportion of the correct predictions in the confusion matrix out of all positive predictions.
  • recall - Recall is the proportion of correct predictions in the confusion matrix out of all positive classes.
  • F1 score - F1 score allows you to compare low-precision models to high-recall models, or vice versa, by using the harmonic mean of precision and recall to punish extreme values.
  • TPR - True positive rate is the probability that a positive prediction will be true.
  • FNR - False negative rate is the probability of getting a type II error, which is wrongly labeling a negative class as positive.
  • FPR - False positive rate is the probability of getting a type I error, which is wrongly labeling a positive class as negative.
  • TNR - True negative rate is the probability that a negative prediction will be true.
  • FDR - False discovery rate is the ratio of the number of false positive to the total number of positive predictions.
  • MCC - Matthews correlation coefficient, also known as the phi coefficient, is a metric that measures the association between two binary variables.

Next, let's look at the calculations of these metrics using the confusion matrix example.

Confusion matrix calculator with an example

Finally, it is time to talk about the calculations. We will use the confusion matrix example below to demonstrate our calculation. Let's take the classification results below as an example:

  • TP: 80;
  • FN: 70;
  • FP: 20; and
  • TN: 30.

The calculation of the metrics are shown below:

  1. Accuracy

    To calculate accuracy from confusion matrix, use the formula below:

    accuracy = (TP + TN) / (TP + FN + FP + TN)

    The accuracy for this example is (80 + 70) / (80 + 70 + 20 + 30) = 0.55.

    You can also calculate the percentage versions of these metrics.

  2. Precision

    The precision can be calculated using the formula below:

    precision = TP / (TP + FP)

    The precision for this example is 80 / (80 + 20) = 0.8.

  3. Recall

    Find the recall using the formula below:

    recall = TP / (TP + FN)

    The recall for this example is 80 / (80 + 70) = 0.53.

  4. F1 score

    To estimate F1 score, use the following formula:

    F1 score = (2 * precision * recall) / (precision + recall)

    The F1 score for this example is (2 * 0.8 * 0.53) / (0.8 + 0.53) = 0.64.

  5. True positive rate

    The true positive rate TPR (also called sensitivity) can be calculated using the formula below:

    TPR = TP / (TP + FN)

    The TPR for this example is 80 / (80 + 70) = 0.53.

  6. False negative rate

    We express the false negative rate FNR in a similar way:

    FNR = FN / (TP + FN)

    The FNR for this example is 70 / (80 + 70) = 0.47.

  7. False positive rate

    The false positive rate FPR is as follows:

    FPR = FP / (FP + TN)

    The FPR for this example is 20 / (20 + 30) = 0.4.

  8. True negative rate

    The true negative rate TNR (also called specificity) is:

    TNR = TN / (TN + FP)

    The TNR for this example is 30 / (30 + 20) = 0.6.

  9. False discovery rate

    We can calculate the false discovery rate as follows:

    FDR = FP / (TP + FP)

    The FDR for our example is 20 / (80 = 20) = 0.2.

  10. Matthews correlation coefficient

    Finally, we can calculate the Matthews correlation coefficient using the formula below:

    MCC = (TP * TN - FP * FN) / √((TP + FP) * (TN + FN) * (FP + TN) * (TP + FN))

    Hence, the MCC is (80 * 30 - 20 * 70) / √((80 + 20) * (30 + 70) * (20 + 30) * (80 + 70)) = 0.11547.

If all of these confusion matrix calculations look complicated, just use the confusion matrix calculator we have built for you!

FAQs

What is machine learning?

Machine learning is a branch of artificial intelligence that involves using algorithms or statistical models to automate the data analysis process. It can be used to perform predictions using techniques such as regression and classification.

What is accuracy?

In the field of machine learning, accuracy is understood as the metric to assess the performance of a machine learning model. In general, the higher the accuracy, the more reliable the model.

What is classification?

Classification is a machine learning operation that involves taking in data and separating the data points into different groups based on their characteristics. For example, an algorithm can tell you if an email is considered a spam email or not.

What is regression?

Regression is a machine learning operation that involves using data points to predict a continuous outcome. For instance, a regression model can be used to predict future stock prices based on economic variables.

How do I find precision for 80 true and 20 false positive samples?

The precision for 80 true positive and 20 false positive samples is 0.83. You can find the answer following the steps below:

  1. Determine the true positive TP;

  2. Determine the false positive FP; and

  3. Apply the precision formula:

    precision = TP / (TP + FP)

What is the difference between accuracy and precision?

Precision is the percentage of correct predictions made out of all positive predictions, whereas accuracy is the percentage of correct predictions made out of all predictions.

Confusion matrix.

Confusion matrix

Analysis results

Check out 33 similar probability theory and odds calculators 🎲
AccuracyBayes theoremBertrand's box paradox...30 more