Confusion Matrix Calculator

Q: What is machine learning?

Machine learning is a branch of artificial intelligence that involves using algorithms or statistical models to automate the data analysis process . It can be used to perform predictions using techniques such as regression and classification.

Q: What is classification?

Classification is a machine learning operation that involves taking in data and separating the data points into different groups based on their characteristics . For example, an algorithm can tell you if an email is considered a spam email or not.

Q: What is regression?

Regression is a machine learning operation that involves using data points to predict a continuous outcome . For instance, a regression model can be used to predict future stock prices based on economic variables.

Q: How do I find precision for 80 true and 20 false positive samples?

The precision for 80 true positive and 20 false positive samples is 0.83 . You can find the answer following the steps below: Determine the true positive TP ; Determine the false positive FP ; and Apply the precision formula: precision = TP / (TP + FP)

Q: What is the difference between accuracy and precision?

Precision is the percentage of correct predictions made out of all positive predictions , whereas accuracy is the percentage of correct predictions made out of all predictions .

Creators

Wei Bin Loo

Wei Bin is a Product Manager based in London, leading a technology company's Product and Data functions. With a keen focus on delivering top-notch technology solutions, Wei Bin empowers businesses to unlock their full potential through innovative products, data-driven insights, and an unwavering commitment to customer value. His passion lies in guiding companies toward growth and success, leveraging the power of technology, data, and customer-centric product solutions. At Omni, Wei Bin leverages his financial expertise as a Strategy Consultant and CFA Level 2 holder to create various financial tools aimed at helping people improve their financial literacy. Outside of his professional pursuits, Wei Bin is an avid wine enthusiast with extensive knowledge and certification in the field. He also enjoys the strategic challenges of chess and poker, as well as swimming in his leisure time. See full profile

Check our editorial policy

Reviewers

Dominik Czernia, PhD

Dominik CzerniaPhD, Institute of Nuclear Physics PAN

Website

Research Gate

Dominik Czernia, PhD, is a physicist at the Institute of Nuclear Physics in Kraków, specializing in condensed matter physics with a focus on molecular magnetism. He has led several national research projects, pioneering innovative approaches to novel materials for high technology. Passionate about making science accessible, Dominik has created various calculators, mostly in physics and math categories. In his free time, he enjoys family walks, city explorations, mountain hiking, and traveling everywhere by bike. See full profile

Check our editorial policy

and Jack Bowater

With this confusion matrix calculator, we aim to help you to calculate various metrics that can be used to assess your machine learning model's performance. The confusion matrix is the most prevalent way of analyzing the results of a classification machine learning model. It is thus a critical topic to understand in this field.

We have prepared this article to help you understand what a confusion matrix is and how to calculate a confusion matrix. We will also explain how to interpret the confusion matrix examples to make sure you understand the concept thoroughly.

What is a confusion matrix in machine learning?

You can see a confusion matrix as way of measuring the performance of a classification machine learning model. It summarizes the results of a classification problem using four metrics: true positive, false negative, false positive, and true negative.

However, the use of a confusion matrix goes way beyond just these four metrics. Using these four metrics, the confusion matrix allows us to assess the performance of the classification machine learning model using more versatile metrics, such as accuracy, precision, recall, and more.

We will talk about the definitions of these metrics in detail in the next section. You will be able, for example, to calculate accuracy from the confusion matrix all by yourself!

How to read a confusion matrix?

After understanding the definition of a confusion matrix in machine learning, it's time to talk about how to read a confusion matrix.

A confusion matrix has four components:

True positive (TP) - These are the correct predictions made that are labeled as positive. You can input this and the below values in the confusion matrix calculator's first section.
False negative (FN) - These are the wrong predictions made that are labeled as negative.
False positive (FP) - These are the wrong predictions made that are labeled as positive.
True negative (TN) - These are the correct predictions made that are labeled as negative.

Using these four components, we can calculate various metrics to help us in analyzing the performance of the machine learning model:

accuracy - Accuracy is the proportion of the correct predictions in the confusion matrix out of all predictions made. You can calculate accuracy from confusion matrix, as well as other metrics, using our tool.
precision - Precision is the proportion of the correct predictions in the confusion matrix out of all positive predictions.
recall - Recall is the proportion of correct predictions in the confusion matrix out of all positive classes.
F1 score - F1 score allows you to compare low-precision models to high-recall models, or vice versa, by using the harmonic mean of precision and recall to punish extreme values.
TPR - True positive rate is the probability that a positive prediction will be true.
FNR - False negative rate is the probability of getting a type II error, which is wrongly labeling a negative class as positive.
FPR - False positive rate is the probability of getting a type I error, which is wrongly labeling a positive class as negative.
TNR - True negative rate is the probability that a negative prediction will be true.
FDR - False discovery rate is the ratio of the number of false positive to the total number of positive predictions.
MCC - Matthews correlation coefficient, also known as the phi coefficient, is a metric that measures the association between two binary variables.

Next, let's look at the calculations of these metrics using the confusion matrix example.

Confusion matrix calculator with an example

Finally, it is time to talk about the calculations. We will use the confusion matrix example below to demonstrate our calculation. Let's take the classification results below as an example:

TP: 80;
FN: 70;
FP: 20; and
TN: 30.

The calculation of the metrics are shown below:

Accuracy

To calculate accuracy from confusion matrix, use the formula below:

accuracy = (TP + TN) / (TP + FN + FP + TN)

The accuracy for this example is (80 + 70) / (80 + 70 + 20 + 30) = 0.55.

You can also calculate the percentage versions of these metrics.
Precision

The precision can be calculated using the formula below:

precision = TP / (TP + FP)

The precision for this example is 80 / (80 + 20) = 0.8.
Recall

Find the recall using the formula below:

recall = TP / (TP + FN)

The recall for this example is 80 / (80 + 70) = 0.53.
F1 score

To estimate F1 score, use the following formula:

F1 score = (2 * precision * recall) / (precision + recall)

The F1 score for this example is (2 * 0.8 * 0.53) / (0.8 + 0.53) = 0.64.
True positive rate

The true positive rate TPR (also called sensitivity) can be calculated using the formula below:

TPR = TP / (TP + FN)

The TPR for this example is 80 / (80 + 70) = 0.53.
False negative rate

We express the false negative rate FNR in a similar way:

FNR = FN / (TP + FN)

The FNR for this example is 70 / (80 + 70) = 0.47.
False positive rate

The false positive rate FPR is as follows:

FPR = FP / (FP + TN)

The FPR for this example is 20 / (20 + 30) = 0.4.
True negative rate

The true negative rate TNR (also called specificity) is:

TNR = TN / (TN + FP)

The TNR for this example is 30 / (30 + 20) = 0.6.
False discovery rate

We can calculate the false discovery rate as follows:

FDR = FP / (TP + FP)

The FDR for our example is 20 / (80 = 20) = 0.2.
Matthews correlation coefficient

Finally, we can calculate the Matthews correlation coefficient using the formula below:

MCC = (TP * TN - FP * FN) / √((TP + FP) * (TN + FN) * (FP + TN) * (TP + FN))

Hence, the MCC is (80 * 30 - 20 * 70) / √((80 + 20) * (30 + 70) * (20 + 30) * (80 + 70)) = 0.11547.

If all of these confusion matrix calculations look complicated, just use the confusion matrix calculator we have built for you!

FAQs

What is machine learning?

Machine learning is a branch of artificial intelligence that involves using algorithms or statistical models to automate the data analysis process. It can be used to perform predictions using techniques such as regression and classification.

What is accuracy?

In the field of machine learning, accuracy is understood as the metric to assess the performance of a machine learning model. In general, the higher the accuracy, the more reliable the model.

What is classification?

Classification is a machine learning operation that involves taking in data and separating the data points into different groups based on their characteristics. For example, an algorithm can tell you if an email is considered a spam email or not.

What is regression?

Regression is a machine learning operation that involves using data points to predict a continuous outcome. For instance, a regression model can be used to predict future stock prices based on economic variables.

How do I find precision for 80 true and 20 false positive samples?

The precision for 80 true positive and 20 false positive samples is 0.83. You can find the answer following the steps below:

Determine the true positive TP;
Determine the false positive FP; and
Apply the precision formula:

precision = TP / (TP + FP)

What is the difference between accuracy and precision?

Precision is the percentage of correct predictions made out of all positive predictions, whereas accuracy is the percentage of correct predictions made out of all predictions.