Correlation Coefficient Calculator (Matthews)

Created by Julia Żuławińska
Reviewed by Dominik Czernia, PhD and Jack Bowater
Last updated: Sep 28, 2022

This correlation coefficient calculator can help you explore the world of statistics by explaining what a correlation coefficient is and how to calculate a correlation coefficient. Unlike other correlation coefficients, the Matthews equation is based on a binary classification, not on continuous values. The text below covers the Matthews correlation formula and other useful correlation statistics or metrics.

Statistics is a branch of mathematics that collects, analysis, and interprets data. It's used in medicine & physics, as well as by governments and many other types of organization looking to find the best way to spend their time and money. The Matthews correlation is a common measure for interpreting data. In factories, it's used for quality control; in medicine, it helps with testing for disease.

Before we dive in, you should know that we have many other statistics calculators! Check out our Pearson correlation calculator and Spearman's correlation calculator calculators to discover other correlation coefficients. Our p-value calculator may come in handy during your statistical journey as well.

What is a correlation coefficient? - correlation coefficient definition

A correlation coefficient is a measure of the strength of a correlation, the statistical connection between two variables. In other words, it describes how changing the value of one variable will affect the value of another. There are many types of correlation coefficients: Pearson, Intraclass, or Rank. They're all normalized, i.e., they operate on the same scale from -1 to +1, where:

  • 0 means no relationship between a set of variables
  • +1 means a perfect positive relation, i.e., variables change in the same direction
  • -1 means an ideal negative relation, i.e., variables change in the opposite direction

How to find the correlation coefficient?

Our correlation coefficient calculator uses the Matthews correlation formula that, despite the relative risk, is often used in medicine to do such things as evaluate the applicability of drugs. It also finds use the biological sciences as well as in machine learning - the scientific field that combines statistical models and algorithms to build computer systems that learn.

So, what is correlation coefficient proposed by Matthews? It measures the correlation between the predicted and observed binary classification of a sample. The Matthews correlation coefficient formula is based on the so called confusion matrix:

Said is

Said is not

Actually is

True positive

False negative

Actually is not

False positive

True negative

Confusing, right? Try to think of the columns being a prediction, while the rows are the true result.

The relation between these classifications is expressed with Matthews correlation formula:

MCC = [(TP * TN) - (FP * FN)] / √[(TP + FP)(TP + FN)(TN + FP)(TN + FN)]

, where:

  • TP - true positive
  • FP - false positive
  • TN - true negative
  • FN - false negative

The scale of this coefficient is defined a little differently to the correlation coefficient definition we mentioned before:

  • +1 describes a perfect prediction
  • 0 doesn't give you any valid information
  • -1 represents a complete inconsistency between prediction and outcome

If you still have doubts about how to find correlation coefficient - keep reading, we'll give you an example calculation a bit further down.

Other correlation statistics

The Matthews correlation coefficient formula is believed to be the best determinant of the quality of a binary classification. If you aren't new to solving statistical problems, you might also find other scores relevant. In our calculator, you can find them by clicking the advanced mode button. They all range from 0% to 100%:

  • Sensitivity (true positive rate, recall) - a measure of how many actual positives have been correctly labeled as such:

    Sensitivity = TP / (TP + FN)

  • Specificity (true negative rate, selectivity) - a measure of how many of the negative items in the data are, indeed, negative:

    Specificity = TN / (TN + FP)

  • Precision - a proportion of actual positives to all the items predicted to be positive:

    Precision = TP / (TP + FP)

  • Accuracy - a ratio of true result, both true positives and true negatives, to all elements:

    Accuracy = (TP + TN) / (TP + TN + FP + FN)

  • F1 score - a measure of the test's accuracy, based on precision and recall:

    F1 score = (2 * TP) / (2 * TP + FN + FP)

Which of these give you the most information? Well, it depends on the data in your research. F1 score, as a function of precision and recall, is a better measure than accuracy when there are many points that are actually negative. Precision is valuable when you can't afford many false positives. Sensitivity is a similar case, but instead for false negatives values. In the end, it is down to you to decide which metric is the most significant.

How to use our calculator - a correlation coefficient example

You already have answers to "what does correlation mean?", "what is correlation coefficient formula?" and "what are some other correlation statistics?", but you may still not know how to calculate correlation coefficient on your own? Let's have a look at this correlation coefficient example:

Let's say that you work in a ceramic factory, and you need to check if some plates are correctly manufactured. You checked 100 plates and you said 15 of them have defects, but, in fact, 25 of them are defective. So you were right in only 10 of the cases. The confusion matrix looks like this:

Said is defective

Said is not defective

Actually is

10 - TP

15 - FN

Actually is not

5 - FP

70 - TN

MCC = [(10 * 70) - (5 * 15)] / √[(10 + 5)(10 + 15)(70 + 5)(70 + 15)] = 0.4042

It's not a bad outcome, but it's one that would probably cost you your job! When you look at sensitivity, you see that you only correctly identified 40% of the broken plates.

Have you enjoyed this calculator? Check out the birthday paradox calculator next!

Julia Żuławińska
True positives (TP)
False positives (FP)
True negatives (TN)
False negatives (FN)
Correlation coefficient
Check out 29 similar inferential statistics calculators 📉
AB testCoefficient of determinationConfidence interval… 26 more
People also viewed…

90% confidence interval

If you don't like statistics, but life is forcing you to count the 90% confidence interval, this calculator will significantly help.

Black Friday

How to get best deals on Black Friday? The struggle is real, let us help you with this Black Friday calculator!

Chilled drink

With the chilled drink calculator you can quickly check how long you need to keep your drink in the fridge or another cold place to have it at its optimal temperature. You can follow how the temperature changes with time with our interactive graph.

Specificity

With our specificity calculator, you can learn how to examine the accuracy of diagnostic tests.
Copyright by Omni Calculator sp. z o.o.
Privacy policy & cookies
main background