Pearson Correlation Calculator
 What is the Pearson correlation coefficient?
 Interpretation of the Pearson correlation
 How to use this Pearson correlation calculator
 Pearson correlation formula and properties
 How to calculate Pearson correlation by hand
 Pearson's r and Rsquared in simple linear regression
 "Correlation does not equal causation"
This Pearson correlation calculator helps you determine Pearson's r for any given two variable dataset. Below, we explain what Pearson correlation is, give you the mathematical formula, and teach how to use it to compute the Pearson correlation by hand. You can also discover the link between Pearson's r and linear regression, as well as finally understanding what that common saying, "correlation does not equal causation", means.
What is the Pearson correlation coefficient?
The Pearson correlation measures the strength and direction of the linear relation between two random variables, or bivariate data. Linearity means that one variable changes by the same amount whenever the other variable changes by 1 unit, no matter whether it changes e.g., from 1
to 2
, or from 11
to 12
.
A simple reallife example is the relationship between parent's height and their offspring's height  the taller people are, the taller their children tend to be.
The Pearson correlation coefficient is most often denoted by r (and so this coefficient is also referred to as Pearson's r).
Interpretation of the Pearson correlation

The sign of the Pearson correlation gives the direction of the relationship:
 If r is positive, it means that as one variable increases, the other tends to increase as well; and
 If r is negative, then one variable tends to decrease as the other increases.
 The absolute value gives the strength of the relationship:
 Pearson's r ranges from 1 to +1;
 The closer it is to ±1, the stronger the relationship between the variables;
 If r equals 1 or +1, then the linear fit is perfect: all data points lie on one line; and
 If r equal 0, it means that no linear relationship is present in the data.
Remember that Pearson correlation detects only a linear relationship!
This means that a low (or even null) correlation doesn't mean that there is no relationship at all! Take a look at the eight data sets below: they all have a Pearson correlation coefficient equal to zero.
How to use this Pearson correlation calculator
Just input your data into the rows. When at least three points (both an x and y coordinate) are in place, our Pearson correlation calculator will give you your result, along with an interpretation.
The verbal description of the strength of correlation returned in this calculator employs Evan's scale (1996) for the absolute value of r:
0.8 ≤ r ≤ 1.0
very strong0.6 ≤ r < 0.8
strong0.4 ≤ r < 0.6
moderate0.2 ≤ r < 0.4
weak0.0 ≤ r < 0.2
very weak
You may encounter many other guidelines for the interpretation of the Pearson correlation coefficient. Bear in mind that all such descriptions and interpretations are arbitrary and depend on context.
Pearson correlation formula and properties
It is high time we gave the mathematical formula for the Pearson correlation. Formally, Pearson's r is defined as the covariance of two variables divided by the product of their respective standard deviations. This translates into the following formula:
which can be further rewritten as:

It can be proven (via the Cauchy–Schwarz inequality) that the absolute value of the correlation coefficient never exceeds 1.

Note that the correlation is symmetric, i.e., the correlation between X and Y is the same as between Y and X.

Correlation vs. independence. If the variables are independent, their correlation is
0
, but, in general, the converse is not true! There is, however, a special case: whenX
andY
are jointly normal (i.e., the random vector(X, Y)
follows a bivariate normal distribution) and uncorrelated, then independence follows.
How to calculate Pearson correlation by hand
In case you wanted to better understand how the Pearson correlation formula works, we have prepared a way for you to compute Pearson's r by hand. Suppose we have the data set:
(1, 1), (3, 2), (3, 3), (5, 4)
,
so the xvalues are 1, 3, 3, 5
, and the respective yvalues are 1, 2, 3, 4
.
 Count how many points there are:
4
 Calculate the mean (arithmetic average) of the x and y values:
x̄ = (1 + 3 + 3 + 5) / 4 = 12 / 4 = 3
ȳ = (1 + 2 + 3 + 4) / 4 = 10 / 4 = 2.5
 Calculate the sums of the squares of x and y, and their dotproducts:
∑x_{i}^{2} = 1² + 3² + 3² + 5² = 44
∑y_{i}^{2} = 1² + 2² + 3² + 4² = 30
∑x_{i}y_{i} = 1 * 1 + 3 * 2 + 3 * 3 + 5 * 4 = 36
 We have all the values needed to apply the formula:
numerator = ∑x_{i}y_{i}  nx̄ȳ = 36  4 * 3 * 2.5 = 6
denominator = √8 * √5 = √40 ≈ 6.32
because∑x_{i}^{2}  nx̄² = 44  4 * 3² = 8
∑y_{i}^{2}  nȳ² = 30  4 * 2.5² = 5
 Finally, we can compute the value of the Pearson correlation coefficient:
r = 6 / 6.32 ≈ 0.95
Pearson's r and Rsquared in simple linear regression
In simple linear regression (Y ~ aX + b
) the Pearson correlation is directly linked to the coefficient of determination (Rsquared), which expresses the fraction of the variance in Y
that is explained by X
:

The Rsquared can be calculated by simply squaring the Pearson correlation coefficient.

The slope
a
of the fitted regression line can be found, as the Pearson correlation betweenY
andX
multiplied by the ratio of their respective standard deviations gives the gradient:a = r * (s_{y} / s_{x})
"Correlation does not equal causation"
Always remember that even a very strong correlation between two variables does not mean there's a causal link between the variables. It could be random chance, or there may be some other intervening variable that affects both your variables.
For example, the demand for sunglasses is strongly positively correlated with the rate of people drowning. This does not mean that sunglasses force anybody underwater! Instead, we rather suspect that hot weather causes both of these variables to increase.
Click here to read about other mindblowing examples of crazy correlations.