Linear Regression Calculator
With the help of our linear regression calculator, you can quickly determine the simple linear regression equation for any set of data points.
What is linear regression?, you wonder. Scroll down to learn what the linear regression model is, what the linear regression definition looks like, and how to calculate the linear regression formula by hand. You will also find an example of linear regression and a detailed explanation of how to interpret the slope of the regression line!
What is linear regression?
Linear regression is a statistical technique that aims to model the relationship between two variables (one variable is called explanatory/independent and the other is dependent) by determining a linear equation that best predicts the values of the dependent variable based on the values of the independent variable.
In other words, when we have a set of twodimensional data points, linear regression describes the (nonvertical) straight line that best fits these points. A simple example is when we want to predict the weights of students based on their heights.
Be careful, as in some situations simple linear regression may not be the right model! If your data seem to follow a parabola rather than a straight line, then you should try using quadratic regression, if they rather resemble a cubic (degree three) curve, think of cubic regression, while if your data come from a process characterized by exponential growth, try exponential regression instead.
Linear regression equation
It's time for a more formal definition of linear regression. Assume we are given a set of points in the Cartesian plane:
(x_{1},y_{1}), ..., (x_{n},y_{n})
.
We assume that x
is an independent variable, and that y
is a dependent variable. We are going to find a straight nonvertical line with a slope a
, and an intercept b
, i.e., the line of the best fit has the formula:
y = a * x + b
.
As you can see, it is really easy to write down the linear regression equation! When calculating linear regression, we need to work out the values of the parameters a
and b
. In the next section, we will explain how to interpret these parameters, and then we will show you how to calculate them efficiently.
⚠ Bear in mind that in this article we restrict our attention to the case with only one explanatory variable. We call such a model
💡 Simple linear regression is, well, simpler to understand and compute than multiple regression. However, many realworld phenomena require multiple explanatory variables. We will show you a way to calculate simple linear regression which easily extends to multiple linear regression. 
Linear regression parameters interpretation
The slope coefficient
The coefficient a
is the slope of the regression line. It describes how much the dependent variable y
changes (on average!) when the independent variable x
changes by one unit. Indeed, let's take a look at the following simple calculation:
a * (x + 1) + b = (a * x + b) + a = y + a
.
 If
a > 0
, theny
increases bya
units wheneverx
increases by1
unit. We say there is a positive relationship between the two variables: as one increases, the other increases as well.  If
a < 0
, theny
decreases bya
units wheneverx
increases by1
unit. We say there is a negative relationship between the two variables: as one increases, the other decreases.  If
a = 0
, then there is no relationship between the two variables in question: the value ofy
is the same (constant) for all values ofx
.
Interestingly, we can express the slope a
in terms of the standard deviations of x
and y
and of their Pearson correlation. We have:
a = corr(x, y) ⋅ sd(y) / sd(x)
where:
corr(x, y)
is the correlation betweenx
andy
;sd(x)
is the standard deviation ofx
; andsd(y)
is the standard deviation ofy
.
The intercept coefficient
It isn't hard to note that the intercept coefficient b
indicates the point on the vertical axis through which the fitted line passes. It has one more interesting property, which is related to the mean values of our observations.
Namely, the intercept coefficient b
is such that the regression line passes through the point whose horizontal coefficient is equal to the mean of the x
values, and the vertical coefficient is equal to the mean of the y
values.
We call such a point the center of mass of the set of data points.
How to use this linear regression calculator?
To use the linear regression calculator, follow the steps below:
 Enter your data, up to 30 points. The calculator needs at least 3 points to fit the linear regression model to your data points.
 We will show you the scatter plot of your data with the regression line.
 Below the plot you can find the linear regression equation for your data.
 Moreover, we tell you the coefficient of determination, R², of the fitted model. It tells you what proportion of the variance in the dependent variable
y
is explained by the model. Recall that R² ranges from0
to1
, and the closer it is to1
, the better the fit.  If you want to increase the precision of calculations, go to the
advanced mode
of our linear regression calculator. There you can set the number of significant figures.
Keep in mind that our linear regression calculator does not verify the assumptions of linear regression! You have to check them by yourself  at least remember to take a look at residuals to verify if they are independent, normally distributed, and homoscedastic (i.e., whether they have constant variance).
How to calculate linear regression?
We will show you how to calculate linear regression using the orthogonal projection approach. This approach is very handful as the calculations are quick and it easily generalizes to multiple linear regression.
We need to introduce some notation:

let
X
be a matrix with two columns andn
rows, wheren
is the number of data points. We fill the first column with ones, and in the second we put the observed valuesx_{1}, ..., x_{n}
of the explanatory variable:⌈ 1 x_{1} ⌉  1 x_{2}   ... ...  ⌊ 1 x_{n} ⌋ 
let
y
be a column vector filled with the valuesy_{1}, ..., y_{n}
of the dependent variable:⌈ y_{1} ⌉  y_{2}   ...  ⌊ y_{n} ⌋ 
also, let
β
denote the column vector of the linear regression coefficients:⌈ b ⌉ ⌊ a ⌋
To find the vector β
you just need to perform the following matrix multiplication:
β = (X^{T}X)^{1}X^{T}y
where:
⚠ We assume that the inverse of X^{T}X
exists. In other words, that the columns of X^{T}X
are linearly independent. In our specific case of simple linear regression, this condition means that the observed values x_{1}, ..., x_{n}
of the explanatory variable must not all be equal. Otherwise, we wouldn't be able to fit the linear regression.
💡 To compute multiple regression, you just need to append additional columns to the matrix X : each column must contain the observed values of a different explanatory variable. Obviously, the vector β contains then more coefficients: the number of coefficients is equal to the number of explanatory variables plus one. Most importantly, the matrix formula for β remains the same! That's the power of the matrix approach to linear regression!

Linear regression example
We want to find the linear regression model for the observations:
(1, 3), (2, 6), (3, 6)
.
Our data is:

the matrix
X
:⌈ 1 1 ⌉  1 2  ⌊ 1 3 ⌋ 
the vector
y
:⌈ 3 ⌉  6  ⌊ 6 ⌋
So, to find the linear regression model we need to:

Determine
X^{T}
:⌈ 1 1 1 ⌉ ⌊ 1 2 3 ⌋ 
Compute
X^{T}X
:⌈ 3 6 ⌉ ⌊ 6 14 ⌋ 
Find
(X^{T}X)^{1}
:⌈ 14/6 1 ⌉ ⌊ 1 1/2 ⌋ 
Perform the final matrix multiplication
(X^{T}X)^{1}X^{T}y
. The linear regression coefficients we wanted to find are:⌈ 2 ⌉ ⌊ 1.5 ⌋ 
Therefore, the slope of the regression line is
1.5
and the intercept is2
. The linear regression model for our data is:y = 1.5x + 2
As you can see, to find the simple linear regression formula by hand, we need to perform a lot of computations. Thankfully, there is our linear regression calculator! 😊