Residual Calculator

Q: What is the sum of squares residuals?

The sum of squares residuals is one of the metrics used to analyze the accuracy of your linear model . The larger the sum of squares residuals, the less accurate your model is.

Q: Can we explain every relationship with linear regression?

Mathematically speaking, yes , you can. However, it might not be wise to do that. Some relationship is not linear, and fitting a linear model to it might lead to poor results .

Q: What is a residual plot?

A residual plot is a graph plotted with residuals on the y-axis and predicted value on the x-axis . It allows you to assess if the linear model is a good fit.

Creators

Wei Bin Loo

Wei Bin is a Product Manager based in London, leading a technology company's Product and Data functions. With a keen focus on delivering top-notch technology solutions, Wei Bin empowers businesses to unlock their full potential through innovative products, data-driven insights, and an unwavering commitment to customer value. His passion lies in guiding companies toward growth and success, leveraging the power of technology, data, and customer-centric product solutions. At Omni, Wei Bin leverages his financial expertise as a Strategy Consultant and CFA Level 2 holder to create various financial tools aimed at helping people improve their financial literacy. Outside of his professional pursuits, Wei Bin is an avid wine enthusiast with extensive knowledge and certification in the field. He also enjoys the strategic challenges of chess and poker, as well as swimming in his leisure time. See full profile

Check our editorial policy

Reviewers

Dominik Czernia, PhD

Dominik CzerniaPhD, Institute of Nuclear Physics PAN

Website

Research Gate

Dominik Czernia, PhD, is a physicist at the Institute of Nuclear Physics in Kraków, specializing in condensed matter physics with a focus on molecular magnetism. He has led several national research projects, pioneering innovative approaches to novel materials for high technology. Passionate about making science accessible, Dominik has created various calculators, mostly in physics and math categories. In his free time, he enjoys family walks, city explorations, mountain hiking, and traveling everywhere by bike. See full profile

Check our editorial policy

and Steven Wooding

Steven Wooding

Steven Wooding is a physicist by training with a degree from the University of Surrey specializing in nuclear physics. He loves data analysis and computer programming. He has worked on exciting projects such as environmentally aware radar, using genetic algorithms to tune radar, and building the UK vaccine queue calculator. Steve is now the Editorial Quality Assurance Coordinator here at Omni Calculator, making sure every calculator meets the standards our users expect. In his spare time, he enjoys cycling, photography, wildlife watching, and long walks. See full profile

Check our editorial policy

We have prepared this residual calculator for you to calculate the residuals for the linear regression analysis. Residual is one of the most important metrics used to assess the accuracy of your linear regression analysis. It tells you the performance of the linear regression and how accurate it is.

If everything is still unclear for you, scroll down now to understand what linear regression and residuals are, how to find residuals, and how to calculate the sum of squares residuals in statistics. Furthermore, you will find some practical examples to help you understand the concept better. We will also explain the application of the residual graph.

What is linear regression?

Linear regression is a statistical approach that attempts to explain the relationship between 2 variables. It can be shown as:

y = a × x + b

where y is the dependent variable, whereas x is the independent variable. Linear regression aims to explain the relationship between y and x. Specifically, it models the change in y for any changes in x.

We define a as the slope. It controls the change in y per unit change in x. The second parameter b is the intercept and it is the value of y when x equals zero.

Linear regression is a very powerful tool as it can help you to predict the "future". For example, we can use linear regression to predict future stock prices. Let's say we model the stock price of Company Alpha using the following model:

stock price = 1.5 × GDP growth + 20

If the expected GDP growth of the following year is 10%, stock price of Company Alpha is:

1.5 × 10 + 20 = $35

However, it is important that you understand not all relationships are linear. If your data can't be explained by using just a straight line, you might want to try out other regression methods. Please visit our quadratic regression calculatorand exponential regression calculator.

What is residual? – The residual definition

Let's say you have now modeled a linear relationship between y and x using linear regression. The next vital step to take is to estimate the accuracy of your linear model. And this is where the calculation of the residual comes in. So, how to find the residual?

The residual definition is the difference between the observed value and the predicted value of a certain point in the model. If the observed value is larger than the predicted value, the residual is positive. If the predicted value is larger than the observed value, the residual is negative. The further away the residual is from zero, the less accurate the model is in predicting that particular point.

However, to assess the performance of the whole linear model, we need to sum all the residuals up. This is when we need to calculate the sum of squared residuals to prevent the positive value from being offset by the negative residuals.

Theory aside, let's dive into how to calculate the residuals in statistics to help you understand the process now.

How to calculate residual in statistics? – The residual formula

As we mentioned previously, residual is the difference between the observed value and the predicted value at one point. We can calculate the residual as:

e = y − ŷ

where:

e – Residual;
y – Observed value; and
ŷ – Predicted value.

For instance, say we have a linear model of y = 2 × x + 2. One of the actual data points we have is (2, 7), which means that when x equals 2, the observed value is 7. However, according to the model, the ŷ, the predicted value, is 2 × 2 + 2 = 6.

Hence, according to the equation above, the residual, e, is 7 − 6 = 1.

To assess the whole linear model, determining the residual of a single data point is not enough since you will probably have many data points. So, now we need to sum up all the individual residuals. And to capture both the positive and negative deviations, we will need to take the sum of e² instead of e. A square e² will turn all the negative residuals into positive ones. The sum of squares residuals calculation can be done using the following equation:

Σ(e²) = e₁² + e₂² + e₃² + … + e_n²

So, if the model of y = 2 × x + 2 has 3 data points of (1, 4), (2, 7) and (3, 5); the predicted values of each point will be:

ŷ₁ = 2 × 1 + 2 = 4
ŷ₂ = 2 × 2 + 2 = 6
ŷ₃ = 2 × 3 + 2 = 8

And the individual residuals will be:

e₁ = 4 − 4 = 0
e₂ = 7 − 6 = 1
e₃ = 5 − 8 = -3

So, we can calculate the sum of squares residuals as:

Σ(e²) = 0² + 1² + (-3)² = 0 + 1 + 9 = 10

How to use the residual plot or residual graph?

Now, let's take some time to talk about what a residual plot is after we have discussed the residual meaning and the residual formula.

A residual graph is a plot of the residuals calculated against the predicted value, i.e., the residuals will be on the y-axis, and the predicted value will be the x-axis. So, why do we need to plot the residual graph?

The primary usage of the residual plot is to assess if a linear model is a good model for the data. By definition, the residuals in the linear model should be random. So, if the residuals in the residual plot look totally random, you have got yourself a good model.
On the other hand, if the residuals on the plot seem to follow a certain pattern, it might mean that a linear model is not suitable for your data, and you should consider other models, such as the quadratic model instead.

FAQs

What is the sum of squares residuals?

The sum of squares residuals is one of the metrics used to analyze the accuracy of your linear model. The larger the sum of squares residuals, the less accurate your model is.

Why do you need to use sum of squares residuals?

The main reason we need to use the sum of squares residuals instead of the sum of residuals is that the negative residuals and positive residuals might offset each other. This would make the linear model more accurate than it is.

Can we explain every relationship with linear regression?

Mathematically speaking, yes, you can. However, it might not be wise to do that. Some relationship is not linear, and fitting a linear model to it might lead to poor results.

What is a residual plot?

A residual plot is a graph plotted with residuals on the y-axis and predicted value on the x-axis. It allows you to assess if the linear model is a good fit.