The F-statistic calculator (or F-test calculator) helps you compare the equality of the variances of two populations with normal distributions based on the ratio of the variances of a sample of observations drawn from them.
Read further, and learn the following:
- What is an F-statistic;
- What is the F-statistic formula; and
- How to interpret an F-statistic in regression.
What is F-statistic?
Broadly speaking, an F-statistic is a test procedure that compares variances of two given populations. While an F-test may appear in various statistical or econometric problems, we apply it most frequently to regression analysis containing multiple explanatory variables. In this vein, an F-statistic is comparable to a T-statistic, with the main difference of having a linear combination of multiple regression coefficients (F-test) instead of testing only an individual one (T-test).
In the following article, we introduce the F-test in its most basic form using the F-distribution table for better intuition. Then we show how to calculate F-statistic in linear regressions (see the calculator's
Multiple regression mode) and explain how to interpret an F-statistic in regression analysis.
How to calculate the F-statistic using an F-statistic table?
The best way to grasp the essence of F-test statistics is to consider its most basic form. Let's consider two populations, from which we each draw an equal number of observation samples. If we want to test whether the two populations are likely to have the same variance (denoted by , ), we need to follow these steps:
- Specify the null hypothesis (which in our simple case is that the two variances are equal) and the alternative hypothesis (which supposes that the two variances are different).
Determine the variance of the samples (here you may find our variance calculator useful).
Calculate the F-test statistic by dividing the two variances.
- Determine the degrees of freedom of the two samples, with being the number of observations taken from the two populations in each case.
Choose the significance level of the F-statistic — for example, corresponds to a 95 percent confidence interval.
Check the critical value of the F-statistic in theas follows:
- Look for the appropriate F-statistic table with the given significance level .
- Find the right column at the top of the F-table statistics that correspond to the degree of freedom of your first sample (nominator).
- Check the row on the side that corresponds to the degree of freedom of your second sample (denominator).
- Read the F critical value at the intersection, which represents the shaded area on the F-distribution graph below.
- Compare the F-statistic critical value to the previously obtained F-value. If the F-value is larger than the critical value collected from the F-table statistic , you can reject the null hypothesis. That is, we can state with a high confidence that the variances in the two observation samples are not equal.
How to calculate the F-statistic in linear regression?
Analysts mainly apply F-statistic on multiple regressions models (and so can you, with our F-test statistic calculator in
Multiple regression mode). It's therefore a good idea that we step further in this direction from the previous basic analysis.
Let's assume we have the following regression model (full model, or unrestricted model), where we would like to know if it is more significant than its reduced form (restricted model). In other words, we are testing whether the restricted coefficients (or the effects of the restricted variables) are jointly non-significant (equal to zero) in the population:
- is the constant or intercept,
- is the dependent variable (also called the regressand, response variable, explained variable, or output variable);
- is the independent variable (also called the regressor, *explanatory variable, controlled variable, or input variable);
- are the coefficients;
- is the residual (or error term).
To conduct the F-test and obtain the F-statistic (or F-value), we need to take the following steps:
State the hypothesis we want to test.
In our case, the null hypothesis is that the last two coefficients are jointly equal to zero in the unrestricted model. Or, stating the same differently, the joint effect of the related independent variables is insignificant.
In turn, the alternative hypothesis is that at least one of these coefficients is not equal to zero.
- is the number of restrictions (in the present case, ); and
- is the total number of coefficients (in the present case, ).
- Now, to gain information on which model fits better, we need to obtain the sum square of residuals (), where we expect that the sum square of residuals of the restricted model is larger than that of the full model (i.e. ).
- However, the real question is to determine whether the sum square of residuals of the restricted model is significantly larger than the one in the full model (i.e. ). To do so, we need to apply the following F-statistic formula to estimate the F-ratio.
- is the F-statistic;
- is the sum square of residuals of the full model;
- is the sum square of residuals of the restricted model;
- is the number of restrictions;
- is the total number of coefficients; and
- is the number of observations representing the population.
Naturally, the larger the F-statistic, the more evidence we have to reject the null hypothesis (note that the F-statistic increases when the difference between the two variances gets larger). However, to be more precise, we need to find a critical value of the F-statistic to decide on the rejection. In other words, if is larger than its critical value, we can reject the null hypothesis.
Now, we can proceed in the way we described in the previous section by finding the critical F-value in the F distribution table with a specified significance level F-statistic and looking for the intercept corresponding to the degrees of freedom, where is at the top and is at the side of the table (we can also say that has an F-distribution with and degrees of freedom). If is larger than its critical value, we can reject the null hypothesis.
So how to interpret F-statistic in regression?
The F-test can be interpreted as testing whether the increase in variance moving from the restricted model to the more general model is significant. We may write it formally in the following way:
where is the significance level of the test. For example, if and , the critical value at the 5% level is .
What is the difference between F-test vs T-test?
There are some differences between the F-test vs a T-test.
- The T-test is applied to test the significance of one explanatory variable, but the F-test studies the whole model.
- While the T-test is used to compare the means of two populations, F-test is applied for comparing two population variances.
- The T-statistic is based on the student t-distribution, while the F-statistic follows the F-distribution under the null hypothesis.
- While the T-test is a univariate hypothesis test where the standard deviation is unknown, the F-test is applied to determine the equality of the two normal populations.
Can an F-statistic be negative?
No. Since variances always take a positive value (squared values), both the numerator and the denominator of the F-statistic formula must always be positive, resulting in a positive F-value.
What is a high F-statistic?
While a large F-value tends to indicate that the null hypothesis can be rejected, you can confidently reject the null if the T-value is larger than its critical value.
Is the F-distribution symmetric?
No. The curve of the F-distribution is not symmetrical but skewed to the right (the curve has a long tail on its right side), where the shape of the curve depends on the degrees of freedom.
How to calculate F-statistic?
To calculate F-statistic, in general, you need to follow the below steps.
- State the null hypothesis and the alternate hypothesis.
- Determine the F-value by the formula of
F = [(SSE₁ – SSE₂) / m] / [SSE₂ / (n−k)], where
SSEis the residual sum of squares,
mis the number of restrictions and
kis the number of independent variables.
- Find the critical value for the F-statistic as determined by
F-statistic = variance of the group means / mean of the within-group variances.
- Find the F-statistic in the F-table.
- Support or reject the null hypothesis.
What is the F-statistic of two populations with variances of 10 and 5?
The F-statistic of two populations with variances of 10 and 5 is 2. To get this result, it suffices to divide the two variances.