Covariance vs. Variance: Key Differences Explained
Variance and covariance are mathematical terms frequently used in statistics and probability theory. Variance refers to the spread of a data set around its mean value, while covariance refers to the measure of the directional relationship between two random variables.
In addition to their general use in statistics, these two terms have specific meanings in finance. Financial experts use variance to measure an asset's volatility, while covariance describes the returns of two different investments over time relative to other variables.
This article will examine these fundamentals in more detail by recalling the mathematical formulae, their definitions, and properties. But before diving into the covariance vs. variance subject, check out our variance 🇺🇸 and covariance 🇺🇸 calculators!
What is variance?
According to the classic definition, variance is the average 🇺🇸 of the squares of the deviations from the mean. In more mathematical terms, it can be considered a measure used to characterize the dispersion of a distribution or sample.
The formula for variance is as follows:
where:
- xi — Individual data point;
- μ — Mean of the data points; and
- N — Total number of data points.
💡 Note that when calculating a sample variance to estimate a population variance, the denominator of the variance equation becomes n−1. This removes bias from the estimation and prohibits the researcher from underestimating the population variance. To learn more about this subject, check out our detailed article: Sample Variance vs. Population Variance: What's the Difference?.
Roughly speaking, you can view variance as the average of the squares minus the square of the average. This formula incorporates squares in order to prevent positive and negative deviations from the average from canceling each other out. Since the dimension of this measure is the square of the dimension of the mean, we more often use the standard deviation 🇺🇸, which is simply the root of the variance.
💡 Want to know more about variance? Check out our dedicated article: What's Variance in Statistics?.
What is covariance?
Covariance is slightly different. While variance allows us to study the variations of a variable in relation to itself, covariance will enable us to examine the simultaneous variations of two variables in relation to their respective means. In finance, this concept measures the degree of correlation between the fluctuations of two securities or between a security and an index.
You can view covariance as the product of the values of two variables minus the product of the two means. Mathematically, the formula is as follows:
where:
- x — Independent variable;
- y — Dependent variable;
- N — Number of data points;
- xˉ — Mean of x; and
- yˉ — Mean of the dependent variable y.
From this equation, we can deduce that the lower the covariance, the more independent the series are. Conversely, the higher the covariance, the more closely related the series are. A covariance of zero corresponds to two completely independent variables.
Although variance and covariance are closely related, they measure different aspects of data. Here are their key differences:
Feature | Variance σ2 | Covariance COV(x,y) |
---|---|---|
Definition | Spread of values of one variable around its mean | Relationship between how two variables change |
Type of measure | Variability within a single dataset | Direction of relationship between two datasets |
Possible values | Always ≥ 0 | Can be positive, negative, or zero |
Units | Square of the variable’s units | Product of the units of the two variables |
Use case | Understand data spread, risk in finance, quality control | Analyze relationships, portfolio diversification |
A covariance matrix is a square matrix that contains the variances and covariances associated with several variables. The matrix's diagonal elements contain the variables' variances, while the off-diagonal elements contain the covariances between all possible pairs of variables.
For example, suppose you create a covariance matrix for three variables X, Y, and Z. The following table shows the variances in bold along the diagonal. The variances of X, Y, and Z are 2.0, 3.4, and 0.92, respectively. The covariance between X and Y is -0.86.
X | Y | Z | ||
---|---|---|---|---|
X | 2.0 | -0.86 | -0.15 | |
Y | -0.86 | 3.4 | 0.48 | |
Z | -0.15 | 0.48 | 0.82 |
The covariance matrix is symmetric because the covariance between X and Y is identical to that between Y and X. Consequently, the covariance for each pair of variables is displayed twice in the matrix.
Many statistical applications calculate a statistical model's covariance matrix for parameter estimators. It is often used to calculate standard errors of estimators or estimator functions. For example, logistic regression creates this matrix for the estimated coefficients, allowing you to visualize the coefficient variances and covariances between all possible pairs of coefficients.
Variance and covariance are powerful statistical tools, but they answer different questions. Variance tells us how much a single dataset varies around its mean, while covariance explains how two datasets move together. Understanding both gives you a fuller picture: variance shows the spread of your data, and covariance shows the direction of relationships.
Variance is defined as the average of the squares of the deviations from the mean. It quantifies the dispersion of a set of data points around their mean value. Covariance, on the other hand, measures the directional relationship between two random variables. It indicates whether an increase in one variable would lead to an increase or decrease in another variable.
A correlation of zero means that there is no linear relationship between the two variables. Thus, if two random variables are independent, the covariance equals zero.
Covariance tells you whether two elements change together or not, but does not tell you how strong that relationship is. Correlation tells you both whether they change together and how strong that relationship is.
The variance is always positive because it is the expected value of a number squared; the variance of a constant variable (i.e., a variable that always takes the same value) is equal to zero, which is the case here.
This article was written by Claudia Herambourg and reviewed by Steven Wooding.