Coefficient of Variation Calculator
Once you know the standard deviation and mean of your dataset, the coefficient of variation calculator helps you make decisions about that data. Is the variation 'great' or 'small'? If you make a new random observation, how close to the mean would you expect it to be?
Read on to learn:
- Definition of the coefficient of variation
- Coefficient of variation formula
- When to use the coefficient of variation
- A fun example of how to calculate the coefficient of variation
What is the coefficient of variation?
The coefficient of variation (Cv) is the ratio of the standard deviation to the mean, sometimes expressed as a percentage.
For example, you might say that the ratio between the standard deviation and the mean is 0.1 or 10%.
How to calculate coefficient of variation
The coefficient of variation formula is:
Cv = (σ / μ) * 100%
where σ is the standard deviation of a population, and μ is the mean. The
* 100% is only used when expressing Cv as a percentage.
This equation can also be applied to sample data, where s is used for the standard deviation, and x̅ for the mean:
Cv = (s / x̅) * 100%
For sample data, the coefficient of variation is a biased estimate of the population coefficient of variation. The coefficient of variation formula is slightly modified to calculate an unbiased coefficient of variation, represented by Ĉv:
Ĉv = (1 + 1/4n) * Cv
where n is the sample size. This formula modifies Cv to be larger when the sample size is small. As n increases, the value of Ĉv approaches the value of Cv.
Think of it this way: a dataset with a large sample size more accurately represents the population compared to a dataset with a small sample size.
Coefficient of variation and relative standard deviation
The coefficient of variation (Cv) is very similar to the relative standard deviation (RSD), with the only difference being that the coefficient of variation can be negative, while RSD is always positive.
The coefficient of variation will tell you whether the mean is negative or positive:
- A positive mean results in a positive Cv
- A negative mean results in a negative Cv
Easy to remember, fortunately. The RSD, however, is often used when you see the mean ± standard deviation (e.g., 11 ± 2% cm).
Common uses for the coefficient of variation calculator
The coefficient of variation calculator is commonly used to:
- Conduct quality assurance analysis
- Assess the precision of a technique, such as an assay in analytical chemistry
- Assess the risk/reward ratio for investment options such as stocks and bonds
- Compare variation of two datasets with different means
When not to use the coefficient of variation
You should not use the coefficient of variation for data which are on an interval scale. Interval scales do not have a true zero that indicates an absence of quantity, such as temperature (in degrees Celsius or Fahrenheit) or a calendar year.
It is also inappropriate to use the coefficient of variation when a dataset contains both positive and negative numbers.
For example, if you recorded net daily earnings for a lemonade stand, there could be positive net earnings on some days and negative earnings on others. If net earnings were $5.00 on two days and -$5.00 on another two days, the mean would be 0 and the coefficient of variation would be infinity, which doesn't make a lot of sense.
In any dataset where values are both positive and negative, the mean will be closer to zero without necessarily affecting standard deviation. The coefficient of variation calculation would be unusually high, and it would not accurately represent the variation of the dataset.
Fun example - Assess a method for estimating jellybeans in a jar
Sofia is getting ready to go to a party. She knows there will be a jar full of jellybeans game, where whoever's guess is the closest to the actual number of jellybeans inside the jar will win the prize (probably the jellybeans)!
To prepare, Sofia decides to test the precision of a method she learned from her uncle. This method involved counting the jellybeans down the side of the jar, multiplied the number of jellybeans touching the bottom of the jar.
Sofia fills a cylindrical jar full of jellybeans and estimates the jellybeans using the counting method, shaking the jar each time before trying again. Then she recorded the results: 80, 88, 76, 88, 91.
Based on the data, the mean was 84.6 and the standard deviation was 6.3.
Sofia then calculated the coefficient of variation:
Cv = 6.3/84.6 * 100% = 7.4%
Since her sample size was small, she modifies the coefficient of variation using the unbiased coefficient of variation formula:
Ĉv = (1 + 1/4n) * Cv
Ĉv = 1 + 1/(4 * 5) * 7.4% = 7.8%
The actual number of jellybeans was 86, so the estimated mean of 84.6 was sufficiently accurate. However, the precision of the technique varied too much to be reliable with just one estimate.
...at the event
To make matters difficult, shaking the jar is not allowed at the event, and each person only can see the jar for 1 minute. To secure a win, Sofia formulates a strategy:
Sofia estimates the number of jellybeans in the jar to be 120 using the same technique as before.
Based on the empirical rule, she muses that her guess has a 68% chance of being within one standard deviation (7.8%) of the actual number of jellybeans.
Since 7.8% of 120 is 9 jellybeans (
120 * 0.078 = 9), Sofia believes that there is a 68% chance that the actual number of jellybeans is between 120 ± 9, or from 111 to 129 jellybeans.
Sofia gathers six friends, and they all guess from 111 to 129 in 3-jellybean increments:
111, 114, 117, 120, 123, 126, 129.
Will Sofia and her friends win the game? How would you improve this strategy? Better yet, try it yourself!