Omni Calculator logo
Board

Confidence Intervals for Proportions Explained

Calculating the confidence interval for a proportion is a typical exercise in statistics. It is even one of the first confidence intervals students learn to calculate. However, while the concept may seem straightforward, its details are more subtle than they first appear. So, to understand how to calculate and interpret confidence intervals for a proportion, we must first define what proportions and confidence intervals are.

If you're not entirely familiar with the basics of interpreting confidence intervals in general, you might want to read our How to Interpret Confidence Intervals guide.

What is a proportion?

Generally speaking, a proportion is a part, share, or number considered in relation to a whole. It can equal 0, 1, or any value between 0 and 1. It can be expressed as a number or a percentage. In statistics, a proportion or a population proportion is a fraction of the population with a particular characteristic.

For example, let's say you have 1,000 people in the population, and 187 have blue eyes. The fraction of people with blue eyes is 187 out of 1,000, or 187/1000.

🙋 If you want to visualize the behavior of sample proportions when sampling repeatedly from a population, try our sampling distribution of the sample proportion calculator 🇺🇸.

What is a confidence interval for a proportion?

A confidence interval is a statistical method that estimates the range of values within which a population parameter is likely to lie, based on observations from a sample. In the case of a confidence interval for a proportion, given the proportion pp observed in the sample, the aim is to determine an interval within which the true proportion of the population studied will likely fall with a probability equal to a predetermined value, usually 95%.

For example, a survey finds that 60% of voters support a candidate, with a 95% confidence interval of [57%, 63%]. This means we are 95% confident that the true proportion of voters who support the candidate lies between 57% and 63%. If you need a quick calculation, use our confidence interval calculator 🇺🇸.

To create a confidence interval for a population proportion, you should:

  1. Take a random sample from the population.
  2. Calculate the sample proportion p^=xn\hat{p} = \frac{x}{n}, where:
    xx â€” Number of items in the sample with characteristics of interest.
    nn â€” Sample size.
  3. Respect the following conditions to ensure that the sampling distribution is approximately normal:
    • np^≥5n\hat{p}\ge 5; and
    • n(1−p^)≥5n(1-\hat{p})\ge 5.
  4. Choose a confidence level, typically 90%, 95%, or 99%.
  5. Calculate the confidence interval of a proportion using one of the many methods specifically developed for this purpose. Check them below!

If you would like to understand how this process differs from predicting individual outcomes rather than estimating a proportion of a population, see our article: Prediction Interval vs. Confidence Interval.

💡 Here are the most common methods to construct a confidence interval for a proportion:

  • The Wald interval: This method is the most common. While it produces confidence intervals perfectly centered around the observed proportion, for values of pp close to 0 or 1, it can produce intervals where part of the interval is less than 0 or greater than 1.
  • The Wilson score interval: This confidence interval is not centered on pp and has the advantage of not producing outliers (values less than 0 or greater than 1).
  • The Wald or Wilson with continuity correction: It considers the transition from a discrete to a continuous distribution, which slightly modifies the formulas for calculating confidence intervals.
  • The Clopper-Pearson exact method: It is based on inverting the two-tailed binomial tests. It is the most commonly cited exact method for finding a confidence interval.

To show you how to calculate the confidence interval for a population proportion, we introduce the Wald formula, as it is one of the most commonly taught (even though it is not always the most accurate for small samples).

First, you must ensure that the above conditions of normality are met, then you can use the following formula:

p^±z∗×p^(1−p^)n\hat{p} \pm z^* \times \sqrt{\frac{\hat{p}(1 - \hat{p})}{n}}

where:

  • p^\hat{p} â€” Sample propotion;
  • nn — Sample size; and
  • z∗z^* â€” z-score 🇺🇸 corresponding to the confidence level:
    • 1.645 for 90%;
    • 1.96 for 95%; and
    • 2.576 for 99%.

By analyzing the formula, we can see that the confidence interval for proportions is based on the margin of error (ME), which is written as follows:

ME=z∗×p^(1−p^)nME = z^* \times \sqrt{\frac{\hat{p}(1 - \hat{p})}{n}}

Try our margin of error calculator 🇺🇸 to compute ME directly.

The confidence interval is then obtained by subtracting and adding the margin of error from the sample proportion:

Lower bound=p^−MEUpper bound=p^+ME\text{Lower bound} = \hat{p} - ME\\ \text{Upper bound} = \hat{p} + ME

In other words, the formula tells us how much we need to deviate from the observed proportion to create an interval that, after several samples, would capture the true proportion of the population in approximately 90%, 95%, or 99% of cases (depending on the confidence levels).

In a market study of n=500n = 500 randomly selected adults, 421 respondents said they owned a smartphone. He's how to build a confidence interval:

  1. Calculate the sample proportion:
p^=421500=0.842\hat{p} = \frac{421}{500} = 0.842
  1. Check the normality conditions:
np^=421≥5,n(1−p^)=79≥5n\hat{p} = 421 \geq 5,\quad n(1 - \hat{p}) = 79 \geq 5
  1. The conditions are met. Now, apply the Wald formula with a 95% confidence level. You get a 95% confidence interval of:
CI=[0.81,0.87]\text{CI} = [0.81,0.87]

You can say that you are 95% confident that between 81% and 87% of all adult residents in the city own a smartphone.

Just as with confidence intervals for population means, a confidence interval for a proportion of a population is constructed by taking a sample of a given size from the population, calculating the proportion of the sample, and then adding and subtracting the margin of error to obtain the limits of the confidence interval.

P-hat is a statistic representing the proportion of a particular outcome in a sample, calculated as the number of successes divided by the total number of observations in that sample.

This means that if we took many samples and created confidence intervals from those samples, approximately 99% of those intervals would contain the true value we are estimating.

This article was written by Claudia Herambourg and reviewed by Steven Wooding.