Here we perform a Z-test for population mean μ.
Null hypothesis H₀:   μ = μ₀
Alternative hypothesis H₁
two-tailed (μ ≠ μ₀)
Approach
p-value
Do you know the Z-score?
Yes
Z-score
p-value

# Z-test Calculator

By Anna Szczepanek, PhD
Last updated: Jun 09, 2020

This Z-test calculator is a tool that helps you perform a one-sample Z-test on the population's mean. Two forms of this test - a two-tailed Z-test and a one-tailed Z-tests - exist, and can be used depending on your needs. You can also choose whether the calculator should determine the p-value from Z-test, or you'd rather use the critical value approach!

Read on to learn more about Z-test in statistics, and, in particular, when to use Z-tests, what is the Z-test formula, and whether to use Z-test vs. t-test. As a bonus, we give some step-by-step examples of how to perform Z-tests!

## What is a Z-test?

A one sample Z-test is one of the most popular location tests. The null hypothesis is that the population mean value is equal to a given number, μ₀:

H₀: μ = μ₀

We perform a two-tailed Z-test if we want to test whether the population mean is not μ₀:

H₁: μ ≠ μ₀,

and a one-tailed Z-test if we want to test whether the population mean is less/greater than μ₀:

H₁: μ < μ₀ (left-tailed test); and

H₁: μ > μ₀ (right-tailed test).

Let us now discuss the assumptions of a one-sample Z-test.

## When to use Z-tests?

You may use a Z-test if your sample consists of independent data points and:

• the data is normally distributed, and you know the population variance;

or

• the sample is large, and data follows a distribution which has a finite mean and variance. You don't need to know the population variance.

The reason these two possibilities exist is that we want the test statistics that follow the standard normal distribution N(0,1). In the former case, it is an exact standard normal distribution, while in the latter, it is approximately so, thanks to the central limit theorem.

The question remains, "When is my sample considered large?" Well, there's no universal criterion. In general, the more data points you have, the better the approximation works. Statistics textbooks recommend having no fewer than 50 data points, while 30 is considered the bare minimum.

## Z-test formula

Let `x1, ..., xn` be an independent sample following the normal distribution N(μ, σ²), i.e., with a mean equal to `μ`, and variance equal to `σ²`.

We pose the null hypothesis, H₀: μ = μ₀

We define the test statistic, Z, as:

`Z = (x̄ - μ0) * √n / σ`

where:

• `x̄` is the sample mean, i.e., `x̄ = (x1 + ... + xn) / n`;

• `μ0` is the mean postulated in H₀;

• `n` is sample size; and

• `σ` is population standard deviation.

In what follows, the uppercase `Z` stands for the test statistic (treated as a random variable), while the lowercase `z` will denote an actual value of `Z`, computed for a given sample drawn from N(μ,σ²).

If H₀ holds, then the sum `Sn = x1 + ... + xn` follows the normal distribution, with mean `n * μ0` and variance `n² * σ`. As `Z` is the standardization (z-score) of `Sn/n`, we can conclude that the test statistic `Z` follows the standard normal distribution N(0,1), provided that H₀ is true.

If our data does not follow a normal distribution, or if the population standard deviation is unknown (and thus in the formula for `Z` we substitute the population standard deviation `σ` with sample standard deviation), then the test statistics `Z` is not necessarily normal. However, if the sample is sufficiently large, then the central limit theorem guarantees that `Z` is approximately N(0,1).

In sections below, we will explain to you how to use the value of the test statistic, `z`, to make a decision, whether or not you should reject the null hypothesis. Two approaches can be used in order to arrive at that decision: the p-value approach, and critical value approach - and we cover both of them! Which one should you use? In the past, the critical value approach was more popular because it was difficult to calculate p-value from Z-test. However, with help of modern computers, we can do it fairly easily, and with decent precision. In general, you are strongly advised to report the p-value of your tests!

## p-value from Z-test

Formally, the p-value is the smallest level of significance at which the null hypothesis could be rejected. More intuitively, p-value answers the questions: provided that I live in a world where the null hypothesis holds, how probable is it that the value of the test statistic will be at least as extreme as the `z`-value I've got for my sample? Hence, a small p-value means that your result is very improbable under the null hypothesis, and so there is strong evidence against the null hypothesis - the smaller the p-value, the stronger the evidence.

To find the p-value, you have to calculate the probability that the test statistic, `Z`, is at least as extreme as the value we've actually observed, `z`, provided that the null hypothesis is true. (The probability of an event calculated under the assumption that H0 is true will be denoted as `Pr(event | H0)`.) It is the alternative hypothesis which determines what more extreme means:

1. Two-tailed Z-test: extreme values are those whose absolute value exceeds `|z|`, so those smaller than `-|z|` or greater than `|z|`. Therefore, we have:

`p-value = Pr(Z ≤ -|z| | H0) + Pr(Z ≥ |z| | H0).`

The symmetry of the normal distribution gives:

`p-value = 2 Pr(Z ≤ -|z| | H0)`

2. Left-tailed Z-test: extreme values are those smaller than `z`, so

`p-value = Pr(Z ≤ z | H0)`

3. Right-tailed Z-test: extreme values are those greater than `z`, so

`p-value = Pr(Z ≥ z | H0)`

To compute these probabilities, we can use the cumulative distribution function, (cdf) of N(0,1), which for a real number, `x`, is defined as:

Also, p-values can be nicely depicted as the area under the probability density function (pdf) of N(0,1), due to:

`Pr(Z ≤ x | H0) = Φ(x) = the area to the left of x`

`Pr(Z ≥ x | H0) = 1-Φ(x) = the area to the right of x`

## Two-tailed Z-test and one-tailed Z-test

With all the knowledge you've got from the previous section, you're ready to learn about Z-tests.

1. Two-tailed Z-test: `p-value = Φ(-|z|) + (1 - Φ(|z|))`.

From the fact that `Φ(-z) = 1 - Φ(z)`, we deduce that

`p-value = 2Φ(−|z|) = 2(1 - Φ(|z|))`;

The p-value is the area under the probability distribution function (pdf) both to the left of `-|z|`, and to the right of `|z|`:

2. Left-tailed Z-test: `p-value = Φ(z)`;

The p-value is the area under the pdf to the left of our `z`:

3. Right-tailed Z-test: `p-value = 1 - Φ(z)`;

The p-value is the area under the pdf to the right of `z`:

The decision as to whether or not you should reject the null hypothesis can be now made at any significance level, α, you desire!

• if the p-value is less than, or equal to, α, the null hypothesis is rejected at this significance level; and

• if the p-value is greater than α, then there is not enough evidence to reject the null hypothesis at this significance level.

## Z-test critical values & critical regions

The critical value approach involves comparing the value of the test statistic obtained for our sample, `z`, to the so-called critical values. These values constitute the boundaries of regions where the test statistic is highly improbable to lie. Those regions are often referred to as the critical regions, or rejection regions. The decision of whether or not you should reject the null hypothesis is then based on whether or not our `z` belongs to the critical region.

The critical regions depend on a significance level, &alpha, of the test, and on the alternative hypothesis. The choice of α is arbitrary; in practice, the values of 0.1, 0.05, or 0.01 are most commonly used as α.

Once we agree on the value of α, we can easily determine the critical regions of the Z-test:

1. Two-tailed Z-test: `(-∞, Φ-1(α/2)] ∪ [Φ-1(1 - α/2), ∞)`

2. Left-tailed Z-test: `(-∞, Φ-1(α)]`

3. Right-tailed Z-test: `[Φ-1(1 - α), ∞)`

To decide the fate of H₀, check whether or not your `z` falls in the critical region:

• If yes, then reject H₀ and accept H₁; and

• If no, then there is not enough evidence to reject H₀.

As you see, the formulae for the critical values of Z-tests involve the inverse, Φ⁻¹, of the cumulative distribution function (cdf) of N(0,1).

## How to use the one-sample Z-test calculator?

Our calculator reduces all the complicated steps:

1. Choose the alternative hypothesis: two-tailed or left/right-tailed.

2. In our Z-test calculator, you can decide whether to use the p-value or critical regions approach. In the latter case, set the significance level, `α`.

3. Enter the value of the test statistic, `z`. If you don't know it, then you can enter some data that will allow us to calculate your `z` for you:

• sample mean `x̄` (If you have raw data, go to the average calculator to determine the mean);
• tested mean `μ0`;
• sample size `n`; and
• population standard deviation `σ` (or sample standard deviation if your sample is large).
4. Results appear immediately below the calculator.

If you want to find `z` based on p-value, please remember that in the case of two-tailed tests there are two possible values of `z`: one positive and one negative, and they are opposite numbers. This Z-test calculator returns the positive value in such a case. In order to find the other possible value of `z` for a given p-value, just take the number opposite to the value of `z` displayed by the calculator.

## Z-test examples

To make sure that you've fully understood the essence of Z-test, let's go through some examples:

1. A bottle filling machine follows a normal distribution. Its standard deviation, as declared by the manufacturer, is equal to 30 ml. A juice seller claims that the volume poured in each bottle is, on average, one liter, i.e., 1000 ml, but we suspect that in fact the average volume is smaller than that...

Formally, the hypotheses that we set are the following:

• H₀: μ = 1000 ml

• H₁: μ < 1000 ml

We went to a shop and bought a sample of 9 bottles. After carefully measuring the volume of juice in each bottle, we've obtained the following sample (in milliliters):

`1020, 970, 1000, 980, 1010, 930, 950, 980, 980`.

• Sample size: `n = 9`;

• Sample mean: `x̄ = 980 ml1;

• Population standard deviation: `σ = 30 ml`;

• So `Z = (980 - 1000) / (30 / √9) = -2`; and

• Therefore, `p-value = Φ(-2) ≈ 0.0455`.

As `0.0455 < 0.05`, we conclude that our suspicions aren't groundless; at the most common significance level, 0.05, we would reject the producer's claim, H₀, and accept the alternative hypothesis, H₁.

1. We tossed a coin 50 times. We got 20 tails and 30 heads. Is there sufficient evidence to claim that the coin is biased?

Clearly, our data follows Bernoulli distribution, with some success probability `p` and variance `σ² = p * (1-p)`. However, the sample is large, so we can safely perform a Z-test. We adopt the convention that getting tails is a success.

Let us state the null and alternative hypotheses:

• H₀: p = 0.5 (the coin is fair - the probability of tails is `0.5`)

• H₁: p ≠ 0.5 (the coin is biased - the probability of tails differs from `0.5`)

In our sample we have 20 successes (denoted by ones) and 30 failures (denoted by zeros), so:

• Sample size `n = 50`;

• Sample mean `x̄ = 20/50 = 0.4`;

• Population standard deviation is given by `σ = √(0.5*0.5)` (because `0.5` is the proportion `p` hypothesized in H₀). Hence, `σ = 0.5`;

• So `Z = (0.4 - 0.5) / (0.5 / √50) = -√2 ≈ -1.4142`; and

• Therefore, `p-value ≈ 2 * Φ(-1.4142) ≈ 0.1573`.

Since `0.1573 > 0.1` we don't have enough evidence to reject the claim that the coin is fair, even at such a large significance level as `0.1`. In that case, you may safely toss it to your Witcher.

## Z-test vs t-test

Recall that we use a t-test for testing the population mean of a normally distributed dataset which had an unknown population standard deviation. We get this by replacing the population standard deviation in the Z-test statistic formula by the sample standard deviation, which means that this new test statistic follows (provided that H₀ holds) the t-Student distribution with n-1 degrees of freedom instead of N(0,1).

For large samples, the t-Student distribution with n degrees of freedom approaches the N(0,1). Hence, as long as there are a sufficient number of data points (at least 30), it does not really matter whether you use the Z-test or the t-test, since the results will be almost identical. However, for small samples with unknown variance, remember to use the t-test instead of Z-test, because in such case the t-Student distribution differs quite significantly from the N(0,1)!

Anna Szczepanek, PhD