Module 10: Hypothesis Testing With Two Samples

# Two Population Means with Known Standard Deviations

Barbara Illowsky & OpenStax et al.

Even though this situation is not likely (knowing the population standard deviations is not likely), the following example illustrates hypothesis testing for independent means, known population standard deviations. The sampling distribution for the difference between the means is normal and both populations must be normal. The random variable is [latex]displaystyleoverline{{X}}_{{1}}-overline{{X}}_{{2}}[/latex]. The normal distribution has the following format:

**Normal distribution** is: [latex]displaystyleoverline{{X}}_{{1}}-overline{{X}}_{{2}}sim{N}Bigg[{mu_{{1}}-mu_{{2}},sqrt{{frac{{(sigma_{{1}})}^{{2}}}{{n}_{{1}}}+frac{{(sigma_{{2}})}^{{2}}}{{n}_{{2}}}}}Bigg]}[/latex]

**The standard deviation** is: [latex]displaystylesqrt{frac{(sigma_1)^2}{n_1}+frac{(sigma_2)^2}{n_2}}[/latex]

The **test statistic** (*z*-score) is: [latex]displaystyle{z}=frac{(overline{x}_1-overline{x}_2)-(mu_1-mu_2)}{sqrt{frac{(sigma_1)^2}{n_1}+frac{(sigma_2)^2}{n_2}}}[/latex]

### Example

**Independent groups, population standard deviations known.** The mean lasting time of two competing floor waxes is to be compared. **Twenty floors** are randomly assigned **to test each wax**. Both populations have a normal distributions. The data are recorded in the table.

Wax | Sample Mean Number of Months Floor Wax Lasts | Population Standard Deviation |
---|---|---|

1 | 3 | 0.33 |

2 | 2.9 | 0.36 |

Does the data indicate that **wax 1 is more effective than wax 2**? Test at a 5% level of significance.

Solution:

This is a test of two independent groups, two population means, population standard deviations known.

**Random Variable: **[latex]displaystyleoverline{{X}}_{{1}}-overline{{X}}_{{2}}[/latex] = difference in the mean number of months the competing floor waxes last.

*H _{0}*:

*μ*≤

_{1}*μ*

_{2}

*H*:

_{a}*μ*>

_{1}*μ*

_{2}The words **“is more effective”** says that **wax 1 lasts longer than wax 2**, on average. “Longer” is a “>” symbol and goes into *H _{a}*. Therefore, this is a right-tailed test.

**Distribution for the test:** The population standard deviations are known so the distribution is normal. Using the formula, the distribution is:

Since *μ _{1}* ≤

*μ*then

_{2}*μ*–

_{1}*μ*≤ 0 and the mean for the normal distribution is zero.

_{2}**Calculate the p-value using the normal distribution:**

*p*-value = 0.1799

**Graph:**

[latex]displaystyleoverline{{X}}_{{1}}-overline{{X}}_{{2}}={3}-{2.9}={0.1}[/latex]**Compare α and the p-value:**

*α*= 0.05 and

*p*-value = 0.1799. Therefore,

*α*<

*p*-value.

**Make a decision:** Since *α* < *p*-value, do not reject *H0*.

**Conclusion:** At the 5% level of significance, from the sample data, there is not sufficient evidence to conclude that the mean time wax 1 lasts is longer (wax 1 is more effective) than the mean time wax 2 lasts.

#### Using a Calculator

- Press
`STAT`

. - Arrow over to
`TESTS`

and press`3:2-SampZTest`

. - Arrow over to
`Stats`

and press`ENTER`

. - Arrow down and enter
`.33`

for sigma1,`.36`

for sigma2,`3`

for the first sample mean,`20`

for n1,`2.9`

for the second sample mean, and`20`

for n2. - Arrow down to
*μ*_{1}: and arrow to >*μ*_{2}. - Press
`ENTER`

. - Arrow down to
`Calculate`

and press`ENTER`

. - The
*p*-value is*p*= 0.1799 and the test statistic is 0.9157. - Do the procedure again, but instead of
`Calculate`

do`Draw`

.

### try it

The means of the number of revolutions per minute of two competing engines are to be compared. Thirty engines are randomly assigned to be tested. Both populations have normal distributions. The table below shows the result. Do the data indicate that Engine 2 has higher RPM than Engine 1? Test at a 5% level of significance.

Engine | Sample Mean Number of RPM | Population Standard Deviation |
---|---|---|

1 | 1,500 | 50 |

2 | 1,600 | 60 |

The *p*-value is almost zero, so we reject the null hypothesis. There is sufficient evidence to conclude that Engine 2 runs at a higher RPM than Engine 1.

### Example

An interested citizen wanted to know if Democratic U. S. senators are older than Republican U.S. senators, on average. On May 26 2013, the mean age of 30 randomly selected Republican Senators was 61 years 247 days old (61.675 years) with a standard deviation of 10.17 years. The mean age of 30 randomly selected Democratic senators was 61 years 257 days old (61.704 years) with a standard deviation of 9.55 years.

Do the data indicate that Democratic senators are older than Republican senators, on average? Test at a 5% level of significance.

Solution:

This is a test of two independent groups, two population means. The population standard deviations are unknown, but the sum of the sample sizes is 30 + 30 = 60, which is greater than 30, so we can use the normal approximation to the Student’s-t distribution. Subscripts: 1: Democratic senators 2: Republican senators

**Random variable: **[latex]displaystyleoverline{{X}}_{{1}}-overline{{X}}_{{2}}[/latex] = difference in the mean age of Democratic and Republican U.S. senators.

*H _{0}*:

*µ*≤

_{1}*µ*

_{2}*H*:

_{0}*µ*–

_{1}*µ*≤ 0

_{2}*H _{a}*:

*µ*>

_{1}*µ*

_{2}*H*:

_{a}*µ*–

_{1}*µ*> 0

_{2}The words “older than” translates as a “>” symbol and goes into

*Ha*. Therefore, this is a right-tailed test.

**Distribution for the test: **The distribution is the normal approximation to the Student’s *t* for means, independent groups. Using the formula, the distribution is:

[latex]displaystyleoverline{{X}}_{{1}}-overline{{X}}_{{2}}sim{N}Bigg[{0,sqrt{{frac{{(9.55)}^{{2}}}{30}+frac {{(10.17)}^{{2}}}{30}}}}Bigg][/latex]Since *µ _{1}* ≤

*µ*,

_{2}*µ*–

_{1}*µ*≤ 0 and the mean for the normal distribution is zero.

_{2}(**Calculating the p-value using the normal distribution** gives

*p*-value = 0.4040)

**Graph:**

**Compare α and the p-value:**

*α*= 0.05 and

*p*-value = 0.4040. Therefore,

*α*<

*p*-value.

**Make a decision:** Since *α* < *p*-value, do not reject *H0*.

**Conclusion:** At the 5% level of significance, from the sample data, there is not sufficient evidence to conclude that the mean age of Democratic senators is greater than the mean age of the Republican senators.

## Concept Review

A hypothesis test of two population means from independent samples where the population standard deviations are known (typically approximated with the sample standard deviations), will have these characteristics: Random variable:

[latex]displaystyleoverline{{X}}_{{1}}-overline{{X}}_{{2}}[/latex] = the difference of the means Distribution: normal distribution

## Formula Review

Normal Distribution: [latex]displaystyleoverline{{X}}_{{1}}-overline{{X}}_{{2}}sim{N}Bigg[{mu_{{1}}-mu_{{2}},sqrt{{frac{{(sigma_{{1}})}^{{2}}}{{n}_{{1}}}+frac{{(sigma_{{2}})}^{{2}}}{{n}_{{2}}}}}Bigg]}[/latex]

Generally *µ*_{1} – *µ*_{2} = 0.

Test Statistic (*z*-score): [latex]displaystyle{z}=frac{(overline{x}_1-overline{x}_2)-(mu_1-mu_2)}{sqrt{frac{(sigma_1)^2}{n_1}+frac{(sigma_2)^2}{n_2}}}[/latex]

Generally *µ*_{1} – *µ*_{2} = 0.

where: *σ*_{1} and *σ*_{2} are the known population standard deviations. *n*_{1} and *n*_{2 }are the sample sizes. [latex]displaystyleoverline{{x}}_{{1}}[/latex] and [latex]overline{{x}}_{{2}}[/latex] are the sample means. *μ*_{1} and *μ*_{2} are the population means.