Module 10: Hypothesis Testing With Two Samples
Two Population Means with Known Standard Deviations
Barbara Illowsky & OpenStax et al.
Even though this situation is not likely (knowing the population standard deviations is not likely), the following example illustrates hypothesis testing for independent means, known population standard deviations. The sampling distribution for the difference between the means is normal and both populations must be normal. The random variable is [latex]displaystyleoverline{{X}}_{{1}}-overline{{X}}_{{2}}[/latex]. The normal distribution has the following format:
Normal distribution is: [latex]displaystyleoverline{{X}}_{{1}}-overline{{X}}_{{2}}sim{N}Bigg[{mu_{{1}}-mu_{{2}},sqrt{{frac{{(sigma_{{1}})}^{{2}}}{{n}_{{1}}}+frac{{(sigma_{{2}})}^{{2}}}{{n}_{{2}}}}}Bigg]}[/latex]
The standard deviation is: [latex]displaystylesqrt{frac{(sigma_1)^2}{n_1}+frac{(sigma_2)^2}{n_2}}[/latex]
The test statistic (z-score) is: [latex]displaystyle{z}=frac{(overline{x}_1-overline{x}_2)-(mu_1-mu_2)}{sqrt{frac{(sigma_1)^2}{n_1}+frac{(sigma_2)^2}{n_2}}}[/latex]
Example
Independent groups, population standard deviations known. The mean lasting time of two competing floor waxes is to be compared. Twenty floors are randomly assigned to test each wax. Both populations have a normal distributions. The data are recorded in the table.
| Wax | Sample Mean Number of Months Floor Wax Lasts | Population Standard Deviation |
|---|---|---|
| 1 | 3 | 0.33 |
| 2 | 2.9 | 0.36 |
Does the data indicate that wax 1 is more effective than wax 2? Test at a 5% level of significance.
Solution:
This is a test of two independent groups, two population means, population standard deviations known.
Random Variable: [latex]displaystyleoverline{{X}}_{{1}}-overline{{X}}_{{2}}[/latex] = difference in the mean number of months the competing floor waxes last.
H0: μ1 ≤ μ2Ha: μ1 > μ2
The words “is more effective” says that wax 1 lasts longer than wax 2, on average. “Longer” is a “>” symbol and goes into Ha. Therefore, this is a right-tailed test.
Distribution for the test: The population standard deviations are known so the distribution is normal. Using the formula, the distribution is:
Since μ1 ≤ μ2 then μ1 – μ2 ≤ 0 and the mean for the normal distribution is zero.
Calculate the p-value using the normal distribution: p-value = 0.1799
Graph:
[latex]displaystyleoverline{{X}}_{{1}}-overline{{X}}_{{2}}={3}-{2.9}={0.1}[/latex]Compare α and the p-value: α = 0.05 and p-value = 0.1799. Therefore, α < p-value.
Make a decision: Since α < p-value, do not reject H0.
Conclusion: At the 5% level of significance, from the sample data, there is not sufficient evidence to conclude that the mean time wax 1 lasts is longer (wax 1 is more effective) than the mean time wax 2 lasts.
Using a Calculator
- Press
STAT
. - Arrow over to
TESTS
and press3:2-SampZTest
. - Arrow over to
Stats
and pressENTER
. - Arrow down and enter
.33
for sigma1,.36
for sigma2,3
for the first sample mean,20
for n1,2.9
for the second sample mean, and20
for n2. - Arrow down to μ1: and arrow to > μ2.
- Press
ENTER
. - Arrow down to
Calculate
and pressENTER
. - The p-value is p = 0.1799 and the test statistic is 0.9157.
- Do the procedure again, but instead of
Calculate
doDraw
.
try it
The means of the number of revolutions per minute of two competing engines are to be compared. Thirty engines are randomly assigned to be tested. Both populations have normal distributions. The table below shows the result. Do the data indicate that Engine 2 has higher RPM than Engine 1? Test at a 5% level of significance.
| Engine | Sample Mean Number of RPM | Population Standard Deviation |
|---|---|---|
| 1 | 1,500 | 50 |
| 2 | 1,600 | 60 |
The p-value is almost zero, so we reject the null hypothesis. There is sufficient evidence to conclude that Engine 2 runs at a higher RPM than Engine 1.
Example
An interested citizen wanted to know if Democratic U. S. senators are older than Republican U.S. senators, on average. On May 26 2013, the mean age of 30 randomly selected Republican Senators was 61 years 247 days old (61.675 years) with a standard deviation of 10.17 years. The mean age of 30 randomly selected Democratic senators was 61 years 257 days old (61.704 years) with a standard deviation of 9.55 years.
Do the data indicate that Democratic senators are older than Republican senators, on average? Test at a 5% level of significance.
Solution:
This is a test of two independent groups, two population means. The population standard deviations are unknown, but the sum of the sample sizes is 30 + 30 = 60, which is greater than 30, so we can use the normal approximation to the Student’s-t distribution. Subscripts: 1: Democratic senators 2: Republican senators
Random variable: [latex]displaystyleoverline{{X}}_{{1}}-overline{{X}}_{{2}}[/latex] = difference in the mean age of Democratic and Republican U.S. senators.
H0: µ1 ≤ µ2H0: µ1 – µ2 ≤ 0
Ha: µ1 > µ2Ha: µ1 – µ2 > 0
The words “older than” translates as a “>” symbol and goes into
Ha. Therefore, this is a right-tailed test.
Distribution for the test: The distribution is the normal approximation to the Student’s t for means, independent groups. Using the formula, the distribution is:
[latex]displaystyleoverline{{X}}_{{1}}-overline{{X}}_{{2}}sim{N}Bigg[{0,sqrt{{frac{{(9.55)}^{{2}}}{30}+frac {{(10.17)}^{{2}}}{30}}}}Bigg][/latex]Since µ1 ≤ µ2, µ1 – µ2 ≤ 0 and the mean for the normal distribution is zero.
(Calculating the p-value using the normal distribution gives p-value = 0.4040)
Graph:
Compare α and the p-value:α = 0.05 and p-value = 0.4040. Therefore, α < p-value.
Make a decision: Since α < p-value, do not reject H0.
Conclusion: At the 5% level of significance, from the sample data, there is not sufficient evidence to conclude that the mean age of Democratic senators is greater than the mean age of the Republican senators.
Concept Review
A hypothesis test of two population means from independent samples where the population standard deviations are known (typically approximated with the sample standard deviations), will have these characteristics: Random variable:
[latex]displaystyleoverline{{X}}_{{1}}-overline{{X}}_{{2}}[/latex] = the difference of the means Distribution: normal distribution
Formula Review
Normal Distribution: [latex]displaystyleoverline{{X}}_{{1}}-overline{{X}}_{{2}}sim{N}Bigg[{mu_{{1}}-mu_{{2}},sqrt{{frac{{(sigma_{{1}})}^{{2}}}{{n}_{{1}}}+frac{{(sigma_{{2}})}^{{2}}}{{n}_{{2}}}}}Bigg]}[/latex]
Generally µ1 – µ2 = 0.
Test Statistic (z-score): [latex]displaystyle{z}=frac{(overline{x}_1-overline{x}_2)-(mu_1-mu_2)}{sqrt{frac{(sigma_1)^2}{n_1}+frac{(sigma_2)^2}{n_2}}}[/latex]
Generally µ1 – µ2 = 0.
where: σ1 and σ2 are the known population standard deviations. n1 and n2 are the sample sizes. [latex]displaystyleoverline{{x}}_{{1}}[/latex] and [latex]overline{{x}}_{{2}}[/latex] are the sample means. μ1 and μ2 are the population means.