Ch11-Solns-All Skuce 2e

Instructors Solutions Manual - Chapter 11
Chapter 11 Solutions Develop Your Skills 11.1 1. These data are collected on a random sample of days. They should be independent, unless the locations are close enough to each other that the foot traffic at each would be affected by the same factors. We will assume this is not the case. Histograms show approximate normality.
DailyFoot Traffic at Location 1

12
NumberofDays
10 8 6 4 2 0 NumberofPeople

9 8 7 6 5 4 3 2 1 0 NumberofPeople
NumberofDays
Copyright 2011 Pearson Canada Inc.
275

10 9 8 7 6 5 4 3 2 1 0 NumberofPeople
NumberofDays
The histogram for foot traffic at location 1 shows some right-skewness, but sample sizes are reasonable, and close to the same, so we will assume the population data are normally distributed. The largest variance is 478.7 (for location 2), and the smallest is 257.2 (location 1). The largest variance is less than twice as large as the smallest. So, following our rule, we will assume the population variances are approximately equal. Therefore, these data meet the required conditions for one-way ANOVA.
276
2.
Because we don't know the details of how the cashiers made their sample selection, we cannot know if the sample was truly random or independent. We will assume that the sample data were properly collected. Histograms suggest normality.
WineryPurchasesforCustomers Under30 YearsofAge

20
NumberofPurchases
15 10 5 0
ValueofPurchase
WineryPurchasesforCustomers Aged3050
18 16 14 12 10 8 6 4 2 0
NumberofPurchases
ValueofPurchase
WineryPurchasesforCustomers Over50YearsofAge
14
NumberofPurchases
12 10 8 6 4 2 0
ValueofPurchase
277
The largest variance is 652.9, and the smallest is 555.1, so clearly the sample variances are fairly close in value. We will assume that the population variances are approximately equal. These data appear to meet the requirements for one-way ANOVA. 3. We will presume that the college collected the sample data appropriately, so the data are independent and truly random. The histograms suggest normality.
AnnualSalariesofMarketing Graduates
7
NumberofGraduates
6 5 4 3 2 1 0
AnnualSalary
AnnualSalariesofAccounting Graduates
9 8 7 6 5 4 3 2 1 0 NumberofGraduates
AnnualSalary
278
AnnualSalariesofHuman ResourcesGraduates
8 7 6 5 4 3 2 1 0 NumberofGraduates
AnnualSalary
AnnualSalariesofGeneral BusinessGraduates
NumberofGraduates 7 6 5 4 3 2 1 0
AnnualSalary
The largest variance is 159,729,974, and the smallest is 70,826,421. The ratio of the largest to the smallest is about 2.3, which is meets the requirement (less than four). These data appear to meet the requirements for one-way ANOVA.
279
4.
It appears the data are randomly selected, and independent. The data sets are too small for histograms, but stem-and-leaf displays suggest normality.
Route1 3 3 4 0 5 1 6 0
6 5 4
6 7
Route2 2 2 3 2 4 6
8 3 9
8 5
Route3 3 1 4 3 5 3 6 1
6 6 5
9 7
The largest variance is 94, the smallest is 67, for a ratio of largest-to-smallest of about 1.4. This is within the accepted range, so we will assume the population variances are approximately equal. These data appear to meet the requirements for one-way ANOVA. 5. The histograms appear approximately normal. We have to be a bit cautious about assuming these are random samples. For example, one class may be mostly Accounting students, one may be mostly Marketing students, etc. The students who have selected these programs may have different levels of interest and aptitudes for statistics. We will assume that the classes are approximately randomly selected, in the absence of other information, but should note the caution. The largest variance is not much larger than the smallest variance, so we will assume the population variances are approximately equal.
280
Develop Your Skills 11.2 6. H0: 1 = 2 = 3 H1: At least one differs from the others. = 0.05 nT = 85, n1 = 27, n2 = 30, n3 = 28, k = 3 x1 50.5556, x 2 56.6, x 3 74.3214
2 2 s12 257.1795, s2 478.7310, s3 333.5595 SSbetween = 8475.2497, SSwithin = 29,575.9738
We have already checked for normality and equality of variances.
MSbetween MS within
SSbetween 8475.2497 4237.6249 2 11.749 k 1 29575.9738 SS within 360.6826 82 nT k
The F-distribution has 2, 82 degrees of freedom. The closest we can come in the table is 2, 80. We see that the p-value is < 1% (Excel provides a p-value of 0.00003). Reject H0. There is sufficient evidence to conclude that at least one of the locations has a different average number of daily passersby than the others. The Excel output for this data set is shown below.
Anova:SingleFactor SUMMARY Groups Location1 Location2 Location3
Count 27 30 28
Sum 1365 1698 2081
Average Variance 50.5556 257.1794872 56.6000 478.7310 74.3214 333.5595238
ANOVA SourceofVariation BetweenGroups WithinGroups Total
SS 8475.2497 29575.9738 38051.2235
df 2 82 84
MS 4237.6249 360.6826074
F Pvalue 11.749 3.26E05
281
7.
H0: 1 = 2 = 3 H1: At least one differs from the others. = 0.05 nT = 150, n1 = 50, n2 = 50, n3 = 50, k = 3 x1 77.5684, x 2 119.6708, x 3 132.4674
2 2 s12 652.9145, s2 555.0899, s3 625.7846 SSbetween = 82504.4210, SSwithin = 89855.6606
We have already checked for normality and equality of variances. F = 67.5 The F-distribution has 2, 147 degrees of freedom. Excel provides a p-value of approximately zero. Reject H0. There is sufficient evidence to conclude that customers in different age groups make different average purchases. 8. H0: 1 = 2 = 3 = 4 H1: At least one differs from the others. = 0.025 nT = 80, n1 = 20, n2 = 20, n3 = 20, n4 = 20, k = 4 x1 51,395, x 2 71,170, x 3 56,100, x 4 53,885
2 2 2 s12 159,729,973.68, s2 70,826,421.05, s3 116,576,842.11, s4 76,859,236.84 SSbetween = 4,750,850,500, SSwithin = 8,055,857,000
We have already checked for normality and equality of variances. F = 14.9 The F-distribution has 3, 76 degrees of freedom. Excel provides a p-value of approximately zero. Reject H0. There is sufficient evidence to conclude that at least one of the program streams had an average salary for graduates that differs from that of the other program streams.
282
Anova:SingleFactor SUMMARY Groups Marketing Accounting HumanResources GeneralBusiness
Count 20 20 20 20
Sum Average 1027900 51395 1423400 71170 1122000 56100 1077700 53885
Variance 159729973.68 70826421.05 116576842.11 76859236.84
ANOVA Sourceof Variation BetweenGroups WithinGroups Total
SS 4750850500 8055857000 12806707500
df
MS 3 1.58E+09 76 1.06E+08 79
F P value 14.94004664 9.77E 08
9.
H0: 1 = 2 = 3 H1: At least one differs from the others. = 0.05 nT = 30, n1 = 10, n2 = 10, n3 = 10, k = 3 x1 47, x 2 34.6, x 3 48.7
We have already checked for normality and equality of variances. F = 7.4 The F-distribution has 2, 27 degrees of freedom. Excel provides a p-value of 0.0027. Reject H0. There is sufficient evidence to conclude that the average commuting time for at least one of the routes is different from the others. The Excel output is shown below.
283
Anova:SingleFactor SUMMARY Groups Route1 Route2 Route3
Count 10 10 10
Sum Average 470 47 346 34.6 487 48.7
Variance 78.44444 67.15556 94.01111
ANOVA SourceofVariation SS BetweenGroups 1184.86667 WithinGroups 2156.5 Total 3341.36667
df
MS F Pvalue 2 592.4333 7.417436 0.002708 27 79.87037 29
10. H0: 1 = 2 = 3 H1: At least one differs from the others. = 0.05 nT = 135, n1 = 45, n2 = 45, n3 = 45, k = 3 x1 70.1111, x 2 56.6889, x 3 54.0667
We have already checked for normality and equality of variances. F = 15.2 The F-distribution has 2, 132 degrees of freedom. Excel provides a p-value of approximately zero. Reject H0. There is sufficient evidence to conclude that differences in the use of the online software are associated with differences in final grades. We should be cautious about interpreting the results, because although there is evidence of a difference in the average grades, we cannot necessarily attribute the differences in the use of the online software as the cause. There are many potential confounding factors, that is, other factors which could have an effect on the final grades.
284
Develop Your Skills 11.3 11. Completed Excel templates are shown below. For locations 1 and 3:
TukeyKramerConfidence Interval Wasthenullhypothesis rejectedintheANOVAtest? xbari xbarj ni nj q(fromAppendix7) MSwithin UpperConfidenceLimit LowerConfidenceLimit
yes 50.5556 74.3214 27 28 3.4 360.682607 11.4505171 36.0812289
For locations 2 and 3:
yes 56.6000 74.3214 30 28 3.4 360.682607 5.72364915 29.719208
For locations 1 and 2:
285
yes 50.5556 56.6000 27 30 3.4 360.682607 6.06771106 18.1565999
The first two confidence intervals do not contain zero, so it appears that the average number of people passing by location 3 is greater than at the other two locations. 12. Completed Excel templates are shown below (to save space, the row checking for rejection of the null hypothesis in ANOVA is not shown). For under 30 and over 50:
Tukey KramerConfidenceInterval x bari 77.568 x barj 132.467 ni 50 nj 50 q(fromAppendix7) 3.36 MSwithin 611.2629973 UpperConfidenceLimit 43.15088123 LowerConfidenceLimit 66.64711877
286
For under 30 and 30-50:
Tukey KramerConfidenceInterval xbari xbarj

ni nj
q(fromAppendix7)
MSwithin
UpperConfidenceLimit LowerConfidenceLimit
77.568 119.671 50 50 3.36 611.2629973 30.3542812 53.8505188
For 30-50 and over 50:

ni nj
q(fromAppendix7)
MSwithin
119.671 132.467 50 50 3.36 611.2629973 1.04848123 24.5447188
None of these confidence intervals contains zero. Certainly the highest average purchase is with those over 50. 13. Completed Excel templates are shown below (to save space, the row checking for rejection of the null hypothesis in ANOVA is not shown). Marketing and Accounting:
ni nj
q(fromAppendix7)
MSwithin
51395.000 71170.000 20 20 3.74 105998118.4210530 11164.94982 28385.05018
287
Accounting and General:

ni nj
q(fromAppendix7)
MSwithin
71170.000 53885.000 20 20 3.74 105998118.4210530 25895.05018 8674.949822
Accounting and Human Resources:

ni nj
q(fromAppendix7)
MSwithin
71170.000 56100.000 20 20 3.74 105998118.4210530 23680.05018 6459.949822
Marketing and Human Resources:

ni nj
q(fromAppendix7)
MSwithin
51395.000 56100.000 20 20 3.74 105998118.4210530 3905.050178 13315.05018
At this point, no further comparisons are necessary. Since this interval contains zero, there does not appear to be a significant difference between the average salaries of Marketing graduates and Human Resources graduates. The differences between the
288
sample means for all other pairs are smaller than for this pair, and so we know there will not be a significant difference for the other pairs. To summarize: We have 95% confidence that the interval ($-28,385.05, $-11,164.95) contains the average difference in the salaries of Marketing graduates, compared to Accounting graduates (in other words, the average salary of Accounting graduates is likely at least $11,164.95 higher) ($8,674.95, $25,895.05) contains the average difference in the salaries of Accounting graduates, compared to General Business graduates ($6,459.95, $23,680.05) contains the average difference in the salaries of Accounting graduates, compared to Human Resources graduates. The differences between the average salaries of Human Resources, General Business, and Marketing graduates are not significant. 14. Because of the balanced design, these calculations simplify to: ( xi x j ) qscore ( xi x j ) 3.49
MS within n
79.8703704 10 ( xi x j ) 9.86321
For route 2 and route 3:
(34.6 48.7) 9.86321 14.1 9.86321 ( 23.96, 4.24) For route 1 and route 2:
( 47 34.6) 9.86321 12.4 9.86321 ( 2.54, 22.26) For route 1 and 3: ( 47 48.7) 9.86321 1.7 9.86321 ( 11.56, 8.16)
289
Route 2 would be the recommended route. 15. We have to be careful NOT to answer this question merely by inspection! First we recall that the F-test for ANOVA indicated a rejection of the null hypothesis. We have sample evidence that the population means are not all the same. The completed Excel templates are shown below. For assigned quizzes and sample tests only:
yes 70.1111 54.0667 45 45 3.36 218.900673 23.455099 8.63378989
We have 95% confidence that the interval (8.6, 23.5) contains the amount that the average mark for all those who used the online software for assigned quizzes, versus the average mark for all those who used sample tests only. Thus it appears that the average mark is at least 8.6 percent higher for those who use the online software for assigned quizzes.
290
For assigned quizzes for marks, and quizzes for no marks:
yes 70.1111 56.6889 45 45 3.36 218.900673 20.8328768 6.01156767
Once again, it appears that the average marks are higher when the online software is used for assigned quizzes for marks, compared with quizzes for no marks. We have 95% confidence that the interval (6.0, 20.8) contains the amount by which the average marks are higher when the online software is used for assigned quizzes for marks. We cannot conclude that there is a difference in the average marks when the online software is used for quizzes (no marks) or sample tests only. The confidence interval shown below contains zero.
yes 56.6889 54.0667 45 45 3.36 218.900673 10.0328768 4.78843233
291
We have evidence that assigning quizzes for marks results in the best average marks for students. However, as we cautioned before, we cannot be certain of the causeand-effect relationship here, because there are many potentially confounding variables.
Chapter Review Exercises 1. The histograms appear approximately normal, although there is some skewness in each one. However, with the large sample sizes, it is not unreasonable to assume the normality requirements are met.
2.
The largest variance is 590.65, and the smallest is 370.02. The ratio of the largest to the smallest is not above 4, so it is reasonable to assume that population variances are approximately equal. The missing values are shown below in bold type.
3.
SUMMARY Groups Class#1 Class#2 Class#3 ANOVA SourceofVariation BetweenGroups WithinGroups Total
Count 95 95 95 SS 5596.133333 129367.8526 134963.986
Sum 5840 5088 6075 df 2 282 284
Average 61.47368421 53.55789474 63.94736842 MS 2798.066667 458.7512505
Variance 370.0179171 590.6535274 415.5823068 F 6.099311258
4.
The appropriate F-distribution has 2, 282 degrees of freedom. We refer to the area in the F table for 2, 120 degrees of freedom and see that an F-score of 6.1 has a p-value less than 0.010. Excel provides a more accurate value of 0.0026.
292
5.
Because of the balanced design, these calculations simplify to:

( x i x j ) qscore ( x i x j ) 3.31 MS within n
458.7512505 95 ( x i x j ) 7.273691 For Class 2 and Class 3: (53.5579 63.9474) 7.273691 (-17.7, -3.1) We have 95% confidence that the interval (-17.7, -3.1) contains the difference between the average marks of Class 2 and Class 3. In other words, it appears that the average marks of those with the Class 3 professor are at least 3 percentage points higher than the average mark for those with the Class 2 professor. For Class 1 and Class 2: (61.4737 53.5579) 7.273691 (0.6, 15.2) We have 95% confidence that the interval (0.6, 15.2) contains the difference between the average marks of Class 1 and Class 2. In other words, it appears that the average marks of those with the Class 1 professor are at least 0.6 percentage points higher than the average mark for those with the Class 2 professor. For Class 1 and Class 3: (61.4737 63.9474) 7.273691 (-9.7, 4.8) In this case, the interval contains zero, and so there does not appear to be a significant difference between the average marks of those with the Class 1 professor and those with the Class 3 professor. From these comparisons, it appears that the average marks are lower for the Class 2 professor`s classes, and so this class should be avoided. There is no significant difference between the average marks for Class 1 and Class 3. The choice should then be: any professor but the one who lead Class 2.
293
However, this is not a valid method of choosing classes, because there could be many explanations for why the Class 2 marks were significantly lower. It could have to do with the teacher`s expertise, and evaluation methods. But it could also have arisen because of other factors: the students in Class 2 might have been less wellprepared, they may have worked more, or had family responsibilities that prevented them from studying, the class times might have been inconvenient, etc. 6. The conditions for ANOVA are not met, given the information in these three samples. The distribution of monthly balances for Mastercard owners is quite skewed to the left. The distribution of monthly balances for American Express owners is quite skewed to the right. As well, the variance of the American Express data is less than four times as large as the variance for the Mastercard data. It would not be appropriate to use ANOVA techniques in this case. The Kruskal-Wallis test could be used to compare these samples and draw conclusions about the populations (this technique is not covered in this text). The requirement for equal variances is met. The largest variance is 14.757, which is only 2.3 times as large as the smallest variance, which is 6.314. The missing values are shown below, in bold type.
SUMMARY Groups Employee1 Employee2 Employee3 Employee4 ANOVA SourceofVariation BetweenGroups WithinGroups Total Count 35 37 32 42 SS 264.6295 1621.124 1885.753 Sum 404 462 357 377 df 3 142 145 Average 11.54286 12.48649 11.15625 8.97619 MS 88.20984 11.41637 Variance 6.314286 14.75676 10.32964 13.536 F 7.726613
7.
8.
The F-distribution will have 3, 142 degrees of freedom. The closest we can come in the table is 3, 120. The closest entry in the table is 3.95, and so we know that the pvalue is < 0.01. At the 5% level of significance, the data do suggest that there are differences in the average number of minutes each employee spends with a customer before making a sale.
294
9.
The completed Excel templates are shown below. Employee 4 and Employee 2:
yes 8.97619048 12.4864865 42 37 3.68 11.4163655 1.52792567 5.49266635
We have 95% confidence that the interval (-5.5, -1.5) contains the number of minutes by which the average time spent with customers before making a sale for Employee 4 differs from the average time spent by Employee 2. In other words, we expect the average time spent by Employee 4 is at least 1.5 minutes less than Employee 2. Employee 4 and Employee 1:
TukeyKramerConfidence Interval Wasthenullhypothesis rejected intheANOVAtest? xbari xbarj ni nj q(fromAppendix7) MSwithin UpperConfidenceLimit LowerConfidenceLimit
yes 8.97619048 11.5428571 42 35 3.68 11.4163655 0.55440966 4.57892367
295
We have 95% confidence that the interval (-4.5, -0.5) contains the number of minutes by which the average time spent with customers before making a sale for Employee 4 differs from the average time spent by Employee 1. In other words, we expect the average time spent by Employee 4 is at least 0.5 minutes less than Employee 2. Employee 4 and Employee 3:
yes 8.97619048 11.15625 42 32 3.68 11.4163655 0.11699421 4.24312484
We have 95% confidence that the interval (-4.2, -0.1) contains the number of minutes by which the average time spent with customers before making a sale for Employee 4 differs from the average time spent by Employee 3. In other words, we expect the average time spent by Employee 4 is at least 0.1 minutes less than Employee 3.
296
Employee 2 and Employee 3:
yes 12.4864865 11.15625 37 32 3.68 11.4163655 3.45272548 0.79225251
Since t his interval contains zero, we conclude there is no significant difference between the average number of minutes Employees 2 and 3 spend with customers before making a sale. At this point, we can conclude that there are no significant differences between the average number of minutes Employees 1, 2 and 3 spend with customers before making a sale (the differences in the sample means are all less than the difference for Employees 2 and 3). This means that the average amount of time spent by Employee 4 is less than the average amount of time spent by the other employees.
297
10. Without further information, we cannot comment on whether the data are independent random samples. In practice, we should never take this on faith. We will assume this condition is met, with a caution that if it isn't, the results may not be reliable. Histograms of the sample data reassure us that the population data are probably normally distributed.
Number of Factory Accidents, Training Method #1

12 10
Frequency
8 6 4 2 0 NumberofAccidents

12 10
Frequency
8 6 4 2 0 NumberofAccidents
298

9 8 7 6 5 4 3 2 1 0 NumberofAccidents
The largest variance is 16.5, which is less than twice as large as the smallest variance of 8.3, so we will assume the population variances are approximately equal. It appears that the conditions for one-way ANOVA are met. 11. The Excel output is shown below.
Anova:SingleFactor SUMMARY Groups Numberof Accidents, TrainingMethod#1 Numberof Accidents, TrainingMethod#2 Numberof Accidents, TrainingMethod#3
Frequency
Count 30 30 30
Sum
Average
Variance
281 9.366667 8.309195 331 11.03333 9.757471 362 12.06667 16.47816
SS 111.3556 1001.8 1113.156
df
MS F P value 2 55.67778 4.835263 0.010205 87 11.51494 89
299
H0: 1 = 2 = 3 H1: At least one differs from the others. = 0.025 nT = 90, n1 = 30, n2 = 30, n3 = 30 x1 9.3667, x 2 11.0333, x 3 12.0677
2 2 s12 8.3092, s2 9.7575, s3 16.4782, SSbetween = 55.6778, SSwithin = 11.5149
We have already checked for normality and equality of variances. F = 4.835 Excel provides a p-value of 0.010205. Reject H0. There is sufficient evidence to conclude that at the average number of factory accidents is different, according to the training method. However, we cannot be certain that it is the training method that caused these differences. There may be other factors involved. 12. Comparing training method #1 and #3:
yes 9.366667 12.06667 30 30 3.4 11.51494 0.59356 4.80644
We have 95% confidence that the interval (-4.8, -0.6) contains the amount by which the average number of factory accidents for training method #1 differs from the average number of factory accidents for training method #3. In other words, it appears that training method #1 is associated with at least 0.6 fewer accidents, on average.
300
Comparing training method #2 and #3:
yes 11.033 12.067 30 30 3.4 11.515 1.0731 3.1398
Since this confidence interval contains zero, there is not a significant difference in the average number of factory accidents associated with training methods #2 and #3. Comparing training method #1 and #2:
yes 9.36666667 11.0333333 30 30 3.4 11.5149425 0.43977374 3.77310707
Since this confidence interval contains zero, there is no significant difference between the average number of accidents that are associated with training methods #1 and #2.
301
Training method #1 compares favourably to training method #3, but otherwise the differences are not significant. This suggests that training method #3 is the "worst". Again, we should be cautious, because there may be other explanatory factors. 13. Histograms of the sample data show significant skewness for some of the connection times. The data for early morning and late afternoon connection times appear skewed to the right, and the connection times for the evening are skewed to the left. Sample sizes are also relatively small. As a result, it would probably not be wise to proceed with ANOVA here, as the required conditions do not appear to be met.
ConnectionTimestoOnline MutualFundAccount
12 10
Frequency
8 6 4 2 0 TimesinSeconds,LateAfternoon
ConnectionTimestoOnline MutualFundAccount 10
9 8 7 6 5 4 3 2 1 0 TimesinSeconds,Evening
Frequency
302
9 8 7 6 5 4 3 2 1 0 TimesinSeconds,EarlyAfternoon
Frequency
9 8 7 6 5 4 3 2 1 0
Frequency
TimesinSeconds,EarlyMorning
9 8 7 6 5 4 3 2 1 0
Frequency
TimesinSeconds,MidDay
303
14. We are told the data were collected on a random sample of days. Histograms are shown below.
Commuting Times, 6a.m. Departure

9 8 7 6 5 4 3 2 1 0 NumberofMinutes
Frequency Frequency

8 7 6 5 4 3 2 1 0 NumberofMinutes

10 9 8 7 6 5 4 3 2 1 0 NumberofMinutes
The histograms appear approximately normal. The Excel ANOVA output is shown below.
Frequency
304
Anova:SingleFactor SUMMARY Groups CommutingTimeinMinutes,6 a.m.Departure CommutingTimeinMinutes, 7a.m.Departure CommutingTimeinMinutes,8 a.m.Departure Count 24 22 27 Sum Average Variance
1097 45.70833 172.3895 1002 45.54545 175.4026 1063 39.37037 197.5499
SS 667.0442 12784.71 13451.75
df
MS F P value Fcrit 2 333.5221 1.826131 0.168624 3.127676 70 182.6387 72
We see from the output that the variances are fairly close in value, and certainly the largest is less than four times as large as the smallest. It appears that the conditions for ANOVA are met. H0: 1 = 2 = 3 H1: At least one differs from the others. = 0.05 nT = 73, n1 = 24, n2 = 22, n3 = 27, k = 3 x1 45.7, x 2 45.5, x 3 39.4
We have already checked for normality and equality of variances. F = 3.1 Excel provides a p-value of 0.16. Fail to reject H0. There is not enough evidence to conclude that the mean commuting times are not all equal.
305
15. First, check conditions. The data are not actually random samples, but could perhaps be considered to be (see the explanation in the exercise). Histograms of the data are shown below.
Classes Scheduled at 8a.m. Thursday

9 8 7 6 5 4 3 2 1 0 FinalGrade
Frequency
Classes Scheduled at 4p.m. Friday

12 10
Frequency
8 6 4 2 0 FinalGrade
12 10
Frequency
Classes Scheduled at 2p.m. Wednesday
8 6 4 2 0 FinalGrade
306
The histograms appear reasonably normal. The Excel ANOVA output is shown below.
Anova:SingleFactor SUMMARY Groups MarksofClass Scheduledfor8a.m. Thursdays MarksofClass Scheduledfor4p.m. Fridays MarksofClass Scheduledfor2p.m Wednesday
Count
Sum
Average
Variance
20
1257
62.85 268.0289
23
1650 71.73913 305.2016
25
1691
67.64
263.99
SS 845.314 18142.74 18988.06
df
MS F Pvalue 2 422.657 1.514253 0.22763 65 279.1192 67
We can see from the output that the variances are sufficiently similar to allow us to assume the requirements for ANOVA are met (population variances approximately equal). H0: 1 = 2 = 3 H1: At least one differs from the others. = 0.01 nT = 78, n1 = 20, n2 = 23, n3 = 25, k = 3 x1 62.85, x 2 71.74, x 3 67.64
We have already checked for normality and equality of variances. F =1.514

307
Excel provides a p-value of 0.23. Fail to reject H0. There is not enough evidence to conclude that the mean grades for the students in classes for all three schedules are not equal. It does not appear that the scheduled time for classes affects the marks. However, we should be cautious, because there are many other factors that could be affecting marks. If we could control for them, we would be in a better position to investigate the effects of class schedule on student grades. 16. The first thing to note is that the data are not completely randomly selected. The information is provided by those who enter the contest. These customers may not represent all drugstore customers. Therefore, we must be cautious in interpreting the results. We would need more information about whether most customers entered the contest, before we could apply the results to all customers. As well, we have no way to be sure that the data are correct. Some people may have misrepresented their age or the value of their most recent purchase. With these caveats, we will proceed, but mostly for the practice! Histograms of the data appear approximately normal, and sample sizes, at 45, are fairly large.
MostRecentDrugstore Purchase for Customers Under 18Years Old

20
Frequency
15 10 5 0
AmountofPurchase
MostRecentDrugstore Purchase for Customers 1825Years Old

20
Frequency
15 10 5 0 AmountofPurchase
308

25
Frequency
20 15 10 5 0 AmountofPurchase

20 15 Frequency 10 5 0
AmountofPurchase

20 15
Frequency
10 5 0
AmountofPurchase
MostRecentDrugstore Purchase for Customers 75or More Years Old

20
Frequency
15 10 5 0
AmountofPurchase
309
Excel's ANOVA output is shown below.

Anova:SingleFactor SUMMARY Groups Under18 1825 2634 3549 5074 75andover Count 45 45 45 45 45 45 Sum Average Variance 1055.7 23.46 106.4338 1246.36 27.69689 83.09607 1567.82 34.84044 57.77471 1604.26 35.65022 147.776 1647.04 36.60089 121.0066 1172.11 26.04689 78.81046
ANOVA Sourceof Variation BetweenGroups WithinGroups Total SS 7179.96 26175.49 33355.46 df MS F Pvalue Fcrit
5 1435.992 14.48308 1.53E12 2.248208 264 99.1496 269
The largest variance is 147.8, and the smallest is 57.8, so the largest variance is less than four times the smallest variance. We will assume that the population variances are sufficiently equal to proceed with ANOVA.
310
17. H0: 1 = 2 = 3 = 4 = 5 = 6 H1: At least one differs from the others. = 0.05 nT = 270, n1 = 45, n2 = 45, n3 = 45, n4 = 45, n5 = 45, n6 = 45, k = 6 x1 23.46, x 2 27.50, x 3 34.84, x 4 35.65, x5 36.60, x6 26.05
2 2 2 2 2 s12 106.43, s2 83.10, s3 57.77, s4 121.01, s6 78.81 147.78, s5 SSbetween = 7179.961, SSwithin = 26175.49
We have already checked for normality and equality of variances. F = 14.5 Excel provides a p-value of approximately zero. Reject H0. There is enough evidence to conclude that the mean purchases of customers in different age groups are not all equal, when we consider the most recent purchases of those who entered the contest. 18. Because there are so many age groups in this data set, it is not as easy to see where the greatest differences in samples means is, simply by inspection. The easiest way to proceed is to create a table showing the differences in sample means. This is fairly easily constructed in Excel. See an example of such a table, below. Notice that the table shows the absolute value of the differences. Under 18 18-25 26-34 35-49 50-74 Under 18 18-25 26-34 35-49 50-74 75 and over 0 4.237 11.380 12.190 13.141 2.587 0.000 7.144 7.953 8.904 1.650 0.000 0.810 0.000 1.760 0.951 0.000 8.794 9.603 10.554 75 and over
By inspection of the table, we can see that we should start first by comparing the differences of purchases for customers under 18 and 50-74, then under 18 and 35-49, then under 18 and 26-34, and so on. We need the q-value for 6, 265 degrees of freedom. We will use the value for 6, 120 degrees of freedom, as the closest entry in Appendix 7.
311
The completed templates are shown below. Under 18 and 50-74:

yes 23.46 36.6008889 45 45 4.1 99.1496022 7.05501305 19.2267647
We have 95% confidence that the interval (-$19.23, -$7.06) contains the amount by which the average most recent purchase of customers under 18 differs from those aged 50-74 (for those who entered the contest). Under 18 and 35-49:
TukeyKramerConfidence Interval Wasthenullhypothesis rejectedintheANOVAtest? xbari xbarj ni nj q(fromAppendix7) MSwithin UpperConfidenceLimit LowerConfidenceLimit yes 23.46 35.6502222 45 45 4.1 99.1496022 6.10434638 18.2760981
We have 95% confidence that the interval (-$18.27, -$6.10) contains the amount by which the average most recent purchase of customers under 18 differs from those aged 35-49 (for those who entered the contest).
312
Under 18 and 26-34:

yes 23.46 34.8404444 45 45 4.1 99.1496022 5.2945686 17.4663203
We have 95% confidence that the interval (-$17.47, -$5.29) contains the amount by which the average most recent purchase of customers under 18 differs from those aged 26-37 (for those who entered the contest). 75 and over and 50-74:
yes 26.0468889 36.6008889 45 45 4.1 99.1496022 4.46812416 16.6398758
We have 95% confidence that the interval (-$16.64, -$4.47) contains the amount by which the average most recent purchase of customers 75 and over differs from those aged 50-74 (for those who entered the contest).
313
75 and over and 35-49:

yes 26.0468889 35.6502222 45 45 4.1 99.1496022 3.51745749 15.6892092
We have 95% confidence that the interval (-$115.69, -$3.52) contains the amount by which the average most recent purchase of customers 75 and over differs from those aged 35-49 (for those who entered the contest). 75 and over and 26-34:
yes 36.6008889 34.8404444 45 45 4.1 99.1496022 7.84632028 4.3254314
At this point, we see the confidence interval contains zero. For this and all the remaining comparisons, there is not a significant difference in the average purchases (for those who entered the contest).
314
19. This question has already been answered, in the discussion of exercise 16. We proceeded, for practice, but these data do not represent a random sample of data about the drugstore customers. 20. Generally speaking, these data do not meet the requirements for ANOVA. The data sets are non-normal, and quite significantly skewed. The histograms for Canada-wide data are shown below.
120 100
Frequency
Canadians with Secondary School Graduation Certificateas HighestLevelof Schooling
80 60 40 20 0 WagesandSalaries
Canadians with Trades Certificateor Diploma asHighestLevelof Schooling 40

30
Frequency
20 10 0 WagesandSalaries
60 50
Frequency
Canadians with CollegeCertificateor Diploma asHighestLevelof Schooling
40 30 20 10 0 WagesandSalaries
315
21. The professor has selected random samples, from large classes, and there is no immediately obvious reason why the observations would not be independent. The sample data appears to be approximately normally distributed, as the histograms below illustrate.
12 10
Frequency
Students Working <5Hours per Week onAverage
8 6 4 2 0 FinalMarkinMicroeconomics
15 10 5 0
Students Working 5 <10 Hours per Weekon Average
Frequency
FinalMarkinMicroeconomics
12 10
Frequency
316
10 8
Frequency
6 4 2 0 FinalMarkinMicroeconomics
12 10
Frequency
Students Working 20or More Hours perWeekon Average
Excel's ANOVA output is shown below.

Anova:SingleFactor SUMMARY Groups LessThan5HoursPerWeek 5<10HoursPerWeek 10<15HoursPerWeek 15<20HoursPerWeek 20orMoreHoursPerWeek Count 32 34 36 27 24 Sum Average Variance 2076 64.88 355.40 2217 65.21 251.44 1985 55.14 305.32 1557 57.67 284.00 1261 52.54 256.43
SS 3968.56 43283.32 47251.88
df
MS F P value 4 992.1399 3.392455 0.010938 148 292.4549 152
317
The ANOVA output shows the largest variance as 355.40, and the smallest as 251.44, and so the largest variance is less than four times as large as the smallest. We will presume that the population variances are approximately equal. H0: 1 = 2 = 3 = 4 = 5 H1: At least one differs from the others. = 0.05 nT = 153, n1 = 32, n2 = 34, n3 = 36, n4 = 27, n5 = 24, k = 5 x1 64.88, x 2 65.21, x 3 55.14, x 4 57.67, x5 52.54
2 2 2 2 s12 355.40, s2 251.44, s3 305.32, s4 256.43 284.00, s5 SSbetween = 3968.56, SSwithin = 43283.32
We have already checked for normality and equality of variances. F = 3.39 Excel provides a p-value of 0.010. Reject H0. There is enough evidence to suggest that the mean marks are not all equal. Again, because there are so many possible comparisons, it is useful to calculate all differences in sample means, so we can see which is largest, second-largest, and so on. Such a summary table is shown below (absolute values of differences are shown).
LessThan 5<10 10<15 15<20 20orMore 5Hours HoursPer HoursPer HoursPer HoursPer PerWeek Week Week Week Week LessThan5Hours PerWeek 0 5<10HoursPer Week 0.330882 0 10<15HoursPer Week 9.736111 10.06699 0 15<20Hours PerWeek 7.208333 7.539216 2.527778 20orMoreHours PerWeek 12.33333 12.66422 2.597222
0 5.125 0
318
So, the first comparison will be the marks of students who work 20 or more hours a week and those who work 5 - <10 hours a week, then students who work 20 or more hours a week and those who work less than 5 hours a week, and so on. We need the q-value from Appendix 7 for 5, 148 degrees of freedom. Note that if we use the table value for 5, 120 degrees of freedom, we get the following result. For the marks of those who work 20 or more hours a week, and those who work 5 <10 hours a week:
yes 52.54 65.21 24 34 3.92 292.454883 0.02647542 25.3019559
We have 95% confidence that the interval (-25.30, -0.03) contains the amount by which the average mark of students who work 20 hours or more and those who work 5 to <10 hour per week. Note that although there appears to be a significant difference between the marks of those who work 20 or more hours a week, and those who work 5 - <10 hours a week, the size of the difference may be quite small.
319
For the marks of those who work 20 or more hours a week, and those who work <5 hours a week:
yes 52.54 64.88 24 32 3.92 292.454883 0.46678284 25.1334495
This confidence interval contains zero. For this and all remaining comparisons, there is not a significant difference in the average marks.
320

Ch11-Solns-All Skuce 2e

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Ch11-Solns-All Skuce 2e

Uploaded by

Copyright:

Available Formats

Instructors Solutions Manual - Chapter 11

DailyFoot Traffic at Location 1

DailyFoot Traffic at Location 2

Copyright 2011 Pearson Canada Inc.

Instructors Solutions Manual - Chapter 11

DailyFoot Traffic at Location 3

Copyright 2011 Pearson Canada Inc.

Instructors Solutions Manual - Chapter 11

WineryPurchasesforCustomers Under30 YearsofAge

Copyright 2011 Pearson Canada Inc.

Instructors Solutions Manual - Chapter 11

Copyright 2011 Pearson Canada Inc.

Instructors Solutions Manual - Chapter 11

Copyright 2011 Pearson Canada Inc.

Instructors Solutions Manual - Chapter 11

Copyright 2011 Pearson Canada Inc.

Instructors Solutions Manual - Chapter 11

We have already checked for normality and equality of variances.

SSbetween 8475.2497 4237.6249 2 11.749 k 1 29575.9738 SS within 360.6826 82 nT k

Anova:SingleFactor SUMMARY Groups Location1 Location2 Location3

Sum 1365 1698 2081

Average Variance 50.5556 257.1794872 56.6000 478.7310 74.3214 333.5595238

ANOVA SourceofVariation BetweenGroups WithinGroups Total

SS 8475.2497 29575.9738 38051.2235

F Pvalue 11.749 3.26E05

Copyright 2011 Pearson Canada Inc.

Instructors Solutions Manual - Chapter 11

Copyright 2011 Pearson Canada Inc.

Instructors Solutions Manual - Chapter 11

Anova:SingleFactor SUMMARY Groups Marketing Accounting HumanResources GeneralBusiness

Variance 159729973.68 70826421.05 116576842.11 76859236.84

ANOVA Sourceof Variation BetweenGroups WithinGroups Total

SS 4750850500 8055857000 12806707500

F P value 14.94004664 9.77E 08

Copyright 2011 Pearson Canada Inc.

Instructors Solutions Manual - Chapter 11

Anova:SingleFactor SUMMARY Groups Route1 Route2 Route3

Sum Average 470 47 346 34.6 487 48.7

Variance 78.44444 67.15556 94.01111

ANOVA SourceofVariation SS BetweenGroups 1184.86667 WithinGroups 2156.5 Total 3341.36667

MS F Pvalue 2 592.4333 7.417436 0.002708 27 79.87037 29

Copyright 2011 Pearson Canada Inc.

Instructors Solutions Manual - Chapter 11

yes 50.5556 74.3214 27 28 3.4 360.682607 11.4505171 36.0812289

For locations 2 and 3:

yes 56.6000 74.3214 30 28 3.4 360.682607 5.72364915 29.719208

For locations 1 and 2:

Copyright 2011 Pearson Canada Inc.

Instructors Solutions Manual - Chapter 11

yes 50.5556 56.6000 27 30 3.4 360.682607 6.06771106 18.1565999

Copyright 2011 Pearson Canada Inc.

Instructors Solutions Manual - Chapter 11

For under 30 and 30-50:

Tukey KramerConfidenceInterval xbari xbarj

77.568 119.671 50 50 3.36 611.2629973 30.3542812 53.8505188

For 30-50 and over 50:

Tukey KramerConfidenceInterval xbari xbarj

119.671 132.467 50 50 3.36 611.2629973 1.04848123 24.5447188

51395.000 71170.000 20 20 3.74 105998118.4210530 11164.94982 28385.05018

Copyright 2011 Pearson Canada Inc.

Instructors Solutions Manual - Chapter 11

Accounting and General:

Tukey KramerConfidenceInterval xbari xbarj