You are on page 1of 668

Probability and Statistics

in Engineering

Probability and Statistics


in Engineering
Fourth Edition

William W. Hines
Prqjessor Emeritus
School of Industrial and Syste.rns Engineen"n.g
Georgia fnstit,ge of Technology

Douglas C. Montgomery
Professor of Engineering and Statistics

Department of Industrial Engineen"ng


Arizona State University

David M. Goldsman
Professor
School of Industrial and Sys!erns Engineering

Georgia Institute of Technology

Connie :VL Borror


Senior Lecroirer
Department of lr.dustrlal Engir.eering
Arizona State University

WILEY

John Wiley & Sons, Inc.

Acquisitions Edi!or

Wayne Anderson
Jenny Welter
Marketi.'1g Manager
Kathedne Hepburn
Se::ior Production Editor
valerie A. mrgas
Senior Designer
Dawn Stamey
Cover Image
Alfredo PasiekalPhoto Researchers
Production Management Services
Argosy Publishing

Associate Editor

This book was set in 10/12 Tunes: Roman by Argosy Publishing and printed and bound by Hamil.:on
Printing. The cover was printed by Phoenix Color.
This book is printed on acid~free paper,

Copyright 2oo3 John WIley &: Sons, me, All rights reserved.
No p31t of this publication may be rcproduced. stored in a retrieval system. or transmitted in any form or by any
means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except ;as permitted under Sec
tions 107 or 108 of the 1976 United St:ates CopyrightAct. witb.outeither the poor written permission of the Publisher or au:horiz.ation through payn::tent of the appropriate per-eopy fee to the Copyright Clearance Center. 222
Rosewood Drive. Danvers, MA OHi23. (978) 750-S400. fux (973) 750-44-70. Requests to the Publisher for permission should be nddressed to the PermissionS D<?artment, John Wuey & Sons, Inc., 111 River Street, Hobok-en.
1') 07030, (201)748-{)OCQ,
(201)748-6088, E-mail: PERMJffiQ@W!LEYCOM,

mx

10 order books or fur custOtr..er service please call1(800}-CALL vm.EY (225-5945),


Library of Congress CaJ.aloging in PublicaJian Data::
Probabilit)' and statistics in engineering fW'l1Jjaro W. Hines .. (et :ll.]. - 4th cd.

p.em.
Includes bibliographical referenceS,
1. Engineering~Statistica1 methods. L Hines, W'tlli:::u:n W.
TA340 Jj55 2002
519.2--de21
2002026703
ISBN 0-471~24087-7 (cloth: acjd~free paper)
Printed in :..l::e United States of .A.cl.erica
1098765432

PREFACE to the 4th Edition

This book is written for a first course in applied probability and statistics for undergraduate smdents in engineering, physical sciences, and management science curricula. We have
found that the text can be used effectivelY as a two-semester sophomore~ or junior-level
course sequence, as well as a one~semester refresher course in probability and statistics for
first-year graduate students.
The text has undergone a major overhaul for ille fourth edition, especially wiill regard
to many of the statistics chapters. The idea has been to make the book more accessible to a
wide audience by including more motivational examples, rea1~world applications. and useful computer exercises. With the aim of making the course material easier to learn and easier to teach, we have also provided a convenient set of cOurse notes, available on the Web
site www.wiley.comico.11ege/hines. For instructors adopting the text, the complete solutions are also available ou a password~protected portion of this Web site.
Structurally speaking, we start ille book off with prObability illeory (Chapter I) and
progress through random variables (Chapter 2), functions of random variables (Cbapter3),
joint random variables (Chapter 4), discrete and continuous distributions (Chapters 5 and
6), and normal distribution (Chapter 7). Then we introduce statistics and data description
techniques (Chapter 8). The statistics chapters follow the Same rough outline as in the previous edition, namely, sampling distributions (Chapter 9), parameter estimation (Chapter
10), hypothesi, testing (Chapter 11), single- and multifactor design of experiments (Chapters 12 and 13), and simple and multiple regression (ChapterS 14 and 15). Subsequent special-topics chapters include nonparametric statistics (Chapter 16), qUality control and
reliability engineering (Chapter 17), and stochastic processes and queueing theory (Chapter 18), Frnally, there is an entirely new Chapter, on statistical techniques for computer sim~
ulation (Chapter 19)-.--perhaps ille first of its kind in this type of statistics text.
The chapters that have seen the most substantial evolution are Chapters 8-14. The discussion in Chapter 8 on descriptive data analysis is greatly enhanced over that of the previous edition~s. We also expanded the discussion on different types of interval estimation:in
Chapter 10. In addition. an emphasis has been placed On real-life computer data analysis
examples, Throughout ille book, we incorporated other SlIUctural changes. In all chapters,
we included new examples and exercises; including numerous computerflbased exercises.
A few words on Chapters 18 and 19. Stochastic processes and queueing theory arise
naturally out of probability, and wefeelillat Chapter 18 serves as a good introduction to the
subject-nonnally taught in operations research. management science, and certain engineering disciplines. Queueing theory bas garnered a great deal of use in such diverse fields
as telecommunications, manufacturing, and production planning, Computer simulation. the
topic of Chapter 19, is perhaps the most widely used tool in operations research and man~
agement science, as well as in a number of physical sciences. Simulation :ma..'7ies all th.e
tools of probability and statistics and is used in everyJring from financial analysLs to factory
control and planning, Our text provides what amounts to a simulation mL.1icourse, covering
the areas of Monte Carlo experimentation. random number and variate generation, and
simulation output data analysis.

vi

Preface

We are grateful to the following individuals for their help during the process of completing the current revision of the text. Christos Alexopoulos (Georgia fustitute of Technology), Michael Cararnanis (Boston University), David R. Clark (Kettering University),
J, N. Hool (Auburn University). Johu S. Ramberg (University of Arizona), and Edward J.
Williarus (Gniversity of )":lichigan - Dearborn), served as reviewers and provided a great
deal of valuable feedback. Beamz Valdes (Argosy Publishing) did a wor.derful job supervising the typesetting and page proofing of the text, and Jennifer Welter at Wiley provided
great leadership at every tum. Everyone was certainly a pleasure to work with. Of course,
we thank our families for their infinite patier:.ce and support throughout the endeavor.

Hines. Montgomery, Goldsman., and Borror

Contents

1. An Introduction to

4. Joint Probability Distributions

1~ 1 rntroduction
1-2
1-3
1-4
1-5
1-6

1-7
1-8

1-9
1-10

A Review of Sets
2
Expe:.'iments a."l.d Saruple Spaces
5
Events
8
Probability Definition and Assignment
8
FInite Sample Spaces and Enumeration
14
1-6_1 Tree Diagram
14
1-6_2 MultiplicatioD Principle
14
1-6.3 Permutations
15
1-6.4 Combinations
16
1-6.5 Permutations of Like Objects
19
CODditiODal Probability
20
Pa.titions, Total Proba~ility. and Bayes'
Theorem
25
Summa,y
28
Ex=i,es
28

4-6
4-7
4-8
4-9
4~ 10
4-11
4-12
4-13

101

101

5, Some Important Discrete


Distributions
106
5-1
44

3. Functions of One Random Variable and


Exp~tion
52
3-1 Introduction
52
32 Equivalent Events
52
33 Functions of a Discrete Random
Variable
54
3-4 Continuous Functions of a Continuous
Random Variable
55
3-5 Ex,pectation
58
3-6 Approximations to E[H(Xl] and I1H(X)]
3-7 The Moment-Gene.'1Iting Function
65
3-8 Sum:na:y
67
3-9 E--:ercises
68

4-3
4-4
4-5

Introduction
71
Joint Distribution for Two--Dimensional
Random Variables
72
Marginal DistrlbutionS
75
Conditional Distributions
79
Conditional Expectation
82
Regression of the Mean
85
Independence of Random Variables
86
Covariance and Correlation
87
The Distribution Function for TW(JDimensional Random Variables
91
Funetions of Two Random Variables
92
Joint Distributions of Dimension n > 2
94
Linear Combinations
96
~omen>Generat:ing Functions and Linear
Combinations
99
The Law ofL"1le Numbers
99

4-14
4-15 S\lIll1!llUy
4-16 Exercises

2_ One-Dimensional Random
Variables
33
2-1 Introduction
33
2-2 The Distribution Function
36
38
2-3 Discrete Random Variables
2-4 Continuous Random Variables
41
2-5 Some Characteristics ofPi;.}tnOUtiODS
2-6 Chebyshev', Ioequality
48
2-7 SuInmary
49
2-8 Exercises
50

4-1
4-2

71

62

Introduction

106

5w2 Bernoulli Trials a.."1d the Bernoulli


Dis:ributior..
106
5~3 The Binomial Dis:ri::outio:l.
108
5-3.1 Mean and Variance of the Binomial
Distribution
109
The Cumulative Binomial
Distribution - ltO
5~3.3 An Application of the Binomial
Distribut:!on
111
54 The Geometric Distribution
112
5-4,1 Mean and Variance of the Geometric
Distribution
113
5~5 The Pascal Distribution
115
5-5.1 Mear. a::.d Variance 0: the Pascal
Distric:.r:iolJ
115
5-6 The Multinomial DistnlJotioa
116
5-7 The Hypergeometric Distribution
117
5-7_1 Mean ar.d Variance ofthc Hyper1 I8
geometric Distribution
5~3.2

viii
5-8

Contents
rr.e Poisson Distribution
118
5-8.1 Development from a Poisson
Process
118
5~S.2 Development of the Pojsson
Distribution from the
Bir..oroial

120

7-4

5-8.3 Mean and Va..iance of the Poisson


Distribution

5-9 Some Approximations


5-10 Generation of Realizations
5-l! Summary
123
5~12

Exercises

120
122

6-2

6-3

6-4

7-6

123

7~7

7~8

Introduction
128
The Unifonn Distribution
128
6-2.1 Mean and Variance of the Uniform
Lnsbibution
129
The Exponential Disbibution
130
6-3.1 The Relationship of the Exponential
Distribution to the Poisson
Distribution
131
6~3.2 Mean and Variance
of the Exponential
Distribution
131
6-3.3 Memoryles, Property of the
ExponcLtial Distribution
133
The Gamma Distribution
1346-4.1 The Gamma Function
134
6-4.2 Definition of the Gamma
Distribution
134

6-4.3 Relationship Between the Gamma

6-5

6-6
6-7
6-8

Distribution and the Exponential


Distribution
135
6-4.4 Mean and Variance of thc Gamm.a
Distribution
135
The Weibull Distribution
137
6-5.1 Mean and Variance of the Vleibull
Distribution
137
Generation of Realizations
138
Summary
139
Exercises
139

7. The Normal Distribution


7-1
7-2

7-5

123

6_ Some Important Continuous


128
Distributions
6--1

7~3

143

Introduction
143
The Normal Distribution
143
7-2.1 Properties of the Normal
Distribution
143
7-2.2 Mean and Variance of the Normal

Distribution

144

7-2.3 The Normal Cumulative


Distribution
145

7-9
7-10

7-2.4 The Standard Normal


Distribution
145
7M2.5 Problem-Solving Procedure
146
The Reproductive Property of the Normal
Distribution
150
The Central Limit Theorem
152
The Nonnali\.:pproximation to the Binomia!
Distribution
155
The Lognormal Distribution
157
7-6,1 Density Function
158
7~6,2 Mean and Va..rianee of the Log:normal
Distribution
158
7-6.3 Properties of the Lognormal
~bution
159
The Bivariate Normal Distdbution
160
Generation of Normal Realiza'.:ions
164
Summary
165
Exercises
165

S. Introduction to Statistics and Data


Description
169
8~1

The Field of Statistics


169
Data
173
Graphical Presentation of Data
173
8-3.1 Numerical Data; Dot Plots and
Scatter Plots
173
8-3.2 Numerical Data: The Frequency
Distribution and Histogram
175
8-3.3 The Stem-and-LeafPlot
178
8- 3.4 The Box Plot
179
8-3.5 ThePa!'oto Chart
181
8-3.6 Ti.:ne Plots
183
8-4 NUlIlerical Description of Data
183
8-4.1 Measures of Central
Tendency
183
186
8-4.2 Measures of Dispersion
8-4.3 Other Measures for One
Variable
189
8-4.4 Measuring A.!.wciation
190
191
8-4.5 Grouped Data
8-5 Summary
192
8-6 Exercises
193
8-2
8-3

9. Random Samples and Sampling


Distributions
198
9-1

Random Samples
198
9-1.1 Simple Random Sampling from a
Finite Universe
199
9-l.2 Stratified Random Sampling of a
Finite Universe
200
9-2 Statistics and Sampling
201
Distributions
9-2.1 Sampling Distributions
202

ix

Contents
9-2.2 Finitc Populations and Enumerative
Studies
204The Chi-Square Distribution
205

9-3

9-4 The t Distribution


9~5 The F Distribution

9-6 Summary
9-7 Exercises

208
211

214
214

11. Tests of Hypotheses

r ~,,

(~:

,Parameter Estimation

216

10-1 Point Estimation


216
10-1.1 Properties of Estimators
217
10-1.2 The Method of Maximum
Likelihood
221
10-1.3 The Metbod of Moments
224
10-1.4 BaycsianInferencc
226
227
10-1.5 Applications to Estimation
101.6 Precision of Estimation:
The Standard:Error
230
10-2 Single-Sample Confidence Interval
Estimation
232
10-2.1 Confidence Llterval on the Mean
of a Normal Distribution, Variance
Known
233
10-2.2 Confidence Interval on the Mean
of a Nonnal Distribution, Variance
Unknown
236
10-2.3 Confidence Interval on the
Variance of a Non:u.ll
Distribution
238
10-2A Confidence Interval on a
P;oportion
239
10-3 Two-Sampl. Confidence Interval
Es~on
242
10-3_1 Confidence Interval on the
Difference between Means of Two
Normal Distr1outions, Variances
Known
242
10-3,2 Confidence Interval on the

Difference between Mea:n:s of'IWo


Normal Distnoutious. Va.."'iatices

10-4

10-5
10-6
10-7

10"8 Othe:: Interval EstiJ:nation Procedures


10-8.1 P:ediction In:eIYals
255
10~8,2 Tolerance Ir.tervals
257
;0-9 Su:n:nru:y
258
10-10 Exercises
260

Unknown
244
10-3.3 Confider.ce Interval an IJ.: - J4 for
Pl'rired Observations
247
10-3.4 Confidence Interval 0:.1 the Ratio
ofVariauces of1'v,"o Normal
IAstributions
248
10-3.5 Confidence bterval on the
Difference between Two
Propor::ions
250
Approximate Confidence Intervals in
Maximum Likelihood Esti:trtation
251
Simultaneous Confidence Intervals
252
Bayesian Confidence Intervals
252
Bootstrap Confidence Intervals
253

255

266

11-1 Introduction

11-2

1l~3

11-4
11-5
11-6
11-7
11-8

266
!lLl Statistk:al Hypotheses
266
11-1.2 Type! and Type II Errors
267
!l-13 One-Sided and Two-Sided
Hypotheses
269
Tests of Hypotheses on a Single
Sample
271
11 ~2.1 Tests of Hypotheses on the Mean
of a Normal Distribution. Variance
KnoVitn
271
11~2.2 Tests of Hypotheses on the Mean
of a Kormal Disttfoution. Varianee
Unkao~n
27&
11-2.3 Tests of Hypotheses on the
Variance of a Normal
Distribution
281
11-2.4 Tests of Hypotheses 0:.1 a
Proportion
2&3
Tests of Hypotheses on Two Samples
286
11-3.1 Tests of Hypotheses on the Means
of Two Normal Distributions,
Variances KnO\\T.
2&6
11~3.2 TesLS of Hypotheses on the Means
of1\vo Normal Distributions,
Variances Unknovro
288
11-33 The Paired t- Test
292
11 ~3.4 Tests for the Equality ofIwo
Variances
294
11 ~3.5 Tests of Hypotheses on 1\vo
Proportions
296
Testing for Goodrless of Fit
300
Contingency Table Tests
307
Sample Computer Output
309
Summary
312
Exercises
312

12_

Design and Analysis of Single.


Factor Experiments: The Analysis
of Variance
321

12-1

The Completely Randomized Single-Factor


Experiment
321
12-:_1 AD Example
321
12-1.2 The Analysis of
Variance
323
12-1.3 Esti.mation of the Model
Parameters
327

Contents
12-1.4 Residual Analysis and Model

12-2

12-3
12-4

12-5
12~6

127
12-8

13.

Checking
330
12-1.5 An linbaLmcedDe,ign
33\
Tests on Individual Treaonen~
Means
332
12-2.1 Orthogonal Contrasts
332
12-2.2 'f\Jkey's Test
335
The Random-Effects Model
337
The Randomized Block Design
341
12-4.1 Design and Statistical
Analysis
341
12-4.2 Tests on Individual Treatment
Mea:lS
344
12-4,3 Residual Analysis and Model
Checking
345
Deterrr.:n:.ng Sample Size in Si."l.gle-Factor
Experiments
347
Sample Co:rnputer Ou~ut
348
Summary
350
Exercises
350

Design of Experiments with Several


Factors
353

--~~--~-----------13-1 Examples of Experimental Design


Applications
353
13-2 Factorial Experi.-uents
355
13-3 Tvio-Factor Factorial
Experiments
359
13-3.1 Statistical Analysis of the Fixed~
Effects Model
359
13-3.2 Model Adequacy
Checking
364
13-3.3 One Observation per
Cell
364
13-3.4 The RandomEffects
Model
366
13-3.5 The Mixed Model
367
13-4 General Factorial Experiments
369
135 The 2' FaC'.orial Design
373
135.1 The2'Design
373
135.2 The 2' Design for k ~ 3
r"'actors
379
13-5.3 A Single Replicate of the 2~
Design
386
136 Confounding in the 2' Design
390
13~7 Fractional Replication of the 2k
Design
394
13-7.1 The One-Half Fracuon of the 2'
Design
394
13M7.2 Smaller Fractions: The 2k -p
Fractional Factorial
400
138 Sample Computer Output
4D3
139 SlUlllllary
404
1310 Exercises
404

14.

Simple Linear Regression and


Correlation
409

14-1 Simple Linear Regression


409
14-2 Hypot..~esis Tes!ing in Simple Linear
Regression
414
14-3 lnterval Estimation in Simple Linear
417
Regression
14-4 Prediction of New Observations
420
14-5 Measuring the Adequacy of the Regression
Model
421
14-5.1 Residual Analysis
421
14-5.2 The Lack-ofFit Test
422
14-5.3 The Coefficient of
Dete:rmination
426
14-6 Transformations to a Straight
Line
426
14w-7 Correlation
427
14~8

SampleComputerOu~ut

149 Summary
14-10 Exercises

15.
15~ 1

43:

431
432

Multiple Regressiol' __43_7_ _ __

Multiple Regression Models


437
Estimation of the Parameters
438
15-3 Confidence Intervals in Multiple Linear
Regression
444
15-4 Prediction of New Observations
446
15~ 5 HypOthesis T~'ting in Multiple Linear
Regression
447
15-5,1 Test for Significance of
Regression
447
15-5.2 Tests on Individual Regression
Coefficients
450
156 MeasuresofMcdelAdequacy
452
15-6.1 The Coefficient of Multiple
Detennination
453
15-6.2 ResidualAnalysis
454
15-7 Polynomial Re:gression
456
15-8 Indicator Variables
458
15-9 The Correlation Matrix
461
15-10 Problems in Multiple Regression
464
464
15-1O.! Multicollinearity
15-10.2 Influential Observations m
Regression
470
15-10.3 Autocorrel1.tion
472
15,11 Selection of Var'.ab1e.s in Multiple
RegressioQ
474
15-11.1 The ModelBuililing
Problem
474
15-11.2 Computational Procedures for
Variable Selection
474
15-12 SUlIlIrulJ:)'
486
15-13 Exercises
4&6
15~2

Contents

16.

Nonparametric Statistics

491

491
16-1 Introduction
491
16-2 The Sign Test
16~2.1 A Description of the Sign
Test
491
16-2.2 The Sign Test for Paired
Samples
494
16-2.3 Type II Error (i3) for the Sign
Te't
494
16~2.4 Co::nparisou of the S:g!l Test and
the t- Test
496
496
16-3 The Wilcoxon Signed Rar..k Test
16~3.1 A Description of the Test
496
16~3.2 A La..rge-Sample

Approrimat:on
t.97
163.3 Paired Observations
498
16~3A Comparison with the
,-Test
499
16-4 'The Wilcoxon R:mk Sum Test
.!99
164.1 A Description of the Test
499
16-4.2 A Large-Sample
ApproxJ.mz:tion
501
16-4.3 Compa.'"'ison with the
'-Test
501
16-5 Nonparnmetric Methods in the Analysis of
Variance
501
16-5.1 11le Kruskal-Wallis Test
501
5(l4
16-52 The Rank Transformation
16-6 Summary
5(l4

16-7 Exercises

505

11. Statistical Quality Control and


ReliabilityJJ:Il~"Ii"~_ _
507
_ __
Quality Improvement and
507
17-2 Statistical Q"ality Con",,1
508
173 Statistical Process Control
508
17-3.1 Il:!.ttoduction to Control
Char~
509
17-3.2 Control Charts for
Measurements
510
17-3.3 Control Cbw~ for Individual
Measurements
518
17~3.4 Control Chart.s for
Attribotes
520
17-35 CUS"L'M and EWMA Control
Charts
525
173.6 Average Run Length
532
17-3.7 Other SPC Problem-Solving
Tools
535
17-4 Reliability Engineering
537
174.1 Basic Reliability
Definitions
538
17-1

Statistics

xi

17-4.2 The Exponential Time~to-FaiJure


Model
541
17-4.3 Simple Serial Systems
542
17-4.4 Simple Active
Redundancy
544
17-4.5 Standby Redundancy
545
174.6 Life Testing
547
17-4.7 Reliability Estimation with a
Known Tirr.e-to-Failure
Distribution
548

17-4.8 EsEmation with the


Exponential Tune~:o-Failure
Distribution
548
17-4.9 Demonstration and Acceptance
Testing
551
17-5 Summary
551
17-6 Exercises
551

18. Stochastic Proc<sses and


Queueing
555
18-1 Introduction
555
18-2 Discrete-TIme Markov Chains
555
18-3 Classification of States and Clams
557
184 Continuous-Time Markov Chains
561
18-$ The Birth-Death Ptoces:s
in Queueing
564
18-6 Considerations in Queueing Models
567
18-7 Basic Single-Server Model1A-1.th Consrant
Rates
568
18-8 Single Server with Limited Queue
Length
570
18-9 Multiple Servers \vith an U~~
Queue
572
18-10 Other Queueing Models
573
18-11 S1JJl1IIlaIY
573

18-12 Exercises

573

19.

Computer Simulation

19-1

Yfotivatior.al Examples
576
Generation ofRando;n Variables
530
,9-2.) o"nerating Uniform (0.1) Random
Variables
581
19-2.2 Generating Nonuniform Random
Variables
582
Output Analysis
586
19-3.1 Terminating Simulation
Analysis
587
19-3.2 Initialization Problems
589

19~2

19-3

19~4

516

19-3.3 Steady State Simulation


Analysis
590
Comparison of Systems
591
19-4.1 Classical Confidence
Ix:tervals
592

xii

Contents
19-4.2 Common Random
Numbers
593
19-4.3 Antithetic Rmldom

Numbers

593
19-4.4 Selecting the Best
Syst=
593
19-5 SlllIllDZIY
593
19-6 Exercises

Chart VIII Operating Characteristic Curves for the


Random~Effects Model Analysis of
VarianCe

C:itical Values for the WUcoxon Two-

Table X

Sample Test
627
Critical Values for the Sign
Test
629

Table XI

594

Critical Values for the Wdcoxon Signed


Rank Test

Appendlx

623

Table IX

630

Table XTI Percentage Points of the Studentized

597

Range Statistic

Cum\llative Poisson
598
Distribution
Table II
Cumulative Standard Nonna:!
Distribution
601
Table ill Percentage Points of the
y} Distribution
603
Table IV Percentage Points of the
t Distribution
604
Table V Percentage Pomts of the
F DiJ;tribution
605
CbartVI Operating Cl:iaraJ;teristic Curves
610
Cbart\lI Operating Characteristic Ccrves for the
Table I

F)Xd-Elfects ModelAnalysiJ; of

V:u:iance

619

Table

631

xm Factors for Quality-Control

Cbarts
633
Table XIV k Values for One-Sided and Two-Sided
Tolerance Intervals
Table XV Rmldom Numbers

References

637

Answers to Selected Exercises


Index

649

634
636

639

Chapter

An Introduction to Probability
1-1 INTRODUCTION
Since professionals working in engineering and applied science are often engaged in both
the analysis and the design of systems where system component characteristics are nondetetnlinistic. the understamling and utilization of probability is essential to the description.

design, and analy&is of such systems. Examples reflecting probabilistic behavior are abundant, and in fact. true deterministic behavior is rare, To illusrrate. cons:der the description of
a variety of product quality or perfonnance measurements: the operational lifespan of
mechanical and/or electronic systems; the pattern of equipment failures; the occurrence of

natural phenomena such as sun spots ortomados; particiecouots from a radioactive source;
travel times in delivery operations; vehicle accident counts during a given day on a section of
freeway; or customer waiting times in line at a branch bank,
The termprobability has come to be widely used in everyday life to quantify thc degree
of belief in an event of interest. There are abundant examples, such as the statements that

"there is a 0.2 probability of rain showers" and "the probability that brand X personal computer will survive 10,000 hours of operation without repair is 0.75." In tltis chaprer we introduce the basic structure. elementary concepts, and methods to support precise and
unambiguous statements like those above.
The formal study of probability theory apparently originated in the seventeenth and
eighteenth centuries in France and was motivated by the study of games of chance. WIth little formal mathematical understructure, people viewed the field with some $kepticism~
however, this view began to change in the nineteenth century, when a probabilistic model

(description) was developed for the behavior of molecules in a liquid. This became known
as Brownian motion r since it was Robert BrO\vn, an English botanist, who firs.t observed the
phenomenon in 1827. In 1905, Albert Einstein explained Brownian motion under the
hypothesis that particles are subject to the continual bombardment of molecules of the surrounding medium. These results greatly stimulated interest in probability, as did the emergence of the telephone system in the latter part of the nineteenth and early twentieth
centuries. Since a physical connecting system was necessary to allow for the interconnection of individual telephones, with call1engths and interdemand intervals displaying large
variation~ a strong motivation emerged for developing probabilistic models to describe this
system's behavior.

Although applications like these were rapidly expanding in the early twentieth cenrury,
it is generally thought that it was not until the 19305 that a rigorous mathematical structure
for probability emerged. This chapter presents basic concepts leading to and including a

definition of probability as well as some results and methods useful for problem solution.
The emphasis th:roughout Chapters 1-7 is to encourage an understanding and appreciation
of the subject, with applications to a variety of problems in engineering and science. The
reader should recognize that there is a large, rich field of mathematics related to probabil-

ity (hat is beyond the scope of this book.

Chapter 1

An Introduction to Probability

Indeed, our objectives in presenting the basic probability topics considered in the cur~
rent chapter are threefold. First, these concepts enhance and enrich our basic understanding
of the world in which we live. Second, many of the examples and exercises deal with the
use of probability concepts to model the behavior of real-world systems. Finally, the probability topics developed in Chapters 1-7 provide a foundation for the statistical methods
presented in Chapters 8-16 and beyond. These statistical methods deal with the analysis
and interpretation of data, drawing inference about populations based on a sample of units
selected from them, and with the design and analysis of experiments and experimental
data. A sound understanding of such methods will greatly enhance the professional capability of individuals working in the data-intensive areas commonly encountered in this
twenty-first century.

1-2 A REVIEW OF SETS


To present the basic concepts of probability theory, we will use some ideas from the theory
of sets. A set is an aggregate or collection of objects. Sets are usually designated by capital
letters, A, B, C, and so on. The members of the set A are called the elements of A. In general, when x is an element of A we write x E A, and if x is not an element of A we write
x ~ A. In specifying membership we may resort either to enumeration or to a defining prop~
ertj. These ideas are illustrated in the following examples. Braces are used to denote a set,
and the colon within the braces is shorthand for the t = "such that."

The set whose elements are the integers 5, 6, 7, 8 is a finite set with four elements. We could denote
this by

A=[5,6,7,8).
Note that 5 E A and 9 ~ A are both true.

If we write V::: {a, e, i,

0, u} we have defined the set of vowels in the English alphabet. We may use
a defining property and write this using a symbol as

V = (*: * is a vowel in the English alphabet}.

If we say thatA is the set of all real numbers between 0 and 1 inclusive, we might also denote A by a

defining property as

A= [x:x

R, O$x'; 1),

where R is thc set of all real numbers.

The setB= (-3, +3} is the same set as


B

where R is again the set of real numbers.

=(x: x E R, x' =9),

1-2

A Review of Sc:ts

:3

In the real plane we can consider points (x, y) that lie on a. given lineA, Thus, the condition for inclusion for A requires (x, y) to satisfy ax..;.. by = c, so that
A = (x, y): x

R, y E R. ax+by=c).

where R is the set of real numbers.

The universal set is the set of all objeets under consideration; and it is generally
denoted by u: Another special set is the null set or empty set, usually denoted by 0. To mustrate tbls concept> consider a set
A={X:XE R,x'=-l).

'The universal set here is R. the set of re-al numbers. Obviously, set A is empty, since there
are no real numbers having the defining property :? :;:: -1. We should point out that the set
(0)0'0.
If two sets are considered, say A and B, we call A a subset of B, denoted A c B, if each
element in A is also an element of B. The sets A and B are said to be equal (A = B) if and
only if A c B and B c A. As direct consequences of this we may show the following:
1, For any set A, 0 c A.

2. For a given U, A considered in the context of U satisfies the relation A c U.


3, For a given set A, A c A (a rejJexive relation).
4, If A c B and Bee, then A ceCa transitive relation).
An interesting consequence of set equality is that the order of element listing is immaterial, To illustrate, let A (a, b, c) and B = (c, a, b). Obviously A ; B by our definition.
Furth=ore, when defining properties are used, the sets may be equal althougb the defining properties are outwardly different. As an example of the second consequence. we let.4. : : :
{x:.x E R, where x is an even~ prime number} andS """ {x: x+ 3 = 5}. Since the integer 2 is
the only even prime, A B.
We now considet some operations on sets. Let A and B be any subsets of the universal
set U. Then the following hold:

1. The complemernof.4 (with tespectto U) is the set made up of the elements of Uthat
do not belong ro A. We denote this complementary set as A. That is,
A=(X:XE u,x>!c A}.

2. The intersection of A and B is the set of elements that belong to both A and R We
denote the intersection as A ('j R In other words,

AnB

(x:xEAandxEB).

We should also note that A n B is a set, and we could give this set some designator,
such as C.
3. The union of A and B is the set of elements that belong to at least one of the ,etsA
and B. 1f D represents the union, then

D=A u B

{x: x

A or x

B Corbothl}.

These operations are illustrated in the following examples,

Chapter 1

An Introduction to Probability

'E~~ie'l;~
Let Ube the set of leiters ttl the alphabet, that is, U = (*; * is a letter of the English alphabet}; and let
A {"': ;;0 is :a yowell and B '=' {*: * is one ofthe1etrers a, b. c}. As a consequence of the definitions,

A,= the set of consonants,


B ~ {t!, e,t g .. x; y. zl.
AuB= (a, b. c, e, i,o, U}.
A0B~{al.

:E~l!leI1
If the unlversal.setis defined as U =- {1, 2, 3.4, 5, 6, 7), and threesubsets,A = {l. 2. 3},B= {2.4, 6}.
C::;: (1. 3, 5, -"}}. arc defined, then we see immedia:ely from the de:fi.citions that

A= (4.5.6. II. B= {U.5. 71=C.


A c, B= (1. 2. 3, 4. 6).
AUC=!1.2.3.5.7}.
A"8~{2I,

o4nC=(l,3),

BvC=U,

80C=0.

The Venn diagram can be used to illustrate certain set operations, A rectangle is drawn
to represent the universal set U. A s~setA of U is represented by the region, within a cir~
de drawn inside the rectangle. Then A will be represented by the area of the rectangle outside of the circle, as illustrated in Fig. I-I. Using this notation, the intersection and union
are illustrated in Fig. 1-2.
The operations of intersection and union may be extended in a straightforward manner
to accommodate any finite number of sets. In the case of three sets, say A. B, and C, A u B
u C has the property that Au (B u C) = (A u B) u C, which obviously holds since both
sides have identical members. Similarly, we see that A n B n C = (A n B) 0 C = A n
(B " C). Some important laws obeyed by sets relative to the operations previously defined
are listed below.
Identity laws:

De Morgan's law:
Associative iaws:
Distributive laws;

Au0=A.
AuU=U.

-- Au B =A n B,
A u (B u
A nCB n
A U (B n
A n (B u

AnU=A,
An0=0.
A 0 B =A u B.

C) = (A u B) u c,
C)=(A nB) n C.
C) = (A u B) n (A u C),
C) = (A n B) u (A n C).

The reader is asked in Exercise 1~2 to illustrate some of these statements with Venn dia~
Fonnal proofs are usually more lengthy.

grams.

FJgUr< 1-1 Asetln. Venn diagram.

1-3

Experiments and Sample Spaces

(aJ
(bJ
Figure 1~2 The interseetion and union of two sets in a Venn diagram.
(a) The intersection shaded. (b) The union shaded.

In the case of more than three sets, we usc a subscript to generalize. Thus, if n is a positive integer, andE j , E 2, .. , , En are given sets, thenE j nE 2 n ... n En is the set of elements
belonging to all of the sets, and E I U E2 U ... U En is the set of elements that belong to at
least one of the given sets.
If A and E are sets, then the set of all ordered pairs Ca, b) such that a E A and bEE is

called the Cartesian product set of A and B. The usual notation is A x B. We thus have
AxB~{(a,

b):aE AandbE B).

Let r be a positive integer greater than 1, and let A j '


sian product set is given by
Al xA 2 x .. ' xA r = {(ai'

,Ar represent sets. Then the Carte-

az, ... ,a r): ajE AJorj= 1, 2, ... ,r}.

Frequently, the number of elements in a set is of some importance, and we denote by


n(A) the number of elements in set A. If the number is finite, we say we have afinite set.
Should the set be infinite, such that the elements can be put into a one-to-one correspondence with the natural numbers, then the set is called a denumerably infinite set. The nondenumerable set contains an infinite number of elements that cannot be enumerated. For
example, if a < b, then the set A = {x E R, a':::; x':::; b} is a nondenumerable set.
A set of particular interest is called the power set. The elements of this set are the subsets of a set A, and~ common notation is {O, 1 For example if A = {I, 2, 3 }, then

t,.

~ ~ (O, (I), (2), (3),

p, 2), (1,3), (2,3), (1,2,3)).

1-3 EXPERIMENTS AND SAMPLE SPACES


Probability theory has been motivated by real-life situations where an experiment is performed and the experimenter obser...es an outcome. Furthermore, the outcome may not be
predicted with certainty. Such experiments are called random experiments. The concept of
a random experiment is considered mathematically to be a primitive notion and is thus not
otherwise defined; however, we can note that random experiments have some common
characteristics. First, while we cannot predict a particular outcome with certainty, we can
describe the set o/possible outcomes. Second, from a conceptual point of view, the experiment is one that could be repeated under conditions that remain unchanged, with the outcomes appearing in a haphazard manner; however, as the number of repetitions increases,
certain patterns in the frequency of outcome occurrence emerge.
We will often consider idealized experiments. For example, we may rule out the outcome of a coin toss when the coin lands on edge. This is more for convenience than out of

Chapter 1

An Introduction to Probability

necessity. The set of possible outcomes is called the sample space, and these outcomes
define the particular idealized experiment. The symbols <=& and :t are used to represent the
random experiment and the associated saITlple'space:Following the terminology employed in the review of sets and set operations, we will
classify sample spaces (and thus random experiments). A discrete sample space is one
in which there is a finite number of outcomes or a countably (aenumerably) infinite
number of outcomes. Likewise, a continuous sample space has nondenumerable (uncountable) outcomes. These might be real numbers on an interval or real pairs contained in
the product of intervals, where measurements are made on two variables following an
experiment.
To illustrate random experiments with an associated sample space, we consider several
examples.

:Ji~i!#~i~;i~,'
~: "T~~.s_~~e coin and observe the "up" face.
ff,:

{H,n

Note that this set is finite.

'&2:
ff,:

Toss a true coin three times and observe the sequence of heads and tails.

~3:

Toss a true coin three times and observe the total number of heads.

ff,:

CO, 1,2, 3}.

't:4 :

Toss a pair of dice and observe the up faces.

ff 4 :

{(I, 1), (I, 2), (1,


(2, 1), (2, 2), (2,
(3, 1), (3, 2), (3,
(4,1), (4, 2), (4,
(5, 1), (5, 2), (5,
(6,1), (6, 2), (6,

'1g5:

An automobile door is assembled with a large number of spot welds. After assembly, each

{HHH, HHI; HIli, HIT, THH, THT, TTH, TIT},

3), (1, 4), (I, 5), (I, 6),


3), (2,4), (2, 5), (2, 6),
3), (3,4), (3, 5), (3, 6),
3), (4, 4), (4, 5), (4, 6),
3), (5, 4), (5, 5), (5, 6),
3), (6, 4), (6, 5), (6, 6)}.

weld is inspected, and the total number of defectives is counted.

'!i5:

(0, 1,2, ... ,X}, whereX= the total number of welds in the door.

1-3

Experiments and Sample Spaces

~G:

A cathode:ay rube is manufactured. put on life test. ar.d aged to failure. The elapsed time
(in hoUts) at faih.t..-e is recorded.

Ef,:

{UE

R, ,;"0).

This set is uncountable.

'tj7:

A monitor records the emission count from a 7adioactive source in one minute.

Ef,:

{O, 1. 2, ... }.

Tn.is set is cocntably infinite.

"8:

Two key solder joints On a printed circuit board are inspected with a probe as well as-visually, and each joint is classified as gqod. G, or defective, D, requiring rework or scrap.

Ef,:

{GG, GD,DG,DD).

e.g9:

In a particular chemical plant the volume produced per day for a particular product ranges
between a minimum value, fl, and a maximum value, c, which corresponds to capacity. A
day is randomly selected and the amount produced is observed.

Ef,:

(X:XE

R, b:;;x';c).

i,O: An extrusion plant is engaged i::l making up an order for pieces 20 feet long. Inasmuch as
the trim

~eration crea~es

scrap at both ends, the extruded bar

rr.us~

exceed 20 feet.

Because of costs involved, the amount of scrap is critical. A bar is extruded. trimmed, and
iinished. and the totalleagth of Smtp is measured.

:flO:

{r.xeR,x>O}.

~! 1:

In a missile launch, the three components of velocity are mentioned from the ground as a
function of time. At 1 minute after launch these are printed for a control unit.

9' .l:

v..... V,Y' v): Vx' vJ1 v; arc real numbers}.

11:

In the preceding example, the ve:ocity components are contin:.l.ously recorded for 5
minutes.

9'12:

The space lS complicated here. as we have all possible realizations of the functions v.>:(t).
and vt(t} for 0 S tS 5 minutes to consider.

Chapter 1

An Introduction to Probability

All these examples have the characteristics required of random experiments. With the
exception of Example 1-19, the description of the sample space is straightforward, and
although repetition is not considered, ideally we could repeat the experiments. To illustrate
the phenomena of random occurrence, consider Example 1-8. Obviously, if ~l is repeated
indefinitely, we obtain a sequence of heads and tails. A pattern emerges as we continue the
experiment. Notice that since the coin is true, we should obtain heads approximately onehalf of the time. In recognizing the idealization in the model, we simply agree on a theoretical possible set of outcomes. In ~ I' we ruled out the possibility of having the coin land
on edge, and in ~6' where we recorded the elapsed time to failure, the idealized sample
space consisted of all nonnegative real numbers.

1-4 EVENTS
An event, say A, is associated with the sample space of the experiment. The sample space is
considered to be the universal set so that event A is simply a subset of ':t. Note that both 0
and ':t are subsets of ':t. As a general rule, a capital letter will be used to denote an event. For
a finite sample space, we note that the set of all subsets is the power set, and more generally, we require that if A c ':t, then A' c ':t, and if AI' A1 , is a sequence of mutually exclusive events in ':t, as defIned below, then U~"'l Ai c g. The following events relate to
experiments 'i; I' '"t;;1' , 'i; 10' described in the preceding section. These are provided for illustration only; many other events could have been described for each case.
'&\.A:

The coin toss yields a head (H).

'&2' A:
'iSJ.A:

The total number of heads is two (2).

All the coin tosses give the same face {HHH, TIT}.

'&<l-.A:

The sum of the "up" faces is seven (I, 6), (2, 5), (3, 4), (4, 3), (5, 2),
(6, I)}.

'&5' A:

The number of defective welds does not exeeed 5 (0, 1,2, 3,4, 5).

'&6' A:

The time to failure is greater than 1000 hours (t: t> 1000).

'is,. A:
'iSs. A:
'is,. A:

The count is exactly two [2}.

'&10' A:

The scrap does not exeeed one foot [x: x

Neither weld is bad {GG}.


The volume produced is between a > b and c [x: x
E

R, b <a<x<c}.

R, 0 < x:::; I}.

Since an event is a set, the set operations defined for events and the laws and properties of Section 1-2 hold. If the intersections for all combinations of two or more events
among kevents considered are empty, then the kevents are said to be mutually exclusive (or
disjoint). If there are two events, A and B, then they are mutually exclusive if A n B = 0.
With k= 3, we would requireAJ nA, = 0,A j nA, = 0,A, nA, = 0, andA j nA, nA J =
0, and this case is illustrated in Fig.1-3. We emphasize that these multiple events are associated with one experiment.

1-5 PROBABILITY DEFINITION AND ASSIGNMENT


An axiomatic approach is taken to define probability as a set function where the elements
of the domain are sets and the elements of the range are real numbers between 0 and 1. If
event A is an element in the domain of this function, we use customary functional notation,
peA), to designate the corresponding element in the range.

1~5

Probability Definition and Assignment

Figun 1--3 Three mutually exelusive events.

Definition
If an experiment %has sample ~ce ff and an event A is defined on ff, then peA) is a real
number called the probability of event A or the probability of A, and the function P() has
the following properties:
1.. 0:0;

P(A):O; 1 for each event Aof ff.

2" PC::I') = 1..


3. For any finite number k of mutually exclusive events defined on fJ,
(k
1 k
pi UA, 1= lp(AJ

,,1:::::1

IC::d

4. If Ai' A2 AJ . is a denumerable sequence of mutually exclusive events defined on


::1', then

P(QA}~P{AJ
Note that the properties of the definition given do not tell the experimenter how to
assign probabilities; however. they do restrict the way in which the assignment may be
accomplished. In practice, probability is assigned on the basis of (1) estimates obtained
from previous expeJ;ience or prior observations, (2) an analytical consideration of experimental conditions, or (3) assumption.
To iliustrate the assigmnent of probability based on experience, we consider the repe-

tition of the experiment and the relative frequency of the occu.rrence of the event of interest.
This notion of relative frequency has intuitive appeal., and it involves the conceptual
repetition of an experiment and a counting of both the number of repetitions and the number of times the event in question occurs. More precisely. ~ is repeated m times and !:\.vo
events are denoted A and B. We let mA and mB be the number of times A and B occur in the
m repetitions.

Definition
The value fA = mA1m is called the relative frequency of event A. It has the following
properties:

1. 0 ~/A::; I.
2, IA = 0 if and only if A never oeeurs, and I"

=I

if and only if A occurs on every

repetition.
3, If A and B are mutually exclusive events, thenl" ~8 =1" +la.

10

Chapter 1

An Introduction to Probability

As rn becomes large, fA tends to stabilize. That is, as the number of repetitions of the
experiment increases, the relative frequency of event A will vary less and less (from repetition
to repetition). The concept of relative frequency and the tendency toward stability lead to one
method for assigning probability. If an experiment ~ has sample space g and an event A is
defined, and if the relative frequency fA approaches some number PA as the number of repetitions increases, then the number PA is ascribed to A as its probability, that is, as rn """'""7 00,
(1-1)

In.practice, something less than infinite replication must obviously be accepted. As an


example, consider a simple coin-tossing experiment '& in which we sequentially toss a fair
coin and observe the outcome-either heads (H) Or tails (7}-arising from each trial. If the
observational process is cODsidered as a random experiment such that for a particular repetition the sample space is S = {H, T}, we define the event A = {H}, where this event is
defined before the observations are made. Suppose that after m = 100 repetitions of~, we
observe m A :::::: 43, resulting in a relative frequency offA = 0.43, a seemingly low value. Now
suppose that we instead conduet m = 10,000 repetitions of ~ and this time we observe rnA
=.4,924, so that the relative frequency is in this case fA:::::: 0.4924. Since we now have at our
disposal 10,000 observations instead of 100, everyone should be more comfortable in .
assigning the updated relative frequency of 0.4924 to peA). The stability of fA as m gets
large is only an intuitive no!ion at this point; we will be able to be more precise later.
A method for computing the probability of an event A is as follows. Suppose the sample space has a finite number, n, of elements, ei , and the probability assigned to an outcome
iSPi=P(E,), whereE,= [e,} and
i == 1, 2, .. , , n,

while
PI +P2

+ '" +Pn == 1,

P(A)= L,Pi'

(1-2)

i:Ci GA

This is a statement that the probability of event A is the sum of the probabilities associated with the outcomes making up event A, and this result is simply a consequence of the
definition of probability. The practitioner is still faced with assigning probabilities to the
outcomes, ej , It is noted that the sample space will not be finite if, for example, the elements
ei of g are countably infinite in number. In this case, we note that
Pi~O,

i= 1, 2, ... ,

However, equation 1-2 may be used without modification.


If the sample space is finite and has n equally likely outcomes so thatp,
= lin, then

PtA) = n(A)
n

=P2 = ... == p"


(1-3)

and n(A) outcomes are contained in A. Counting methods useful in determining n and n(A)
will be presented in Section 1-6,

1-5

Probability DefInition and Assignment

11

Suppose the coin in Example 1-9 is biased so that the outcomes of the sample space g' = {HHH, Hm;
HTH, HIT, THH, THT, TTH, TIT} haveprobabilitiesPJ =-f.?,pz =
P3 =
P4 =~'P5=fr'P6=1:;,
4
8
.
P7=Ti'PS =21' where e1 =HHH. ez =HHI, etc.lfwe let event A be the event that all tosses YIeld the

f;-,

17,

same face, then P(A) = 27 + 27 = 3'

Suppose that in Example 1-14 we have prior knowledge that

e-2 21-1
Pi = (i-I)l
=0.

i=I,, ...

otherwise.

where Pi is the probabili[)' that the monitor will record a count outcome of i-I during a I-minute interval.lfwe consider event A as the event containing the outcomes 0 and 1, then A = {O, I}, and P(A) =
PI + P2 = e- 2 + 2-2 = 3[2 =. 0.406.

Consider Example 1-9. where a true coin is tossed three times, and consider event A, where all coins
show the same faco. By equation 1-3,

since there are eight total outcomes and hvo are favorable to event A. The coin was assumed to be true,
so all eight possible outcomes are equally likely.
_

.: ;_'- __ .~v~-"-,

Assume that the dice in Example 1-11 are true, and consider an event A where the sum of the up faces
is 7. Using the resulrs of equation 1-3 we note that there are 36 outoomes, of which six are favorable
to the event in questiOI!, so that P(A) =

i.

Note that Examples 1-22 and 1-23 are extremely simple in two respects: the sample space
is of a highly restricted type, and the counting process is easy. Combinatorial methods frequently become necessary as the counting becomes more involved. Basic counting methods
are reviewed in Section 1-'6, Some important theorems regarding probability follow.

Theorem 1-1
If 0 is the empty set, then P(0) = 0.
Proof Note thatg= g u 0 andg and 0 are mutually exclusive. ThenP(9') =P(9') + P(0)
from property 4; therefore P(0) = 0.

Theorem 1-2
P(A) = 1 - P(A).

12

Chapte' 1

An Introduction to Probability

Proof Note that g=A 'J A and A and A are mutually exclusive. ThenP(9') =P(A) .,.P(A)
from property 4, but from property 2, P(9') 1; therefore P(A) 1 - PCA).

Theorem 13
P(A V B) = PCA) + P(B) - peA n B).

Proof Since A vB =A v (B nA), where A and (B nA) are mutually exclusive, andB=
(A n B) V (B n A), where CA n B) and (B n A) are mutually exclusive, then peA v B) =
P(A) + PCB n A\ and PCB) ~ PeA n B)+ P(B nA). Subtracting, peA v B) - PCB) = P(A)peA n B), and thus P(A v B) = P(A).,. P(B) - P(A n B).

The Venn diagntm shown in Fig. 1-4 is helpful in following the argument of the proof for
Theorem 1-3. We see that the ~double counting" of the hatched region in the expression
PCA) + PCB) is corrected by subtraction of P(A n B).

Theorem 14
PCA v B v C)= P(A).,.P(B) +P(C) -P(AnB) - p(A

n C) -PCB n C) + PCA n B n C).

Proof We may write A v B v C = CA v B) v C and use Theorem 1-3 since A v B is 3D


event. The reader is asked to provide the details in Exercise 1-32,

Theorem 1-5
k.

P(At vA2 v .. vA,)= 2,p(A;)- 2,p(A; nAj )';'


i=l

i</=2

2,P(Ai nAj nA,

)+ ...

<)<.r-.:.3

Proof Refer to Exercise 1-33.

Theorem 16
!f A c B, then PCA):;; PCB).

Proof
!fA cB, thenB =A v

(A n

CD

B) andP(B) =P(A) +P(A nB) ?;P(A), since peA () B);;' O.

L-_ _ _~_ _ _ _~,

Figure 1-4 VeM diagram for two events.

1-5

Probability Definition and Assignment

13

;$~~~pi~I:b1
If A and B are mutually exclusive events, and if it is known that peA) = 0.20 while PCB) = 0.30, we
can evaluate several pr~biliti~~

1. peA) = I - peA) = 0.80.


2. p(S)

=I - PCB) =0.70.

3. peA u B) =P(A) + PCB) = 0.2 + 0.3 =0.5.


4. P(AnB)=O.
5.

peA nB)=~(A uB), by De Morgan's law_

=1-P(AuB)
= I - [peA) + PCB)] = 0.5.

Suppose events A and B are not mutually, exclusive and we know that P(A) = 0.20, PCB) = 0.30, and
peA n B) = 0.10. Then evaluating the s~;~ilities as before, we obtain
1. peA) = 1- peA)

2. p(S) = I - PCB)

=0.80.
=0.70.

3. peA u B) =P(A) + PCB) - peA n B) = 0.2+ 0.3 - 0.1 = 0.4.


4. peA n B) = 0.1.
5. peA n

B) =peA u

B)

=I -

[peA)

+ PCB) -

peA n B)]

=0.6.

!;R~!~!t~fii:
Suppose that in a certain city 75% of the residents jog (1), 20% like ice cream (1), and 40% enjoy
music (M). Further, suppose that 15% jog and like icc cream, 30% jog and enjoy music, 10% like ice
cream and music, and 5% do all three types of activities. We can consolidate all of this information
in the simple Venn diagram in Fig. 1-5 by starting from the last piece of data, P(J n In M) = 0.05,
and "working our way out" of the center.
1. Find the probability that a random resident will engage in at least one of the three activities.
By Theorem

11,

-----,.,,-

P(l u lu M) = pel) + pel) + P(M) - P(l n l) - P(l n M) -pel n M) + P(l n I n M)

=0.75 + 0.20+ 0.40 -

0.15 - 0.30 - 0.10 + 0.05 =0.85.

This answer is also immediate by adding up the components of the Venn diagram.
2. Find the probability that a resident engages in precisely one type of activity. By the Venn diagram, we see that the desired probability is
P(l nl n

M) +P(j n In M) + p(J n I n M) = 0.35 + 0 + 0.05 = 0.40.

Figure 1-5 Venn diagram for Example 1-26.

14

Chapter 1

An Inrroduction to Probability

1-6 FINITE SAMPLE SPACES AND ENUMERATION


Experiments that give rise to a finite sample space have already been discussed, and the
methods for assigning probabilities to events associated with such experiments have been
presented. We can use equations 1-1, 1-2, and 1-3 and deal either with "equally likely" or
with "not equally likely" outcomes. In some situations we will have to resort to the relative
frequency concept and successive trials (experimentation) to estimate probabilities, as indicated in equation 1-1, with some finite m. In this section, however. we deal with equally
likely outcomes and equation 1-3. Note that this equation represents a special case of equation 1-2, where P! == P2 == .. == p" = lIn.
In order to assign probabilities, peA) = n(A)/n, we must be able to detennine both n, the
munber of outcomes, and n(A). the number of outcomes favorable to event A. If there are n
outcomes in g, then there are 2" possible subsets that are the elements of the power set,
(0, I)A
The requirement for the n outcomes to be equally likely is an important one, and there
will be numerous applications where the experiment will specify that one (or more) item(s)
is (are) selected at random from a population group of N items without replacement
If n represents the sample size (n .::; N) and the selection is random, then each possible
selection (sample) is equally likely. It will soon be seen that there are N!I[n!(N - n)!] such
samples, so the probabillty of getting a particular sample must be n!(N - n)!/NL It should
be carefully noted that One sample differs from another if one (or more) item appears in one
sample and not the other. The population items must thus be identifiable. In order to illustrate, suppose a population has four chips (N = 4) labeled a, b, c, and d. The sample size is
to be two (n = 2). The possible results of the selection, disregarding order., are elements of
:J = (ab, ac, ad, bc, bd, cd). If the sampling process is random, the probability of obtaining each possible sample is
The mechanics of selecting random samples vary a great
deal, and devices such as pseudorandom number generators, random number tables, and
icosahedron dice are frequently used, as will be discussed at a later point.
It bccomes obvious that we need enumeration methods for evaluating n and n(A) for
experiments yielding equally likely outcomes; the following sections, 1-6.1 through 1-6.5,
review basic enumeration techniques and results useful for this purpose.

i.

1-6.1 Tree Diagram


In simple experiments. a tree diagram may be useful in the enumeration of the sample
space. Consider Example 1-9, where a true coin is tossed three times. The set of possible
outcomes could be found by taking all the patha in the tree diagram shown in Fig. 1-6. It
should be noted that there are 2 outcomes to each trial, 3 trials, and 2 3 = 8 outcomes {HHH,

HHT, HTH, HIT, THH, THY, TTH, TIT}.

1-6.2 Multiplication Principle


If sets AI' A 2 , Akhave, respectively, nl'~' , .. , nk elements, then there are n1 ~ '" 'nk
ways to select an element first from AI' then from~, "" and finally fromAk'
In the special case where nl == ~ == '" == nk == n, there are nk possible selections. This was
the situation encountered in the coin-tossing experiment of Example 1-9.
Suppose we consider some compound experiment 't; consisting of k experiments, cgl'
cgz, ... , egk If the sample spaces g I' g 2' ... , g k contain n 1, ~, '" nk outcomes, respectively,
then there are n 1 . ~ ... 'nk outcomes to eg. In addition, if the n; outcomes of g} are equally
likely forj = 1, 2, ... , k, then the n 1 n,.' .... n, outcomes of'l: are equally likely.

1-6
First
toss

Second
toss

Finite Sample Spaces and Enumeration

15

Third
toss

/
/
Figure 1-6 A tree diagram for tossing a true coin three times.

Suppose we toss a true coin and cast a true die. Since the coin and the die are true, the two outcomes
to '"if: l' SOl = (H, T}. are equally likely and the six outcomes to "&2' S02 = {1,2,3,4,5.6 J, are equally likely.
Since n l = 2 and TI..z = 6, there are 12 outcomes to the total experiment and all the outcomes are equally
likely. Beeause of the simplicity of the experiment in this case, a tree diagram permits an easy and
complete enumeration. Refer to Fig. 1-7.

A manufacturing process is operated with very little "in-process inspection." When items are completed they are transported to an inspection area, and four characteristics are inspected, each by a different inspector. The first inspector rates a characteristic according to one of four ratings. The second
inspector uses three ratings, and the third and fourth inspectors use two ratings each. Each inspector
marks the rating on the item identification tag. There would be a total of 4 . 3 . 2 . 2 = 48 ways in
which the item may be marked.

1-6.3

Pennutations
A permutation is an ordered arrangement of distinct objects. One permutation differs from
another if the order of arrangement differs or if the content differs. To illustrate, suppose we

T~~

go = ({H, 1), (H, 2), (H, 3), (H, 4), (H, 5), (H, 6),
(T, t), (T, 2), (T, 3), (T, 4), (T, 5), (T, 6))

=--4
5
6
Figure 1-7 The tree diagram for Example 1-27.

16

Chapter I

An Introduction to Probability

again consider four distinct chips labeled a, b, c, and d. Suppose we wish to consider all permutations of these chips taken one at a time. These would be
a
b

c
d

If we are to consider all pemlUtations taken two at a time, these would be

ab
ba
ac
ca

be
cb
bd
db

ad cd
da de
Note that permutations ab and ba differ because of a difference in order of the objects,
while pennutations ac and ab differ because of content differences. In order to generalize,
we consider the case where there are n distinct objects from which we plan to select permutations of r objects (r::; n). The number of such pennutations, p~, is given by

P," =n(n-l)(n-2)(n-3)(n-r+l)
n!
= (n-r)!'
This is a result of the fact that there are n ways to select the first object, (n - 1) ways to
select the second, "', [n - (r- 1)] ways to select the rth and the application of the multiplication principle. Note that P~ = n! and OJ = 1.

;~~~i1ipj".i?ir
A major league baseball team typically has 25 players. A line-up consists of nine of these players in
a particular order. Thus, there are P~ == 7.41 x lOll possible lineups.

1-6.4

Combinations
A combination is an arrangement of distinct objects where one combination differs from
another only if the content of the arrangement differs. Here order does not matter. In the
case of the four lettered chips a, b, c, d, the combinations of the chips, taken two at a time,
are

ab

ac
ad
be
bd

cd
We are interested in determining the number of combinations when there are n distinct
objects to be selected r at a time. Since the number of permutations was the number of ways
to select r objects from the n and then permute the r objects, we note that

1-6

Finite Sample Spaces and Enumeration

17

(1-4)

where (' represents the number of combinations. It follows that

n' ::::::p"l'ljr!::::::

,r

n,

rl(n-r)1

(1-5)

In the illustration witll the four chips where r~ 2, the reader may readily verify thatY, ~ 12
and (~) : : : 6, as we found by complete enumeration.
For present purposes, e) is defined where n and r are integers such that 0 ::::; r:S; n~ however. the terms (~ may be generally defined for real n and any nonnegative integer r. In this
case we write

n~ = "(n-l)(n ,r),

2) .. (n- r+ I)
r.I

The reader will recall the binomial theorem:

(a+by =

n f n..

'ja'b'"

0-6)

rooO\' r

The n~mbers (J are thus called binomial coefficients.


Returning briefly to the definition of "",dam sampling from a finite population with
out replacement. there were N objects with n to be se1ected, There are thus different :sam~
pIes. If the sampling process is random, eacb possible outcome has probability l!(~) of
being the one selected.
Two identities that are often helpful in problem solutions are

e)

(1-7)

and
(1.8)

To develop the result shown in equation 1-7, we note that

and to develop the re;"lt shown in equation 1-8, we expand the right-hand side and collect
terms.
To verify that a finite collection of n elements has 2' subsets, as indicated earlier, we
see that

2'

(l.,.!)"~

il/n) = (nl~(n\~ ... ~(n\


r:::::!G r

,0.1

I;

n;

from equation 1-6. The right side of this relationship gives the total number of subsets since

Gl is the number of subsets with 0 elements. (;) is the number v.;th one element, ..., ""d tl
is the number with n elements.

18

Chapter 1

An Introduction :0 Probability

A production lot of size 100 is known to be 5% defective. ArandoDl sample of 10 items is selected
without replacement. In order to determine the probability that there will be no defectives in the sam-

ple. we resort to counting both the number of possible samples a:od The number of samples favorable
to e'/entA. where eYent A is taken to mean that there are nO defectivc.,_ The number of possible sam~
pIes is (:l~;;;;; l~::' The number "favorable to A" is
~1 SO that

I!l.

5)(95[
_51 _951.
(
_ O!5!10l85!
P(A) -- 0,10/
(100, 100)
= 0.5&375.

llo J

10!90!

To generalize the preceding example, we consider the case where the population has N
items of which D belong to some class of interest (such as defective). A random sample of
size n. is selected without replacemenL If il denotes the event of obtaining exactly r items
from the class of interest in the sample, then

fDr N - D\

P(A)

\rJ,"'-rj
(N"
,

r;;;;; 0, 1,2, ...

,min (n, D).

'

~n)
Problems of this type are often referred to as hypergeometric sampling problems,

E:.Di~~~);~~
An NBA basketball team typically has 12 players. A starting team consists of five of these players in
no particular order, Thus, there are (~2) = 792 possible starting teams.

:F;{inji~"~~:ji.
One obvious application of counting methods lies in calculating probabilities for poker bands. Befo:e
proeeeding, we remind the reader of some sta."ldard ~erminology, The rank. of a particu1~ c~d drawn
from a standard 52-card deck can be2, 3, "', Q. K,A, while the possLble suUs are +, 0, v,~. In poker,
we dravo' five cards at random from a deck. The number of poss~ble hands is
=2.598,960.

G)

1. We fitst calculate the probability of obtainir.g (V.i'O pairs, for example Av, A~, 3v, 30, !.O~.

We proceed as follows.

tv,,,

(a) Select
ranks (e.g. A. 3). We can do this (;) ways.
(b) Select two suits for first pair (e.g" \? ~), There are W
ways,
(e) Select two suits for second pair (e.g., <:', 0). There are Wways.
(d) Select remaining card to complete the hand, There are 44 ways.
Thus, the number of ways to select two pairs is

and So
0.0475.

1-6

Finite Sample Spaces and Enumeration

19

2. Here we calculate the probability of obtaining a full house (onc parr, one three-of-a-kind), for
exampleA'Y,Ao!o, 3'Y, 3<>, 3~.
(a) Select two ordered ranks (e.g., A, 3). There are PJJ ). ways. Indeed, the ranks must be
ordered sinee "three A's, two 3 's" differs from "two A's, three 3's."
(b) Select two suits for the parr (e.g., 'Y, +). There are@ways.
(e) Select three suits for the threc-of-a-kind (e.g., 'Y, <>, ~). (~) ways.
Thus, the number of ways to select a full house is

n(full

house)~

1312(a;J ~

3744,

and so

P(full house)

3744
2,598,960

0.00144.

3. Finally, we calculate the probability of a flush (all five cards from same suit). How many
ways can we obtain this event?
(a) Select a suit There are (~) ways.
(b) Select five cards from that suit. There are (~3) ways.
Then we have

P(flllSh)

5148 ~ 0.00198.
2,598.960

1-6.5 Pennutations of Like Objects


In the event that there are k distinct classes of objects, and the objects within the classes are
not distinct, the following result is obtained, where n, is the number in the first class, n2 is
the number in the second class, .. " nk is the number in the kth class, and n J + n2 + ." + nk
~n:

n!

(1-10)

Consider the word "TENNESSEE." The number of ways ro arrange the letters in this word is

n!
nT!nE!nN!nS!

_9_!_=3780
1!4!2!2!

'

where we use the obvious notation. Problems of this type are often referred to as multinomial sampling problems.

The counting methods presented in this section are primarily to support probability
assignment where there is a finite number of equally likely outcomes. It is important to
remember that this is a special case of the more general types of problems encountered in

probability applications.

20

Chapter I

An Introduction to Probabilirj

17 CONDITIONAL PROBABILITY
As noted in Section 1-4, an event is associated with a sample space~ and the event is representod by a subset of::f, The probabilities discussed in Section 15 all relate to the entire
sample space, We bave used the symboll:'(4) to denote the probability of these events; bow
eve~ we could have used the symbol P(AI::!), read as '~the probability of A,g!'Le!!.s.?!!'ple
~Eace ~.. In this section we consideiilie probability of events where the event is con4ltione.d On some subset of the sample space.
- --'...
"~Some illustrations of tllisidea should be helpful. Consider a group of 100 persons of
whom 40 are college graduates., 20 are se1f~employed, and 10 are both college graduates
and self-employed, Let B represent the set of college graduates and A represent the set of
self-employed, so thatA r1 B is the set of college graduates who are self-employed. From
the group of 100, one person is to be randomly seJected, (Each person is given a number
from I to 100, and 100 chips with the same numbers are agitated, with one being selected
by a blindfolded outsider,) Then, P(A) =0,2, P(B) =OA, and P(A n B) =0, I if the entire
sample space is considered, As noted, it may be more instructive to write P(AI::!), P(BI::f),
and P(A n BI::i') in such a case, ~ow suppose the following event is considered: self
employed giver. that the person is a college graduate (AlB), Obviously the sample space is
reduced in that only college graduates are considered (Fig, 18), The probability P(AIB) is
thus given by

n(AnB)
.(B)

= .(AnB)!. = p(AiB) =
n(B)!n

p(AnBl = !!.:.!.
P(B)
0.4

= 0,25,

The reduced sample space consists of the set of all subsets of 'i:! that belong to B. Of course,
A n B satisfies the condition.

As a second illustf'ation, consider the case wbere a sample of size 2 is randomly


selected from a lot of size 10, It is known that the lot has seven good and three bad items,
LetA be the event that the:first item selected is good, and B be the event that the second item
selected is good. If the items are selected without replacement. that is, the first item is not
replaced before the second item is selected, then
PtA)

=.2.
10

and

If the first item is replaced before the second item is selected, the conditional probability
P(BIA) ~ P(B) = ~, and the events A and B resulting from the two seleetion experiments
comprising % are said to be jr.dependent, A formal definltion of conditiooal probability

(a)

(b)

Figure 18 Conditional probability, (a) Initial sample space. (b) Reduced sample space,

1-7

Condit:o'&al Probability

21

P(AIB) will be given later, and independence will be discussed in detail, The following
examples will help to develop some intuitive feeling for conditionallfObability.

Exampiel.:34
, .......
'.,
'.

/.

Recall E.x.ample 1-11 where two dice are tossed, and ass~e that each die isJ:roe. The 36 poss[b!e Outcomes were enumerated. If we consider two events,

B= (d" dz): dz?!: d,).


where d! is tlle value of the up face of the first die a..'1d d?zis the value of the up face of the second die,
= PCB) = peA n B) = 3~' P(BIA) = " and P(AIB) = The probabilities were
obtained from a direct consideration of the sample space and the counting of outcomes. Note that
then PiA)

f.'.

t,.,

eft,

p(AIB) = peA n B)
PCB)

and

P(BIA) = P(AnB)

peA)

.' ~a,nk\e~:3~:
In World V;,rar II an early operations research effort in England was directed at establishing search patterns for U-boats by patrol flights or sorties. Fo~ a time, there was a tendency to co:r.cer.tratc the flights
on h!-shore areas, as it had been believed Lim more sighti.."lgs took place in-shore" The research group
studied 1000 sortie :recor~ witll the following result (the data are fictitious):

Sighting
No sighting

Total sorties

In-shore

Off-shnre

Total

80
820
900

20
80
100

100
900
1000

.-~"---

'9

Let St: There was a sighting.


3'1,: There was no sighting.
B;: In-shore $ortie,
Bz: Off-shore sortie.

'r"

,
I

j
":,

'I

'~/

WI! sel! immediately that

p( SdB,) = 9: ~ 0,0889.
p(sdB,)
~22..=0.20,
,
100
which iIl.dicates a search strategy counter to prior practice.

Definition
We may define the conditional probability of event A given event B as

p(A1B)= P(AnB)
P(B)

if P(B) >0.

(1-11)

This definition results from the intuitive notion presented in the preceding discussion. The
conditional probability P('I') satisfies the properties required of probabilities, That is,

22

Chapter 1

An Introduction to Probability

1. 0" P(AIB) ,a
2. NflB) = 1.
3.

P[~AiIB) = ~P(A,IB) if A, n Aj = 0

for i

* j.

J ~P(A,IB)

4. P[QAiI B =

for Ab A2 ,A3 , . a denumerable sequence of disjoint events.


In practice we may solve problems by using equation 1-11 and calculating P(A n B) and
P(B) with respect to the original sample space (as was illustrated in Example 135) or by
considering the probability of A with respect to the reduced sample space B (as was illustrated in Example 134).
A restatement of equation 1-11 leads to what is often called the multiplication rule,
that is
P(A n B) = P(B) . P(AIB),

P(B) >0,

P(A n B) = P(A) . P(BIA),

P(A) > O.

and

(H2)

The second statement is an obvious consequence of equation 1-11 with the conditioning on
event A rather than event B.
It should be noted that if A and B are mutually exclusive as indicated in Fig. 1-9, then
A n B = 0 so that P(AIB) = 0 and P(BIA) = o.
In the other extreme, if B cA, as shown in Fig. 1-10, then P(AIB) = 1. In the first case,
A and B cannot occur simultaneously, so knowledge of the occurrence of B tells us that A
does not occur. In the second case, if B occurs, A must occur. On the other hand, there are
many cases where the events are totally unrelated. and knowledge of the occurrence of one
has no bearing on and yields no information about the other. Consider, for example. the
experiment where a true coin is tossed twice. EventA is the event that the first toss results
in a "heads," and event B is the event that the second toss results in a "heads." Note that
P(A) = since the coin is true, and P(B\A) = since the coin is true and it has no memory.
The occurrence of event A did not in any way affect the occurrence of B, and if we wanted
to find the probability of A and B occurring, that is, P(A n B), we find that

t,

-t,

P{AnB) = P{A)p(BlA)

=L!=!.
224

We may observe that if we had no knowledge about the occurrence or nonoccurrence of A,


we have P(B) = P(BIA), as in this example.

00
Figure 1-9 Mutually exclusive
events.

Figure 1-10 Event B as a subset of A.

17

Conditional Probability

23

Infonnally speaking, two events are considered to be independent if the probability of


the occurrence of one is not affected by the occurrence or nonoccurrence of the other. This
leads to the following definition.

Definition
A and B are independent if and only if
PtA n B) = PtA) . P(B).

(1-13)

An immediate consequence of this definition is that if A and B axe independent events, then
from equation 1-12,
P(AIB)

=PtA)

and P(BIA)

=P(B).

(1-14)

The following theorem is sometimes useful. The proof is given here only for the first part.

Theorem 1-7
If A and B are independent events, then the following holds:

1. A and B axe independent events.

2.

A and B axe independent events.

3. A and B axe independent eventS.


(:

Proof (part 1)
PtA n B) = PtA)
= PtA)
= P(A)
=PtA)

. P(BIA)
. [1 - P(BIA)]
. [1 - P(B)]
. PCB).

J.y

~i

In practice, there axe many situations where it may not be easy to determine whether
two events axe independent; however, there are numerous other cases where the requirements may be either justified or approximated from a physical consideration of the experiment. A sampling experiment will serve to illustrate.

r,

);:~"';;~'ei:~~
Suppose a random sample of size 2 is to be selected from a lot of size I ~O, and it is known that 98 of
the I 00 items are good. The sample is taken in such a manner that the first item is observed and
repl~ced before the second item is selected. If we let

A: First item observed is good,

B: Second item observed is good,


and if we want to determine the probability that both items are good, then

98 98
100 100

P(AnB) =P(A) P(B) =_ . - =0.9604.


If the sample is taken "without replacement" so that the first item is not replaced before the seeond
item is selected, then

98 97
100 99

P(AnB) =P(A)' P(~A) = --=0.9602.

24

Chapter 1

An Introduction to Probability

The results are obviously very dose, and one coromon practice is to assume the events independent
when the sampling fraction (sample size/population size) is small, say less than 0.1.

EX3]l,Iiiej:~i .
The field of reliability engineering has developed rapidly since the early 1960s. One type of problem
encountered is that of estimating system reliability given subsystem reliabilitics. Reliability is defined
here as the probability of proper functioning for a sta~ed period of time. Consider the structure of a
simple serial system, shown in Fig. 1-11, The system functions if and only if both subsystems fullC~
tion. If the subsystems surVive independectly, then.

where Rl and R2 are the reliabilities for subsystems 1 and 2, respectively. For exampte, if R j
and R, = 0.80, then R, = 0.72.

;;:

0,90

Example 1-37 illustrates the need to generalize the concept of independence to more
than two events. Suppose the system consisted of three subsystems or perhaps 20 subsys~
terns. W'hat conditions would be required in order to allow the analyst to obtain an estimate
of system reliability by obtaining the product of the sub'J'stem reliabilities?

Definition
The k events AI' Az, ... , A, are mutually independent if and only if the probability of the (
intersection of any 2, 3, ... , k of these sets is the product of their respective probabilities.
Stated more precisely. we require that for r = 2, 3 ..... k,

P(;!.;, nAt, n ... nAt) ~ P(;!.;,) . n4,,) ..... peA;)

rrp(~I)'
j=t

In the case of serial system ",liability calculations where mutual independence may
reasonably be assumed, the system reliability is a product of subsystem reliabilities
(1-15)

In the foregoing definition there are 2" - k - 1 conditions to be satisfied. Consider three
events, A, B, and C, These are independent if and only if P(;!. n B) = peA) , PCB), PC.4 n C)
P~4) . P(C), PCB n C) = PCB) P( C), and peA n B n C) = peA) . PCB) . P( C). The following example illustrates a case where events are pairwise independent but not mutually
independent,

System

Figure 1-11 A simple serial system.

1~8

Partitions, Total Probability, and Bayes' Theorem

25

0;,~!~P!:'i.~#~:
Suppose the sample space, with equally likely outcomes, for a particular experiment is as follows:
9'~

Let Ao: First digit is zero.


A J: First digit is one.
Bo: Second digit is zero.

(CO. o. 0). (0.1. 1). Cl. 0.1). 0.1. OJ}.


B J: Second digit is one.
Co: Third digit.js zero.
C J: Third digit is one.

It follows that

and it is easily seen that

ptA, nBj ) ~~P(A, ).p( BJ ).

i ~ 0.1. j

0.1.

ptA, ncj ) ~ ~ P(A,).P( Cj ).


P(B, n CJ ) ~ ~ P(B,).P( Cj ).
However, we note that

p(Ao nBo n Co) ~-" p(Ao)P(Bo)P( Co).


4

PCAo n Bo n

C l ) ~ 0" P(Ao) P(BO! 'PCC 1).

and there are other triplets to which this violation could be extended.

The concept of independent experiments is introduced to complete this section. If we consider two experiments denoted ~ I and ~2 and let Al and A2 be arbitrary events defined on
the respective sample spaces:11 and:12 of the ,two experiments, then the following definition can be given.

Definition
If peAl n A,) ~ peAl) . peA,). then '&1 and '&2 are said to be independent experiments.

1-8 PARTITIONS, TOTAL PROBABILITY, AND BAYES' THEOREM


A partition of the sample space may be defined as follows.

Definition

When the experiment is performed, one and only one of the events,
have a partition of :1.

B,. occurs if we

26

Chapter 1

All IJlIroduetion to Probability

A particular binary "word" consists of five "bits," b l b2 b1> b4,' Os. where bl~O.l.l;;;;t 1, 2. 3, 4, 5. Au
experiment consists of tranSmitting a "word," and it follows that there are 32 possible words, If the
events are

Bi = (CO, 0, 0, 0, 0), (0,0,0,0, OJ,


{(O, 0, 0, 1,0), (0, 0, 0, 1, I), (0, 0, 1,0,0), (0, 0, 1,0, I), (0, 0, 1, 1,0), (0, 0, 1, 1, 1)],
E, = (0, 1. 0, 0, 0), (0,1,0,0,1), (0, 1, 0, 1, 0), (0, 1,0, 1, 11, (0,1, 1,0,01, (0,1, 1,0,11,
(0,1,1,1,0), (0,1,1, 1,1)},

E4 = {(I, 0, 0, 0, 0), (I, 0, 0, 0, 1), (1,0,0, I, 0), (I, 0, 0, I, I), (1, 0,1,0,0), (1, 0,1, 0, 1),
(1,0,1, 1,0), (1, 0, I, 1, 1),
E, = [(I, 1, 0, 0, 0), (1,1,0,0, I), (I, 1,0, 1,0), (1, 1,0, 1, 1), (1,1,1,0,0), (1,1,1,0,1),
(1, 1, 1, 1, OJ),

B,= [(1, I, 1, 1, 1)],

In general, if k events B; (i = 1, 2. '" , kl form a partition and A is an arbitrary event with


respect to ~. then we may vnite
A= (A nB,l u (A nB,)u '" u (A nB,)
so that
PtA) = PtA

B,l +P(A

n B,}+ '" + peA n

B,),

since the events (A n B;) are pairwise Illutually exclusive, (See Fig, 1-12 for k
not matter thetA n B; = 0 for some or all of the i, since P(0) = 0, .
Using the results of equation 1-12 we can state the follov.'ing theorem.

4,) It does

Theorem 1-8
If B" B" "', B, represents a partition of:'! and A is an arbitrary event on ff', then the total
probability of A is given by

PtA) = P(B,) peA I B,l + peA '.1Jz)~" + P(B,) , PtA i B,l

L,P(E,)P(A I Bil.
i"'1

The result of Theorem 1-8, also known as the law of total probability, is very useful, as there
are numerous practical .imations in wbicb P(A) cannot be computed directly, However,

1-8

Partitions, Total Probabiliry, and Bayes' Theorem

27

with the infonnation that Bi has occurred, it is possible to evaluate P(AIB) and thus determine P(A) when the values P(B;) are obtained.
Another important result of the total probability law is known as Bayes' theorem.

Theorem 1-9
If B 1 B2, , Bic constitute a partition of the sample space ';j and A is an arbitrary event on
Ef. then for r= I, 2... , k,
(1-16)

Proof

_p(B,nA)
I )- P(A)
P(B,A
p(B,)PI:AIB,)
k

2.: p(BJ p(AlB, )


i==l

The numerator is a result of equation 1-12 and the denominator is a result of Theorem 1.8.

Three facilities supply microprocessors to a manufacturer of telemetry equipment. All are supposedly
made to the same specifications. However, the manufacturer has for several years tested the microprocessors, and records indieate the following information:

Supplying Facility

Fraction Defective

Fraction Supplied

0.Q2

2
3

0.01
0.03

0.15
0.80
0.05

The manufacturcr has stopped testing because of the costs involved, and it may be reasonably
assumed that the fractions that are defective and the inventory mix are the same as during the period
of record keeping. The director of manufacturing randomly selects a microprocessor, takes it to the
test department, and finds that it is defective. If we letA be the event that an item is defective, and Bi
be the event that the item came from faciliry i (i = 1, 2, 3), then we can evalnate PCB)A). Suppose, for
instance, that we are interested in determining P(B31A). Then

p(s,).p(AIs,)
p( S,iA) p( ~ )p(AIB\) + p(s,). p(Ais,) + p( s,) p(AIs,)
(0.05)(0.03)
(0.15)(0.02)+ (0.80)(0.01) +(0.05)(0.03)

3
25

28

Chapter 1

An Introduction to Probability

1-9 SUMMARY
This chapter has introduced the concept of random experiments, the sample space and
events, and has presented a formal definition of probability. This was followed by methods
to assign probability to events. Theorems 1-1 to 1-6 provide results for dealing with the
probability of special events. Finite sample spaces with their special properties were discussed, and enumeration methods were reviewed for use in assigning probability to events
in the case of equally likely experimental outcomes. Conditional probability was defined
and illustrated, along with the concept of independent events. In addition, we considered
partitions of the sample space, total probability, and Bayes' theorem. The concepts presented in this ehapter form an important background for the rest of the book.

1-10 EXERCISES
1~1. Television sets are given a final inspection following assembly. Three types of defects are identified, criticaF.'majo~ and minb'r defects, and are coded
A, B, and C, respectively, by a mail-order house. Data
are analyzed with the following results.

Sets having only critical defects


Sets having only major defects
Sets having only minor defects
Sets having only critic:S'
and major defects\O
Sets having only critieal
and minor dcfccts U Q.,
Sets having only major
and minor defects V
Sets having all three types of defects

2%
5%
7%
3%
4%
3%
1%

(a) What fraction of the sets has no lefects? O.!1~


(b) Sets with either critical defects or major defects
(or both) get a complete rework. What fraction
falls in this category? 0 _

1%

1 ~2. illustrate the following properties by shadings or


colors on Venn diagrams.
(a) Associative laws: A u (B u C) = (A u B) u C, A
nCB n C)=(AnB)n C.
(b) Distriburive laws: A u (B n C) = (A u B) n (A u
C), A n (B u C) = (A n B) u (A n C).
(e) IfAcB,thenAnB=A.

(d) IfAcB,thenAuB=B.
(e) IfAnB=0,thenAcB.
(I) IfAcBandBcC,thenAcC.

1-3. Consider a universal set consisting of the integers 1 through 10, 9r U= [1,2,3,4,5,6,7,8,9, 1O}.
LetA= {2, 3, 4}, B= {3, 4, 5}, and C= {5, 6, 7}. By
enumeration, list the membership of the following
sets.

(a) A n B.
(b)AuB.

(e)AnE.
(d) A n (B n C).
(e) An (B u C).

1-4. A flexible circuit is selected at random from a


production run of 1000 circuits. Manufacturing
defects are classified into three different types, labeled
A, B, and C. Type A defects occur 2% of the time, type
B defects occur 1% of the time. and type C occur
1.5% of thc time. Furthermore, it is known that 0.5%
have both type A andB defects, 0.6% have bothA and
C defects, and 0.4% have B and C defects, while 0.2 %
have all three defects. What is the probability that the
flcxible circuit selected has at least One of the three
types of defects?
1~5. In a human-factors laboratory, the reaetion timcs
of human subjects are measured as the elapsed time
from the instant a position number is displayed on a
digital display until the subject presscs a button
located at the position indicated. Two SUbjects are
involved, and times are measured in seconds for each
subject (t 1, t2). What is the sample space for this experiment? Present the following events as subsets and
mark them on a diagram: (t l + tz)fl::; 0.15, max (t 1, t2J

;;0.15,

It,-r,1 ;;0.06.

1~6. During a 24-hour period, a computer is to be


aecessed at time X, used for somc processing, and
exited at time Y:2: X. Take X and Y to be measured in
hours on the time line with the beginning of the 24hour period as the origin. The cxperiment is to observe
XandY.

(a) Describe the sample space ff.


(b) Sketch the following eVents in the X, Yplane.

(i)

The time of use is 1 hour or less.

1-10
(li) The access is before tl and the exit is after t2,

where s: tl < t2 $" 24.


(iii) The time of use is !ess than 20% of the

period.
1-7. Diodes from a batch a.-.oe tested one at a time and
marked either defective or nondefective, This is continued until either tv.\) defective items are found or
five items have been~ted. Describe the sample space
for this experiment.
1MK A set has four elements. A :::: {a, b, c, d}.

Describe the power se' {O, J lA.


1-9. Describe the sample space for each of the follow..ng experiments.
(a) A lot of 120 battery lids for pacemaker cells 1S
knov;'U to contain a number of defectives because
of a problem with 'JJ.e barrier materia: applied to
the glassed-in feed-through. Three lids are randomly selected (without replacement) and are
carefully inspected following a cut do\VD.
(b) A pallet of 10 castings is known to contain one
defective and nine good Ull.its. Four castfugs a..re
randomly selected (without ~eplacement) and
inspected,

1-10. The production manager of a ce:rtain company


is interested in testing a fInished product, which is
available in lots of size 50. She would like to rework a
lot if she can be reasonably sure that 10% of the items
are defective. She decides to select a random sample
of 10 iter!'.s wilhom replacement and rework the lot if
it contains one or more defecrivei:ems. Does this pro-cedure seem reasonable?
1-11. A trucking firm has a conb:lictto ship a load of
goods from city W to city Z There are no direct routes
connecting W to Z, but there are six roads from W to X
and five roads from X to Z, How many rotal routes a.re
there to he considered?
1~12. A state has one million registered vehicles and
is considering using license plates with six symhols
where the fIst three are letters and the last three are
/digit~ Is this scheme feasihle?

( 1-13~)The maLagcI of a small plant wishes to deter~


mine the number of wa)'S he ca:a assign workers to the
first shift. He has 15 workers who can scrve as operators of the production equipment, eight who can serve
as maintenance personnel, and fot;I' who can be supervisors. If the shift requires six operators, two mainte~
nance personnel, and one supervisor, how many wa.ys
can the first shift be :na!l1led?
1..14. A production lot has 100 units of which 2Q
to be defective, A n:,uldom sample of four
units is selected 'hithout repla.cemenL 'What is the

are kno>t'D

Exercises

29

probability that the sample will contain no more than


.two cefective units?

"!-!s/-k inspecting incoming lots of merchandise, the


~Jollowing

inspection rule is used where lhe lots contain 300 ulllts, A random sample of 10 items is
selected. Ifthere.is no more than one defective item in
the samp1e, the lot is accepted. Otherwise it is
returned to the vendor. If the fraction defective in the
original lot is p', determine the probability of accepting the lot as a function of p'.

1~16. In a plastics plant 12 pipes empty different


chemicals into 2. mixing vaL Each pipe has a five-position gauge lhat measures the rate of flow into the 'lat,
One day, while expc--r1menting with various mixtures,
a solution is obtained that emits a poisor.ous gas. The
settings on the gauges were not recorded. W"hat is the
prObability of obtaining this same solution when ran~
domlyexperimenting again?
1~17. Eight equally skilled men and WOmen are
app;ying for two jobs. Because the two new emp:oyees must work closely together, their personaliti~s
should be compatible. To achieve this. the personnel
manager has administered a test and must compare
the scores for each possibility. How .wany compar~
isons must the ~ager make?

118. By accident. a chemist combined two labora~


tory substances that yielded a desirable product.
Unfortunately, her assistant did not record the names
of the ingredients. There are forty substances avail2.ble
in the lab. If the twO in question must be located by
successi"'e triakmd~error experiments. what is the
m.aximum number of tests that might be made?
1~ 19. Suppose, in the previous problem. a known Cat~
alyst \\'iiS used in 'JJ.e first accidenta:: reaction. Because
of this, the order in wxch the ingredients axe mixed is
important. "Wllat is 'JJ.e maximum. number of tests that
might be made?

1-20. A co:npany plans to build five additio.u.al ware~


houses at new locations. Ten locations are cnder consideration. Hov.' many total possible choices are there
for the set of five locations?
1..21. Washing maclllnes can have five kinds of major
ru:d five kinds of minor defects. In how many ways
can one oajor ane! one rrtinor defect occur? In how
many ways can two major and two minor defects
occur?
1~Z2. Consider the diagra1!1 at the top of the oext page
of a::J. electronic system, which shows the probabilities
of the system co:nponents operating properly. The
entire system operates if assembly m and at least
one of the coopOoents in each of assemblies I and IT

30

Chapter 1

An Introductiou to Probability

II

operates. Assume that the components of each assembly operate indepeudently and that the assemblies
operate independently. 'What is the probability that the
entire system operates?
1-23. How is the probability of system operation
affected if, in the foregoing problem, the probability
of successful operation for the component in assembly
ill changes from 0.99 to 0.9?
1-24. Consider the series-parallel assembly shown
below. The values Rj (i = I, 2, 3, 4, 5) are the reliabilities for the five components shown, that is, R j = probability that unit i will function properly. The
components operate (and fail) in a mutually independent manner and the assembly fails only when the
path fromA to B is broken. Express the assembly reliability as a function of R 1 R2, R), R4, and R5
1-25. A political prisoner is to be exiled to either
Siberia or the Urals. The prObabilities of being sent to
these places are 0.6 and 0.4, respectively. It is also
known that if a resigent .9f Siberia is sejected at random the probability is 05,that he.will be wearing a fur
coat, whereas the probability i~ 0.7 that a resident of
thc Urals will be wearing one. Up-on arriving in exile,
the first person the prisoner sees is not wearing a fur
coat. 'What is the probability he is in Siberia?
1-26. A braking device designed to prevent automobile skids may be broken down into three series subsystems that operate independently: an electronics
system., a hydraulic system., and a mechanical activa[Or. On a particular braking. the reliabilitics of these

III

units arc approximately 0.995, 0.993, ~d 0.994,


respectively. Estimate the system reliability.
1-27. Two balls are drawn from an urn containing m
balls numbered from I to m. The first ball is kept if it
is numbered I and returned to the urn otherWise. 'What
is the probability that the second ball drawn is numbered 2?
1-28. Two digits are selected at random from the digits 1 through 9 and the selection is without replacement (the same digit cannot be picked On both
selections). If the sum of the two digits is even, find
the probability that both digits are odd.
1-29. At a certain university, 20% of the men and 1%
of the WOmen are over 6 feet tall. Furthermore, 40% of
the students are Women. If a student is randomly
picked and is observed to be over 6 feet tall, what is
the probability that the student is a woman?
1-30. At a maehine center there are four automatic
screw machincs. An analysis of past inspection
records yields the following data.

Percent
Production

Machine

15
30
20
35

2
3
4

Perccnt
Defectives
Produced
4
3

5
2

HO
Machines 2 and 4 are newer and more production has
been assigned to them than to machines 1 and 3.
Assume that the current inventory mix reflects the
production perce:ltages indicated.
(a) If a screw is randomly picked from inventory.
what is the probability L"lat it will '!::Ie defective?
(b) If a screw is picked and found to be defective,
what is the probability that it was produced by

machine 3?
1~31. A point is selected at random inside a cirele,
What is the probability that the point is closer to the
center than to the circumference?

1 ~32. Complete the details of the proof for Theorem


1~4 in the text.
1~33.

Prove Theorem 1~5.

1~34.

Prove the second aud third parts of T!leorem

1-7.
1",35. Suppose there are n people in a room. If a list is
made of all their birthdays (the specific month and
day of the month). what is the probability that tv.'o or
more persons have'the same bir.hday? Assume there
are 365 days in the year and that each day is equally
likely ro occur for any person's birthday. Let B be the
event that two or' more persons have the same binh~

day, FindP(B) andP('B) for n= 10, 20, 21, 22, 23, 24,
25, 30, 4<), 50, and 60.

/---

'1~36.Jn

a certain dice game, players continue to throw

tw:;;dice until they either win or lose. The player 'Nins


on the first throw if the sum of the two uptumed faces
is either 7 or 11 an9 loses if the sum is 2, 3, 0::- 12.
Otherwise, the sum of the faces becomes the
player's "pOint." The player continues to throw llt;til
the first succeeding throw on which he makes his

Exercises

31

point (in which case he wins), or until he throws a 7


(in which case he loses). lhat is L'le probability
that '.he player with the dice
event'..:.ally win the

\\w

game?
1-37~ The industrial engineering depa."'1ment of
the XYZ Company is performing a work sampling
study on eight technicians, The engineer wishes to
randomize the order in which he visits the technicians' work areas. In how many ways may he arrange
these visit~?

1 ~38. A hiker leaves poin: A shown in the :figure


below. choosing at random one path froi.'. A.B, AG,
AD, a..'l.dAE, At each subsequenljunction she chooses
another path at r.mdom. What is the probability tr,.at

she arrives at.: poir.t X1

") I ;.?

1~39. Three primers do work for the publications


office of Georgia Thch, The publications office does
not negotiate a contraCt penalty for late work. and the
data below reflect a large amount of experience '.vith

these pm:ters.

Fraction of
Deliveries
Contracts Held
More tha..'1
Printer i
One Month Late
Fraction of

Printer i
1

2
3

0,2
0.3
0.5

0,2

0.5
0.3

A department obscrves that its recruiting bookIe: is


more th2n a month late. 'Wha: is the probability t'lat
the contract is heM by printer 3?

32

Chapter I

An. Inrroduction to Probability

1-40. Following aircraft accidents, a detailed invesLi.ga~


tion is conducted. The probability that an accident
due to structural failure is correctly identified is 0.9
and the probability that an accident that is not due
to structural failure being identified incorrectly as

due to structural failure is 0.2. If 25% of all aircraft


accidents are due to structural failures, :find the probability that an aircraft accident is due to structural failure
given that it has been diagnosed as due to structural
failure.

Chapter

One-Dimensional Random Variables


21

DlTRODUCTIO~

The objectives of this chapter are to introduce the concept of random variables. to define

and illustrate probabilitydisoributions and cumulative disoribution functions and to present


useful character:izatiolls~for

va..-iables"

- _.- -- - -

When describ.ing the sample space of a random experiment, it is not necessary to specify that an individual outcome be a number. In several examples we observe this, such as in
Example 19, where a true coin is tossed tlrree times and the sample space is '1' = {HHB,
HIfT, HnI, HIT, THH, THT, 7TH, TlTj, or ExampJe 115, where probes of solder joints
yield a sample space '1'= {GG, GD, DG, DD}.

In most experimental situations. however, we are interested in numerical outcomes. For


example, in the illustration involving coin tossing we might assign some real number x to
every element of the sample space. In general, we want to assign a real number x to every outcome e of the sample space if. A functional notation will be used initially, so that x = X(e),
whCreX is the function,. The domain of X is '!:f. and the numbers in the range are real numbers.
The func.ti0n Xis called arandom vari.ble. Figure 2-1 illustrates the nature of this fu'1cnon,

Definition
If't; is an experiment having sample space ~~ and X is a/unction that assigns a real num-

ber X(e) to every outcome e E if, tbenX(e) is called a random variable.

~,,~~pI~~";(
Consider the coin-tossing eXperiment discussed in the preceding paragraphs, If X is me fnrnbcr of
heads showtng, thee X(HHlI) ~ 3, X(HlflJ ~ 2, X(liTH) =2, X(HT/) 1, X(THH} =2, X(TH1) =1.
X(Tm}; I, and X(TTI) =0. The ta.1:ge spa.ceRx= {x:x= 0, t 2. 3} in t,1is example (see Fig. 2-2),

Rx

e.----~------------------+_----~ X(el

(e)

(b)
Figure2-1 The concept of a random variabk. (a):l': The sample space of'i!l. (b)R,: The range

space of X.

33

34

Chapter 2

One~Dimensional

Of

Random Variables

Rx

TIT
TTH
THT-

HIT

0
-1

HHT-

HTH
THHHHH-

-2
-3

X
Figure 2-2 The number of heads in three coin tosses.

The reader should recall that for all functions and for every element in the domain,
there is exactly one value in the range. In the case of the random variable, for every outcome
e E g there corresponds exactly one value X(e). It should be noted that different values of
e may lead to the same x, as was the case where X(ITH) = I, X(THTJ = I, and X(HTl) = I

in the preceding example.


Where the outcome in g is already the numerical characteristic desired, thenX(e)::= e,
the identity function. Example 1-13, in which a cathode ray tube was aged to failure, is a
good example. Recall that g = (t: ' " OJ. If X is the time to failure, then X(t) = t. Some
authors call this type of sample space a numerical-valued phenomenon.
The range space, Rx. is made up of all possible values of X, and in subsequent work it
will not be necessary to indicate the functional nature of X. Here we are concerned with
events that are associated with Rx. and the random variable X will induce probabilities onto
these events. If we return again to the coin-tossing experiment for illustration and assume
the coin to be true, there are eighl.:qually likely outcomes: HHH, HHT, liTH, HIT, THH,
THY, TTH, and ITT, each having probability
Now suppo~~ ~ i~ the event '~exacJly two
heads" and, as previously, we let X represent the number of h.~ads..(see Fig. 2-2), The event
- - I
.
.,,"3,
that (X = 2) r.ehltes ~_Rx,_no!3j_ho",ev_er,px(X = 2) = peA) = 8' since A = {HHT, liTH,
THHris the equivalent event in g, and probability-was 'defined--on events in the sample
space. The random variable X induced the probability of to the event (X = 2), Note that
parentheses will be used to denote an event in the range of the random variable, and in general We will write Px(X =x).
In order to generalize this notion, consider the following definition.

t.

Definition
If g is the sample space of an experiment ~ and a random variable X with range space Rx
is defined on g. and furthermore if event A is an event in g while event B is an event in-R~~
then ~ .~~ B ar~ equivalent events if

A={eE g:X(e)EBj ..
Figure 2-3 illustrates this concept.
More simply, if event A in g consists of all outcomes in g for whichX(e) E B, then A
and B are equivalent events. Whenever A occurs, B occurs; and whenever B occurs, A
occurs. Note that A and B are associated with different spaces.

2-1 Introduction

3S

RX

Figure 2-3 Equivalent events.

Definition
If A is an event in the sample space and B is an event in the range space Rx of the random
variable X, then we define the probability of B as
Px(B) = P(A),

where

A= leE 9':X(e)E B}.

With this definition, we may assign probabilities to even~ in Rx in terms of probabilities defined on events in ::1, and we will suppress the function X, so that P x(X = 2) = in
the familiar coin-tossing example means that there is an event A = (HHT, HTH, THH} = {e:
X(e) = 2} in the sample space with probabilityt. In subsequent work, we will not deal with
the nature of the function X, since we are interested in the values of the rangc space and
their associated probabilities. Vihile outcomes in the sample space may not be real numbers, it is noted again that all elements of the range of X are real numbers.
An alternative but similar approach makes use of the inverse of the function X. We

would simply define X-'(B) as


X-'(B) = (e E 9': X(e) E B}

so that
P x(B)

Table 2-1

=P(X-l(B)) =P(;!').

Equivalent EVents

Some Events in Ry

Equivalent Events in 9

Y=2
Yd
Y=4
Y=5
Y=6
Y=7
Y=8
Y=9
Y= 10
Y= 11
Y=12

{(I, I)}
{(I, 2), (2, I)}
{(l, 3), (2, 2), (3, I)}
{(l, 4), (2, 3), (3, 2), (4, I)}
{(l, 5), (2, 4), (3, 3), (4, 2), (5, I)}
{(I, 6), (2, 5), (3, 4), (4, 3), (5, 2), (6, I)}
{(2, 6), (3, 5), (4, 4), (5, 3), (6, 2))
{(3, 6), (4, 5), (5, 4), (6, 3))
{(4, 6), (5, 5), (6, 4))
{(5, 6), (6, 5)}
{(6,6))

Probability

j6
2

j6
J

,
36
,
36
,
j6

j6

j6
j6
J

",,
j6

j6

36

Chapter 2

One~Dimensional

Random Variables

The following examples illustrate the sample space-range space relationship, and the
concern with the range space rather than the sample space is evident, since numerical
results are of interest.

..'E:l<.~I'~~2:2
Consider the tossing of two true dice as described in Example 1-11. (The sample space was described
in Chapter 1.) Suppose we define a random variable Yas the sum of the "up" faces. Then Ry = {2, 3,

-ft'

16,

k,

4,5,6,7, 8, 9, 10, 11, 12}. and the probabilities arc (36'


;6' ~, ;6' ~, 3~' ;6' fc;), respee~
tively. Table 2-1 shows equivalent events. The reader will recall that there are 36 outcomes, which,
since the dice are true, are equally likely.

One hundred cardiac pacemakers were placed on life test in a saline solution held as close to body
temperature as possible. The test is functional, with pacemaker output monitored by a system providing for output signal conversion to digital form for comparison against a design standard The test
was initiated on July 1, 1997. Vlhen a pacer output varies from the standard by as much as 10%, this
is considered a failure and the computer records the date and the time of day (d, t). If X is the random
variable "time to failure," then ~ = {(d, t): d = date .t = time} and Rx = {x: x;:: a}. The random variable X is the total number of elapsed time units since the module went on test. We will deal directly
with X and its probability law. This concept will be discussed in the following sections.

2-2 THE DISTRIBUTION FUNCTION


As a convention we will use a lowercase of the same letter to denote a particular value of a
random variable. Thus (X = x), eX yx), eX:::; x) are events in the range space of the random
variable X, where x is a real number. The probability of the event
x) may be expressed
as a function of x as

ex::;

Fx(x) ~ Pix,;, x).

(2-1)

This function Fx is calledthe distribution junction or cwnulative function or cumulative distribution function (CDF) of the random variable X.

In the case of the coin-tossing experiment, the random variable X assumed four values, 0, 1, 2, 3, with
probabilities
We can state FJ...x) as follows:

t, t; t. t.

F,(x)

0,

t
~8'

4.

~8'

x<O,
0::::x<1,
t~x<2,

7
~8'

=1,
A graphical representation is as shown in Fig. 2-4.

x;::3.

2~2

The Distribution Function

37

F(x)
1

7/8

----.0

4/8

---.0

1/8

Figure 2-4 A distribution function for the number of heads in tossing three true coins.

fi~~p!~~~~~j
Again recall Example 1~ 13, when a cathode ray tube is aged to failure. Now :f = {t: t ~ O}, and if we
let X represent the elapsed time in hours to failure, then the event (X::;; x) is in the range space of X. A
mathematical model that assigns probability to (Jf_~x) is

x:::; 0,

FX<x) = 0,

= l-e-,1.r,

x>o,

where;" is a positive number called the failure.J!H~.....cfailureslhour). The use of this "exponential"
model in practice depends on certain assumptions -abo~t the failure process. These assumptions will
be presented in more derail later. A graphical representation of the cumulative distribution for the time
to failure for the CRT is shown in Fig. 2 ...5.

A eustomer enters a bank where there is a common waiting line for all tellers, with the individual at
the head of the line going to the first teller that becomes available. Thus, as the customer enters, the
waiting time prior to moving to the teller is assigned to the random variable X. If there is no one in
line at the time of arrival and a teller is free, the waiting time is zero, but if others are waiting or all
tellers are busy, then the waiting time will assume some positive value, Although the mathematical
form of FX depends on assumptions about this service system, a general graphical representation is as
illustrated in Fig. 2...6. ~

Cumulative distribution functions have the following properties, which follow directly
from the definition:

1. 0'; FX<x):;; 1,

-oo<x<oo.

F,(x)

Figure 2 ...5 Distribution function of time to failure for CRT.

38

Chapter 2

One~DimensioDal

Random Variables

LProbability
, of no wa:t

oL---~o~------------------~x

Figure 2-6 Waiting-time distribution function.

2. limx~X<x) = I,

lim"..,J'x<;tl = O.
3. The function is nondecreasing. That is, if Xl S; x" then F;;.Xt ) ,; F;;'X;;.
4. The function is continuous from the rigbl That is, for all X and Ii> 0,

pm [Fx(x+ Ii) - Fx(x)] = 0.

0->0

In reviewing the last three examples, we note that in Example 2-4, the values of x for
which there is an increase in F "x) were integers, and where x is not an integer, then F .j.x)
bas the value that it had at the nearest integer X to the left. In this case, Fx(x) bas a saltus or
jump at the values 0, I, 2, and 3 and proceeds from 0 to I in a series of such jumps. Example 2-5 illustrates a different situation, wbere F x<;t) proceeds smoothly from 0 to 1 and is
continuous everywhere but not differentiable at x = O. Finally Example 2~6 illustrates a situation wbere there is a saltus at X = 0, and for x > O. F;;'x) is continuous.
Using a simplified [ann of results from the !:z..besgue decomposition thect~ it is noted
that we can represent F X<x) as the sum 6!' two component functions, say GP) and HP), or
FP) = Gx(x) + Hx(x),

(2-2)

where GX<x) is eontinuous and Hix) is a right-band continuous step function with jumps
coinciding with those of FP), andHX<- =) =O.lf Gx<;t) =0 for all x, then X is called adiscrete random variable, and if Hx(x) " 0, then X is called a continuous random variable.
Where neither situation bolds, X is called a mixed type random variable, and although this
was illustrated in Example 2-6, this u.'Xt will concentrate on purely discrete and continuous
random variables, since most of the engineering and management applications of statistics
and probability in this book relare either to counting or to simple measurement,

2-3

DISCRETE RA."IDOM VARIABLES


Although discrete random variables may result from a variety of experimental situations, in
engineering and applied science they are often associated with counting. If X is a discrete
random variable, then FX<x) will bave at most a countably in:fin.ite number of jumps and
Rx {x\,~, .... Xk}}

Suppose !hat the number of worldng da:ys in a particular year is 250 and that the records of employ~
ees are marked for each day they are absent from work An experiment consists of randomly selecting

2-3

Discrete Random Variables

39

a record to observe the days marked absent. The random variable X is defined as the number of days
absent, so that Rx = {O, 1,2, ... , 250}. Ibis is an example of a discrete random variable with a finite
number of possible values.

A Geiger counter is connected to a gas tube in such a way that it will record the background radiation
count for a selected time interval [0, tJ. The random variable of interest is the count. If X denotes the
random variable, then Rx = {O, 1, 2, .. ,' k, ... }, and we have, at least conceptually, a countably infinite range space (outcomes ean be placed in a one-to-one correspondence with the natural numbers)
so that the random variable is discrete.

Definition
If X is a discrete random variable, we associate a numbe~pX<x;) = Px(X =:xi)lwith each outcome Xi in Rx for i = 1, 2, ... , n, ... , where the numbers PTxi) satisfy the following:
I. p,,(x,) ? 0

2.

for all i.

2::, Px(x,) ~ I.

We note immediately that


I

p,,(x,) ~ Fix,) - FX(xH)!

v---

(2-3)

and
Fx(x,) ~

px(x,;; Xi) ~

2:px(x).

(2-4)

x:Sx,

The function Px is called the probability function or probability mass f!!:.nction or probability la'll. of the random variable, and fu,,'collection of pairs [(xi' p,,(x')0;;-'i;-2, ..-:ris
called the probability distribution of X. The functionpx is usually presented in either tabuJf!r, gra.p!:!~91, or matk~"!.~tjc..Cl? f?Im, as illustrated in the following examples.
~-

~~jmf2]
For the coin-tossing experiment of Example 1-9, where X = the number of heads, the probability distribution is given in both tabular and graphical form in Fig. 2-7. It will be recalled tha~~:= _{O~;~~}} .

Tabular Presentation

.'

Graphical Presentation
3/8

pix)

1/8

1
2
3

3/8
3/8

1/8

3/8

1/8

o
2
Figure 2-7 Probability distribution for coin-tossing expcriment.

1/8
3

40

Chapter 2

One-Dimensional Random Variables

Suppose we have a random variable X with a probability distribution given by the relationship
( ,
Px(x)=i "JpX(I-P)'-x,

x=O, l,. . ,n,

"

otherwise,

=0,

(2-5)

where n is a positive integer and 0 < p < 1. This relatiouship is known as the binomial distribution,
and it -will be studied in more detail later. Although it would be possible to -di;piay this model~
graphical or tabular form forpartieular n andp by evaluating pf.;;.) for x =0. 1,2, ... n. this is seldom
done in

Recall the earlier discussion of random sampling from a finite population ~i.thout replacement. Suppose there are N objects of which D are defective. A random sample of size It is selected Vlithout
replacement, and if we let X represent the number of defectives in the sample, then

(DYN-DI

'It:)')

Px(x)=
=0,

=0, 1, 2.... , min (n, D),

otherwise,

(2-6)

This distributio:l is knov.'Il as the hypergeometric distribution... In a particular case, suppose N;;;; 100
items, D == 5 items, and n::::: 4; then

=0,

otherwise,

In the event that either tabular Or graphical presentation is desired, this would be as shown in Fig. 2-8;
hOVfeve.r, U!uess there is some special reason to use these forms, we will use the mathematical. relationship.

Tabu!ar

Presentation

Graphica!

p,(x)

0 (~) (~5)/('~0)

Present9.tion

0.B051
'" 0.805

m(~5) / C~O)
(~) (;5) / ('~O) '" 0.014
m(9,5);C~0)

"""0.178

2
3

p,(x)

0.178

=Q,003

(~) (9;) / ('~)

'" 0.000

0,Q14

Figure 2-8 Some hypergeometric probabilities N = 100, D =5, r. :: 4.

0.003

------"'-------3

x.

24 Continuous Random Variables

41

'j~~~ir~1,~}'
In Example 2-8. where the Geiger countel;' was prepared fol;' deteetir.g the background radiation count,
we might use the following relationship. which bas been experimentally sbown to be appropriate:

pix) = .->-e),.t'!!x!,

=0

x = 0, 1,2.... ,
oilie..-wise,

i, > 0,
(2-7)

This is called the f!zjQJLJ!i~1J:i.bU.tiO!1, and at a later point it will be derived analytically_ The param.eter ). is the mean rate~m' "hits"'per unit time. and x is rhe number of rhese "hits."

These examples have illustrated some discrete probability distrihutions and alternate
means of presenting the pab:s [(Xi,P,,(Xi))' i = 1.2, , .. ], In !.ater sections a number of probability distributions will be developed, each from a set of postulates motivated from considerations of/redl-';""orzdphino~~\
A general graphical presentation of the discrete distribution from Example 2-12 is
given in Fig. 2-9. This geometric interpretation is often useful in developing an intuitive
feeling for discrete distributions. There is a close analogy to mechanics if "e consider the
probability distribution as amass of one unit distributed over the real line in amounts Px(xi)
at points XI' i = 1,2, ... n. Also, utilizing equation
we note the following useful re.....ult
where b2 a:
(2-8)

2-4

CONTINUOUS RANDOM VARIABLES


Recall from Section 2-2 that where Hx(x) O. X is called continuous. Then Fx(x) is continuous, Fix) has derivative fb) (dldx) F,,(x) for all x (with the exception of possibly a
countable number of values), and fx(x) is piecewise contiuuous. Under these conditions, the
range space Rx will consist of one or more intervals,
An interesting difference from the case of discrete random variables is that for i5 > 0,
(2-9)

We define the probability density function f J,x) as

fx(x) = dx Fx(x),

Px(x)

~--~X~,----~X2----~X~3----X~4---\\--~XJn_-,----x~n--------~!
Figure 2~9 Geometric interpretation of a probability distribt.::tion.

(HO)

42

Chllprer 2

One-Dimemional Random Variables

and it follows that


(2-11)

We also note the close correspondence in this form with equation 2-4 with an integral

replacing a summation symbol, and the following properties off,!,;;):


1. f,!.;;);:'O

forallxE Rx.

2. fx(;r)dx = L
'.
3. f,Ax) is piecewise continuous.
4. f,Ax) = 0 if x is nOt in the range Rx.
These concepts are illustrated in Fig. 2-10. This definition ofa density function stipulates a function fx defined on Rx such that
(2-12)

where e is an outcome in the sample space. Vv'e are concerned only with Rx and lx- It is
important to realize thatfx(x) does not represent the probability of anything, and that only
when the function is integrated between two points does it yield a probability.
Some comments about equation 2~9 may be useful, as this result may be counterintuitive. If we allow X to assume all values in some interval, then Px(X =xo);::::; 0 is not equivalent to saying the event (X = xo) in Rx is impossible. Recall that if A = 0, then Pj.A) = 0;
however, although PiX =xol =0, the fact that the setA = (x: X =xo) is not empty clearly
indicates that the converse is not true.
All immediate result of this is that Px(a ,; X'; b) = Px(a < X'; b) = Px(a < X < b) =
Px(a'; X < b), where X is continuous, all given by Fx(b) - Fx(a).

!iN'!\l~~3!i~i
The time to failure of the cathode ray tube described in Example 1-13 has the follOwing probability
density function:

'''' O.
otherwise.
where A > 0 is a eonstant knov.-'U as ilie fa.i!ure rate. This probability density function is called the
exponentiol density, and experimental evidence has indicated that it is appropriate to describe the time

~(x)

x
Figure :2-10 HJTOthetical probability density function.

2-4 Contim.:.ous Random Va..'"iables

43

to failure (a real-world occurrence) for some types of components, In this example suppose we want
=), and

'" find P,(f?IOO hours). This is equivalent to stating P,(100"

PT (T?100)

T<

=h,
r k-"dt
=e- too ,,".

We might again employ the cOncept of conditional probability and determine P"T21OOIT:> 99), the
probability the tube lives at least 100 hours given that it has lived beyond 99 hours. From our earlier
wo~k.

P-lT:" lOOT> 99\ = PAT:" 100 and T> 99)


,
.
!
PP(T> 99)

e-1(0).
e

_.

-9?:t ;;;: e

A random variable X has the triangular probability density fuuction given below and shown graJ?rl~
cally in Fig. 2-11:

O::;x < 1,

/j.x) =x,
;;;:2-x,

=0,

1 Sx<2.
otherwise.

Thefollowing are calculated for illustration:

1.

p/-l<X<!\f'O""+I'!2
xd>:=.1..
'\
2) -I
()
8

(
3\
2. pxX';;-.=
\.

0",,+

2;...,.,

1,'xd>:+ 1'12(2-x)""
0

x'\'1'

1 (

=:2+ 2X-Z"J =g
3. P,,(X"; 3) = 1.

4. P",X" 2.5) = O.

(1 3)

")

5, Px -<X<- = xd.Hf,,~ (2-x)""


\.4
2
Jj4
J!
'
15
32

('

3
8

'n

27

Figure 2~11 Triangular density function.

44

Chapter 2

OneDimensional Random Variables

In describing probability density functions, a mathematical model is usually employed.


A graphical or geometric presentation may also be useful. The area under the density func~
tion corresponds to probability, and the total area is one. Again the student familiar with
mechanics might consider the probability of one to be distributed over the real line accord~
ing to Ix. In Fig. 212, the intervals (a, b) and (b, c) are of the same length; however, the
probability associated with (a, b) is greater.
~~l.r{\T

2-5

SOM:E CHARACTERISTICS OF DISTRIBUTIONS

[(Xl' Pxex;

While a discrete distribution is completely specified by the pairs


i = I, 2, "', n,
... ], and a probability density function is likewise specified by [ex./xex; x E Rx], itis often
convenient to work with some descriptive characteristics of the random variable. In this sec-

tion we introduce two widely used descriptive measures, as well as a general expression for
other similar measures. The first of these is the first moment about the origin. TIris is called
the mean of the random variable and is denoted by the Greek letter J1, where
for discrete X,
for continuous X.

(2-13)

This measure provides an indication of central tendency in the random variable.

Returning to the cointossing experiment where X represents the number of heads and the probabil
ity distribution is as sho'NIl in Fig. 27, the calculation of J1 yields

1'=

X;PX(XI)=0m+l-(%J+2HJ+3(~J=%,
,=1

as indicated in Fig. 2-13. In this particular example, because of symmetry, the value J1. eould have been
easily detenninedfrom inspection. Note that the mean value in this example cannot be obtained as the
output from a single trial,

;E#~r;,.1'!~:1
In Example 2-14, a density Ix was defined as

AN = X.

=2->;
=0,

a
b
Figure 2~12 A density function.

O';;X';; I,

I,;;x ';;2,

otherwise.

2-5 Some Characteristics QfDistribunons

316

45

3/8

118

1/8

o
~

3/2

Figure 213 Calculation of the mean.

The mean is determined by

JJ.= f~x.Xdt>-fX.{2-X)dx

+ fOdx+ftOdx=4
another result that we could have dete:rmined via symmetry,

Another measure describes the spread or dispersion of the prebability associated with
elements in RJ(: 'This measure is called the variance, denoted by cr,
--.:..t and is defined as
follows:
~ ____ ~ ___ ~_~~

_ _ -i_

for discrete X,
for continuous X.

(2-14)

This is the second moment about the mean, and it corresponds to the moment of .inertia.in
mechanics. Consider Fig. 2-14, where two hypothetical discrete distributions are shown in

graphical fonn. ~ote that the mean is one in both cases. The variance for the discrete random variable shov.n in Fig. 2-14a is
0"

= (0-1)'

{)+(1-1)' (i)+(2-1)'() =i,

and the variance of the discrete random variable shown in Fig. 2~14b is
0"

=(_1_1)' . .1:.+(0_1)' 1 +(1_1)'.


5

,I
),1
+ (2-1) '5+(3-1 5 2,
WIDch is four times as great as the variance of the random variable shown in Fig. 214a.
If the units on the random variable are, say feet, fuen the units of the mean are the same,
but the units on the variance would be feet squared. Another measure of dispersion, called the
standard deviation, is defined as the positive square root of the variance and denoted o;where

(j={;;2.

(2-15)

It is noted that the units of (jare the same as those of the random variable, and a small value
for (jindicates little 'dispersion whereas a large value indicates greater dispersion.

46

Cbllpter 2

One-Di:mensiorull Random Variables

PxIA
1;2

114

114

o
(al,u~'

-1

and 0'0 0,5

Figure 2-14 Some hypothetical distributions.

An alternate form of equation 2-14 is obtained by algebraic manipulation as

for discrete X,

1-

= _x 2 f x (x)dx-/1 ,

for continuous X.

(2-16)

This simply indicates. that the second moment about the mean is equal to the second
moment about the origin less the square of the mean. The reader familiar with engineering
mechanics will recognize that the development leading to equation 2-16 is of the same
nature as that leading to the theorem of moments in mechanics.

Using the alternate form,


2
(f

:- 2

1J (3)' 3

=lO '8+ 1 '8+ 2 8+ 3 '8 - \2 ='4'

which is only slightly easier.

2. Binomial distribution-Example 'J.alO. From equation 2-13 we may show that jJ.= "P, and

or

,,' Jtx"(
n11'"(1_ p)n-z1-(np)2,
I
,:x
r=<l

wbich (after some algebta) simplifies '"

,,'=np(l-p).
3. JExpo_.en1iaJ dism'bution-Euvnpie 213. Consider the density functionftx), where

N.x)=2e-",
=::

x;,; 0,
otherwise.

2-5

Some Characteristics of Distributions

47

T'nen, using integration by parts,

o
and

4. Another density is exCx}, where

. g.,(x) = 16.re-4':
;;; 0,

xl:O,
otherwise.

9x(X)

rnen

and

Note that the mean is the same foe the densities in parts 3 and 4. with part 3 having a variance
twice that of part 4,

In the development of the mean and variance, we used the terminology "mean of the
random variable" and ~'variance of the random variable." Some authors use the terminology
"mean of the distribution" and "variance of the distribution:' Either terminology is acceptable, Also, where several random variables are being considered, it is often convenient to
use subscript on If. and (Y, for example, If.x and (Yx'
In addition to the mean and variance. other mOments are also frequently used to
describe distributions, That is, the moments of a distribution describe that distribution,
measure its properties. and, in certain circumstances? specify it. ~1oments about the origin
are called on'gin moments and are denoted J.4. for the kth origin moment, where .

48

Chapter 2

One-Dimensional Random Va.'iables

for discrete X,
for continuous X.

= [ , / fx(x}dx
k = 0,1,2,,,,,

(2.17)

Moments about the mean are called central moments and are ~noted Ilk> where
P.= I,(X,_p)kpX(x,)

for discrete X,
for continuous X.
(2-18)

Note that the mean j.J. == j.J.~ and the \'ariance is 0

-:=

i12. Central moments may be expressed

ill terms of origin moments by fue relationship


,

/'"

ik\

ILk=I,H)'l,jpJp;-j,

26

k=O,I,2",,,

(219)

J.1

j"'O

CHEBYSHEV'S Th'EQUALITY
In earlier sections of this chapter, it was pointed out that a small variance, 0 2, indicates that
large deviations from the mean, J.ll are improbable. Chebyshev's inequality gives us a way
of understanding how fue variance measures the probability of deviations about IL'

Thwrem 21
Let X be a random variable (discrete or continuous), and let k be some positive number,
Then
(1,..20)

Since

it follows that
(f

1 ~ JP'./K (x- IL)1 'fx(x)dx+ JW ~(x- IL)1 . fx(x)dx.


--

'J.t+..JK

Now, (x - ILl' ~ K if and only if b: - JLi ~

,r "J--

Jl-fi

VK; fuerefore,

Kfx(x}dx+J

""

mKfX<x)dx

f.tTvK

= K[Px( X,;; IL -.fK) + px(x;;: p ~.fK)]


and

2-7

Summary

49

so that if k = YK/cr, then

The proof for discrete X is quite similar.


An alternate form of this inequality,

Px (lX-IlI<kcr)

1- :'

(2-21)

or

Px(ll-kcr<X<Il+kcr) ~

1
I-I?'

is often useful.
The usefulness of Chebyshev's inequality stems from the fact that so little knowledge
about the distribution of X is required. Only J1 and 0- 2 must be knO\Vll. However, Cheby-

shev's inequality is a weak. statement, and this detracts from its usefulness. For example,
px(ix - 111 ~ cr)" 1, which we knew before we started! lithe precise form oflx(x) or Px(x)
is known, then a more powerful statement can be made.

From an analysis of company records, a materials control manager estimates that the mean and standard deviation of the "lead time" required in ordering a small valve are 8 days and 1.5 days, respeetively. He does not know the distribution of lead time, but he is willing to assume the estimates of the

mean and standard deviation to be absolutely correct. The manager would like to determine a time

interval such that the probability is at least ~ that the order will be received during that time. That is,

1
k'

8
9

1--=so that k = 3 and j1 ka gives 8 3(1.5) or [3.5 days to 12.5 days]. It is noted that this interval may
very well be too large to be of any value to the manager, in which case he may elect to learn more
about the distribution of lead times.

27

SUMMARY
TItis chapter has introduced the idea of random variables. In most engineering and management applications, these are either discrete or continuous; however, Section 2-2 illustrates a more general case. A vast majority of the discrete variables to be considered in this
book result from counting processes, whereas the continuous variables are employed to
model a variety of measurements. The mean and variance as measures of central tendency
and dispersion, and as characterizations of random variables, were presented along with
more general moments of higher order. The Chebyshev inequality is presented as a bounding probability that a random variable lies between Ji- ka and Ji + ka.

50

Ch2.pter 2

One-Dimensional Rando:::a Variables

2-8 EXERCISES
2-1. A five-a.rd poker hand may contain from zero to
f~~ aces, If X is the random variable denoting the
number of aces, enumerate the nmge space of X Vlhat
are the probabilities associated with each possible
vitue of Xl
2~2.

A car rental agency has either 0. 1, 2. 3, 4, or 5


",
th
bahil"lUes 6'
I I ,
J
ears returned each U4y, Wi pro
6' J' 12'
and
respectively. Fmd the mean and the vari~
ance of the number of cars returned..

t. -h.

.~~ random variable X has

the probability density


function ce-x, Find the proper value of c, assuming 0 $
X < Z<:I. Find the mean and the variance of X

"

2.4t The cumulative distribution function that a television tube will fail in! hours is 1 - e-.:t, where c is a
parameter dependent on the manufacturer and t 2::: 0.
Find the probability density function of T. the life of

the tuhe.

~Consider the three functions given below. Determine which functions are distribution functions
(CDFs).
0-7J...

>" (a)

O<X<<><>.

_~. (h)

O.s:;x<oo,
x<O.

Fodx) 1
GJ.x) ~ e~.
=0,
(e) HJ.x)=e'.

"

= 1.

x=0.1.2 .....

=0,

otherwise.

Find the probability that he will have suits left over at


the season's end.
'2 J

: '3 ,

z"lO. A ra.:1dom variable X has a CDF of the form


(1 V+1

Fx(x) = 1-(2:)

x=0,1,2, ....

=0.

x<{}.

(a) Find the probability function for X.


(h) Find PodO < X ~ 8).

2Ml1. Consider the following probability density


f';![Lction:

fxf.x)=/a;
O~x<2,
=k{4-x).
2';x';4.
:= 0,
o1heJ1lf1.se.
(a) Fmd the value of k for which f is a prob.bility
demity function.
(b) Find the mean and variance of X
(c) Find the C1J.l):J,ulative distributionfuncdon,

2--12. Rework the above problem, except let the probability density function be defined as

-_<xS;O.
x>O.

f/..x)=lo:.
O::;'x<a"
2,6. Refer to Exercise 2-5. FInd the probability den= k{2a -x).
a";x'; 2a.
sity function corresponding to the functions given, if .' ;,:. ; ~:) '/
0,
otherwise
they are distritaltion functions.
~:
.a.:l~~The manager of a job shop does not "know the
(l!J)Wbich of the f'ollowillg functions are ~screte pr05ability distribution of the time required to com~
probability distributioDs?
"
plete an order. Howe'ler. from past pcdo:rmance she
1
has been able to estimate the mean and variance as :!.4
(
va) Px ()
x =-3'
x=O.
"
2

-'1l

=0,

x=l,

othenvise.

(5\1 2"
J,3 j l::d .
/1,5-,1:'

,/(h)
'i

Px(x) = x
=0.

deys and 2 (deys) respectively. Fmd an interval such


that the probability is at least 0.15 that an. order is finished during that time.
2~14. The continuous

x = 0-1.2,3,4,5.
otherwise.

2..8. The demandfof a product is-l. 0, +1) +2 per day


J 2 3
. 1
W1'th prob a biliti'
'es SJ 'IO'5'lD,respectlvey.

Ad emand

of -1 implies a Ul1it is returned. Find the expected


demand and the variax:.ce, Sketch the distribution
~n(CDF).
~.
f"
.
"~ manager 0 a men s clothing store 1$ con~
\~'"Iled over the inventory of suits, which is currently
30 (all sizes). The number of suits sold from nOw to
t.~e end of the seasor. is distributed as

:r

random variable Thas the probability density funcnonj(') = ki'- for -I ,; ,:5 O. Find

1hc following:
(a) The appropriate value of k.
(h) The mean and variancc of T.
(c) The cumulative distribution function.

2--15. A discrete random variable X has probability


functionpX<x), where

pl.x) = k(1/2t.
=0,

x=I.2.3,
otherwise.

(a) Fmd k;
(h) Fmd the mean and variance of X.
(c) Fmd the cumulative distribt.>tion ~ctionFxf.x).

2-8
2-16. The discrete random variable N (N =. 0. 1, ... )
has probabilities of OCC1i.'TCnce of kf1 (0 < r < 1). Find
the appropriate value of k.

51

2-.21. A continuous random 'fanabIe X has a density

function
2x
f.(x)~9'

/-~

. ,,2--11:I'he postf. service requires, on the average, 2


days to deliver' a letter across town. The variance is
estimated to be 0.4 (dayyz. If a business executive
wants at least 99% of his letters deL'vered on time,
how early should he mail them?

Exercises

o<x<3,

~O

otherwise,

(a) Deve!op the CDF for X. V

I,l1) Find the ~ of X and the VlIIWnce of X . /


2-18. Two different real estate developers, A and B, :!lc)~ Find 14. ':.jr,...
O""'n pa..--cels of land being offered for sale, The proba'rd) F1nda value m such that PI-X l!:m) =Px(X 5m),
bility distributions of selling prices per parcel are
This is called the median of X 5f,f-2.,
sho'W"TI in the following ~ab1e,
2-22. Suppose X takes on the values 5 and -5 with

t.

Price
Sl000 $1050 Sl100 $1150 $1200 $1350
A

0.2

0.3

0.1

0.3

0.1

0.1

0.3

0.3

0.05
0.1

0.05
0.1

probabilities Plot the quantity P[JX - 1'1'" k<r] as a


ftmction of k (for k > 0). On the same set of axes, plot
the same probability determined by Chebyshev's
inequality.

~:;l;:mLnd the cumulative distribc.tion function assodated with

Assuming that A and B are operating independently,


co:rrqJute the fo:Cowi."lg:
(a) The expected selling price of A and of B.
(b) The expeeted selling price of A given that the B
selling price is $1150.
(c) The probability marA and B both have the same

selling price.

If

\ 2t

fx(x)='::r exp
O.
2~24.

x \
- - 2 :"

t>0,;;:2:0,

otherwise.

Find the cumulative distribution function asso-

ciated with

2-19. Show that the probability function for the sum of


values obtained L"l tossing two dice may be written as
x-I

Px(X) =36'
13-x

~36'

x=2.3, ... ,6,


x~7.8 .... ,12.

of

2~2(}. Fbd the mean and variance


the rando.!1! vari
able whose probability fur.ction is defined in the previous problem.

2-25. Coasider t.i.e probability density :unctioa


f,{y) ~ k sin y. 0 ;; y S 7If2. What is the appropriate
ruue of k? Find the mean of the distribution,

2-26. Show tba~ central momen!S can be expressed in


of origin moments by eq:;:aUon 2-19, Hint See
Chapter 3 of Kendall and Stuart (1963).

ter!'!1s

Chapter

Functions of One Random


Variable and Expectation
3-1 INTRODUCTION
Engineers and management scientists are frequently interested in the behavior of some
function. say H, of a random variable X. For example, suppose the circular cross-sectional
area of a copper wire is of interest. The relationship Y = 1tX2/4, where X is the diameter,
gives the cross-sectional area. Since X is a random variable, Y also is a random variable, and
we would expect to be able to determine the probability distribution of Y =H(X) if the distribution of X is known. The first portion of this chapter will be concerned 'With problems
of this type. This is followed by the concept of expectation, a notion employed extensively
throughout the remaining chapters of tlris book. Approximations are developed for the
mean and variance of functions of random variables, and the moment-generating function,
a mathematical device for producing moments and describing diytributions, is presented
with some example illustrations.
(

3-2 EQUIVALENT EVENTS


Before presenting some specific methods used in determining the probability distribution of
a function of a random variable, the concepts involved should be more precisely formulated.
Consider an experiment ~ with sample space g. The random variable X is defined On
g, assigning values to the outcomes e in g, X(e) = x, where the values x are in the range
space Rx of X. Now if Y H(X) is defined so that the values y H(;c) in Ry, the range space
of Y, are real, then Y is a random variable, since for every outcome e E g, a value y of the
random variable Yis determined; that is, y = HlX(e)]. This notion is iliustrated in Fig. 3-1.
If C is an event associated 'With Ry and B is an event in Rx. then B and C are equivalent
eventsiftbey occur together, that is, ifB= (XE Rx:H(x) E C). In addition, if A is an event associated 'With g and, furthermore, A and B are equivalent, thenA and C are equivalent events.

RX

Figure 3~1 A function of a random variable.

52

Ry

+----"'----f-+ y = H(x)

3-2

Equivalent Events

53

Definition
If X is a random variable (defined on ff) having range space Rx , and if H is a real-valued
function, so that Y = H(X) is a random \lariable with range space R y. then for any event
C c R", we define
(3-1)

It is noted that these probabilities relate to probabilities in the sample space. We could
write

P,.(C)=P(ee 9': H[X(e))

C)).

However, equation 3-1 indicates the method to be used in problem solutiou. We find the
event Bin Rx that is equivalent to event C in Ry; then we find the probability of event B.

:!:~~~~fl@~
In the case of the cross-sectional area Yof a ..vire. suppose we know t.'mt the diamete: of a Vvire has
density function

LOOO ,; x " L0Il5,


other.vise.
Let Y~ (1t14)X' be the cross-sectional area of the wire, and suppose we want to find p,.(y,; (1.01)r.i4).
The equivalent event is deteIpJ.ined Pr(Y" (1.0J)r.i4) = Px[(r.i4)X' :;; (1.01)r.i4j = px<iX1 ".JiJiI).
The event {x E Rx:lxl" .JJ:ij1} is in the range space Rx,:md sinceJx<x) = 0, for all x < l.0.
we calculate

~)'
r;-;-) ~
=PxILO:;;X';vLOI

PxliXi';~1.01
" '

W200dx
OI

1.000

0,9975,

rj;:1#Yl~_'~~J
In the case of the Geigercouater experiment of Example 2-12. we used the distribution given in equation 2-7:
~

p,!.x)= '-"(MY/X!,
= O.

x ~ 0, 1,2,
or.."lerwise.

"'>

Recall that .A.. where A > 0, represents the Clean "hit" rate and l is the tir!J.e interval for which the
counteI: is operated. Now suppose we wish to find P~Y:5 5), where
Y~2X+2,

Proceeding as in the previous example,

= [Px(O) +Px(l)j=

[e-" (M)O 1O!] +le-" (M)' !1!]

=e-~[I+MJ.

The event {XE Rx:

xst} ism the rnngespace ofX. and we have thefuoctlonpx to work with in that

-=---..- . .- . .- - - - - - - - - - - -.. . .
space,

54

Chapter 3

Functions of One Random Variable and Expectation


,

3-3 FIDlCTIONS OF A DISCRETE RANDOM VARIABLE


Suppose that both X and Yare discrete random variables, and letxi "x/21 , xi.!:""j represent _
the values of X such that H(x,) Yi for some set of index values, D., = U: j = 1, 2... s,}.
The probability distribution for Y is denoted !'!EiY~ given by
'~i:
!PY(Yi)= py(Y = y,) L,pxtxd'I'
(3-2)

I
\

L-.-____jfJ.Q;
For example, in Fig, 3A where s, = 4, the probability of y,ls phD = Px(x\) + px<xJ + Pl.x,,)
+ p,!.xJ.
'
~

In the special case where H is such that for each y there is exactly one x, then

ph) = Px(x). where y, = H(x;). To illustrate these concepts, consider the following examples.

In the coin-tossing experiment where X represented the number of heads, recall that X assmned four
values, 0, 1,2, 3. ""'ith probabilities

3, 5. and p,(- 1)
exactly one x.

i}~21~,~i1

t. t. t, t. If Y:::: 2X ~ I, then the possible values of Y a.::e - 1, 1.

=t. py(l) ~ 1. py(3) =1, py(S) =}. In this case. H is such that for each y there is

Xis as in the previollS example; however. suppose now that Y:::: ~ -

Ya:e 0, 1, 2, as indiCllted in Fig. 33. In this case

py(O) = Px(2)=t,
4
py(l) = Px(l)+ px(3)=S'
1

pA2)= Px(O)=g'

Rx
H
Xi;

I
)

Xi::!_

Xl::;_

Xil\_

Figure 32 Probabilities in R y

Ry

21, so that the possible values of

3-4 Conr:I::.uous F-;uerioas of a Coatinuous Random Variable

Ax

S5

Ay

y=H{x)=:x-21

(~+=============rrl!~"'101

\111'

Figure 3-3 An example r.mccon H.

In the event that X is continuous but Y is discrete, the formulation for p.f.y) is
(3-3)

where the event B is the event

iX;;~~I~~-.~l

that is equivalent to tlre eveot (Y = y) in Ryo

\//

Suppose X has the exponential probability density function given by


N?:) = kr",
=.

X" 0,
otherwise.

Furthermore, if

y=o

forXS l/)"

for X > 1/}...

and

34 CONTI/lilJOUS FUNCTIONS OF A CONTINUOUS


R~OMVARIABLE

lfX is a continuous'random variable with probability density functionf", and His also continuous, tben Y =H(X) is a continuous random variable, The probability density function for
the random variable Ywill be denotedfyand it may be found by performing three steps.
1. Obtain the CDFof Y. F,(y) = P,(Y ';'y). by finding the eventBinRx which isequivalent to the ovem (Y;; y) in Ry.

2. Differentiate F,(y) with respect to y to obtain the probability density functionf,ty).


3. Find the range space of the new random variable.

56

Chapter 3

Functions of One Random Variable and Expectation

~x~~ie.
Suppose that "fue random variable X has the following density function:
fx{x) = xJg,
:;: : 0,

0';; x;!; A,
ome::wise.

If y= H(X) is therendom variable for wbich the density ilis desired, and H(>:) =0:+ 8, as shov;min
Fig. 3-4, then we proceed according to the steps given above.

,... f,h(y'
= R'(Y')
}
r = 32 41
3. If .x-=O.),= 8, and ifx=4,)':=. 16, so then we have
y

g<'y<,16,

= 0,

otb.:TNise.

Y= H(x) 1

Hex)

tr(Y) = 32 -4'

2x+8

16~
i

FiJlUl'" 3-4 The function H(x) = 2x + S.

y= H(x}
H(X)::; (x- 2)2

Figure 3-5 The function Hex) = (x - 2)'.

3-4 Continuous Functions of a Continuous Random Variable

57

.:~~l~;~:i;
Consider the random variable X defined in Example 3-6, and suppose Y=H(X)
in Fig. 3-5. Proceeding as in Example 3-6, we find the following:

= (X -

2?, as sbown

1. Fh) ~ P!.Y~y) ~ PxX- 21' ~y) ~px(-.JY ~ X -2 ~.JY)


~Px(2-.JY ~H2+.JY)
x 2 [+-!Y
c

2+-5 x

~t-!Y8dx~16
~ I~ [(4 +4.JY

2-'0'

i+(4-4.JY

+ Y)]

~!...JY.
2

2. fy(y) ~ Fy(y) ~ Ir::4~y

3. If x = 2, Y = O. and if x = 0 or x = 4, then y = 4. However,fy is not defined for y = 0; therefore,


I

fy(Y)~4JY'

O<y~4,

=0

othenvise.

In Example 3-6, the event in Rx equivalent to (y" y) in Ry was [X'; (y - 8)12]; and in
X ,; 2
In the
Example 3-7, the event in Rx equivalent to (Y'; y) in Ry was (2
first example, the funetion H is a strictly increasing function of x, while in the second
example this is not the case.

-.JY ,;

+.JY).

Theorem 3-1
If X is a continuous random variable with probability density function/x that satisfies/X<:<)
> 0 for a < x < b, and if y = H(x) is a continuous strictly increasing or strictly decreasing
function of x, then the random variable Y ~ H(X) has density function

l~fX~~rI,I

(3-4)

with X~H-l(y) expressed in terms ofy. IfHitiiicreasmg, then/,,(y) > 0 if H(a) <y <H(b);
and if H is decreasing, then/,,(y) > 0 if H(b) < y <H(a).

Proof (Given only for H increasing. A similar argument holds for H decreasing.)
F,,(y)

p,(Y'; y)

P,(H(X)'; y)

~pX[X';Wl(y)]

=Fx[W1(y)].
'() dFx{x) dx
/y ()
y = Fy y =~. dy'

=/x{x)- dx,
dy

by the chain rule,

wherex=W'(y).

58

Chapter 3

Functions of One Ramlom V_ble aru:\ Expectation

In Example 36, we hall

Ux) =xl8,

O';x'; 4,

; ; ;: O.

otherwise,

andFJ(x) =: 2x + 8, which is a strictly increasing function. Using equation 3-4,

ty(y)= tX(X)'idxi= y-8 . .!.


dy

since x = (y - 8)/2. R(O)

16

Z'

8 and R(4) = 16; therefore,

/r(y) =-"-_!c

8';y';16,

:::;; 0,

otherwise,

32

4'

3-5 EXPECTATION
If X is a random variable and Y = H(X) is a function of X, !bell the expected value of H(X)
is defined as follows:
\

(3-5)

H(x) fx(x)dx for X conti~':..


(3-6)
.. _.--- -~ ~-" - ---~....--~
In the case where X is continuous, we restrict H so that Y;;:;:. H(X) is a continuous random
variable, In reality. we ought to regard equations 3-5 and 36 as theorems (not definitions),
and these results have come to be known as the law of the unconscious statistician.
The mean and variance, presented earlier, are special applications of equations 3-5 and
36. If H(X) = X, we see that
. E[H(X)] =

----~-,.~---

E[H(X)

=E(X) =/1.

(37)

Therefore, the expected value of !be random variable X is just lhe mean, p..
If H(X) = (X - J1.?, then
E[H(X) = EX - /11') = d'.

(3-8)

Thus, lhe variance of the random variable X may be defmed in terms of expectation. Since
the variance is utilized extensively. it is customary to introduce ~a variance operator V that
is defined in terms of the expected value operator E;
V[H(X)] = E([H(X) - E(H(X)]~.

(3-9)

Again, in the case where H(X) = X,


VeX) ~ E[(X - E(X)il

=E(X') - [E(X)]'.

(HO)

which is the van"ante of X, denoted dl.


The origin moments and central moments discussed in the previous chapter may also
be expressed using !be expected value operator as

J1.' = E(X'l

(311)

3-5

Expectation

59

and

ilk = E[(X - E(X))''j.


There is a special linear function H that should be considered at this point. Suppose that
H(X) = aX + b, where a and b are constants. Then for discrete X, we have
-._____c

'I E(aX + bl = I'(ax, + b )Px(x,)

= a 2,xi Px(x')+b 2,Px(x,)


i, .. ~

faE(X)+b'J

(3-12)

and the same result is obtained for continuous X; namely,

~= [(ax+b)fx(x)dx
=a[ xfx(x)dx+b[fx(x)dx
=aE(X)+b,

(3-13)

Using equation 3-9,


V(aX + b) = E[(aX + b - E(aX + b))']
E[(aX + b - aE(X) - bJ']
= aE[ (X - E(X) )2]

ti'v(X).]

(3-14)

In the further special case where H X) = b, a constant, the reader may readily verify that

E(b) = b

(3-15)

= O.

(3-16)

and

V(b)

These results show that a linear shift of size b only affects the expected value, not the variance.

Suppose X is a random variable such that E(X) = 3 and VeX) = 5. In addition, let H(X) = 2X -7. Then
EIH(X)] =E(2X -7) = 2E(X) -7 =-1

and
V[H(X)] = V(2X -7) = 4V(X) = 20.

The next examples give additional illustrations of calculations involving expectation


and variance.

Suppose a contractor is about to bid on ajob requiring X days to complete, where X is a random variable denoting the number of days for job completion. Her profit, P, depends on X; that is, P = H(X).
The probability distribution of X, (x, p/..x)), is as follows:

60

Chapter 3

Functions of One Random Variable and Expectation

Px(xl
[

8
5

ii

otherwise,

O.

Using the notion of expected value, we calculate the mean and variance of X as

1
8

5
8

2
8

33
8

E(X) = 3-+4-+5-=and

V(X)=[3,.2.+4,.~+52.~J_(33)2 =
8

23

64

If the function HUG is given as

then the expected value of H(X) is

E[ H(X)] = 10,000

H(x)

3
4
5

$10,000
2,500
-7,000

+ 2500 (%)-7000 .

(i)

= $1062.50

and the contractor would view this as the average profit that she would obtain if she bid this job many,
many times (actually an infinite number of times), where H remained the same and the random vari~
able X behaved according to the probability function Px. The variance of P = H(X) can readily be
calculated as

V[ H(X}] = [(IO,OOO)' .2.+ (2500}2 . ~+ (-7000)' '~J- (1062.5}2


888

= $27.53 10'.

A well-kn~ simple inventory problem is "the newsboy problem," described as follows. A newsboy
buys papers for 15 cents each and sells them for 25 cents each, and he cannot return unsold papers.
Daily demand has the following distribution and each day's demand is independent of the previous
day's demand.

Number of customers x

Probability, pIx)

23

24

25

26

27

28

29

30

0.01

0.04

0.10

0.10

0.25

0.25

0.15

0.10

If the newsboy stocks too many papers, he suffers a loss attributable to the excess supply. If he stocks
too few papers, he loses profit because of the exeess demand. It seems reasonable for the newsboy to

35

Ex.pec:ation

61

stock some number'bf ;tapers so as to ;ninirnize the expected loss. If we let s represent the number of
papers stocKed, Xtbe daLy demand, andL(X, s) the newsboy's loss for a pa..>1:icular stock level $, then
the loss is simply
L(X, s) = O.lO(X -s)

= 0.15(s-X)

ifX>s,
ifX,;s,

and for a given stock level, s, the expected loss is


,

3ll

E[L{X.s)] = 2:,0.15(s-x) pxCx)+ 2:,O.IO(x-s)' Px(x)


.:o~

.r--,:+l

and the E[L(X, s)I is evaluated for some different values of s.

For 3=26.
E[L(X. 26)] = 0.15[(26 - 23)(0.01) + (26 - 24)(0.04).,. (26 - 25)(0.10)
+ (26 - 26)(0.10)1 + 0.10((27 - 26)(0.25) + (28 - 26)(0.25)
... (29 - 26)(0.15) - (30 - 26)(0.10):
=$0.1915.

Fors=27.
E[L(X, 27)] = 0.15i(27 - 23)(0.01) + (27 - 24)(0.04) + 07 - 25)(0.10)
+ (27 - 26)(0.10) ... (27 - 27)(0.25): ... 0.10[(28 - 27)(0.25)
+ (29 - 27)(0.15) + (30 - 27)(0.10),
= $0. 154().

For $

28,

BIL(X, 28)J =0,15[(28 - 23)(0.01) + (28 - 24)(0.04) + (28 - 25)(0.10)


(28 - 26)(0.10) + (28 - 27)(0.25) + (28 - 28)(0.25)1

+ 0.10[(29 -

28)(0.15) + (30 - 28)(0.10))

$0.1790

Thus, the newsboy's policy should be to stock 27 papers if he desires to minimize his expected loss.

Consider the redundant system sbam in the diagram below.

At least one of the units must function, the redundancy is standby (meaning that the second unit does
not operate until the first falls), sMtching is perfect, and the system is nonmaintained. It can be
shown that ~det' certain conditions, when the time to failure for each of the units of this system has
an exponential distribution, then the time ttl failure for the system has the following probability den
sity function.

62

Chapter 3

Functions of One Random Variable and Expectation

f,!x) = J.'xe-'-',
= 0,

x> 0, J. > 0,
otherwise,

where A is the "failure rate" parameter of the component exponential models. The mean time to
ure (MTTF) for this system is

fail~

Hence, the redundancy doubles the expected life. The terms "mean time to failure" and "expected
life" are synonymous.

3-6 APPROXIMATIONS TO E[H(X)] AND V[H(X)]


In cases where HeX) is very complicated, the evaluation of the expectation and variance
may be difficult. Often, approximations to E[H(X)] and V[H(X)] may be obtained by utilizing a Taylor series expansion. This technique is sometimes called the delta method. To
estimate the mean, we expand the function H to three terms, where the expansion is about
x =IL If Y =H(X), then

Y = H(Il) + (X - Il)H'(Il) + (X - Il)'


2

. W(Il) + R,

where R is the remainder. We use equations 3-12 through 3-16 to perform

E(Y) = E[H(Il)] + E[H'(Il)(X -

Il)]

+ E[~W(Il)(X - Il)' ] + E(R)


= H(Il) + !W(Il)V(X) + E(R)

'" H(Il)

+!2 W(1l )a'.

(3-17)

Using only the first two terms and grouping the third into the remainder so that

Y =HC!l) + (X -

Il) . H'C!l) + R I,

where

then an approximation for the variance of Y is determined as

V(y) = V[HC!l)] + V[(X - Il) . H'C!l)] + V(RI)


= 0 + veX) . [H'C!l)]'
= [H'C!l)]'. a'.

(3-18)

If the variance of X, 02, is large and the mean, J.1, is small, there may be a rather large error
in this approximation.

The surface tension of a liquid is represented by T (dyne/centimeter), and under certain eonditions,
Too 2(1- 0.005X)1.2, where X is the liquid temperature in degrees centigrade. If X has probability den~
sity functionjx, where

A;>proximations to E,H(X)] and V(H(X)]

3-6

fx(x) ~ 30oox"\

63

x;" 10,

;;: 0

otherwise,

then

E(T)= [2(1-0.oo5x)1.2 3OO0x" dx


_to

and

V(T)

.r~, 4(I-D.005x)'' .3000x-4dx-(E(T)' .

In order to determine these values, it is necessary to evaluate

r(1-0-OO5x)"

J:o

ax.

X4

Since the evaluation is difficult., we use the approximations given by equations 3-17 and 3-13. Note
that

Since

H(X) = 2(1 - O.OOSX)'',


then

H'(X) =- 0.012(1

0,005X)"z

H"(X) = 0.000012(1- O.005X)""'.

Thus,
H(lS) = 2[I-O"005(15)lt~= 1.82,
H'(lS)

=- 0.012,

li"(15)

= O.

Using equations 3-17 and 3-18

E{T) =H(15)+tW(15).cr' = 1.82

and
V(1) = [H'(I5)]" d'= [- 0.012]'75 = 0.Q108.

An alternative approach to this sort of approximation utilizes digital simulation and


statistical methods to estimate E(y) and V(l:'). Simulation will be discussed in genem! in
Chapter 19, but the essence of this approacb is as follows.

64

Chapter 3

Functions of One Random Variable and Expeetation

1. Produce n independent realizations of the random variable X, where X has proba.


bility distribution Px orfx. Call iliese realizations x,. x" '''' x,.. (The notion of inde
pendence will be discussed further in Chapter 4.)
2. Use Xi to compute independent realizations of Y; namely, y, =H(x,), Y2 =H(x-J, ... ,
y, =H(;<,).
3. Estimate ErY) and V(l:') from ilie y"Yz, ... ,y, values. For example, the natural esti
matorior E(y) is the sample mean y '" 2:~., y,/n. (Such statistical estimation problems will be rreated in detail in Chapters 9 and 10.)
As a preview of this approach. we give a taste of some of the details. FIrst, how do we
generate the n realizations of X that are subsequently used to obtain Y1' h, ,,+, Y11? The roOst
important technique, called the imerse transform method, relie..<;. on a remarkable result

Theorem 32
Suppuse that X is a random variable with CDF F ,{x). Then the random variable F,{X) has
a uoiforrn distribution on [0, 1]; that is, it has probability density function

Jul.u) = 1,
=0

OS uS

I,
(319)

otherwise.

Proof Suppose that X is a continuous random variable. (The discrete case is similar.)
Since X is continuous, its CDF Fx(x) has a unique inverse. Thus, the CDF of the random
variable F x(X) is
P(FiX) S u) = P(X S F!i(u) = F,{F!i(u = u.

Taking ilie derivative wiili respect to u, we obtain ilie probability density function of F ,{X),
d

duP(Fx(X),;;u)~I,
which matches equation 319, and completes ilie proof.
We are now in a position to describe the inverse transform method for generating random
variables. According to Theorem 32, ilie random variable F,!X) has a uoiforrn distribution
on [0,1]. Suppose iliatwe can somehow generate such a uoiforrn [O,IJ random variate U, The
inverse transform meiliod proceeds by setting F,{X) ~ U and solving for X via the equaJion
X

=F!i(U).

(320)

We can ilien generare independent realizations of Y by


Y, = H(X i) H(Fi, (Ui
,
....
U,
are
independent
realizations
of
the uoiforrn [0,1] dis
2
tribution. We defer the question of generating U until Chapter 19. when we discuss computer simulation techniques. Suffice it for now to say iliat iliere are a variety of methods
available for this task, the most widely used acmally being a pseudouniform generation
meiliod-one iliat generates numbers that appear to be indepen.dent and uniform on [0, I]
but that are actually calculated from a derern:tinistic algorithm,
i

=1,2, "', n, where U,. U

Suppose that U is unifO,ITn on [0,1]. We will show how to use the inverse transfonn method
to generate aD exponential random variate with parameter ).,. that is, thc continuous distribution

3~ 7

l.c:.e Moment~Genera.ting Funetion

65

having probability density functionfx.Cx) :;; A,e-.tr. for x 2: 0. lne CDF of the exponential random

va..-iable is

lnerefore, by the inverse rransform theorem. we can Set

and solve for X,


X =F;' (U) =Iln(l-U).
where we obtain the inverse after a bit of algebra. In other words, -(1/A.) In(1 - [1) yields au exponential random va..-iable with parameter A.

We remark that it is not always possible to find Pi in closed form, so the usefulr.ess
of equation 3-20 is sometimes limited. Luckily, many other random variate generation
schemes are available, and taken together, they span all of the co=only used probability
distributions.

37

THE MO?vIENT-GEJIo'ERATING FTNCTION


It is often convenient to utilize a special function in finding the moments of a probability distribution. TIlls special function, called the moment~ger.eratingfo.nction) is de:finedas follows.

Definition
Given a random variable X, the moment-generating funetionMJt) of its probability distribution is the expected value of ex. Expressed mathematically,
M,ft) = E(ex)
= Le'"
=

(3-21)

Px{x,)

discreteX.

(3-22)
(3.23)

e" Ix (x)dx continuous X.

For certain probability distributions, the moment-generating funetion may not exist for
all real values of t. However, for the probability distributions treated in this book, the
moment-generating function always exists.
Expanding etX as a power series in t we obtain

On taking expectations we see that


2

. ()
'2',I-+,,+EIX
t
' ') ._.,.
t . ...
X t+ElX
( ) Ber~ IX] =hE
, 2
\
r!

Mx t
so that

(324)

66

Chapter 3

Functim:.s of One Random Variable and Expectation

Thus, we see that when ~41yj..t) is written as a power series in r~ the coefficient of fir! in the
expansion is the 7th moment about the origin. One procedure, then, for using the moment~
generating function would be the following:
1. Find Mxit) analytically for the particular distribution.
2. Expand M,(t) as a power series in t and obtain the coefficient of fir! as the 7th origin
moment.
'The main difficulty in using this procedure is the expansion of Mit) as a power series in t.
If we are only interested in the first few momenTS of the dislribution, then the process
of derenninlng these moments is usually made easier by noting that the rth derivative of
Mxil), with respect to t, evaluated at t= 0, is just

d'

_.Mx(t)1
0 =
dt T
t=

EflX'e"'] t=O = J.l;,

(3-25)

assuming we can interchange the operations of differentiation and expectation. So, a second procedure for using the moment-generating function is the following:
1. Determine M/..,) analytically for fue particular distribution.
2. Find J.l; =

!',

Mx(tll,.o

.Moment-generating functions have many interesting and useful properties. Perhaps fue
most important of these properties is that the momentgenerating function is unique when
it exists, so that if we know the moment-generating function. We may be able to determine
the form of the distribution.
In cases where the moment-generating funetion does not exist, we may utilize tbe
characteristic function, C/"t), which is defined to be the expectation of ,fiX, where i = H.
There are several advantages to using the characteristic function rather than the momentgenerating function. but the principal One is that Cx{t) always exists for all t. However. for
simplicitYj we will use only the momentgenerating function.

!i[(~mRf~~tI~~:j
Suppose that X has a binomial distrfbutif:m., that is.
(It~

Px(x)

=l

X'
/

=0,
where 0 < p

~l

..
_ ..
p (I-p) ,

x:= 0,1,2,. '".!t,.

otherwise.

and n is a pOSitive integer, The moment-generating fimction MJ,t) is

This last summation is re<:ognize<! as the binomial expansion of[pe' + (I-p))". so that

M,(tl = [pe' + (1- pl]".


Taking the derivatives. we obtain

3-8 Summary

67

and
M~(t) = npe'(l- p +npe')(: - p(e' -l)l~'.

Thus

and

14 =M~(t)I",,= np(l- p'" npl.


The second cenrral moment may be oblained using cr;;;;;; J.4 - [il;;;;;; np(1- p).
;i!~p!~~~~E'
Assume X to have the following gamma distribution:
a
b-l-~
fxt")=-x e
b

reb)

wherer(b)~

O:5x<-,a>O,b>O,

== O.

otherwise.

e-'y"-ldy isrhegammajunction.

The mOll'lellt~gen.cratiJlg fmctlon is

which. if v.-e let y ;;:;:. x(a - t), becomes

Since the integral on the right Is just r(b), we obtain

)-'

Mx(t)=_a_;;;;;;J( 1-!..
(a-t)' \
a

Now using the power series expansion for

fort < a.

r.

(1--'-

aJ

we find
t b(b.,.l) ")' + ... ,
Mx(I)=hb-"'--la
21
a

which gives the moments

u'

. 1

b
a

and

,_0(0+1)
!1i ---,-.
a

3-8 SUMl\HRY
This chapter first introduced methods for determining the probability distribution of a random variable that arises as a function of another random variable with known distribution.
That is, where Y=H(X), and either Xis discrete with knowndistributionp.,{x) or X is COIl-

68

Chapter 3

Functions of One Random Variable and Expectation

tinuous with known density !x(x), methods were presented for obtaiIDng the probability distribution of Y.
The expected value operator was introduced in general terms for E[H(Xl], and it was
shown that E(Xl = J1, the mean, and E[(X - J1l'] = ri', the variance. The variance operator V
was given as V(Xl = E(X') - [E(Xl]'. Approximations were developed for E[H(Xl] and
VlH(K)] that are useful when exacr methods prove difficult. We also showed how to use the
inverse transform method to generate real.izations of random variables and to estimate their

means.
The moment~generating function was presented and illustrated for the moments f.l~ of
a probability distribution. It was noted that E(X') = 1';.

3-9 EXERCISES
3~1. A robot positions 10 units .t.... a chuck for machin
iog as the chuck is indexed. If the robot positions the
v

unit improperly, the unit falls away and the chuek


position rCl'.l:lains open, thus resulting in a cyele that
produces fewer than 10 units. A study of the robot's
past performance indicates that if X =number of open
positions,
px(x)=O.6,

x=o,

=0.3,

x= 1,

0.1,

=0.0,

(b) If the profit per sale is $200 and thc replacement

of a picture tube eosts $200, tind the expected


profit of the business.
34~ A contractor is going to bid a project, and the
number of days, X, required for completion fonows
the probability distribution given as

pix) =0.1.
=0.3,
=0.4,
=0.1,
=0.1,

x 2,
otherwise.

'=

If the loss due to empty positions is given by Y= 20;(-!,


find the following:
(a) p,.iy).
(b) E(Y) and VCy).

3-2. The content of magnesium in an alloy i.s a random


variable given by the following probability density
function:

x= 10.
x= 11,
x=12,
x=13,
x=14.
otherwise.

0,

The contractor's profit is Y"", 2000(12 - X).


(a) Find the probability distribution of Y.
(b) Find E(X). V(X). E(y). and V(Y).
3-5. Assume thaI: a continuous random variable X has
probability density function

fx(x) =2xer r ,

X~O,

otherwise.

=0,

Find the probability distribution of Z = X'.


=:

0,

otherwise.

The profit obtained from this alloy is P = 10 + 2X,

Ca) Find the probability distribution of P.


(b) \That is thc expected profit?

3-.6. In developing a random digit generator, an impor~


tant property sought is rhat each digit D; follows the
follov.ing discrete unifonn distribution.,

pJ),(d)

3~3. A manufactu.rerof color television

sets offers a 1year warracty of free replacement if the pictu.."'e tube


fails. He estimates the time to failure, T, to be a random variable with the following probability dist:ribu~
non (in units of years):
(\_.!.e-/!4
4
f T,,-

= O.

t> 0,

otherwise.

(a) Vlhat percentage of the sets will he have to


service?

1
= 10'

= O.

d=0.L2,3..... 9,
otherwise.

(a) Find E(D,) and V(D,).

=l

J,

lJ

(b) If y
D, - 4.5 where
is the greatest integer
("round down") function, find Pr0'}, (Y), V(l').

3-7. The percentage of a certain additive in gasoline


determines the wholesale price. If A is a random. variable representing the percentage, then 0 $A ~ 1. If the
percentage of A is less than 0,70 the gasoline is lowtest and sells for 92 ee.tl.tS per gallon. If the percentage
of A is greater tha:n or equal to 0.70 the gasoline is

39 Exexcises

69

bighwtest and sells for 98 cents per gallon, Find the


expected re'>'enue per gallon wberefA(a) == 1. O!::a S 1;

Evaluate E(A) :and V(A) by using the approximations


derived in this chap~.

otherwise.k(a)

3~13~

O.

3..8. The probability function of the random variable X.

Suppose that X bas the uniform probability


density function

- I -(ve)(x-p)
,
f x (x) -Be
=0,

1 :5'x $; 2,

otherwise.
othenvise

is knO'tVtl as the two~parameter exponential distrfbu,


non. Find the moment-generating function of X Eval~
uate E(X) and V(X) using the moment-generating

function.
3--9~ A random

variable X has the following


ity density function:

fl.x) = a-',
O.

probabil~

3-14. Suppose that X has the exponer.'tial probability


density function

x:::: 0,

X:>O.

otherwise.

othel;'Wise,

(al Develop the density function for Y= 2 X'.


(b) Develop the density for V = Xn.
(cJ Develop the density for U = In X
3--10. A two-sided rotating antenna receives signals.
The rotanonal position (angle) of the antenna is denoted
X. and It may be assumed that this position at the tlme
a IUgnal is received is a ra.'1dom variable with the den~
sity below. Actually, the randomness lies in the signal,
I

Find the probability density function of r = FI(X; where

H(X)=~),.

tl+x

~ used-car salesman finds that he sells either 1.


2,3,4,5, or 6 ca.."'S per week with equal probabilIty.

~ind the moment-generallng function of X.

(b/usin~ the moment~ger.e:ating function fincC.E('X':


~ and VgJ/
' I

'.,..>'

3~16. 1.<:.t X be a random variable with probability


density function

fx(x)=-,
2"
=0,

(aJ Flnd the probability density funetion of Y = H(X)


wbere H(x) = 4 - x'.
(b) Find the probability density function of Y= H(X)
where H(;<) = e,

otherwise.

flx) = ai'e-b~,

=O.

The signal can be received if Y:> Yo> where Y:;:;; tan X

For instance, Yo ::;;; 1 corresponds to < X < ~ and


~ < X < 3;' Find the density function for Y.
3--11. The demand for antifreeze m.a season is considered to be a uniform random variable X, with density
fx(x)

= lO'' "
.e: 0.

10' ,; x S; 2 x 10',
otherwise,

wbere X is measured in liters. If the manufacturer


t:lakes a 50 eent profit an eacb liter shc sells in the fall
of the year, and if she must carry any excess aver to

x> 0,
othet'.Vise.

(a) Evaluate the constant a..


(b) Suppose a new function Y = lax:! is of interest.
/'/ Find an approximate value for EeY) and for V(Y).

: 3-17. Assume that Y has the exponential probability


":denS'ity function
y>o,
otherwise.

Find the approximate values of E(X) and 'V(X) where

the nex.t year at a cost of 25 OOnt'> per liter, find the


"optimum" stock level for a particular full season,
3--12. The acidity of a certain product. measured on an
arbitrary scale, is given by the relation.ship
A = (3 .,. 0.050)',

where G is the amount of one of the constituentS baving probability distribution

fc(g)=t,
= 0,

O';g';4,
otherwise.

The concentration of reactant in a chemical


process is a random variable having probability

3~1S.

distribution

fk) = 6r(1- r),


=0,

05r51,

otherwise.

The profit associated with the final product is P =


$1,00 +$3.00R. Find the expected value of P. 'Vtbat is
the probability &stribution of P?

70

Chapter 3

Functions of One Random Variable and Expectation

,/,' 3~19~ The repair time (in hours) for a certain e1ectron~

'-..-iafuy controlled milling machine follows the density


function

f:!--') ~ wr2I,

7:>0.

1"')\,

1:(!}l If Y = (X - 2)', find the CDFfor 1:


3~24. The third moment about the mean is related to
the asymmetry, or skewness, of the distribution :and is
defined as

othenvise.

=0,

Determine the moment~generating function for X and


use this fu.'1crion to evaluate E(X) and V{X).

3..20. The .ctQS$*secrional diameter of a piece of bar


stock is circular with diameter X. It is known that
E{X'J = 2 em and V(X) = 25 x 1~ cm2 A cutoff tOOl
cuts wafers that are e.x:actly 1 Cm thick.. and this is con~
sta(l\Fmd the expected volume of a wafer.

~2V.)I a random variable X has moment-generating


1/" Tunction Mx(t), prove that the random variable Y"" aX

+ b has moment-generating function e ib Mx (at).


3-22. Consider the beta distribution probability
density vJJ1Ction
Ix (x);;;;;; k(l- X),,-I x b-l.

O~xsl,a>O,b>O,

!J., =E(X - J1;)',


:= p.; - 3,u2jl~ + 2(P~)J. Show that for a
symmetric distribution, fJ.J = 0,

Show that fJ.J

3--25. LetJbe a probability density function for which


the rth order moment)1~ exists. Prove that all moments

of order less than r also exist


3--26. A set of COllStaIlts k,.. called cumuIants, may be
used instead of moments to characterize a probability
distribution. If .'1(J..r) is the moment-generating func~
rion of a random variable X, then the cumulants are
defined by the generating function

vW)

Tuus, the rth cumulant is given by

othCI".'tlse,

=0,

k = d''If xCl)1 ,
r
dt' Ir-O

{a) Evaluate the constant k.


(b) Find the me~.

f")fmd the variance,


f3~~) The probability distribution of a random vari-

Find the cumulants of the f1Orrr.a1 distributl:on whose


densty function is

able X is. given by

1/4,

x = 0,
.t;;;;;; 1,

=118,
=118,

7:=3.

pIx) = 112,

I
'.

\ fJ)

()~i1

x~2,

-. '- 0,
otherwise.
Determine the mean and vadan~e of X from the

\ / ' moment-generating function.

log MI'),

f(')=-= exp
a-v2r.

li'x-u"l:.q

--1--'
i J'-~<~<~'
2 \. (J /

327~ Using the inverse transform method. produce 20


realizations of the variable X described by Px(x) in

Exercise

3~23,

3--28. Using the inverse transform method, produce 10


realizations of the random variable Tin Exercise 3-3.

Chapter

Joint Probability Distributions


~.

r--'

1,41 ) INTRODUCTIO:--;
In many situations we must deal with two or more random variables simultaneously. For
example, we might select fabricated
sheet steel specimens
measure
strength
.....
- - and
-_
... shear
.._
- 'and
..
~--'---

__

-.----~-

2!]'ld.diameter 01 sP9.!.lilelds. Thus, both weld !'lieru:.str!'llW..?,ll..Q weld di.'!!lleter:."!.e the ran~jlbl"'u;>.f mt",re5t. Or we may select people from a certain population and
their height and weigbL
'.
. . . .. -- - - - - . ' , . . ...'
.

mcasure

-The objective "filii, chapter is to formulate joint probability distributions for two or
more random variables and to present methods for obta1.Ung both marginal and coruItiwnal
distributions. Conditional ~~"~~(tas w.tl.ll~!pe regre~~91tpfm[~~!1' We
also present a definition of irul.ependence for random variables, and covariance and correlation are defined. Functions of two or more random variables are p~:;;Sentecf~~d a specIal
~f linear combinations is presented with its corresponding moment~generating function. Finally the law of large numbers is discussed.

Definition
If if is the sample space associated with an experiment ~, and Xl' X1;> '. Xk are functions.
each assigning a rea! number X,(e), .'I:,(e) .... Xie) to every outcome e, we call [Xl' X" ....
XJ a k-dimensional rtJJ]dom vector (see Fig. 4-1).
The range space of the random vector [Xl' X" "', X,l is the set of all possible values of
the random vector. This may be represented as Rx, XX1Y.". xXe where
Rx,xx~x ... xx.. = ([Xl'Xz., ... , x,J:XI

Rx,>.t2 E Rx~. ""Xk

Rx)

This is the Cartesian product of the range space sets for the componentS, In the case where
k =2, that is, where we have a two-dimensional random vector, as in the earlier illustrations,
RXI xX. is a subset of the Euclidean plane.

e.~~__~__________~~__~~R~
r-~./~_.,Rx,

~
71

72

Chapter 4

Joint Probability Distributions

42 JOINT DISTRIBUTION FOR TWODIMENSIONAL


J RAt'IDOM VARL.ffiLES
In most of our considerations here, we will be concerned with two-dimensional random
vectors. Sometimes the equivalent term. two-dimensional random variables will be used.
If the possible values of [X" x"J are either finite or countably infinite in number, then
[XI' Xz] will be a two-dimensional discrete random vector. The possible values of [XI' X z]
are {x!;. xljJ. i == I, 2, ... ,i =. 11 2, ....
If the possible values of [XI' X,l are some uncountable set in the Euclidean plane, then
{Xl' X:J ~-ill be a nvo-dimensional continuous random vector. For example, if a
S b and
c::;:~ s d, we would have RXI xXz == {[xl> xJ: a 5. Xl:':;: b, c 5. Xi. ~ d}.
It is also possible for one component to be discrete and the other continuous; however,
here we consider only the case where both are discrete or both are continuous.

Consider the case where weld shear s::rength a.'1d weld diameter are meas:lred. If we let Xl represent
diameter in inches andX2 represen.t strength In pounds, and if we know 0 ~x! <.0.25 inch while 0 ~
x, S 2000 poUlJds, then me ra:nge space for [X" X,] is the ser {[x" x,]: 0 ~ XI < 0.25, 0 "-"z ;; 2000].
This space is shoWD graphically in Fig. 4~2.

1'E,~~~~~1:'
A small pump is inspe..-red for four quality-co;ltrol characteristics. Each charaCteristic is classified as

good, mino::: defect (not affecting operation) or IIJ.ajor defect (affecting operation). A pump is to be
selected and the defects counted. IfXt :::::: the number of minor defects andX2 = the number of major
defects, we know that Xl =. 0, 1, 2. 3. 4 and Xz =. O. I, ,.,' 4 - X I because only four characteristics are
inspected. The range space for [X" X,] is thus ([O, 0], [0, I), (0, 2], [0, 3), (0,4). (1, 0], [I, IJ, (1,2],
[1,3], [2, 0], [2, 1], [2, 2], [3,0], [3, 1], [4, 0]). These possible outcomes are shown in Fig. 4-3.

X2

(pounds)
2,000

Figure 4--2 The range space of (Xl' XJ. where Xl is


weld diareeter andX2 is shear strength.

2~'--~--~--+---+--Figure 4-<3 The range space of [Xl' X:i~, where Xl is


the nUlllOc::: of minor defects andX2 is the number of
major defects, The range space is indicated by heavy

X"

dots.

4-2 10im Distribution for Two-Dimensional Random Variables

73

In presenting the joint distributions in the definition that follows and througbout the
remaining sections of this c!tapter, where no ambiguity is introduced. we shall simplify the
notation by omitting the subscript on the symbols used to specify these joint distributions.
Thus. if X = [X,. X,J.Px(x,. x,) = p(x" x.J and/x(x,. x.J =/(x;, x,).

Definition
Bivariate probability functions are as follows.

o
~

!?iscrete c"'!i!; To each outcome [x;, Xzl of [X,. X,l. we a'sociate a number.
p(x,.x.;J =P(X, =x, and X, =x,),

where

and
(4-1)

The values ([x,. x,l. p(X,.


[X,.

Xz)

for all i, j make up the probability distribution of

X,J.

(3) Euclidean
Contim{OHS ~I! [Xl' Xi] is a continuous random vector with range space R in the
plane, tlienf, the joint density fwtction, has the follO\Ving properties:
for all (x" x.,) E R
and

A probability statement is then of the form

P(al ~X,"I1.a, ~X, $b,)= t'J"/(x\,Xz)dxjdx,


0z <1,
(see Fig. 4-4).

'"

Figore4-4 A bivariate den..~tyfunct:ion. w!:.ere peal ::;;X:::; oJ' az S:Xz ~07) is given by the shaded
volume.

74

Chapter 4

Joint Probability Distributions

It should again be noted thatj(x" x,) does not represent the probability of anything. and
the convention thatj(x\, x,) ~ 0 for (Xl'>;') '" R will be employed so that the second prop"
erty may be ",'ritten

In the case where [X,. X,J is discrete, we migbt present the probability distribution of
[Xl' X 2] in tabular, graphical, or mathematical fonn. in the case where [Xl' X,J is continuous, we usually employ a mathematical relationship to present the probability distribution;
however, a graphical presentation may occasionally be helpful.

;ll~~~f~'
A hypothetical probability distribution is show::t in both tabular and graphical form in Fig. 4--5 for the
random variables defined In Example 4-2.

~i>l;;;~:;(
In the casc of the weld diameters represented by Xl and tensile strength represcnted by Xl' we might
have a uniform distribution as by

!~

1/30

1/30

2/;!{)

1/30

i 1/30

1/30 , 2/30

1/30

I 3/30

,3/30 ,

L=-J

I 3/30 I 1/30

i 4/30 I
.'lI3Oi
I
3130

\
!

i
I

(a)

pix;, xz}J.
3/30

X,

----- .'lI3O
1/30
----- ----3/30
----4/30

~30-- ~.
0

x,

(b)

Figure 4~5 Tabular and graphical presentation of a bivariate probability distribution. (a) Tabulated
values are p(x1 xJ. (b) Graphical presentatiQIl of discrete bivariate distribution.

Uf
4-3

"

O$X, <0,25,0$'2

;:; 0,

Ma..--ginal Distributions

75

~2000,

other"<Y1Se,

The range spare was shO'Vi'D. in Fig. +2, and if we add another dimension to display graphically
y;. fix l ,;0, then the distribution would appear as in Fig, 4-6.ln the univariate case, area corresponded
to probability~ in the bivariate case. volume under the surface represents the probability.
For example, suppose we wish to find P(O, 1
,,0,2, 100 ~X2:;; 200), Tnis probability would
befound by integratingj(x" x,) over Ihe region OJ
,,0,2, 100:5 X, :> 200, That is,

"X,

rlOO r"~ 1
J100 JO1 500

/-

,G MARGDlAL DISTRIBUTIOKS
,~

Having defined the bivariate probability distribution, sometimes called the joint probabil,
ity distribution (or in the continuous case the joint density), a natural question arises as to
the distribution of Xl Of X 2 alone. These distributions are called marginal distributions. In
the discrete case, the marginal distribution of Xl is

p,h)= LP(X',x,) 1for all


,r,

x,

(4-2)

-J

(4-3)

Jf~~R!~1f~;
In Example 4-2 we considered the joint discrete distribution shov.~ in Fig, 4-5. The marginal distributions are shown in Fig. 4-7. We see that [Xl' Pl(X;)] is a univariate distribution a11d it is the distribu~
tiQn of XI (the number of mL.1.OC defects) alone. Likewise [.l'.1. p:.(~)] is a unl\<1.irlate distribution and it
is the distrib>J.lion
(the number of major defects) alone.

If [X" X,J is a continuous randnm vector, the marginal distribution of X, is


(4-4)

= 2.000

F.gnrc 4-6 A bivariate uniform deosity,

76

Chapter 4

Joint Probability Distributions

Ni
1

p,(x.)

8130

SIS(]

2/S(] : 3130

6/30

3/30 :

41S(]

o I 1130

liS(]

2130

illS(]

1/30

3/30

11/30

ill30

I 3130

lP,(X,) [ 7/S(]

8130
i

1/30 :

4/30

31S0

1713~ 8/30.

7/30

1130

~(')"1[

ai30

9130

(a)

P,('I) +

! /7130
o

La
1

P,(X,)
,1/30
4
X,

lCL
o

6130
1

L
.

4130

j3/30,

'2

2 --3:--- 4
(0)

(b)

Figure 4~7 Marginal distributions for discrete [XI' X:.J. (a) Marginal distributions-tabular fann,
ginal dislribution (x,.p,(x,, (el Marginal distribution (x,. p,(x,),

(b) Mar~

and the marginal distribution of X, is


(4-5)

The function/, is the probability density function for Xl alone, and the fUllctionh is the density function for X2 alone.

'~l?!~*,6)
In Example 4-4, the joint donalty 01: LX" X,] was given by

t(x"",) =

5~'

;;;;; 0,

O~XI <0.25,OSx, $2000,


Qilierwise.

The marginal dislributions of X, and X, are

otherwise.
and

otherwise.
These are shown graphically in Fig, 4-8,

4-3 Marginal Disu;butions

fl(X:~

77

LL_
o

0.25

1/2000

Xl

2,000

"z

(b)

(a)

Figure 4~8 Marginal distributions for bivariate uniform vector [Xl' Xzl. (a) M..arginal>:iistribution of
X" (b) Marginal distribution of X"

Sometimes the maro-.Jnals do not come out so nicely, This is the case, for example,
when the range space of [Xl' X,) is not rectangular.

E~~l~L~f;s
Suppose that the jOint density of [X;, Xii is given by
~

}(x" x,) ~ 6x,.

0,

d
othel'Vlise.

Then the marginal of X, is

t,(XI)~ [. f(x"x,)dx,
:=

f6x
x,

dx 2

forO<x\<':"
and the m~.-ginal of Xz is

1: 6x
z

=3xi

1 d:t1

forO<x 2 <1.

TIle expected values and variances of X, and X, are determined from the marginal dis-

ttibutions exactly as in the univariate case, 'Where [X:. X2 ] is discrete, we have

E(X,)~,Ul ~ 2>lP:h)~ LLxlPh,x,),


XI

.xl

Xl

= LLx~p(XI'X,)-.u~,
"

x,

(4-6)

(4-7)

78

Chapter 4 Joint Probability Distributions


and, similarly,
(4-8)

(~9)

In Example 4~5 and Fig. 4--7 marginal distributions for XI and ~ were given. Working with the ntargjnal distribution of Xl shown in Fig. 4-lb, we may calculate:
7

g/,/

E(X)=
"1 =0-+1-+2-+3-+4-=V
1
~
30
30
30
30
30 5
and

I'

V(Xd=O'j2 =

7 ,7 , -+3
3
2-+4
7 2. 1-J -118J'
0 -+1--+2
L 30
30
30
30
30 L5

103

~75

The mean and variance of X2 could also be detennined using the m.aIginal distribution of ~~:

Equations ~6 tlJrough ~9 show that the mean and variance of X, and X" respectively,
may be determined from the marginal distribetions or directly from the joint distributioIL
In practice, if the marginal distribution has already been determined, it is usually easier to
make use of it.
In the case where [Xl' X21 is continuous, then

E(XI)

= [Xlfi(Xl)dxl [ [ X,f(XI'X,)dx,dx ,>

ill

,
1_(Xl-Ill) fi(xIJdx ,

v(XI)~a;

(HO)

(~11)

and

E(X,) = 1'2 = [ xd,(x,)dx, = [ [ x,f(x,> xz)dxldx,

>

(4-12)

'S-~(X2 -,u,) fz(x,)dx,

V(X,)=a z =
=

r xifz(xlJdx, - J.Li
(4-13)

4-4 Conditional Distributions

79

Again, in equations 4~10 through 4~I3, observe that we may use either the marginal densities or the joint density in the calculations.

;~l?:4,!l,;
1L E.xample 4-4. th.e joint density ofwe1d diameters, XJ' and shear strength. X'z. was given as

f(x"x,> SIlO'

0';", < 0.25, 0,; x, s2000,

;;;;; 0,
and

otherolf..se.

the marginal densities for Xl and X2 \\-'ere given in Examp!e 4--6 as


O~xl<O.25,

and

A(",) = 2000'
;;;; O.

0';" < 2000,

otherwise.

Wo.dd:J.g with L.1.e [!')arginal densities, the mean and V3..."iance of XI are- thus

and

rg
\...-

CONDITIONAL DISTRIBUTIONS
When dealing with two'jointly distributed random variables it may be of interest tcfind..the.
~stribution-Of one of these variables, given a particular value of th~oili~r. That IS-I we may
~.~istribution OL~l g:ive.J)~2 s: For example, what is the disttltru&in
of a person's weightg(venfhat he is a particularneight? This prObability distribution would
be ealled the conditional distribution of X, given that X, = "'.
Suppose that the random vector [X" X;J is discrete. From the definition of conditional
probability, it is easily seen that the conditional probability distributions are

P(Xl,X2)

p,(x,)

(4-14)

and
(415)
where p,(x,) > 0 and Pl(X,) > O.
It should be noted that there axe as many eonditional distributions of X2for given Xl as
there axe values Xl 'With PI ex t) > 0, and there are as many conditional distributions of Xl for
given X2 as there are values X2 with Pl~'0 > O.

80

Chapter 4

Joint Probability Distnbutions

i~~I~~~~,t1j
Consider the COU:lting of minor and major defects of the small pumps in Example 4-2 and Fig. 4-7.
There will be five conditional distributions ofXZ, one for each value of Xl' They are shown in Fig. 4-9.
The distribution PXJ'rJ(x,J, fot XI == O,is shown in Fig. 4-9a, Figure 4-9b shows the disaibution Px~: (.xz),
Other conditional distn'butions could likewise be determined for XI
distribution of Xl given that Xl == 3 is

2,3. and 4, respectively. The

otherwise.

If [Xl' X21is a continuous random vector, the conditiooal densities are

(4-16)
and

(4-17)

,~jam~I~*j~:
Suppose the joint density of [Xl' X2] is the functionfpresented here and shown in fig. 4-10:

r_)=x'1 '" x,x,


/( x }.-..
3'
otherwise.

=0,

x,

p{O,O)
p,(O) .

Quotient

",ISO
7130

p(O,l)
p,(O) .

1130

p(0,2)

--p;(O)

4
p(O, 3)
p,(O)

1130
7/30 ;:::

mo=7"

p(O,4)
P1(O}

'1

(a)

x,

P',i ,(x,) = p(l, x 2 )fp,(1)

p{1,0)
p,(1)

p;(1)

p(1,2)
p,Cl) .

p(1,3)
p,(1)

p;(1)

11'>0 1
7130 = '7

2130
2
7130 = '7

- -...- - - - - - - - - - - - - - - - - - Quotient

1/30
7130

='7

p(1,1)

(b)

Figure 4--9 Some examples of conditional distributions,

3/3()

7130"

'1

p(1,4)

7/36 = 0

4-4 Conditional DistributioIlS


f(x,.

81

x"l

x,

Figure 4-10 A biva..';ate density [unction.

The matginal densities are /,(;<,) ""d/,(z,). These are detennined as

otherwise,

O<xzS;2,
othervrise.

The margina! densities are shown in Fig. 4-1l.


The conditional ~ensities may be determined using equations 4-16 anP 4-17 as
'2
xtxz
x,+--

lx,,, (x,)

,i

2x,"+-x,
3
~O.

otheI'W...se.

and

",(3x, +x,)
1+(x,/2)

1' /"

otherwise.
Note that for fx,;,x/.J:;;. there are an i.nfinite number of these conditional densities, one for each value

o<Xl < L TWo of these.fx#!J2)(Xz} andfx:jl(x.z), are shown in Fig, 4-12.A1so, forfx;!xz(,xl)' there are an

o
Figure 4-11 The matginal deusities for Example 4-11.

82

Chapter 4

Joint Probability Distributions

Figure 4-1.2 Two conditional densities iX1jltl and fX~l> from Example 4-11.

x,

infinite nU!lloer of these conditional densities. one for each va1ue 0 S; Xz ~ 2, Three of these are shov.n
in Fig. 4 13.

-------------------------------------

4.5/ CONDITIONAL EXPECTATION

./

In this section we are interested in such questions as determining a person' s expected


weight given that he is a particular height. More generally we want to find the expected
value of Xl given iI:iformation about X 2- If [Xl' X 2] is a discrete random vector, the conditional expectations are

E(X"x,) = }:;XtPx,lx, (xt)

(4-18)

x,

and

E(X,IX1J = }:;x2PX,lz, (X2)'


x,

(4-19)

Note that there will be an E(X,f<,) for each value of x,. The value of each E(X,f<,) will
depend on the value x" which is in turn govemed by the probability function. Similarly,
there will be as many values of E(X,Ix,) as there are values X" and the value of
will
depend on the value determined by the probability function.

X,

E(X,Ix,)

L5

Conditional Expectation

83

]!~pI{t2
Consider the probability distribution of the discrete random vector [XI' Xz), where X: represer.ts the
number of orders for a large turbine in July and X2 represents the number of orders in August The
joint distributon as wee as the marginal distributions are given in FIg. 4-14. We cor-sider the tJree
conditional distributions PX:IO' PX2:j. and PXzf/. and the conditional e:x.:pected values of each:
I

'PX~jl(X2J:;:;: 10'
5
10

Xl"'" 1,

'4'
1

.%2 = 3,

:;:;: O.

=10'

='4'

1
=-

=0,

JO
=0,

Otherw1Se.,

Xz =1,

=0,

E(x,12)o,O

Xl

:.:::3,

otherwise,
1

-'-+1'2

1-:-2'+3"C~O.75
If (Xl' X~ is a continuous random vectQr~ the conditional expectation.s are

(4-20)

and
(4-21)
and in each case there will be an infinite number of values that the expected value may take.
In equation 4-20, there will be one value of E(X,ix,l for each value X, and in equation 4-21,
there will be one value of E(X,Ix,) for each value x"

'1!~ii'Ri~,~~l,3.;'
In Example 4-11; we considered a. joint density I, where

\ - 2 +-3-'
X,x,
J(Xj,X:u-Xj
=0,

X){1

P2(XiJ

0,05

O.~O

0.2

0,10

0,25

0,05

0.4

0,10

0,15

0.05

0,3

0.05

0,05

0,00

0.1

p,ix,)

0,3

0.5

0.2

0,05

O<x;S:1,O::';:xz S:2,
otherwise,

Figure 4-14 Joint and margiaal distrib..1ion, of [X" X,]. Values in body

of table are p(x" X;;,

84

Chapter 4.

Joint Probability Distributions

The conditional densities were

and

Then, using equation 4-21, the E(x,ix:) is dete_ as

(/ )="I'

E\X,!x,
,

I)

I 3x, +x,
x 2 ,----dxz

2 3x, + 1
+4

='

'"

9x!+3

It should be noted that this t<; a function of Xl' For the two conditional densities shov.-n in Fig. 4~ 12,
where Xl ;:;: and Xl ::;< 1. the corresponding expected values are ECX'Ji) = andE{X211) ~

i;

#.

Since E(X,ixll is a function of x,, and x, is a realization of the random variable XI'
E(X2iX,l is a random variable, and we may consider the expected valne of E(X2iX,), that is,
E[E(X,IX,l]. The inneroperatoris the expectation of X2 given X, = xl' and the outer expec,
tation is with respect to the marginal density of X,. This suggests the following "double
expectation~' result.

Theorem 41
E[E(X,iX,l] = E(X,) f12

(4,22)

and

(4-23)

Proof Suppose that Xl and Xz are continuous random vMiables. (he discrete case is similar.) Since the random _iable E(X,iX,) is a function of X" the law of the unconscious stat,
istician (see Section 3,5) says that

= [, x,[,f(x"x,)dx:dx2
= [ , x'/2(x,)dx2
= E(X2~
which is equation 4-22. Equation 4,23 is derived similarlY, and the proofis complete,

Consider yet again the join.t probability density function from Example 4-: I.

f(.J:l,.:t2);;:.:t~+ -"1;2.

=0,

otherv.'ise.

In that example, we derived expressions for the ma..rginal densities f&~l) a."1df:?;(X:!). We also derived
E(X,Jx:)~ (9x,

+ 4)/(9x, .,. 3) in ExampJc 4-13. Thus.

Note that this. is also E(XiJ. since

E(X,)~ E,/,(x,),u,

S: x, G.,. "; ),u, ~ J~.

The variance operator may be applied to conditional distributions exactly as in the univariate C3Se.

REGRESSION OF THE MEAN


It has been observed previously that E(X,ix,) is a value of the random variable E(X,iX,) for
a particular Xl =x;, and it is a function of Xl" The graph of this function is called the regression of X, on X" Alternatively, the function E(X,ix') would be called the regression of Xl on
X,. This is demonstrated in Fig. 4-15.

j[~!3~:f:!5;l
In Example 4-13 we found E~p:) for the bivariate density of Example 4-11, that is,
X.X2
"1"3'

=0

(a)

O<X1 ~1,O$x2 $2,

otherwise.

(b)

Figure 4--15 Some regression curves. (a) Regression of X2 on X;. (b) Regr-:!Ssion of Xj on X 2

86

Chapter 4

Joint Probability Distributions

The result

was

In a like manner, we may find

V\

Regression will be discussed further in Chapters 14 and 15.

'4-7/ INDEPENDENCE OF RANDOMVARRBLES


//

The notions of independence and independent random variables are very useful and important statistical concepts. In Chapter 1 the idea of independent events was introduced and a
fonnal definition oftbis concept was presented. We are now concerned with defining iru1e~
pendent random variables. Vlhen the ontcome of one variable. say Xt' does not influence
the outcome of Xl, and vice versal we say the random variables Xi and XL are independent.

Definition
1. If [Xl' K,J is a discrete random vector, then we say that Xl and X, are independent
if and only if
(4-24)

for all x, and x,.


2. If [X" K,J is a continuous random vector, then we say that Xl and X1 are independent if and only if
(425)

for all Xl and x,.


Utilizing this definition and the properties of conditional probability distributions we
may extend the concept of independence to a theorem.

Theorem 4-2
1. Let [Xl. Xzl be a discretc random vector. Then
Px,~,(x,) = p,(x,)

and
px~,(X) = PI(X)

for all Xl and X, if and only if Xl andX, are independent.


2. Let [Xl' K,J be a continuous random vector. Then

IXI;<,(x,) =hex,)

4-8

Covariance and Correlation

87

and

Ix,,,,(x,) = !; (x,)
for all x, and"" if and only if X, and X, are independent.
Proof We consider here only the continuous case. We see that

1,,1",("") =h(x,) for all x, and x,


if and only if

.t; (x,)
if and only if

ft.x , x;J = I, (x,)f;,(x,) for all Xl and x.,


if and only if Xl and X, are independent, and we are done.
Note that the requirement for the joint distribution to be factorable into the respective
marginal distributions is somewhat similar to the requirement that, for indepeudent events,
the probability of the intersection of events equals the product of the event probabilities.

A city trrulSit service receives calls from broken-down buses and a \VTeeker crew must haul the buses
in for service. The joint distribution of the number of calls received on Mondays and Tuese:ays is
given in Fig. 4-16, along 'n-irh the marginal distributions. The variable Xl represents the number of

calls on Mondays and X2 represents the number of calls on Tuesdays. A quick inspection u.ill show
thatX! andXl are independent. since the joint probabilities are the product of the apprOpriate ~'Ull
probabilities.

48 COVARIA."<CEAND CORRELATlOK
--;;W;;e~h:-a~v'ec-::n-o=ted~tfut=t i,~X,) = 11, and VeX,) =

0'; are the mean and variance of X,. They may


be detem1ined from the marginal distribution of Xl' In a s1miJar manner, /1., and a; are the
mean and variance of Xz. Two measures used in describing the degree of association

between Xl and Xl are the covariance of [Xl' X2] and the correlation. coefficitm.t.

X){l

0.02

p,(x,)

0.04

0.06

0.04

0.04

0.2

0.02

0.04

0.06

0.04

0.04

0.2

0.01

0.02

0.03

0.02

0.02

0.1

0.04

0.08

0.12

0.08

0.08

0.4

0.01

0.02

0.03

0.02

0.02

0.1

P,(X,)

0.1

0.2

0.3

0.2

0.2

Figure 4-16 Joint probabilities for wrecker calls.

88

Chapter 4.

Joint Probability DistributioDS

Definition
If (Xl' X2] is a two-dimensiona1 random variable; the c<JVariance denoted Ci,,2' is
j

_ _",Co"-,v-,,(X,, X,) = "" = E[(X, - E(X,))(X, - E(X,]

and

the~lation COeffi~ denoted p. is


1\

p=

,CoV(X",:\:2)
-,JV(X1 ) . ..,rV(X2)

~.l

"1 ''',

(4-26)

-i c: 0 < I

(4-27)

The cQvariance is measured :in the units of Xl times the units of X2. The correlation
coefficient is a dimensionless quantity that measures the linear association between two
random variables. By performing the multiplication operations in equation 4-26 before distributing the outside ex
lue operator across the resulting quantities, we obtain an
alternate form for tb covariance follows~

(4-28)

Theorem 4-3

ik

If X, andX2 are independent, then p =

o.

Proof We again prove the result for the continuous case. If XI and X1 are independent,
E(X1 .X,)=

r [-'lX" f(x ,x,)dx,dx,


1

= [ [ -'lA(x,).x,tz(Xz)dxJdxz

[r XJj-;(xJdx ,} [[ x,f, (x,)dx, J


= E(X1) E(X,).

Thus Cov(X" X,) =0 from equation 4-28, aod p =0 from equation 4-27. A similar argument
would be used for [X" X,] discrete.
The converse of the theorem is not necessarily true, and we may have p = 0 withom the
variables being independent. If p = 0, the random variables are said to be uncorrelated.

Theorem 4-4
The value of p will be on the interval [-1, +1], that is,

-!$p$+l.

Proof Consider the function Q defined below aod illustrated in Fig. 4-17.

+ t(X, - E(X,))]'
= E[X, - E(X,)]' + 2tE[(X, - E(X,(X, - E(X,))]
+ "E[X, - E(X,)]'.

Q(t) = Ei(X, - E(X,)

Since Q(J) ~ 0, the discriminant of Q(t) must be :;; 0, so that

{ZE[(X! - E(X,(X, - E(X,))]}'

4E[X, - E(X,)]'E[X,

E(X,)' $ O.

4-8

Covariance and correlation

89

Y'll'lre 417 The quadJ:atic Q(I).

It follows that
4[Cov(X" X,)]' -4V(X,) V(X,)" 0,

so

[COV(X1,X2)]' ,,1
V(X;)V(X,)
and
-I~p~+l.

(4-29)

A correlation of p <::: 1 indicates a ~'hlgh positive correlation." such as one might find
between the stock prices of Ford and General Motors (two related cOtopanies). On the
other baud, a ''high negative correlation," p ~ -I, might exist between snowfall and tem

perature.
~~"li'4?i~
"_";..."'-:;:,.t~L=;W"",,,,,\.
Rocall Example 4-7, which examined a continuous random voctor rXj' X2] with the joint probability

density function

O<X1 <X1 < 1.


otherwise.
The marginal of Xl is

f,(x,l =6x,([ -x;),


-:::;: 0,

forO <""; < [,


otherwise,

and the marginal of Xl is

f,(x,) =

3:s,

= O.

for 0 < x, < [,


o:hci:wise.

These facts yield the following results:

E(X;) = f>X~([-Xl)dxl =1/2.

r"

I ') =10 6x;(1-x dx =3/[0,


E\X~
ll 1

V(Xl)=E(XJ)-[E(X,)]' =[/20.

90

Chapter 4

loint Probability Distribotions

E(X,) =J:3xidx, =3/4,

"(xl) = 3xIdx, = 3/5,


V(X,) = E(xi)-(E(X,)]' =0.39.
Further. we have

This implies that


Cov(X" X,) = E(X,X,J - E(X)E(X,) = lf40,
and then

A continuous random. vector [XI'

Cov(X"X,)

~..

~V(XJ V(X,)

-0.179.

Xz] has density functionjas given below;

f(x" x..) = 1,

=0,

-~<XI<+~.O<x;:<l.

Qtb.envisc.

This function is shown in Fig. 4-18.

The .marginal densities are

.!;IX,) = I-x,
=l+xl
=0.

f0r.0<x!<l,
fOI:-1 <Xl < O.
otherwise.

and

\
\
\

X,

Figure 4-18 A joint density from Example 4-18.

49 The Distribution Function for Two~ Dimensional. ~m Variables


SinceftxJ> xz}:P !;(x1) :fieX;J, the variables are not independent. If we calculate the covariance,

91
We

obtain

and thus p = 0, so the variables are uncorTelated although they are not independent.

Finally, it is nored that if X, is related to XI linearly, that is. X, = A .,. BX I , then rf L


lfB > O. then p=.,.l; and if B < 0, p=-l. Thus, as we obServed earlier. the correlation coefficient is a measure of linear association between two random variables.

49 THE DISTRIBUTION FUNCTION FOR rwODIMENSIONAI,


RANDOM VARIABLES
The distribution function of the random vecror [XI' X,] is F, where

F(x l , x,)

P(XI '; Xl' X, ;; x,).

(4-30)

This is the probability over the shaded region in Fig. 4-19.


If [XI' X,J is discrete, then

F(x"x,.l =

L LP(V2)'

(4-31)

11:::Xjfl.s:~

and if [XI' X,l is continuous, then


(4-32)

~~ii1~~:~1; !
Suppose XI and Xl have the following density:
j{xp X;J : : :; 24x JXz,
0:.:: 0,

x, > O. X:. > 0, Xl


otherwise.

+ Xz < 1,

Looking at the Euclidea!1 plane shown in Fig. 4~20 we see several cases that must be considered.
1. XI sO. F(x" x,) = O.
. :L x., s 0, F(." x,) = O.

Figure 4-19 Domain of inte~on or summation for F(x 1 .x,;).

92

CluIpter 4

Joint Probability Distributions

Figure4-20 The domain of F. Example 4~19.

F(X].X2)::::::: J;~ ri24tlt2dtldtz

=6xf.xJ.

S.O<xj<landX:!>l.

F(Xt,X2) = I:,t-r'24tlt:zdtzdt1
=

6x:~-8x~+3xi,

The function F has properties analogous to those discussed in the one-dimensional


case. We note that when X I and Xl are continuous,

if the derivatives exist.

410 FUNCTIONS OF TWO RANDOM VARIABLES


Often we will be interested in functions of several random variables; however, at present.
this section will concentrate on functions of two random variables, say Y ~ H(X 1.X;;. Since
X, =X1(e) andX,= X,(e), we see that Y =B[X;(e), X,(e)] clearly depends on the aUEome
of the original experiment and, thus. Y is a random variable with range space Ry
The problem of finding the distribution of Y is somewhat more involved than in the
case of functions of one variable; however, if [Xl' X2 ] fs- discrete, ~e procedure is straightforward if X,and X, take on a relatively small number of values.

4-10 Functions of Two Random Variables

93

;,~~~~~3Q~
If Xl represents the number of defective units produce~ by machine No.1 in 1 hour andXzrepresents
the number of defective units produced by machine No.2 in the same hour, then the joint distn1mtion
might be presented as in Fig. 4-21. Furthermore. suppose the random variable Y "" H(X!! X,;). where
H(x x,) ='lx: + ",. I, follows that R y =(0, 1, 2, 3, 4, 5, 6. 7,8, 9j, In order to determine, say, P(Y
=0)"= p,(O), we note that = 0 lfand only if X: = 0 andX,= 0: therefore,p,(O) =0.D2,
We Dote that Y"" 1 if and only if Xl = 0 and X1 -= 1; therefOl;e py(l) = 0,06, We also note that Y= 2
lfand onlyifeitberX, =0, X, =2orX, = I,X, = 0; sop,.(2)=OJO+ 0.03 0,13, Using similar logic,
we obtain the r~1. of the distribution, as follows:

YI

Py(Y,)

0.D2
0,06
0,13
0,11

2
3
4

0.19

0,15

6
7
8
9
otherwise

0.21
0.D7
0,05

om

In the case where the random vector is continuous with joint density functionftx:, xJ
and H(x" x,) is continuous, then Y = H(X p X,) is continuous, one-dimensional random
variable, The general procedure for the determination of the density function of Y is outlined below.
1. We are given Y ~ H,iX;, XJ, .(

2, Introduce a second random variable Z = H,(Xl , X,). The function H, is selected for
convenience, but we want to be able to solve y:;;; Hl(XI'~) and z = flz(x 1 Xz) for XI
and x., in ternis of y and z.
"

"1

3. Find = G, (Y, zj, and x., = G,(y, z).


4. Find the follov.ing partial derivatives (we assume they exist and are continuous):
ox:

~J..

dy

X~i

p,(x,)

0.02

0,03

0.04

0,01

O.~

0,06

0.09

0,12

0.03

0.3

0.10

0.15

020

0,05

0,5

0.02

0.03

OM

0.01

0.1

P,(X,)

02

0.3

0.4

0,1

oz

ax::!

dy

o~

az

Figure 4-21 Joint distrihl!tion of defectives


produced au two macb.iD.es p(;r.,,~.

94

Ctw.pter 4

loint Probability Distributions

5. 'The joint density of [Y, 2J. denoted t(y, z), is found as follows:
t(y, z)=f(G1(y, z). G,(y,

zll iJ(Y, z)l.

(4-33)

where J(y, z). called the Jacobian of the tnuJsformation. is given by the determinant

IOxtfa y oxt/az!

J(y,,) = axzldy

ax,/o,I'

(4-34)

6. The density of Y, say gy, is then found as

gy(y) = [l(y,z)dz.

(4-35)

!1~iap:.~~z~I
Consider the contin,uous rax.dom \-'eCtor [Xl' Xi! \O.'ith the foll(,)\-\'ing density:
j(Xl,~)"""'4e-U.Y,+.l'>j.

xj>O,x,>O,
otherwis.e.

= O.

Suppose we are interested in the distributio:J. of Y =XjlJ[J!~ We will let y:: xlx2 and ehoose z;;; Xl +-'1:

sothatx,=yzl(l+y)~x,=zI(l+y).Itfo];owsthat

and

aXlr(JZ=~Y~.
l+y

and

dX,{aZ=-.

1
l+y

Therefore,
-yl+y
z
zy
Z
---+----1 -(1+y)' (l+y)'-(I+y)'
l+y

and

I[G/,y, z), G'iy,

ill =4e!-~1+yjt-d(l+Y)!1
::;; 4e:-:<Z.

Thus,

. )_- 4e ." '-.-'-I


z
ily,'

[l+y J

and

y>O,

otherwise.

4-11 JOINT DISTRmunoNs OF DIMENSION n > 2


Should we have three or more random variables, the random vector will be denoted
[Xl' Xl~ ... , Xli]' and extensions will follow from the two-dimensional case. We will assume

4-11

loint Distributions of Dimension n > 2

9S

that the variables are continuous; however, the results may readily be extended to the dis~
crete case by substituting the appropriate summation operations for integrals. Vle assume
the existence of a joint density I such that
(4-36)

and

Thus,

(4-37)

The .marginal densities are determined as follows:

fiL::)

r J:'[/(x".tz,."x.)dx. dx"

1,(.2) = [J.:[J(;V2, .... xnldx, dx,dxl,

>"

I,(x,.) = L[[/(x"x,.".,x,)dx,_1dx2 dx l
The mtegration is over all variables having a subscript different from the one for which the
marginal density is required.

Definition
The variables [Xl'

x,..... XJ are independent random variables ifand only if for all [xpx" . . x,J

----

(4-38)

l(x,.x, ... ,xn )dxl dx2 .. dx,

(4-39)

1(x,. x,.... ,x,) =II(X,!' !,(x.,)' .... I,(xn).

The expected value of. say, XI is


111,=

:(X,) =

[J:r

XI'

and the variance is


(4-40)

We recognize these as the mean and variance, respectively, of the marginal distribution of Xl'
1:1 the two-dimensional case considered earlier, geometric interpretations were instruc~
tive; however, in dealing with rz--dimensional random vectors, the range space is the Euc1id~
ean n-space, and graphical presentations are thus not possible. The marginal distributions
are, however in one dimension and the conditional distribution for one variable given values for the other variables is in one dimension. The conditional distribution of Xi given va1~
ues (x." x,. "., x"J is denoted
j

(4-41)

and the expeered value of XI for given (Xz, .... x,) is

E(X;'X2. X' ..... x,) =

J': Xl' Ix,I", .....', (x')dxl

(442)

96

Chapter 4

Joint Probability Distributions

The hypothetical graph of E(XJx,. x,. _.. , x,) as a function of the vector tXt, x,. ___ x,] is
called the regression of on (X::!, 3 ,' X,,).

x,

412 LINEAR COMBINATIONS


The consideration of general functions of random variables, say Xl~-Xz, ., . X"' is beyond the
scope of this text However, there is Olle particular function of the form Y = H(X" ___ , X,),
where
(4-43)
that is of interest. The Q; are real constants for i =: 0, I) 2. _. '. n. This is called a linear com~
binan'011 of the variables X ~, X21 "'1 XI:' A special situation occurs when ao == 0 and a 1 == ~
;;:;; , .. :::;al'l= l,in which case we have asumY=Xl +X',2 + .. - +Xn

5li2~ifu!ItJ,~~6
Four resistors are connected in series as shown in Fig. 4-22. Each resistor has a resistance that is a ran.dom variable. The resistance of the assembly may be denoted Y, where Y=X 1 + Xz + X3 +Xlj'

;]~g!?'!;23]
Two parts are to be assembled as shov.'ll. in Fig. 4-23. The clearance can be expressed as Y=X\ -Xl
or Y::::::. (l)X\ + (-1)X2; Of eourse, a negative clearance would mean interference. This is a1i.1.ear combination ""i:11 0c ;:;::: 0, a 1 = t and a l = -1.

A sample of 1() items is randomly selected from the output of a process that manufactures a small
shaft used in electric fan motors, and the diameters are to. be measured with a ...alue called :he sample
mean, calculated as

1,

X=io\X,+X,+"'+X" The valueX=~! +~ + ... +~lois a linear combination withac=O and at =a:z = .. =ow=w.

x,

x,

x,

Figure 4--22 Resistors in series.

Figur< 4-23 A simple assembly_

4~12

Linear Comb:nations

97

Let us next consider how to determine the mean and variance of linear combinations.
Consider the sum of two random variables,
Y=X,

+x,.

(4-44)

The mean of Yor )1, = E(],) is given as


(4-45)

However, the variance calculation is not so obvious.


V(]') = E[Y - E(]')]2 = E(Y') [E(],)]'
= E[(X, + X,)']- [E(X, +X,)]'

= E[X; + 2X,X, + X~ - [E(X,) + E(X,l]'

E(X~ + 2E(X,X,) .;- E(X~


= (E(X;) - [E(X,)]2}

[E(X,ll' - 2E(X,) . E(X,) - [E(X,)]'

+ '(E(X~ - [E(X,)]') + 2[E(X,X,) - E(X) . E(X,)]

= VeX,) + V(X,) + 2Cov(X, X,),

or
,

(7y

=0': +0'2 -:-2a:2_

(4-46)

These results generalize to any linear combination


Y= "0-'-

",X, + azX, + ... +c,x,

(4-47)

as follows:
n

Co +

E(Y)
.

L ",E(X,)

~--11

..... - .. - ..

ECY )=ao + 'IaIJii'


1",,1
where E(X,)

)1,.

(4-48)

and
n

V(Y) =La;V(X,)
hi

+ L2>,CJO'ijCov(X,.XJ)
1",,1 j=1

(4-49)

l*i

or

a~= iaYG7 + iiaiajvij'


j".l

{:=1

j=l

''''j
If the variables are independent, the expression for the variance of Yig greatly simpli
fied, as all the covariance terms are zero. In this situation the variance of Yis simply
n

V(Y) = La;. v(x,)


i=1

or

(450)

98

Chapter 4

Jomt Probability Distributions

In Exa,"l1ple 4-22. four resistors were connected in series so that Y = Xj + Xl + X, + X4 was the .Ie:>'ist~
ance of the assembly, where Xl was the resistance of the first resistor, and so on. 'The mean and vana.'1ce of Y m terms of the me3llS and variances of the components may be easily calc'J.lated. If the
resistors are selected randomly for the assembly, it is reasonable to assume that Xlt Xl' X3 and ~ are

independent,

We have said nothlng yet about the distribution of Y; however, given the mean and variance of XI'
Xz> X3> and X4 we may readily calculate the mean and the variance of Y smce the variab!es are
independent.

~~p!e.~~2ijf
ill Example 4-23, where two components were to be assexnbled, suppose the jobt distribution of ,

[X"X,J is
XI ~O,Xz~O,

N" x.,) = 8e-<'"' ""',

otherwise.

=0.

Sinceflxl. x...J can be easily factored as

fl.x"

x.J = [2['''] , [4e-''']


=it(x,) 'f,(x.,),

XI andXz are independent. Furthermore, E(X j ) = PI =~. and E(X:0:::;: J12 =~ We may calculate the variances

r ' 2-'''d>:
x,e
o :

(1 \' _1

__ 1 _
'2) - 4

and V(}')=

In Example 4-24, we might expect me random variables Xi> Xi' , .. , X1C to he ind~ndent because of
the random. SalIlpling process. Furthermore, the distribution for each variable X! is identical. This is
sho,,"n in Fig. 4~24.1n me earlier exar::.ple. the linear combination of interest \\"3.8 the sample mean

It follO'NS that
-'
1 I ) 1 ' ) ~,,+-E
1 (X)
E(XI=;t-=-E,X
"
,
x 10 . ' +-E\X,
10

10

=w;t+ 1O;t+ "+101'


=JL.

4-l4 The Law of Large Numben;

99

~~

t!1'" j.!
X1
Figure 4-24 Some identical distributions.

Furthermore,

I
I

4-13 MOMENTGENERATING FlNCTIONS


Al'olJ LINEAR COMBINATIONS
In the case where Y == aX, it is easy to show that
M,(t) = Mxlat).

For sums of independent randam variables Y =X: .,. X! 1"'

(4-51)
, 1"'

Xn

M,(r)=Mx,(tlMx,(t) '" ,Mx,Ct).

(4-52)

This property has considerable use in statistics_ If the linear combination is of the general
fonn Y = "0 + a,X, + ". + a){, and the variables X., .... X, are independent, then

M,(t) = e"'[Mx,(att) . Mx,(a.,r) ... ,Mx,(a,t)],


Linear combinatiot15 are to be of particular significance in later chapters, and we will dis-

cuss them again at greater length.


,r'

4.('1' THE LAW OF LARGE NUMBERS


A special case arises in dealing with sums of independent random variables where each
variable may take only two values, 0 and 1. CODSider the following formulation. An experiment i! consists of n independent experiments (trials) "'j,j = I, 2, "" rt. There are only two
outcomes, success, lSI, and failure, (F), to each trial, so that the sample space 9',= (S, Fl,
The~~~

.
P(S) = p

and

P(F) = I-p=q

remain constant for j = I, 2, , .. , n. We let


if the jth trial results in failure,
if the j1h trial results in success,

100

Chapter 4

Jomt Probability Distributions

and
Y=X,+X,+,,+X"<

Thus Y represents the number of successes in n trials, and YIn is an approximation (or esti~
mamr) for the unkn<Yl\'Il probability p. For convenience, we will let p YIn. Note that this
value corresponds to the lenni, used in the relative frequency definition of Chapter 1.
The law of large numbers states that
I

(4-53)
or equivalently

(4-54)

To indi.cate the proof, we note that


E(Y)=n . E(X;) = n[(O . q)+ (1 'p)] =np

and
V(y)

nV(Xj ) = n[(O' q) + (1', p)

(P)2) = np(J - p).

8ince p = Yin, we have

E(p) =!.. E(Y) = p


n

(4-55)

and

Using Cnebyshev's inequality,


(4-56)
so if

then we obtain equation 4-53.


Thus for arbitrary e> 0, as n -7 <:10,

Equation 4-53 may be rewritten, with an obvious notation, as

p[lp -pi j" 1- a

(4-57)

We mayuow fix both E and ain equation 4-57 and determine the value of n required to satisfy the probability statement as

> p(l- p)

n_-z-
a

(4-58)

4-16 Exercises

101

(~pIe~~:
A manufacturing process operates so that there is a probability p that each item produeed is defective,
and p is uu.mo'Wll, A random sample of n. items is to be selected to estimate p. The estimator to be used
isp=YIn., where

;
t

0,

Xj

if the jth item is good,

= ,...

ifthejth item is defective,

and
Y:::XI+Xz+""+X~.

pi,

It is desired that the probability be at least 0.95 that the error, 1ft not exceed O.OL In order to
determine the required value of n, we Dote that = 0.01, and a"", 0.05; however, p is unknown. Equa~
tion 4-58 indicates that

n ~ ~p",(lr--,,-p)i

(0.01)' (0.05)
Since p is unknQV;n, the worst possible case must be assumed [note thatp(1 - p) is maximum when

p=~. This yields

11

>-

(0.5)(0.5)
'2
50, ODD,
(0.01) (O.OS)

a very large number indeed.

Example 4-28 demonstrates why the law of large numbers sometimes requires large
sample sizes. The requirements of = 0.01 and a= 0.05 to give a probability of 0.95 of the
departure 1ft - pI being less than 0,01 seem reasonable; however, the resulting sample size
is very large. In order to resolve problems of this nature we must know the distribution of
the random variables involved (ft in this case). The next three chapters will consider in
detail a number of the more frequently encountered distributions,

4-15 SlJMMARY
This chapter has presented a number of topics related to jointly distributed random variables
and functions of jointly distributed variables. The examples presented illustrated these topics, and the exercises that follow will allow the student to reinforce these concepts.
A great many situations encountered in engineering. science, and management involve
situations where several related random variables si{nultanoously bear on the response

being observed. The approach presented in this chapter provides the structure for dealing
with several aspects of such problems,

4-16 EXERCISES
+1. A refrigerator manufacturer subjects his fir.ished
prodUcts to a final inspection, Of interest are two cat~
egories of defects: scratches or flaws iD the porcelain
futish, and mechanical defectS, 'Ib.e number of each
type of defect is a random variable, The results of
1ru:.--pecting 50 refrigerators are shO'Wl11n the following

table, where X represents die occu..~ence of finish


defects and Yrepresents the occummce of mechanical
defects.
<a) Find the margmal distributions of X and Y.
(b) Find the probability distribution of mechanical
defects given that there are no finish defects.

102

Chapter 4

IX o.

Joint Probability Distributions

L1

11/5~ 4i50 i 2150 1 1150 1150 ; 1150


6150 2:'5~~0 I 1/50 1/50 i

1 2

4i50: 3/50

I 2150

1 1/50

3150 I 1/50

1150

0"

(a) Find the appropria!-e value of k.


fb) Calculate !he probability that X, < 1. X, < 3.
(c) Calculate the probability that Xl + Xl ~ 4.
(d) Find the probability that X, < 1.5.
(e) Find the marginal densities of both X, andX,.

4-5. Consider the density function

f(w.x,y,z)=16wxyz.
= 0,

(c) Find the probability distribution of :finish defects

given that there are no mechauical defects,

4-3. Let XI and X2 be the scores on a general intelligence test and an occupational preference test, respectively. The probability density function of the rdlldom
variables [Xl' X2J is given by

f(x,.x,)= 1000'
:..:=

O.

0 ~ x,,,; 100, 0"; x, S 10.


otherwise.

(a) Find the appropriate value of k.


fb) Find the lll3rginal densities o[X, and

x,.

(c) Find an expression for the cumulative distribution


function F(x,. x,).

4-4. Consider a situation in which the surface tension


and acidity of a chemical product are measut. These
variables are coded such that surface tension is measured on a scale 0 SXI ::;; 2. and acidity is measured on
a scale 2 sXi S 4. The probability density :function of
[X,.x,J is
N,.X,) =k(6 -x,-x,). 0 ";x, >:2. 2";x," 4.
otherwise,
=0,

;~ i

otherwise.

(a) Compute the probability that W:::O;

t and Y ~ i.

(b) Compute the probability that X Stand Z::;

4-2. An inventory manager has accumulated records


of demand for her company's product over the last
100 days, The random variable X represents the num
ber of orders received per day and the random variable
Y represents the number: of unlt,<; per order. Her data
are shown in !he table at ~e bottom of this page.
(a) Find the marginal distributions of X and f(b) Find all. conditional distributions for Y given X.

OSw,x,y,z:::O;l

4-6. Suppose the joint densitY of [X .Y] is


1

f(x,Y)=g{6-x- y),

0:5x";2.2:5y";4.

=0.
Find the conditional densities f",,(x) andfl(y)'

4-.7. For the data in Exercise 4-2 find the expected


number of units per order given that there are three
orders per day.

4-8. Consider the probability distribution of the discrete random vector [Xl' XJ. where Xl represents the
number of orde,..'S for aspirin in August at the neighborhood drugstore and X2 represents the number of
orders in September. The joint distribution is shown in
the table on the next page.
(a) Find !he marginal diso:::eution,.
(b) Find the expected sales in September given that
s:a1es in August were either 51, 52, 53, 54, or 55.
4--9. Assume that X ~ and Xi are coded scores On two
intelligence testS, and the probability density function
of [Xl> Xz} is given by
2

J(x,. x,) = /ix,x,.


0'; x, ,; 1,0'; x,,,; 1.
=0,
otheJ"\\-'1se.
Find the expected value of the score on test No.2
given the score on test ~o, 1. Also, find the expected

' _8-1'_9-1
_3_1-_4-,'_5_+-_6_ i _7_11

1101100'51100 31100 21100 111100 1/10011110011/100 '1/100

181100 51100; 31100 21100 1'/100 1/100 i 1/100 I

181100 5/100:21100 1I1001~1100

+---;~-r~+-~~~--4

t,

(e) Find the marginal density of1.

4-16 Exercises

S1

52

53

54

55

S1

0.06

0.05

0.05

0.Q1

52

0.01

0.05

0.01 0.Q1

53

0.05

0.05

, O.Q1
I om .
! O.OS :
0.03

54

:-ss

! 0.10 ! 0,10

0.05 0.02

0,0'1

0.01

0.05 : 0,06

0.05

0.01

....._-

0,03

value of the score on test No.1 given the score or. test

No.2.
4-10. Let

, -

j{xl.Xz} =4x1Xze-\r1 H
:=. O.

t,

X;,

4~14. Suppose we have a simple electrical circuit in


which Obm's law V"" IR holds. We v;ish to find the
probability distribution of resis!.ance given that the
probability clistribuciollS of voltage (11) and current (l)
are ktiOWll to be
g,,(y) e~.
V;?O,

::::0,

4-11. Assume that [X. Y] is a cout:i.!J.uous r.mdom vector and that X and Yare inde:pendcntsucb thatftx. y);;;
g(x)h(y). Define. now random variable Z=XY. Show
cat the probability density function of Z; t z (z), is
given by

otherv.rise;
i~O,

Vi)

otherwise.

X, andXz.
(c) Find expressions for the conditional expectatiollS
of X, and X,.

g(uz)h(u)lu,du.

Hint: LetZ=.\'lY and U = Y, and find the Jacobian for


tlle transforrr:..ation to the joint probability density
function ofZ:md U, say r(z. u). Then ir.tegrate r(z:. u)
with respect to u.

x.z > 0

(a) Fmd the marginal distributions of Xl andXz_


(b) Fmd ce conditional probability diStrib-JtiODS of

103

0,

otherv.ri.se.

Use the results of the previous problem, and assume


that V and I are independent random variables.
4-15. Demand for a certain product is a randOtn variable baving a mean of 20 units per day and a variance
of9, We define the lead time to be the time that elapses
between the placement of an order and its arrival The
lead time for the product is fixed at 4 days, Fmd the
expected value and thl! variance of lead time demand,
assuming demands to be independotly distributed..
4-16. Prove the discrete case of Theorem 4-2.

Hint: Let Z::::: XY and T = X and :fuJ.d the Jacobian for


the transformation to the joint probability density
function of Z and T, say r(z. t). Then integrate ret, t)
with respect to /,
4- U~ Use the result of the previous problem to find
the probability density function ~of the area of a re~
tangle A =SIS:!' where the sides are of random length.
Specifically, the sides are independent random vari*
abIes s"Jcb that

417. Let XI and Xl be random variables such thatX2


=A+BX,. Show that p' 1 and thatp=-l ifB<O
whilep=+lifB>O.
4~18. Let X. and Xl be random variables sucb that Xz
;;;;;;A + BX t , Show that me moment-generating function
for Xzis

Mx,(') = iI"Mx,\Bt).

419. Let X; and Xz be distributed according to


j(x"x,) =2.
:=

O:::;;SI~l,

otherwise,

h",(s')=il s"
0,

0$s:;:,$4,

otheI'\llise,

O$XL~X,$;,

otherwise,

Find the correlation coefficient betweeu Xl and Xz

<

4-20~

and

0,

Let XI and Xl be random variables Vii.th correla


tion coefficient Px1.Xz" Suppose we de5.ne twO;lCVI ran~
dam variables U=A + BX, and v~ C .,-DX" whereAS, C, and D a.."e constants. Show that PIlY ;;;;;;
w

(BDIjsDilPx"x,.

Some care must be taken in determining the 1i.I!.lits of


integration because the variable of integration cannot
asSll!:1C negative values.

4-21. Consider the data shown lr.. E;w;ercise 4-1. Axe


X and Y independent? Ca.:!culate the correlation
coefficient.

4..13. Assume that pc, Y] is a continuous random vec~


tor and that X and Yare independent sucb thatf(x, y}
:::: g(x)h(y). Define a new random variable Z::: XIY.
Show that the probability density function of Z. t.f,.z),
is given by

4-22. A couple v;ishes to sell their bouse, The miniw


mum price that they are willir:!g to accept is a random
variable. say X, where SI S; X S; $2' A population of
bt.yers is interested in the house. Let y, where PI ~ Y
S P;;.. denote the ma.;llir.mm price they are willing to

104

Chapter 4

Joint Probability Distributions

pay, Yis also a random variable. Assume that the joint


dis:ribution of [X,:(I isir", y).
(a) Under what citcumstances will a sale take place?

4-2&. For the bivariate distribution,

(b) Write an expression for the probability of a sale

(1+x)'(lTy)' ,

;;;:. 0,

tlkiJ:g place.
(c) Write an expression for the expected prk'e of the

k(l+x+ y)

f(x,y)

O';x<-,O$y<~,

otherwise.

(a) Evaluate the con.'itant k..


(0) Fl.'1d the marginal distribution of X

transaction.
4~23. Let [X y) be UIllformly distributed over the
semicircle in the following diagram. Thusf(>; y) = 21"
if ~x, y) is in the semicircle,
(a) Find the marginal distributions of X and Y.
(b) Find the conditional probability distributions,
(c) Find the conditional expectations.

4~29.

For the bivariate distribution.

fIx,,)

,,;;'0,y2:0,n>2,

(l+x+yr
otherwise.

=0.
(a) Evaluate the constant k.
(b) Fi::1d F(x; y).

4-30. The manager of a small bank vvisbes to deter~


mine the proportion of the ti..me a particular teller is
busy. He decides to observe the teller at n randomly
spaced mwrvals. The estimator of the degree of gain~
ful employment is to be YIn, where
_ (0, if on the ith observation. the teller is idle,

Xi - ~1, if on the ith observation, the teller is busy,


Let X and Y he indepen.de:rit random variables.
Prove that EeXiY) =E(X) and that EeYjxl = E(l').

and y =.

4D25. Show that, in the discrete case,

ofn.

4~24.

E[E(xjl')l =E(X),
E[E(YiXJJ = E(l').
4ffl26. Consider the two independent random ...<Uiab1es
Sand D, whose probability densities are

fs(s) = 3~'
.

gn (d)= 20'

=0.

L:IXi-

It is desired to estimatep;;;:,P(Xi= 1)
so that the error of the estireate does not exceed 0.05
with probability 0.95, Determine the necessary value

4-31.. Given the following joint distributions, determine whether X and Yare independent.

<.) g(x, y)=4>:ye-(.e,f'l,


(b) .1\>; y) = 3i'y-\

(e) 1(>;y)
lO~s';40,

x~O,y;;'O.

0'; x'; y'; L

6(l+x+y)~,

x;;:0,12:0.

WSd';30,

= h(x)hfylh(z). x 2: 0, Y,,0, z" 0,


Detennine the probability th2;t a point drawn at random will bve a coordinate (x, 'I. z) thaI docs not sat~
isfy either x>y > z or x <y < z.

otherwise.

4-33. Suppose that X and

otherwise;

432, Ler}\>; y, t)

Y are random variables

denoting the fraction of a day that a request for mer-

Fi::J.d the probability distribution of the new random


variable

W=S+D.

.I\>;y)=x+y,
=0,

find the following:

E[XiYl,

(b) E[X:;.
(e) E[:(I.

.l\x,y)=l.
;;;:. 0,

4-27. If

(a)

chandlse occurs and the receipt of a shipment oceurs,


respectively. The joint prObability density function is

O<x<l,O<y<l,

other''''lse,

O$x~l.OSy';l,

otherv&e.

(a) What is the probability that both the request for


merchanilise and the receipt of an order occur
during the first half of the day?
(b) What is the probability that a request for merchandise occurs after its ::eceipt'? Before its ;:eceipt?
4-34. Suppose that in Prohlem 4-33 the merchandise
is highly perishable and must be requested during the

r!

4~16

t..ctay interval after it arrives, "''hat is the probability


that merchandise will not spoil?

4..35. Let X be a continuous random variable 'With


probability density function j(x), Find a general
expression tOt the new random variable Z. where

(a) Z=a+bX.

(0) Z= IIX
(e) Z=lnX

Cd) Z=i'.

Exercises

105

Chapter

Some Important Discrete


Distributions
5-1 INTRODUCTION
In this chapter we present several discrete probability distributions, developing their analytical form from certain basic assumptions about rea1~world phenomena. We also present

some examples of their application. The distributions presented have found extensive application in engineering, operations research. and management science. Four of the distribu~
tions. the binomial, the geomerric, the Pascal, and the n.egative binomial, stem from a ,
random process made up of sequential Bernoulli trials. The hypergeometric distribution,
the multlnomial distribution., and the Poisson. distribution will also be presented :in this
chapter.
When we are dealing with one random variable and no ambiguity is introduced, the
symbol for the random variable will once again be omitted in the specification of the prob~
ability distributions and cumulative distribution function; thus, px<x) = p(x) and FIx) ~
F(x). This practice wili be continued throughout the rext.

5-2 BERNOULLI TRIALS MlJ THE BERNOULLI DISTRIBUTION


There are many problems in which the experiment consists of n trials or subexperiments.
Here we are concerned with an individual trial that has as its tw.o possible outcomes success, S1 or/ailure, F. For each trial We thus have the following:

\1;;, Perform an experiment (the jth) and observe the outcome.


9';: IS, F}.
For convenience, we "ill define a random variable X;.= 1 if'>:) results in S and X)~ 0 if'>:}
results in F (see Fig. 5-1).
The nBernoulli trials o:gl' ~2' .. '\ ~r. are called a Bernoulli pr.ocess if the trials are independent, each trial has only two possible outcomes, say S or F, and the probability of success remains constant from trial to trial. That is,

and
Xj'=l, j

1.2.... ,~

Xj =0. j=I.2, .... n,


.otherwise,

106

(5-1)

5-2 Bernoulli Trials and thc Bernoulli Distribution

~
rl----X..:.)----t{.
.....

hl~X_'<j,((SS;)"~

0=\..;;,i
)

107

FF"j I

X 'F)

LV
Figure 5-1 A Bernoulli trial

For one trial, the distribution giveD in equation 5-1 and Fig.
distri.bution.
The mean and variance are

5~2

is ealled the Bernoulli

and

VeX) = [(0'. q) + (I'. p)]- p' = p(l-p) = pq.

(5-2)

The moment-generating function may be shown to be


MxPl=q+pe'.

(5-3)

:~p.~Y:2
Suppose we consider a to.allufactUl'ing process in wbich a stI.ta1.1 steel part is produced by an automatic machine. Furthermore, each part in a production nm of 1000 parts may be classified as defec~
live or good when inspected. We can think of the production of a part as a single trial that results
in success (say a defective) or failure (a good item). 1 we have reasOD to believe that ffie machine
is just as likely to produce a defecth:e on one run as on another, and if the production of a defec~
tive on one run is neither more nor less likely because of the results on the previous runs, then it
would be quite reasonable to assume that the production run is a Bernoulli process \vith 1000 trials,
The probability, p, of a defective being produced on one trial is called ~ process CNerage fraction
defective.

Note that.in the preceding example the assumption of a Bernoulli process is a matheff
matical idealization of the actual real-world situation. Effects of tool wear, machine adjustment,. and instrumentation difficulties were ignored. The real world was approximated by a
model that did not consider all factors, but nevertheless, the approximation is good enough
for useful results to be obtained.

q
p

Xl

Figure- 5~2 The Bernoulli distribution.

108

Some Important Discrete Distributions

Chapter :>

We are going to be primarily concerned with a series ofBemoulli trials. In this case the
experiment % is denoted [('1:1' 'S" .",11:,,): 'S; are independent Bernoulli trials,j = I, 2, ... ,
n}. The sample space is
.
~ = (Xl' ""X/I): Xi=

S or F.

i=

I ... n}.

Suppose all experiment consists of three Bernoulli trials and the probability of success is p on each

trial (see Fig. 5-3). The random variable X is given by X = L~"'l Xj' The distribution of X can be
determined as follows:

"
Q

1
2
3

53

P{FFF) =q' q' q=cl


P{FFS}+P{FSFj +P{SFFj 3p'!'
P{FSS)+P(SFS}+P{SSF}=3p'q
P{SSS) =p'

THE BINOMIAL DISTRIBUTION


The random variable X that denotes the number of successes in n Bernoulli trials has a
binomial distribution given by p(x), where

()
P."

'l'nl
x)pX( I-p )'-X

x=O,I,2,,,.,.,

= 0,

ofuerwise.

(5-4)

Example 5-2 illustrates a binomial distribution wifu n = 3. The parameters of the binomial
distribution are n andp, where n is a positive integer and 0 ~p S 1, A simple derivation is
outllned below. Let
p(X) = P["x successes in n trials").

Rx

IP

FFS-

'0

FSP
SFF-

-j

FSS-

"-I

SFS-

SSS-

SSF-

(
'2

I\
I

)
i

e)

Figure S~3 Three Bernoulli trials.

5~3

The probability of the particular outcome in


last n - x trials is

The Binomial Distribution

109

with Ss for the first x trials and Fs for the

p[ss{:ss Ff.~7FJ= p'r

(n)

(where q = 1 - p), due to the independence of the trials. There are


comes having exactly x Ss and (n - x) Fs; therefore,
x
p(x)

= (:}'qn-",

x=O,I,2, ... ,n,

= 0,

otherwise.

n!
--"'-- outx!(n-x)!

Since q =-1 - p, this last expression is the binomial distributiun.

5-3.1 Mean and Variance of the Binomial Distribution


The mean of the binomial distribution may be determined as
n

E(X) = ~
L..,x,
""0

n.
p " q n-"
x!(n - x)!

. ~
=npL..,x
""1

(n-1)!
~-l n-~
p q ,
(x -I)! (n - x)!

and letting y = x - 1,

so that
E(X) = np.

(5-5)

Using a similar approach, we can find


E(X(X-l)) = .x(x-l)n!p"r"
""0 x! (n - x)!
2~
(n-2)!
"-2 n-"
( 1)PL..,
=nnp q
. ""2(x-2)!(n-x)!

= n(n _1)p2

( ?)t
n - - . pyqn-y-2
y"oy!(n-y-2)!

n-2

= n(n _1)p2,

so that
VeX) = E(X2) - (E(X)),
= E(X(X - 1)) + E(X) - (E(X))'
= n(n - l)p2 + np - (np)'
=npq.

(5-6)

116

Chapter 5

Some Important Discrete Distributions

An easier approach to find the mean and variance is to consider X as a sum of n inde~
pendent Bernoulli random variables, each with meanp and variance pq, so that X Xl ~ X,
+ ... -rXfj' Then

E(X)

= p + p + ... .,. p = np

and
VeX) =pq +pq

+ --. + pq=npq.

The moment-generating function for the binomial distribution is

MIJ) = (pi +

qr

(5-7)

;~~~i~~~;
A production pro<;ess represented schematically by Fig. 5-4 produces thousands of parts per day. On
the average. 1% of the parts are defeetive and this avernge does not vary with time. Every hour, armdom sample of 100 parts is selected from a oonveyor and severnl characteristics are observed and
measured on each part; however, the .inspector classifies the part as either good or defective. If we
consider the sampling as n 100 Berooulli trials with P =0,01, the total number of defectives in the
sample, X, would have a binomial distribution

p(X)

=(

100'
x jeO.Ol)'(O.99)'OO-',

=0

x=O,1,2, ... ,100,


otherwise.

Suppose the inspector has instructions to stop the process if the sample has more than two defectives.
Then,P(X> 2) 1- P(X'; 2), and we maycalcuJate

P(X52) =

, (100\
1,
~O.OI)'(O.99)'OO-'
r<>O" X)

= (0.99)"l<J +100(0.01)'(0.99)'" +4950(0.01)' (0,99)98


!:x

0.92.

Thus, the probability of the inspector stopping the process is approximatdy 1- 0.92::: 0.08. The mean
number of defectives that would be found is E(X) =np = 100(0.01) ~ 1, and :he variance is VeX) = npq

=0.99.

5-3.2 The Cumulative Binomial Distribution


The cumulative binomial distribution Or the distribution function, F, is

(5-8)

Conveyor

):I Warehouse

I
I

I
,
{. Hourly, n= 100
samples

Figure 5-4 A sampling situation with attribute measurement

5-3

The Binomial Distribution

111

The function is readily calculated by such packages as Excel and :Minitab. For example,

suppose that n ~ 10,p ~ 0.6, and we are interested in calculating F(6) ~ P(X" 6). Then the
Excel function call BINOMDIST(6,10,0.5,TRUE) gives the result F(6) ~ 0.6177. In
Minitab, all we need do is go to CalcIProbability DistributionslBinomial, and click on
"Cumulative probability" to obtain the same result.

5-3.3 An Application of the Binomial Distribution


Another random variable, first noted in the law of large numbers, is frequently of interest.
It is the proportion of successes and is denoted by
p~X!n,

(5-9)

where X has a binomial distribution with parameters n and p. The mean, variance, and
moment-generating function are

E(p) ~ l.E(X) ~ lnp ~ p,


n

V(p)

(5- 10)

(;r . ~ (;r
V(X)

npq

~ ~q ,

M(t)~Mx(~)~(pe'r' +q)'.

(5-11)

(5-12)

In order to evaluate, say, P(p :5" Po), where Po is some number bernreen 0 and 1, we note that

Since npo is possibly not an integer,

(5-13)
where l J indicates the "greatest integer contained in" function.

From a flow of product on a conveyor belt between production operations J and J + 1, a random sample of 200 units is taken every 2 hours (see Fig. 5-5). Past experience has indicated that if the unit is
not properly degreased, the painting. operation will not be successful, and, furthermore, on the average 5% of the units are not properly degreased. The manufacturing manager has grown accustomed
to accepting the 5%, but he strongly feels that 6% is bad. performance and 7% is totally unacceptable.
He decides to plot the fraction defective in the samples, that is, p. If the process average stays at 5%,
he would know that E(p) :::: 0.05. Knowing enough about probability to understand that p will vary,
he asks the quality-control department to determine the P(j > 0.07 j p = 0.05). This is done as follows:

Operation J
degreasing

. Operation J
painting

+1

n = 200 every two hours


Figure 5-5 Sequential production operations.

Operation J + 2
packaging

112

Chapter 5

Some Importaot Discrete Distributions

P(p> 0.07jp=O.05) = I -P{fi'; O.07jp ~ 0.05)


=I - P(X ,; 200(O.07)jp = 0.05)
=1- f(2001;;o.05)k(0.95)""-'

"""

= I - 0.922 =0.078.

:~!,!pi~~1~
An indusmal engineer is concerned about the excessive "avoidable delay" time that one machine
operator seems to have. The engineer considers two acti.-ities as '"avoidable delay time" and "not
avoidable delay time." She identifies: a t.i.n1e-dependent variable as follows:
X(t) = 1,

avoidable delay,
otherwise.

=0,

A pMticular realization of X{t) for 2 days (%0 minutes) is shown in Fig, 5~6,
Rather thanbave a time study technician contln'.lously analyze this operation, the engineer electS
to USe '"work sa.'llpling," randomly selects rt points ou the 960~minute span., and estimates the fraction
of time the '"avoidable delay" category rusts, She lets XI::: 1 if XCr) 1 at the time of the ith observation and X" =:; 0 if X(:) =:; 0 at the time of the ith observation. The statistic

Ix;

P=~

"

is to be evaluated, Of course, Pis anmdom variable having a mean equal to p. variance equal to pqlrt.
and a standard de.iation equal to ~. The procedure oudined is not necessarily the best way to
go about such a study, but it does illusrrate one utilization of the rmdom variable P.

In summary~ analysts must be sure that the phenomenon they are studying may be reasonably considered to be a series ofBemoulli trials.in order to use the binomial distribution
to describe X. the number of successes in n trials, It is often useful to 'visualize the graphical presentation of the binomial distribution. as shown in Fig. 5-7. The values Pix) increase
to a point and then decrease. More precisely, Pix) > Pix - I) for x < (n + I)p, andp(x) <
p(x -1) for x > (n+ I)p.lf (n + I)p is an integer, say m. thenp(m) = p(m- I).

5-4 TIlE GEOMETRIC DISTRIBUTION


The geometric distribution is also related to a sequence of Bernoulli trials except that the
number of trials is not fixed. and. in fact, the random variable of interest, denoted X. is
defined to be the number of trials required to achieve the first success. The sample space

XC')

llLDDlL~
480 min,

Figure 5-6 A realization of X(I). Example 5-5.

960 min.

5-4 T,::e Geometric Distribucon

113

'---OO---''--2:'--c'3-~'\-'n-_L2=--n-'_''C1--nL--~x

Figure 5-7 The binomial distribution.

and range space for X are illustrated in Fig. 58. The range space for X is Rx= (1,2,3, .. ),
and the distribution of X is given by

.x

I, 2, . .. ,
othenvise.

(514)

It is easy to veciiy that this is a probability distribution since

Lpq'-l
x",t

fq' =p.[_1_J=1
l-q

k=O

and
p(X)

<: 0

for ali x.

5-4.1 Mean and Variance of the Geometric Distribution


The mean and variance of the geometric distribution are easily found as follows:

or

!l

d:

q ]

P dql1-q

=p'

~~______x______~~

;~)
)::.~6
)' FFFFFS.-t(==================~:
'\

FFFF::!,,/,F>iS l i

Figure 5,.8 Sample space and range space for X

( ) 7

)
I

(5-15)

114

Chapter 5

Some Important Discrete Distributions

or, after some algebra,


(5-16)

The moment-generating function is


pc'

Mx(r)=--, .

(5-17)

I-qe

A certain experi:Lent is to be performed until a successful result is obtained. The ~a!s are ir.depend~
ent and the cost of pe:rfonni.1.g the experiment is $25,000; however. if a failure results, it costs $5000
to "set up" for the next trial. The experimenter would like to detennine the expected COS! of the proj~
ect. If X is the number of trials required to obtain a successful experiment, then the cost function
would be
C(X); $25,OOOX .,. $5000(X -I)

= 30,OOOX - 5000.
Then
E[C(.X)) ; $30,000 E(X) - E($5000)

I)

r 000 - -5000.
; 130,
P

If the probability of success on a single trial is, say, 0.25, then the E[C(X)]; $30,00010.25 - 55000
SI15,000, 'I'his mayor may not be acceptable to the ex.per'..mcnter, It should also be recognized that
it is possible to conti.."me indefinitely without having a successful experiment. Suppose that the experimenter has a maximum of $500.000. He may wish to find the probability that the experimental work
would cost more tha:a this amount., that is.
P(C(X) > $500,000)

=P($30,000X -

$5000 > 5500,000)

=pl'X>~1
30,000 )
=P(X> 16.833)
I-P(X'; 16)
\6

=1- 2:;O.25(0.75)~1
= O.O!.

The experimenter may Dot be at aU willing to run the risk (probability 0.01) of spending the available
$500,000 without getting a successful run.

The geometric distribution decreases, that is, p(x) < p(x - 1) for x = 2, 3. " .. This is
shown graphically in Fig. 5-9.
An interesting and useful property of the geometric distribution is that it has no memory, that is,

P(X>x+sp:>s) =P(X>x).

(5-18)

The geometric distribution is the only discrete distribution having this memoryless properry.

5-5 The Pascal Dostribution

115

P(X)I

ili~1- - 'c- - -", \_ ~


i

Figure 5~9 The geomet:lc distributiQr..

LetX denote 'the number of tosses efa fair die until we ob.~erve a 6, Suppose we have already tossed
the rlie five times without seeing a 6. The probal:ility that more than tvlO adCiticnal tosses will G'e
requlrOO is

P(X> 7 X> 5)=P(X>2)


I-P(X$2)

=1- Lp(x)

x_,

1--11 ( 1+-5\
I
6)
(j \

25
=36

5-5 THE PASCAL DISTRIBUTION


The Pascal distribution also has its basis in Bernoulli trials. It is a logical extension of the
geometric distribution. In this case, the random variable X denotes the trial on which the rth
success occurs, where r is an integer. The Frobability mass function of X is

p(x) = (

X-I\

Ip'q"-',

r-I)

=0.

x=r.r+l,r+2, ... ,
otherv.tise.

(519)

The term p'q'~ arises from the probability associated with exactly one outcome in i:f that
has (x - rl Fs (failures) and r Ss (successes). In order for this outcome to occur, there must
be r - 1 successes in the x-I repetitions before the last outcome, which is always success.
There are thus (~=;) arrangements satisfying this condition, a.'1d therefore the distribution is
as abown in equation 5-19.
The development thus fur has been for integer values of r. If we have arbitrary r> 0 and
0< p < 1, the distribution of equation 5l9ls known as the negative binomial distribution.

5-S.1 Mean and Variance nf the Pascal Distribution


If X has a Pascal distribution t as illustrated in Fig.
generating function are:

J1. = rip,

5~10,

the mean, variance, and moment-

(520)

116

Chapter 5

Some Irnporta.'lt Discrete Distributions

Lj
!
.11. _,.~

r---

~-------:::-\

34567
x
Figure 5-10 An example of the Pascal distribution.

cr:rq/p',

(5-21)

pi Y
Mx{t)= ( - - , I

(5-22)

and

1- qe )

.tlciIhpie ~.'8.
The president of a large corporation makes decisions by throwing darts at a board. The center section
is !:larked "yes" and represents a success. The probability of his hitting a l'yes" is 0.6, and this pro!)..
ability remains constant from throw to throw. The president con.tinues to throw until he has three
"tits." We cenoteX as the number of the trial .on which he experiences:he third hit. The mean is 3/0.6
:5, mea:.ri.rig that on the average it 'Will ta.l(e five throws. The president"s decision rule is simple. Ifhe
gets three hits on or before the fifth throw be decides in favor of the. question. The probability that be
will decide in favor is :herefore
P(X" 5) = p(3) - p(.) + p(5)

(3\,( )3. )' l/4\


)" ,
=l2(21J,0.6 )'r,0.4)' +l2)
0.6. (0.". + 2fo.6 \0.4)

=0.6826.

56 THE MULTlNOllfiAL DISTRIBUTION


An important and useful higher-dimensional random variable has a distribution kuown as
the multinomial distribution. Assume an experiment ~ with sample space ff is partitioned
into k mutually exclusive events. say B 11 B2, .,.~ BIr 'rYe consider n independent repetitions
of 'I: and let Pi = P(B,) be constant from trial to trial, for i = 1, 2, ... , Ie If k = 2, we IDwe
Bernoulli triaJs, as described earlier. The random vector [X" x" ... , XJ has the following
distribution, where Xj is the number of times Bf occurs in the n repetitions of'i, i ::;: 1,
2 1 .,,)k.

XI'X'''''');.

ll
=

n!

for Xl:::::: 0> 1,2, "., n, i:::::: 1,2. ... , k, and where
k

It should be noted that X,.


x It for any n repetitions,

ll:!x"
PI P2' ,,,Pk

Xl

Xl!X2!'''Xk~J

(5-23)

LX:;:: n.
i=.l

x" ... , X, are

not independent random variables, since

i"": It turns out that the mean and variance of Xi' a particular component, are

5*7 The Hypergeometric Distribution

117

E(X,) = "Pi

(5-24)

V(Xi) = np,(l - p,).

(5-25)

and

~1~lli~~9'!
Mechanical peucils are manu:actured by a process involving a large amount of labor in the assembly
operations. This is bighly repetitive work and L'1.Ceutive pay is invo!ved. Final inspection has revealed
that 85% of the product is good, 10% is defective but may be reworked. and 5% is defective and must
be scra.pped. These percentages remain constant oyer time. A random samp~e of 20 items is selec:ed,
and if we ler

X; ;;;;; number of good iteI11S.


A; : : : number of defective but reworkable items,
X) : : : number of items to be scrapped.
then

p(xpx"x,)

~20:!

, (0.85)" (010)"' (0.05)" .

Zt 1 2 x 3

Suppose we want to evaluate t.;!::; probability function for Xl

:=

18, ~ = 2. andx~ =0 (we must have Xl

+-'-l +xJ =20); !ben

'ZOi!
2 O,=_~i-'-(O.8S'J"(0.10)'(O.05)'
Pilg
\ ' I
(18 11 ')10'
r"'"

0.102.

57 THE HYPERGEOMETRIC DISTRIBUTION


In an earlier section an example presented the hypergeometric distribution. We wiL. now
formally develop this distribution and further illustrate its application. Suppose there is
some finite population withN items. Some number D (D slv') of the items fall into a class
of interest. The particular class will, of course, depend on the situation under consideration.
It might be defectives'(vs. nondefectives) in the case of a production lot, or persons 'With
blue eyes (vs. not blue eyed) in a classroom with N students. A random sample of size n is
selected without replacement. and the random 'variable of interest, X, is the number of items
in the sample that belong to the class of interest. The distribution of X is

D~(N -DJ
(
_xJn-x
()

p x -

=0

(~J

x = 0, I, 2, ... , rnin(n, D),

otherwise.

(5-26)

The hypergeometric's probability mass function is available in many popular software pack~
ages, For instance. suppose that N : : : 20, D = 8~ n :::.- 4, and.;t = 1, Then the Excel function

HYPGEOMDIST(x, 11, D.N) = HYPGEOMDIST(1,4, 8, 20)

0.3633.

118

Chapter.5

Some Importar:t Discrete Distributions

5-7.1 Mean and Variance of the Hypergeometric Distribution


The mean and variance of the hypergeometric distribution are

E(x)=nr~l

(527)

[DJ [ D) [N-nJ

(5.28)

and

.(Xl=n N ' 1- N . N-I'

In a receiving inspeetion department. lots of a pump shaft are periodically received. The lots contain
100 units and the following acceptance sampling plan is used. A random sample of 10 units is
selected wi:hout replacement. The lot is accepted if the sample b..a..1'; no more thai: one defective. Suppose a lot is received that is p"(lOO) percent defective. What is the probability that it will be accepted?

~ (lOOp'UIOO[l- p'JI
../, x
lO-x
P(accept lot) =P(X';l) = =01.
(l~O\
)

1\

\ 10)
(100P'X100[1- P'J}(lOOP"(I00[l- p'~
" 0 l
10
,
1)
9
,

eOO)

,,'

,!O)

Ob. . iously :he probability of accepting :he lot is a function of the lot quality, p~
If p' = 0,05, then

p(accept lot)

0.923,

58 TIlE POISSON DISTRIBL'TION


One of the most useful discrete distributions is the Poisson distribution. The Poisson dis~
tribution may be de\"eloped in NO ways, and both are instructive insofar as they indicate the
cixcumstances where this random variable may be expected to apply in practice. The first
development involves the definition of a Poisson process. The second development shows
the Poisson distribution to be a limiting form of the binomial distribution.

5-8.1 Development from a Poisson Process


In. defining the Poisson process, we initially consider a collection of arbitrary, time-oriented
occurrences, often called arrivals" or "births" (see Fig. 5-11), The random variable of
interest, say X,. is the number of arrivals that occur on the interval (0, t). The range space
Rx, = {O, 1, 2, ... ). In developing the distribution of X, it is neeess3I)' to make some assumptions, the plausibility of which is supported by considerable empirical evidence,

5-8

119

The Poisson Distribution

Figure 5-11 The time axis.

The first assumption is that the number of arrivals during nonoverlapping time intervals are independent random variables. Second, we make the assumption that there exists a
positive quantity A such that for any small time interval, I1t, the following postulates are
satisfied.

1. The probability that exactly one arrival will occur in an interval of width I1t is
approximately A' ill. The approximation is in the sense that the probability is (A'
ru) + o,(ru) where the function [o,(ru)/ru]--+ 0 as ru --+ O.
2. The probability that exactly zero arrivals will occur in the interval is approximately
1 - (,1, . ru). Again this is in the sense that it is equal to 1 - (,1, .
[02(ru)/ru]--+ 0 as ru --+ O.

ru) + o,(ru)

and

3. The probability that two or more arrivals occur in the interval is equal to a quantity o,(M), where [o,(ru)/ru] --+ 0 as ru --+ O.
The parameter Ais sometimes called the mean arrival rate or mean occurrence rate. In
the development to follow, we let

p(x) = P(X, = x) = pit),

x=O, 1,2, ....

(5-29)

We fix time at t and obtain

port + ru) = [1 - ,1, . ru] . port),


so that

and
(5-30)
For x> 0,

p,(t + ru) =,1,. ru p~,(t) + [1 -,1,. ru] . p,(t),


so that

and

()_"A Px ()
lim p,(t+ru)-P,(t)]= Px'()="
t
I\,. Px-l t
t .

&--+0 [

ill

(5-31)

Summarizing, we have a system of differential equations:


(5-32a)
and

x= 1, 2, ....

(5-32b)

120

Chapter 5

Some ImpOrtant Disc:ete Distributions

The solution to these equations is


pir) = (AtYe-"'lx!,

x=o, 1,2, ",'

(5-33)

Thus, for fixed (, we let c = At and obtain the Poisson distribution as

p(X)

x::::::O,1,2, ... ,

= 0,
otherwise,
(5-34)
Note that this distribution was developed as a consequence of certain assumptions; thus,
when the assumptions hold or approximately hold, the Poisson distribution is an appropriate
modeL There ,are many real,world phenomena for which the Poisson model is appropriate.

5-8,2 Development of the Poisson Distribution from the Binomial


To show how the Poisson distribution may also be developed as a limiting form of the bino~
mial distribution with c = np, we return to the binomial distribution

p(X)

n!

.rfI _ )'-'

I(n _ x,)I P ,

X.

x= 0, 1.2, ... ,n,

If we let np = c, so that P = c/n and I - p = 1- c/n = (n - e)/n, and if we then replace terms
involving p with the corresponding terms involving c, we obtain

l- l -c1"-'
C'[ ( IV 2) f x-I'l( c\"f c',-'
=-~ (1)11--,,1-- ",,1--) 1--),1--' .
x!
\
n/\ n
\
n J
n \
n/

p(x) =

n{,,-I)(n- 2) ... (n - x+ I) r eJ'~ nXl


n
n ...

(535)

In lettingn --> = andp --> Oin such a way that op =cremainsfixed, the terms (I _1), (J --"),
1> 1
c
C
tl
tl
... , (I - -=-) all approach 1, as does (1 - -T". Now we know that (1 -j" --> tf as n -->
!'!
II
"
Thus, the limiting fonn ofequatlon 5,35 is pix) = (e'lx!) . e-<, which!s the Poisson distribution.

=.

5-8.3 Mean and Variance of the Poisson Distribution


The mean of the Poisson distribution is c and the variance is also C, as seen below.

(536)

::::::C.

Similarly,

so that
V(X) = E(X')

c.

(E(X)'

(5,37)

5-8

The Poisson Distribution

121

The moment~generating function is


M,!J) = e~~;).

(5-38)

The utility of this generating function is illustrated in the proof of the following theorem,
Theorem 5-1

If Xl' X2
Xk are independently distributed random variables, each having a Poisson dis~
tribution with parameter c," i;::; I, 2, .. _, k, and Y = Xl + X2 + ,,- + Xkj then Yhas a Poisson
distribution with parameter
"'j

Proof The moment-generating function of Xi is


Mx(t):: e,,((.:-'-l),
and since M,,(t) = Mx,(t) Mx,(t) " .. Mx,(t), then
Ml.t} = e(c:,,"c::,-'" +eJC(,,'-l),

which is recognized as the moment-generating function of a Poisson random variable with


parameter C = c j + Cz + ... + c'\;,
This reproductive property of the Poisson distribution is highly useful Simply, it states
that sums of independent Poisson random variables are distributed according to the Poisson
<Hstribution.
A brief tabulation for the Poisson distribution is given in Table I of the Appendix. Most

statistical software packages automatically calculate Poisson probabilities.

Suppose a retailer determines tha: the number of orders for a certain home appliance in a particular
period has a Poisson distribution with parameter c. She would like to determine the stock level K for
the beginning of the period so that there will be a probability of at least 0.95 of supplY.ng all customers who order the appliance dwing the ~od, She does ;:lot wish to back-order merchandise or
re..<;upply
warehouse du.."':ing the period. If X represents the number of orders.,. the dealer wishes to
determine K such that

me

P(X'; Kl ,,0.95

or
P(X>Kl ";0.05.

so that

I-e-.:"c Ix! ~ 0.05.


X

x:X+!

The solution may be determined directly from tables of the Poisson distribution and is obviously a
fuuction of c,

The attainable sea.'litivjty for e1ectromc amplifiers and app~ is limited by noise or spontaLeous
current fluctuations. In vacuum t',:ibes, one noise source is shot noise due to the random emission of

122

Chapter :;

Some Important Discrete Distributions

electrons from the heated ca.thode. Assume that the potential difference between the anode and cath
ode is large enough to ensure that all electrons I'.:mitted by the cathode hav\'.: high velocity-high
enQugh to preclude spare charge (accumulation of electrons between the cathode and anode). Under
these conditions, and cefining an arrival to be an emission of an electrode from the cathode. Davenport and Root (1958) showed that the number of elect:cons, X, emitted from the cathode in time t has
a Poisson distribution given by
p (x) = (AtYe-:tyx!.

0,

0,1. 2, "',
otherwise.

X=

The parameter 1 is the mean rate of emission of electrons from the cathode.

59 SOME APPROXIM:ATIONS
It is often useful to approximate one distribution using another, particularly when the
approximation is easier to manipulate. The two approximations considered in this section
are as follows:
1. The binomial approximation to the hypergeometric distribution.

2. The Poisson approximation to the binomial distribution.


For the hypergeometric distribution, if the sampling fraction nIN is small, say less than
0,1. then the binomial distribution with parameters p ;;;; DIN and n provides a good approximation. The smaller the ratio n/li the better the approximation.

'~~;Ppl~5jl~'~
A prod..:ction lot of 200 units has eight defectives. A random sample of 10 units is selected, and we
want to find the probability that the sample will contain exacdy one defective, The true probability is

Gf: J
2

p(X=l)

(~~)

0.288.

SiDcer.!N=~=O,05 is small. we letp =~= O,(}4 3!l.d use the binomial approximation

10'

p(I)= ( I

04 )'(O.96)' =0.277.

In the case of the Poisson approximation to the binomial, we indicated earlier that for large
n and small p, the approximation is satisfactory, In utilizing this approximation we let c:::;
np. In general, p should be less than 0.1 in order to apply the approximation. The smallerp
and the larger n, the better the approximation,

'~~pIi~]
The probability that a particular rivet in the wing surface of a new aircraft is defective is 0.001. There
are 4000 rivets in the wing. \Vhat is the probability that not more than six defective rivets ",ill be
inst2lJed?

P(X~6)= (4000 (O.OOlY(O.999)-"


::={l

x /

Using the Poisson approximation,

and

P(X~6)~

,
2:,-4
4 '/x! = 0.889.
,m,

5-10 GENERATION OF RE...\LIZATlONS


Sehemes exist for using random numbers) as is described in Section 3~6. to generate rea1~
izations of most common random variables,
With Bernoulli trials, we might first generate a value u, as the ith realization of a urnfonn [0,1] random variable U, where
f(.) = I,

=0,
and independence among the sequence Ui 1S maintained. Then if u,. ~ p, we let XI:::::' 1, and
if", > p, X, O. Thus if Y =
Xi' Y will follow a binomial ilistribution with parameters
nand p, and this entire process might be repeated to produce a series of values of Y, that ;8,
realizations from the binomial ilistribution with parameters n and p.
Similarly, we could produce geometrie variates by sequentially generating values Ii;
and counting the number of trials until u, s: p. At the point tlJis conilinon is met, the trial
number is assigned to the random variable X, and the entire preeess is repeated to produce
a series of realizations of a geometrie random variable.
Also. a similar seheme may be used for Pascal random variable realizations, where we
proceed testing",
until tlJis conilition has been satisfied r times, at which point the trial
number is assigned to X, and once again, the entire process is repeated to obtain subsequent
realizations.
Realizations from a Poisson ilistribution with parameter At = c may be obtained by
employing a technique based on the so-called acceptance-rejection method, The approach is
to sequentially generate values ui as described above until the product ul . ~ .,. uk _." < e-C. is
obtained,. at which pomt we assign X t- k, and once again) this process is repeated to obtain a
sequence of realizations,
See Chapter 19 for more details on the generation of discrete random variables.

2::"

511

SUMMARY
The distributions presented in this chapter have wide use in engineering. scientific, and

management applications. The selection of a specific discrete distribution \\ill depend on


how well the assumptions underlying the distribution are met by the phenomenon to be
modeled. The distributions presented here were selected because of their wide applicability.
A sllIl1Jl1JlI)' of these ilistributions is presented in Table 51.

S.U
5~ 1.

EXERCISES

An experiment consists of four independent


Bernoulli trials with probability of success p on each
trial. The random ...ariable X is the number of successes. Enume:rate the probability distribution of X.

5~2.

Six independent space missions to the moon are


planned. The estimated probability of success on each
mission is 0.95. What is the probability that at least
five of the planned missions will be successful?

,..
~

Q
!\'

.g
T"ble 5-1 Summary or Discrete Distributions

,"

Momcnt~

Probability
Distribution

..

Parameters
-~-

Bernoulli

Function p(x)

,-, ,

Variance

pq

Generating
Futiction

p(x)

.q

x~O.1

~O.

11=1,2 ...

p(x)

otherwise

pe'+ q

r----------

~(:}Xq"-x.

x:::::O,I,2 . , 1I

=0.

oilienvi!>c

-----~

(pel

llpq

"P

-1- q)~

O<p<l
Geometric

O<p< I

,-,
pq.

p(x)

O<p< 1
r~

I. 2.... (r> 0)

Hypergeomelric

N.;;:;1.2, ...

n= 1,2, ... ,N

p(x)

-,---

=GF)~J.

0,1,2, ....

-~-

pe'l (I

qlP'

----

~-.

--

----

'flJIl.DI~]
.N
N ,N-I

pe'

1 qe l

------

See Kendall and


Stuart (1963)

D=I.2, .. .,N

0
Poisson

c>O

otherwise

(). -, Xl-I

pX-f:

O.

.'C.

x=O,~2 ....
otherwi~e

,
I

-----

[ r

rqlp<

eN

rnin(".D) "pl

------qe')

----

rIp

x=r. r+1,r+2, ..
otherwjse

~O,

f--. . -

lip

otherwise

p(x)-= (x.
r~lI) prqx-r,

f--

---

-_.

x-0,1,1, .. ,

..
Pascal
(Neg. binomial)

-------~-----

O<p<l

ninonuro

Mean

e'(r-I)

f
f
if
~
~'

5~12

The Xi'Z Company has pla!lD.ed sales presentations to a dozen important customers. Tne probability
of receiving an order as a result of such a pre~ntation
is esti.rnated to be 0,5, What is the probability of
receiving fOUI Or more orders as the result of the meetings?
S~3.

54. A stockbroker calls her 20 most important cus~

tomerS every morning, If the probability is one in


three of making a transaction as the result of such a
call, what arc the chances of her h.a:tdli:lg 10 or more
transactions?
5~5.

A production process that manufactures

transis~

tors operates, on the average, at 2% fraction defective.

Ever; 2 hours a random sample of size 50 is taken


frOIll the proeess. If the sample contai:::ts more than
t\VO defectives the process must be stopped. DeterIi'lir.e the probability that the process will be stopped
by the sampling scheme.
5~6.

Find the mean and variance of the binocrial distribution using the moment~generating function (see
equation 5~7).

5-7. A production process manufacto.ri.Dg tum~indica~


tor dash lights is known to produce lights that are 1%
defective. Assume this value remains unChanged and
assunle a sample of 100 such lights is randomly
selected, FindP(p $ 0,03), where Pis the sample fraction defective.
5-8. Suppose a random sample of size 200 is ta.'cen
frorr.. a process that is 0.07 fraction defective. \\'llat is
the p:obabilit}' that p will ex.ceed the L"'Ue fraction
defective by one standard deviation? By two standard
deviations? By three standard deviations?

Exercises

125

5-12. The X1'Z Compa.'1Y plans to ...isit potential customers until a substantial sale is made. Each sales
presentation costs $1000. It costs $4000 to travel to
the nex.t customer and set up a new preseatation,
(a) What is the expected cost of making a sale if the
probability of lMking a sale after any presentation
is O.1O?
(b) If \he expeeled profit from each sale is $15,000,
should the trips be undertaken?

(c) If the budget for advertising is only $100,000,


what is the probability that this sum will be spent
Vlithout getting an order']
5~11. l'1nd the mean and variance of the geoJ.lletric
distribution usiog the momeut-ge!1erating function.
A snbmarine's probability of sinking an enemy
ship with anyone
of its torpedoes is 0.8. If the
firings a:::e ir.depe:ode:1t.. determine the probability of
a sinking within the first two firings. \Vithin the Erst
three.

5~14.

515. In Atlar.ta the probability that a thunderstorm


will occur on any day during the spring is 0.05.
Assuming independence, what is the probability that
the first thunderstorm occu..rs on April 25? Assume
spring begins on March 21.
5-16. A potential customer enters :an automobile dealership every hour. The probability of a salesperson
concluding a transaction is OJO. She is detennined to
keep working until she has sold th...ee cars. 'Nhat is the
probability that she will have to work exactly 8 hours?
More than 8 hours?

5-9. Five cruise missiles have beea bc.i1t by an aero~


space company. The probability of a successful firing
is, on anyone test, 0.95. Assuming independent fuings. what is the probability that thefust failure occurs
on the fifth firing?

5~17. A persor.ncl manager is interviewing potential


employees in ordct to fill twO jobs. The probability of
an interviewee having the necessary qualifications a::.d
accepting an offer is 0.8. VIhat is the probability :hat
exactly four people must be inteniewed? "'hat is the
probability that fewer than four people must be interviewed?

5-10. A real estate agent estimates his probability of


sellin.g a house to be 0,10. He has to see fOUI clients
today. If he is successf.d on the :first three calls, what
is the probability that his fourth call is unsuccessful?

s..18. Show that the moment-generating function of


the Pascal tandom. variable is as given by equation
5-22. Use it to deten:.n.i:ne the mean and variance of the
Pascal distnoution.

5-11. Suppose five independent identical laboratory


expe..1mcnts are to be undertaken, Each experiment is
extremely sensitive to en...ironmental conCitions. and
there is only a probability p that it will be completed
successfully, Plot, as a functio:l of p, the probability
:hat the:fifth experiment 15 the first failure. Fir.d mathematically the value of p that maximizes the probability of the fifth trial being the first unsuccessful
e"'periment.

5 ..19. The probability that an experiment has a successful outcome is 0.80. The experiment is to be

repeated until five successful outcomes have occurre,t


What is the expe..."1ed numbe:- of repetitions required?
W:Jat is the variance?
5~20.

A military co=mander wishes to destroy an


enem.y bridge. Each fligflt of planes he sends out has
a probability of 0.8 of scoring a direct hit on the
bridge. Ir takes four direct hits to completely destroy
the bridge, If he can mount seven assaults before the

126

Chapter 5

Some Important Discrete Distributions

bridge becomes tactically unimporta:J.t. what is the


probability that the bridge will be dcstrOyed?

,5..21. Three companies, X, Y. and Z, have probabilities


of obtaining an order for a particular type of merchandise of 0.4, 0,3, and 0,3, respectively, Three

orders are to be awarded independently, What is the


probability that one company receives all the orders?
5-22. Four companies are interviewing five college
students for positions after graduation. Assuming all
five receive offers from each company and assuming
the probabilities of the companies hiring a new
employee ate equal, what is the probability that one
company gets all of the new employees? ~one of

diem?
5-23. We are interested in the weight of bags of feed.
Specifically, we need to know if any of me four events
below has occurred:

TJ ;(X~ 10),
T,;(lO<X'; 11),

p(T,)

T,;(ll <X,; 11.5),


T4 ;;; (11.5 < X).

p(T,); 0.4.

p(T1) =0:2,
0.2,
p(T,) = 0.2,

If 10 bags are selected at random. what is the probability of four being less than or equal to 10 pounds.
one being greater than 10 but less than or equal to 11
pound.<;. a!ld two bcing greater than 11.5 pounds?

524. In Problem 523 what is the probability that :ill


10 bags weigh more than 11.5 pounds? 'What is the
probability that five bags weigh more than 11.5
pounds and the remaining five weigh less than 10
pounds?
5-25. A lot of 25 color television :;ubes is subjected to
an acceptance testing procedure, The procedure consists of drawing five tubes at random, withourreplace
ment. and testing them. If two or fewer tubes fail. the
remaining ones are accepted. Otherwise the lot is
rejected, Assume the lot contains four defective tubes.
w

(a.) What is the exact probability of lot aeceptance?


(b) \Y'bat is the probability of lot acceptance computed from the binomial distribution with p :=: ?

:s

5-26. Suppose that in Exercise 5-25 the lot size had


been 100. Would the binomial approxilnation be satisfactory in this case?
5-27. A purchaser receives small lots (N:= 25) of a
high~precision device, She wishes to reject the lot
95% of the time iiit contains as many as seven defectives, Suppose she decides that the presence of one
defective in the sample is sufficient to cause rejection.
How large should bet sample size be?
5 ..28. Show that the moment-generating function of the
Poisson random variable is as give:.:. by equation 5.-38,

5~29~

The number of antomobiles passing through a


pa.'1icular intersection per hour is estimated to be 25.
Find the probability t:.hat fewer than ; 0 vehicles pass
through during any 1~hour interval. A",sume that the
number of vehicles follows a Poisson distribution.
5-30. Calls arrive ata telephone switchboard such that
the number of calls per hour follows a Poisson distribution with a mean of 10. The current equipment can
ha::tdle up to 20 calls \Vithout becoming overloaded.
What is the probability of such an overload occurring?
5-31. The number of red blood cells per square unit
v-lsible under a microscope follows a Poisson distribution with a mean of 4, Find the probability that more
than five such blood cells are visible to the observer.

5,.32. LetX/ be the number of vehicles passing through


an intersection during a length of time t. The random
variable X/is Poisson distributed with a parameter ?t.
Suppose an auto!'l'latic counter has been installed to
count the number of passing vehicles, However, this
counter is not functioning properly, and each passing
vehicle has a probability p ofnot bei:tg counted. Let Yt
be the number of vehicles counted during t. Fmd the
probability distribution of ~.
5 ..33. A large insurance company has discovered that
0.2% of the U.S. population is injured as a result of a
pa.~cuJ:ar t;>pe of accident.:nlis company has l~.OOO
policyholders carry"ing coverage aga..inst such an accidenL What is the probability that three or fewer claims
will be filed against those policies next year? Five o~
Qore claims?

5,,34. Maintenance crews arrive at a tool crib requesting a particular spare part according to a Poisson disti.bution with parameter 1;;; 2. T.hree of these spare
parts are nOI'I!lally kept on hand. If more than three
orders occur. the crews must journey a conside..""abte
distance to central stores.
(a) On a given day. what is the probability that such a
journey must be made?

per day for spare


partS?
(c) How many spare parts must be carried if the tool
crib is to service all incoming crews 90% of the
time?

(b) Wbat is the expected demand

Cd) "''hat is the expected number of crews serviced


daily at the tool crib?
(e) Wbatis the expected number of crews makiLg the
journey to central stores?

5--35. A loom experiences one yam breakage approx.i.mately every 10 hours, A panicular style of cloth is
being produced that will take 25 hours on this loom.lf
ti:u:ee or more breaks are required to tender the prod-

5-12 Exercises
uct unsatisfactory. find the probability that this style of
cIo'll is finished wi:h acceptable quality.
5-36. The ::mmber of p,eople boarCing a bus at each
stOP follows a Poisson disnibution with parameter A..
TIle bus company is surveying its usages for scbedul~
ing purposes and bas installed an automatic counter
on each bus. However, if more than 10 people board at
any one s~op, the counter cannot record the excess and
merely registers 10. If X is the number of riders
recorded, find the probability distribution of X.
5-37. Amathemadcs textbook has 200 pages on which
typographica: errors in the equations could occur. If
there are in fact five euors randomly dispersed among
these 200 pages. what is the probability that a :andom
sample of 50 pages will contain at least one error?
How large must the .random sample be to assure that at
least three ertO('S will be found with 90% probability?
5-38. The probability of a vehicle baving an accident at
a particular intersection is 0.0001. Suppose that 10,000
vehicles per day travel th.vugh this intersection. \\"tat
is the probability of no accidents occurring? What is
the probability of two or more accidents?
5..39. If the probability of being involved in an auto
accident is 0.01 during any year, wbat is d:!e probability of baving two Or :Dare accidents during any 10yem: driving period?
540* Suppose that the number of aecidents to
employees working on high-explosive sbells over a
period. of time (say 5 weeks) is taken to follow a Pois~
son discrih:.:tion 'With parameter A.::; 2.

(a) Find the probabilities of 1, 2, 3, 4. or 5 accidents.


(b) The Poisson distribution has been free:y aPPlied
in the area of industrial accider:ts. However. ir freque~tly provides a poor ''fit'' to actual hismrical
data. Why rnigl".t this be t:1le? Hint: See Kendall
and Stuart (1963), pp. 128-130.

541. ese either yo',;! favorite co;nputer language or


the random integers in Table XV in ':he Appendix (and
scale them by mUltiplying by 1(}$ to ge::. uniform :0,1]
random nu.::nberrea.liz.ations) to do the following:
(a) Produce five realizations of a binomial random
variable with n "'" S.p = 0.5.
(b) Produce ten realizations of a geometric distribu.tion 'With p "'" OA.
(c) Produce five realizations of a Poisson random
variable with c "'" 0.15.
5-4~ If Y"", XI .3 a:1(:1 X follows

a geometric discribution

with a wean of 6, use uniform lO,l} random number


realizations, and produce five realizations of Y.

5-43. \Vith Exercise 5-42 above, use your computer to

do the following:
(a) Produce 500 realiz.ations of Y.

(b) Calculate)i =~, +)'1 + ........


from this sz.mple.

Ysoo), the mean

5~44. Prove the meworyless prope:ty of the geometric


distribution.

/i.

\/

127

Chapter.

Some Important
Continuous Distributions
6-1 INTRODUCTION
We will now study several important continuous probability distributions. They are the uni~
form. exponential, gamma, and Weibull distributions. In Chapter 7 the normal distribution,
and several other probability distributions closely related to it, will be presented. The normal distribution is perhaps the most important of all continuous distributions. The reason
for postponing its study is that the normal distribution is important enough to warrant a sep,
arate chapter,
It has been noted that the range space for a continuous random variable X consists of
an interval or a set of intervals. This was illustrated in an earlier chapter, and it was observed
that an idealization i') involved. For example, if we are measuring the time to failure for an
electronic component or the time to process an order through an information system, the
measurement devices used are such that there: are only a finite number of possible outcomes; however, we v.-ill idealize and assume that time may take any value on some interval. Once again we will simplify the notation where no ambiguity is introduced, and we let
f(x) = fx(x) and F(x) = F,!.x).

6-2 THE UNIFORM DISTRIBlJT10N


The uniform density function is defined as
f(x)

=,f-.
p-a
(6-1)

where a and {3 are real COnStants with a < {3 The density function is shown in Fig. 6- L
Since a uniformly distributed random variable has a probability density function that is

f(x)t

!
1

p-a!
,_ _ .... ~L-_ _ _ _-L._ _ _.~

128

Figur1l 6-1 A uniform density.

6-2 Toe Uuiforrr. Distribution

129

constant over some inte:val of definition, the constant must be the reciprocal of the length
of the interval in order to satisfy the requirement that

[fix) dx~1.
A uniformly distributed random variable represents the continuous analog to equally
likely outcomes in the sense that for any subinterval [a, b], where a" a < b ,; f3, the
P(a,,;X"; b) depends only on the length b - a,

P(a$X$b)~ t~~ b-a

af3-a

-0:

The statement that we choose a point at random on [a, f3J simply means that the value chosen, say Y, is uniformly distributed on [a, f3].

6-2.1 Mean and Variance of the Unifonn Distribution


The mean. and variance of the uniform distribution are

(62)
(which is obvious by symmetry) and

V(X)~JP x'dx _[f3+ a ]'


af3-a
_ (f3-a)'

(63)

12
The momenr-generating function Mit) is found as follows:

(64)

For a uniformly distributed random variable, the distribution function F(x)


is given by equation 6-.5~ and its graph :is shown in Fig. 6-.2.

F(x)

0,

}: - - - - J.f3-af3-0:'
~

x-{):

x < a,

a';x<f3,

1,

(55)

F(X)t

f3

P(X'; x)

Figure 6-2 Distnoution function for the uniform random variable.

130

Chapter 6

Some Important Continuous Distributions

:E.~!;;t6!J
A point is chosen at random on the interval [0. 10). Suppose we wish to find the probability that the
point lies between and The density of the random yariableX isf(x)
O:S;x::5: 10, andf(x):::O
otherwise. Hence,

t t
pet s X s t) =fa.

=7:0,

:~~i~;6;~.:
Numbers of the form NN.N are "rounded off" to the nea..'<!St integer. The round-off procedure is such
that if the decimal. part is less than 0.5, the round is "down" by simply dropping the decim.:ll part~
however, if the decimal part is greater than 0.5. the round is UP. that is. the new number is livlV.NJ 1
whcre LJ is the "greatest integer contaix.ed in" function. Ii the decimal part is exactly 0.5. a coin is
tossed to determine which way to r~:lUnd. The round.-off error,x, is defined as the difference between
the number before roundiD-g and the number after rounding, These errors are comroonly distributed
according to the uniform distribution on the interval [-1),5, + 0.5;. That is,
"'I'"

j(xl

I,
== O.

-<l.5 "X'; ~ 0.5,


otherwise,

~.K<"'aiIl"'1;'(i:3'
"",,
...,,-~," R-",~,~"",,w
One of the special features of many simulation languages is asiruple automatic procedure for using the
unifonn distr..bution. The user declares a mean and modifier (e.g. 500, 100). The compiler immediately
creates l! routine to produce realizations of a random variable X uniformly distributed on (400, 600],

In the special case where a = 0, f3 = 1, the uniform variable is said to be uniform on


[0, IJ and a symbol U is often used to describe this special variable. Using the results from
equations 62 and 63, we note that E(U) =} and V(U) = l'Z' Ii U" U z, ... , U, is a sequenee
of such variables. where the variables are mutually independent, the values VI' U?J .. Uk
arc called random numhers and a realization u~. ~, ... , ut is properly called a random ".umber realization; however, in COmmon usage, the term ('random numbers" is often given to
the realizations.

6-3 THE EXPONE:'iiTIAL DISTRIBUTION


The exponential distribution has density function
J(x) = I.e"',

=0,

x;' 0,

(66)

where the parameter A is a real, positive constanL A graph of the exponential density is
shown in Fig. 63.

f(xlf

~~
x

Figure 6-3 The expouetttial density function.

131

T.'1e Exponential Distribution

6-3

6-3.1 The Relationship of the Exponential Distribution


to tbe Poisson Distribution
The exponential distribution is closely related to the Poisson distribution, and an explana~
tion of this relationship should help :.he reader develop an unden;tanding of the kinds of sit
uations for which the exponential density is appropriate.
In developing the Poisson distribution from the Poisson postulates and the Poisson
process; we fixed time at SOme value t, and we developed the distribution of the number of
occurrences in. the interval (0, r]. We denoted this tandom variable X, and the distribution
was
p(x) = e-',(?..ttlx!,

x:::: 0, 1,2\ ... ,


othenvise.

=0,

(67)

Now consider prO), which is the probability of no occurrences on [0, tJ. This is given by
p(O) = e~.

(6-8)

Recall thet we originally fixed time at t. Another interpretation of prO) = e'" is that this is
the probability that the time to the Drst occurrence is greater than t. Considering this time
as a random variable T, we note that
p(O)

t" O.

=peT> I) =e-",

(69)

If we now let time vary and consider the random variable T as the time to occurrence, then

,,,,0.
And since jtt) = F(I), we see that the density is
f(t)=)..,:"",

=0,

""
",
e"J,

:V' X)' ~

no,

)(610)

C\;/

""lI,{,.

yi'~

otherwise.

j}

J/

(6-11)

This is the exponential density of equation 66. Thus, the relationship between the
exponential and Poisson distributions may be stated as follows: i: the number of occur~
rences has .a Poiss'on distribution as shown in equation 6-7. then the time between successive occurrences has an exponential. distribution as shown in equation 6-11. For example, if
the number of orders for a certain item received pel," week has a Poisson distribution, then
the time betv.reen orders would have an exponential distribution. One variable is discrete
(the count) and the othe, (tiroe) is continuous.
In oreer to verify thatfis a density function, we note that/(x)" for all x and

l ~ k -'" dx
o

._"'15

-e

/~

= 1:)l-'<

.
...

6-3.2

(:)

/~ /

",
)< 'V

Mean and Variance of the E"l'onential DistributiQY 'y


The mean and variance of the exponential distribution are
(6.12)
and

J; x2k-"'dx-(l//,)2
= [-x e-"'I; +zJ; xe-"'dx]-(l/:<)' =1/:<'

V(X) =

(6.13)

132

Chapter 6

Some lIUportant Continuous Distnoutions

F(x)t

+--

_______
, Fig= 64 Distribution fulletion for the
x expoaential.

The standard deviation is 1/A., and thus the mean and the standard deviation axe equal.
The moment-generating function is

. yl

\ '.j

(6-14)

Mx(t)=i 1-":"1

provided t < ?.
The cumulative distribution function F can be obtained by integrating equation 6-6 as
follows:

J:

X<O,

kr"dr=l-e- lx x;,o.

(6-15)

Figure 64 depicts the distribution function of equation 6-15.

An electronic component is known to have a useful life represented by an exponential density 'X-ith
failure rate of 10-' failures per hour (i.e., .1= 10-5). The mean t:i;me to fwtL""e, E{X,), is thus 10> hours.
Suppose we want to determine the fraction of such components that would fail before the mean life
or expected-life:

This result holds for arty value of J. greater than zero. In our example. 63.212% of the iteI:1S
would fail before 10' hours (see Fig. 6-5).

fix)

f(x) = Ae-Ax, X2 0:

:::: 0,

ot'1erwfse

Figure 6-5 The mean of an exponential dlstribution.

6~3

133

The Exponential Distribucon

Suppose a designer IS to make a d.ecision between two manufacturing processes for the rna!:ufacture
of a certain component Process.4 cos~ C dollars per unit to manufactcre a component P:::ocess B
costs k C dollars per unit to manufacrure a component, where k> 1, Cor.::ponents J:-..ave an exponential time to failure density with a failure rate of 200-: failures per hour for process A. while components from process B have a failure rate of 300- 1 failures per hour. The mean lives are rhus 200 hours
and 300 hou~, respectively, for the tWo processes, Because of a warrar.ty clause, 1: a componen: lasts
for fewer than 400 hours, the IDOllufacC'.:.ter rou.st pay a penalty of K dollars. Let X be be time to fail-

ure of each component. Thus, the component costs

3..4;;

ifX~400,

UX <400,

and
UX;24OO,
ifX<4oo.
The expected costs are

1 200- e-"""'dx+CJ:,200- e-"""dx

E(CA ) ~(C+K)

400

= (C - K){_e-,noc

40

'l+ C~_e-Xl)!fJl- 1

'0..J

400..,;

=(C+K)[l-e-']+C[e-'j
2)
=C+K(I-e,
,

and

E( Co) = (kC - K)

f'

3oo- le-,;300 dx +kCJ:,3OO- 1e-'13" dx

=(kC + K)[1_e-if3]+kC[e-4l3]

. =kC-Kll-e"''''j,
Therefore, if k <: 1- KlC{[2 - e~), then E( CA ) > E(Cfj), and itis likely that !"1.e designer would se:ect
process B.

6-3.3 Memoryless Property of the Exponential Distribution


T4e exponential distribution has an interesting and unique memoryless property for

con~

tinuous variables; that is,

P(X"x+s\
,
P(X>x)

P( X>x+sIX>x ) = '

so that

P(X",,'"

six> x) = P(X>s),

(6-16)

For example, if a cathode ray tube has an exponential time to failure distribution and at time
x it is observed to be still functioning, then the remain.in.g life bas the same exponential failure distribution as the tube had at time zero_

134

Chapter 6

Some Important ConriDuous Distributions

6-4 THE GAMl\1A DISTRIBlJTION


6-4.l

Tbe Gamma Function


A function used in the definition of a gamma distribution is the gamma function defined by
(6-17)

forn>O.
~J\n

important recursive relationship that may easily be shown on integrating equation

6~ 17

by parts is
r(n) = (n - I)f(n - I).

(6-18)

r(n) = (n -I)!.

(6-19)

If It is a positive integer, then

J;

since
=
e -~dx::;:;: L Thus, the gamma function is a generalization of the factorial.
The reader is asked in Exercise 6-17 to verify that

reI}

r~
r,/1'
-I~ J, x-lf2e-xdx~,.fii.

\2/

(6-20)

6-4.2 Definition of the Gamma Distribution


With the use of the gamma fun:tion, We are now able to introduce the gamma probability
density function as

=0,

(6-21)

otherwise.

The parameters are r > 0 and A. > O. The parameter r is usually called the shape parameter"
and "lis called the scale parameter. Figure 6-6 shows several gamma distributions, for ;1. == 1
and various r. It should be noted thatj(x);, 0 for alb, and

J-~ j(x)dx=1~0 ~(),)Tle-"'dx


f(r)
=-1-1- v,-le-Ydy=_I_ T (r)
f(r) 0 r(r)'

I.

The cumulative distribution function (CDP) of the gamma distribution is analytically


intractable but is readily obtained from software packages such as Excel and Min)tab. Jn particulat, the Excel function GA.\1MADIST(x, r, !fA, TRUE) gives the CDF F(x). For exam-

f(x)

Figure 6k6 Gamma distribution for ..t= L

64 The Gamma Distributio:J.

135

ple, Gk\11<LWIST(15, 5.5, 4.2, TRUE) returns a value of F(15) = 0.2126 fur the case r
5.5, .<. = 1/4.2 = 0.238. The Excel function GAMMAIl\'V gives the inverse of the CDP.

64.3

Relationship Between the Gamma Distribution


and the Exponential Distribution
There is a close relationship between the exponential distribution and the gamma distribution. Namely, if r::;;;; 1 the gamma distribution reduces to the exponential distribution. 1bis follows from the general definition that if the random variable X is the sum of r independent,
exponentially distributed random van'ables, each with parameter )"" then X has a gamJr.tl
density with parameters r and A.. That is to say, if
X~Xl

+x, + ... -X"

(6-22)

whe,e ~ has probability density function

M-"',

g(x)

=0,

X~O,

othervrise,

and where the Xj are mutually independent, then X has 11e density given in equation 6-21.
b many appli::;ations of the gamma distribution that we will consider, r will be a positive
integer, and we may use this knowledge to good advantage in developing the distribution
function, Some authors refer to the special case in which r is a positive integer as the
Erlang distribution.

6-4.4 Mean and Variance of the Gamma Distribution


We may show that the mean and variance of the gamma d:stribution are
E(X)

=rI'<'

(6-23)

rl.<.".

(6-24)

and
V(X)

Equations 6~23 and 6-24 represent the mean and variance regardless of whether or not Tis
an integer; bow ever, wben T is an integer and the interpretation given in equation 6-22 is
made, it is obvious that

,
E(X) = 2:E(Xj) rJj.<.=r/.<.
j=!

and

V(X) =

2: V{X))=r<1/.<.2 =r/.<.2
j=l

from a direct application of the expected value and variance operators to the sum of independent random variables.
The moment-generating function for the gamma distribution is

(6-25)

136

Chapter 6

Some Important Continuous Distributions

,
RecalUng that the moment-generating function for the exponential distribution was
[1 - (IIi.)r', thls result is expected, since
M(X,+X,+ ...+X,)(t) =

The distribution function, F, is

F(x)=l-

UMXj(t1=[(I-HT

r /rr/,tY-le-ndt,

(6-26)

;<>0,
.>:so.

:::::0,

(6-27)

If r is a positive integer, then equation 6-27 may be integrated by partS, giving


,-I

F(x) =1- Le-~ (Ax)' /k:,


'"U

X>O,

(628)

which is the sum of Poisson terms with mean Ax. Thus, tables of the cumulative Poisson
may be used to evaluate the distribution function of the gamma.

~~~~!~~i~2
A redundant system operates as shownm Fig, 6-7. Inirially unit 1 is On line, whlleuuit 2 and unit 3 arc
on standby. 'When uniT 1 fails r the decision switch (OS) switches unit 2 on until it fails and then unit 3 is
switehed on. The decision switch i" assumed to be perfect, so that the system life X may be represented
as the sum of the subsystem lives X::= Xl . :. . -l2 + XJ , If the subsystem lives are independent of one
another, and if the subsystems each have a life.x:,,,j== I, 2. 3. ha..,r.ng density g(x) = (1!l00)e-!I1l00, X ~ 0,
then X will have a garr.:.ma density with r~3 and ,l=O,Ol. That is,

0.01 (
, -0.01.< , x>O,
f ()
x = - O.Olx) e
2!

otherwise.

=0,

The probability that the system 'Hill. operate at least.~ hours is denoted R(x) and is;;;alled the reliabil~
ity ftmction. Here,

R(x)=l-F(xl= e"'01'(o.OlX)'/k!

"'"

= e-'()'Ol'[1 + (O.Olx) + (O.01x)' /2]'

Fignnl6-7 A standby redundant syst=

65

The Weibull Distribution

137

~~~pX;;:Z;

For a garn.."I:a distribution with ;t= and r = v!2. where v is a positive integer. the chi-square distri~
lJution with v degrees offreedom results:

f (,x)

2 vtZ r\(y/2) x
,
: : : 0,

(,{2)-: -'I'
It

0
x> ,
otherwise.

This distribution will be discussed further in Chapter 8.

65 THE WEffiI1LL DISTRIBUTION


The Weibull distribution has been widely applied to many random phenomena. The principal utility of the Weibull distribution is that it affords an excellent approximation [0 the
probability law of many random variables. One important area of application has been as a
model for time to failure in electrical and mechanical components and systems. This is discussed in Chapter 17. The density function is

f(x)

=%( x;rr exp [-~ x;r)P}

x"r,

= O.

otherwise.

(629)

Its parameters are y(- < r< 00)) the location parameter; 0> 0, the scale parameter; and
f3 > 0, the shape parameter. By appropriate selection of these parameters, this density func

tion will closely approximate many observational phenomena.


Figure 6-8 shows some Weibull densities for r= 0, 8 = 1, and f3 1, 2. 3, 4. Note that
when y= 0 and f3 = 1, the Weibull distribution reduces to an exponential density with }. =
1/8. Although the exponential distribution is 3 special case of both the gamma and Weibull
distributioust the gamma and Vleibull in general are noninterchangeable.

6-5.1 Mean and Variance of the Weibull Distribution


The mean and variance of the Weibull distribution can be shown to be
(630)

0.8

2.0

Figure 6..s Weibull densities for r=O, 0= 1. and j3 = 1, 2, 3, 4.

138

Chapter 6

Some Impor:ant Continuous Distributions

and
(6-31)
The distribution function has the relatively simple fonn

r (x-r)Pl

F(x)=l-exp -: - '- \

(6-32)

J'

(j

The Weibull CDF F(x) is conveniently provided by software packages sueh as Excel and
Minitab, For the case .< = 0, the Excel function VlEIBlTLL (x, (3, y, TRIJE) retnms F(z).

~~i~~:~)
The tirue~to-failure distribution for eleCtronic subassemblies is known to have a Weibu!l density with
y=O,
and 0= 100. The fraction expected to survive to, say, 400 hours is fuus

p=t.

1- F(400) =e-,1400,"00

=0.1353.

The same result could have been obtainedin Excel via the function call 1-WEIBUu.. (400, 112, 100.
TRUE).

The mea.'1. time to failure is


E(X)

=0 + 100(2) = 200 hours.

~X;;:~~C6::~~
Berrettoni (1%4) presented a number of applications of the Weibull distribution.. The following a...-e
examples of natural processes having a probability law closely approximated by the Weibull distrl~
bution. The randol'!) variable is denoted X in the examp:es.
1. Corrosion resistance of magnesium alloy plates,
X: Corrosion weight loss of 10: mg/(cml)(day) when magnesium alloy plates are immersed
in an inhibited aqueous 20% solution ofMgBr2'
2. Return goods classified according to number of we<"-ks after shipment.
X; Length, of period (10-\ weeks) until a customer returns the defective product after
shipment.
3~

Number of downtimes per shift.


X: Number of downtimes per sblft: (times 10-1) occurring in a CoIltinuoUS automatic and C()m~
plicated .assembly line.
4. Leakage failure in dry-cell batteries.
X: Age (years) when leakage starts.

5. Reliability of capacitors.
X: Life (hours) of 3,3-,uF, 50-V. solid tantalum capacitors operating at an ambient temperature of 125C, where the rated catalogue voltage is 33 V

---

6-6 GENERATION OF REALIZATIONS


Suppose for now that U1, U" ',., are independent uniform (0, 1] random variables. We "ill
show how to use these uniionns to generate other random variables.

6-8 Exen:ises

139

If we desire to prod"ce realizations ofa uniform random variable on [a, ill. this is simply accomplished using

a+ u,({3- a),

X,

i~

I, 2, ....

If we seek realizations of an exponential random variable with parameter


transform merlwd yields
xi

-I
=TIn(Ui}'

(6-33)

A. the

i= 1.2 ....

inverse

(6-34)

Similarly, using the same method, realizations of a Weibull random variable with P2.....""ameters 11, {3. <5 are obtained usIng
(6-35)

The generation of gamma variable realizations usually employs a technique known as


the acceptance-rejection mefr..od~ and a variety of these methods bave been used. If we wish
to produce realiz,ations from a gamma variable with parameters r > 1 and }.. > 0, one
approach, suggested by Cheng (1977), is as follows:

Step 1,
Step 2.
Step 3,

Let a ~ (2r- 1)112 and b = 2r-In 4 + ila.


Generate u1' u., as unifonn [0. 1] random number realizations.
Lety=r[u/(l-u,)]a.

step 4a. If y > b -In(u;u.,). tejeet y and return to Step 2.


Step 4b. If Y ~ b -InCu;u.,) assign x <- (y/;t).
For more details on these and other random-variate generation techniques, see
Cbapter 19.

6-7 SUMMARY
This chapter bas presented four widely used density functions for continuous random vari~
abIes. The uniform, exponential, gamma, and Weibull distributions were presented along
with underlying assumptions and example applications. Table 6-1 presents a summary of
these distributions.

6-8 EXERCISES
6*1. A point is chosen at random on the line segrr:ent
[O,d]. Whatjs the probability 1Jatitlies between and
3 Between 2'4a::.d
"
1"47
3t1
6--2. The opening price of a particular stock is uni~
fonn1y distributed on the interval [35{, 44{-J. What is
the probability that. on any given day. the opening
price is less than 401 Between 40 and 427
6-3. The random variable X is uniformly distributed
on the interval [0, 21. find the distribution of the ran-

this profit is
distributed between SO and
$2000, find the probability distribution of the broker's
total fees.
6-5. lise the moment~generating function for the uniform density (as given by equation 64) 10 generate
the mean and VJrian.ce.
6-6. Let X be uniformly distributed and symmetric
about zero with 'lariance 1. Find the appropriate v:liues for a and fl.

dom variable Y;;;;; 5 + 2X.

6~7.

plus a. 6% cqmmission on the lando\iiller's profit. If

used to generate variates from the empirical probabil~


it}' distribution desctibed below:

.-4. A real estate broker charges a fixed fee of $50

Show how the uniform density function can be

....
:!3
Q
.

!l

'"
Table 6-1

Density

---------r------------------------,---~---------_r-----------------------I

Pnrl1meters
I ~----~-.
+--

a,fJ

Unifonn

n'xpouential

Gamma

f3>a
-----.,bO

--1

fix) ~ fJ.a'

a';x~fJ

:::;; O.

otherwise,

f(x)~M-)"

~o,

Varianee

(fJ- a)' 112

(a, fl)12

II ;I.'

1/ ,t

00

fix) ~r(r)(A'r,eu"',

00
,bO

Mean

Density Punctionf(x)

< r<11>0

_00

rl ).2

rl;l.

X>O

thO

_ _ _ _ _-'1. ______

=0.
~~

e" - etC'

I(f! a)
---(1-llAl'

(l-ll;tr

otherwise.

P(x_r)PU'
(X-r\P] ,
-0- expr--0-)

fix) ~8

Moment~

Generating
Funetion

g
~

t1

);;.

a.~

\ - - - - - - -- - - - - - - - - - - - - - +
Weibull

'"~

Summary of Continuous Di\'isiolls

x~r

otherwise,

--~-

r+"r( p+l)

8'H% '1]-[~i' 1)J)

6~8

0.3
0.2

2
3

0.4

OJ

6-17,Sllowthatr(t)=

Hint: Apply the inverse transform method,


6-8. The random variable X is unifonnly distdbuted
oyer the interval [0. 4). Wbatis the probability thel the
roots of/+ 4Xy + X + 1 =0 are real?
6--9. Verify that the moment-gene:ra.ting function of the
e~neDtia1 dislribution is as given by equation 6-14,
Use it to generate the mean a.t!d variance.

()..lO. The engine and drive train of a new car is guar


anteed for 1 year. The mean life of an engine and drive
train is e-stimated to be 3 years, and the time to fdilu.rc.
has an exponential densi"::y. The realized profit on a
new car is $1000, Including costs of parts and labor
the dealer must pay $250 to repair each failure. What
is the expected profit per car?
6-11. For the data in Exercise 6~ 10, what percentage
of C3I'S will experience failure ir. the engine and drive
train du..~g the fi..""St 6 months of use?
w

612. Let the length of time a machine will operate be


an e,~ponentially distributed random variable with
probability density functionj(t) = ee-!)', t;' O. Suppose
an operator for this machine must be hired for a pre-determined and fixed length of time, say Y. She is paid
d dollars per time period during this interval. The net
profit from operating this machine. exclusive of labor
costs, is r dollars per time period that it is operating.
Fmd the value of Y that maximizes the expected total
profit obtained.

:tCi~ time to failure

141

6--16. A transistor has an exponential time-to~failm:e


distribudon with a mean time to failure of 20,000
hours. The transistor has already lasted 20.000 nours
in a particular application. Vi"hat is the probability that
the transistor fails by 30,000 hours?

ply)

Exercises

of a television tube is estt-

IDa~ to be exponentially distributed \vith a mean of

3 years. A company offers insurance on these tubes


for the f.."'St year of usage. On what percentage of policies will they have to pay a claim?

(fl4Jrs there an exponential density that satisfies the


fc;m5wing condition?
.P(X ~ 2) = tp{H 3}

5.

6-18. Prove the gamma function properties given by


equations 6-18 and 6-19.
6-19.Aferry boa~ w'Jl take its customers across a rivet
when 10 cars are aboord. Experience shows that cars
arrive at the ferry boat independently and at a mean
rate of seven per hour. Fmd the probability that the
time bet\Veen consecutive trips will be at least 1 hour,
6-Ut A box of candy contains 24 bars. The time
beN.'een demands for these candy bars is exponen~
tially dist:ibuted with a mean of 10 minutes. 'lflhat is
the probability that a box of candy bars opened at
8,00 AM. will ce empty by noon?
6~21. Use the moment-generating function of the
gamma distribution (as given by equation 6-25) to find
the mean and variance.

6-22. The life of an electronic system is Y == Xl + Xz . ;.


+~, the su:rn of the subsystem component lives,
The subsystems are independent, each having exponential failure densities with a mean time between
failures of 4 hours. "Wnat is the probability '.:hat the
system wffi opet<1te for at least 24 hom:s?

X3

6~23. The repl~hment time for a certaic product is


known to be gamma distributed \\'ith a mean of40 and
a variance of 400. FInd the probability that an order is
received wirhin the first 20 days after it is ordered.
Within the first 60 days.

6-24# Suppose a gamma distributed random variable is


defined over the interval u :$ x < .;;<) with density
function

.,

f(x) ==

:(r) (x_uy-i e-J.{;t-,,) ,

= 0,
Find the mean of this
disoibution.

x 2:u,.;~~O. r> O.
otherwise.

three~parameter

gamma

6--25. Tne beta probability distribution is defined by

If so, fuld the value of ':L

r(~+r) "-I (I-x )~I ,O~x~I,)'>O,r>O,


f ')
\X - r(),)!'(r) X

6-15. Two manufacturing processes are under consideration. The pet'-umt cost for process I is C. while for
process IT it is 3C Products from both processes have
exponential time~to-failure densities with mean rates
of 2S-! failures per nour and 35-: fai:ures per =:'our
from I and II, respectively. If a product fails before 15
hours it must be replaee'd at a cost of Z dollars. Vlhich
process would you recommend?

(al Graph the distributiou fOLt> 1, r> L


(b) Grapll the distribution for ;I, < I, r < L
(e) Graph the distribution for ,:( < 1, r;' L
(d) Gmpll the distribution for ,:( ~ 1, r < L
(e) Gmpll the distribution for,:(= r.

= 0,

otherwise.

Chapter 6

142

SOUle Important Continuous Distributions

6~26. Show that when i..:= r.:::: 1 the beta distribution


reduces [0 the unifoffi1 distribucion,
6~27.Show

that when A=2. r= 1 Of,t= 1. r=2:he


beta distribution reduces to a triar.gular probability
distribution. Graph the density fu.1..ction.

6~28. Show that if,t ::;:: r :;; 2 the beta distribution


reduces to a parabolic probability distribution. Graph
the density functioD.
6~29.

Find the mca."1 and variance of the beta


distn'bution.

6 ..36. Flnd the mean and variance of the Weibull


distribution.
6 ..31. The d:!ameter of steel shafts is Wcihull distrib.uted with parameters r = 1.0 inches, j3 = 2, and S::::
0.5. Find the probability that :a randomly selected
shaft 'Nill not exceed 1.5 inches in diameter.
6~32.

The time to failure of a ce::tair. transistor is


known to be Wei.bull distributed with parameters r::::::
0, f3 = ~, and {j = 400. Find the fraction expected to
survive 600 hours.
6~33. The

time to leakage failure in :a certain type of


dry-cell battery is expected to have a Wcibull distribu~
tion wit.~ panuneters
0, 13= and 0=400. What
is the probability that a battery ""ill survive beyoJid
800 hours of use?

r=

t,

6-34, Graph the Weibull distribution with


and f3 = 1, 2, 3, and 4,

r= 0,8 =1,

6-35. The time to failure density for a small e01l?-puter


system has a Wetoci1 density v.'ith r:;; fj :::: and

o.

i,

0= 200,
(a) Vlhat fraction of these units will survive to 1000

hours?
(b) What is the mean time to failure?

6~36. A manufacturer of a commercial television mouitor guarantees the pic:ure tube for 1 year (8760
hours). The monitors are used in airport terminals for
flight schedules. and they a.!1~ in continuous use with
power on. The mean life of11e tuhes is 20,000 hours,
and they follow an exponential time-to~failure density,
It costs the manufacturer $300 to make. sell. and
deliver a ClOnitof that will be sold for $400, It costs
$150 to replace a failed tube, including materials and
labor. The manufacnuer has no replacement obliga.tion beyond the first replacement V1hat is the ma;:lU~
faeturer's ex.pected profit?

6..37. The lead time for ordcts of diodes from a certain


manufacturer is known to have a gamma distribution
with a mean of 20 days and a standard deviation of 10
days. Determine the probability of receiving an order
within 15 days of lhe placement date,
6~38~ Use random numbers generated frOUl your
fav6ritc co:nputer 1an.gu.age or from scaling the random integers in Table XV of the Appendix by multiplying by 1O-~. and do the following:

(a) Produce 10 realizations of a 'lariable that is unifor:n 0= [10, 20}.


(b) Produce five realizations of an exponential ran-

dom va..riable with a parameter of ),,= 2 x la-so


(c) Produce five realizations of a ga.mma variable
withr=2andi.::::4.
(d) Produce 10 real.izations of a Weibull variable \'lith

r=O,f3

lfl,0=100.

6~39.

Use the random number generation schemes


suggested in Exercise 6-38. and do the following:

(a) Produce 10 realizations of Y

=2X"'! where X fo1-

lows an exponcutial distribution with:a mean of 10.


(b) Produce 10 realizations of Y::::

X; is gamma with r
on [0, 1].

fX: I'[:Z;, where

2, i.::; 4 and ~ is uniform

Chapter

The Normal Distributio


71 INTRODUCTION
In this chapter we consider the normal distribution. This distribution is very important in
both the theory and application of statistics. We also discuss the lognormal and bivariate

normal distributions.
The uormal distribution was first studied in the eighteenth century, when the patterns
in errors of measurement were observed to follow a symmetrical bell-shaped distribution,
It was first presented in mathematical form in 1733 by DeMoivre, who derived it as a limiting fonn of the binomial distribution. The disuibutiou was also known to Laplace uO later
than 1775. Through historical error, it bas been attributed to Gauss, whose first published
reference to it appeared in 1809, and the tenn Gaussian distribution is frequently employed.
Various attempts were made during the eighteenth and nineteenth centuries to establish this
distribution as the underlying probability law for all continuous random variables; thus, the
name normal came to be applied.

72 THE NOR.lVlAL DISTRIBUTION


The normal distribution is in many respects the cornerstone of statistics. A random variable
<f1 > 0 if it

X is said to have a normal disttibution with mean J1. (- 00 < J1. < <XI) and variance
has the density function

f(x)

7~ e -('I~X(~-~)!~l' ,
(J'l/2rc

_<X<tX<.

(71)

The distribution is illustrated grapbically in Fig. 71. The normal distribution is used so
extensively that the shorthand notation X - N(Jl, 0") is often employed to indic",e that the
random variable X is normally distributed with mean}l and variance cr'.

72.1

Properties of the Normal Distribution


The normal distribution has several important properties.

1:

I': f(xleb:

; required of all density functions.


2. f(x):2:0
for all xj
3. lim,..,.,f(x); 0 and lim"-+-f(x) ; O.
4. f(Jl. + x) ~ f(Jl. - x). The density is symmetric about J1.

(72)

5, The maximum value off occurs at x; }l.


6. The points of inflection off are at x ~ }l cr.

143

144

Chapter 7

The Normal Distribution

(Xii

L. ~_ .

_-=._

J-!

I'

Figure 71 The normal disrribution,

Property 1 may be demoIlStrated as fo]Jows. Let y = (x - Jl)/(f in equatio!! 71 and


deoote the integral I. That is,

1=
Our proof that

2- 1- e-(l;~)y' dy.

"2,, -

[,i( x) dx = 1 will consist of showing thatI' = 1 and then inferring that

1= 1, since/must be everywhere positive. Defining a second normally distributed variable,

Z. we have
[' =

J~ e -(1f2!,' dy ;..- I~ e -(lI')" dz

...,2" -

'12" 1 Joo
=21& -=

1- e-(1j2)(y'+," 'dydz.
---->

On changing to polar coordinates with the transformation of variables y:::::; r sin e and z == r
cos 91 the integral becomes

completing the proof.

72..2 Mean and Variance of the Normal Distribution


The mean of the normal distribution may be detennined easily. Since

E(X) =

1- ;- e-(lJ2)[(x-~)I"]' dx,
'"""" (J",2fC

and if we let z = (x- Jll/(f, we obtain

E(X)

1-- ..,2"
~. _(Jl+(fz)e-z'i'dz
= JlJ- 2- e-,'12dz ~(fI-. ~.J2fC

ze"!~dz.

""""" ....'2,n

Since the integrand of the first integral is that ofa normal density withJl = 0 and
value of the first integral is one. The second integral has value zero, that is,

<I' = 1, the

7~2

The Normal Distribution

145

and thus
E(X)

=)1[1] + 0":0],
=)1

(7-3)

In retrospect, this result makes sense via a symmetry argument.


To find the variance we must evaluate

and letting 1.:;;; (x - ,1l)ICF, we obtain

so that
v(X) = 0-'.

(7-4)

In summary the mean and variance of the normal density given in equation 7-1 are)J and
0-', respectively.
The moment-generating jimcrion for the normal distribution can be sho\l.'I1 to be
,po
Mit) = exp ,t)1 + -t-J.

(7-5)

For the development of equation 7~5> see Exercise 7-10.

72:3 The Normal Cumulative Distribution Function


The distribution function F is

.F(x)=P{X:>x)= J" -=e


1
-"")[("u'}I"I'
,<.,.
duo
->

(T"/ 2tr

(7-6)

It is impossible to evaluate this integral without resorting to numerical methods, and even
then the evaluation would have to be accomplished for eacb pair (p, 0-'), However, a simple transformation of variables, z = (x - )1)10", allows the evaluation to be independent of )1
and <1. That is)

(7-7)

7-2.4 The Standard Nonnal Distributiou


The probability density function in equation 7-7 above,

lI'(z)

I
-OQ

<z<

00,

146

Qmptel' 7

The Normal Distribution

is that of a normal distribution with mean 0 and variance I; that is, Z - N(O, 1), and we say
that Z has a standard nonnal distribution, A graph of the probability density function is
shown in Fig, 7-2, The corresponding distribution function is 4>, where
I

<I>(z) =}'

'121':

<-"

f2 d",

(7-8)

and this function has been well tabulated, A table of the integral in equation HI has been
provided in Table II of the Appendix, In fact, many software packages such as Excel and
Minitab provide functions to evaluate <l)(z), For instance, the Excel call NOR.\1SDIST(l)
does just this task. As an example, we find that NOR.\1SDIST(l ,96) = 0,9750, The Excel
function NOR.\1SlNV returns the inverse CDE For example, NOR.\1S!NV(O,975) = 1.960,
The functions NORMDIST(x, J1, (J, TRL'E) and NORMIl'N give the CDF and inverse
CDF'of the N(p., <i') distribution.

7-2,5 Problem-Solving Procedure


The procedure for solving practical problems involving the evaluation of cumulative normal
probabilities is actually very simple, For example, suppose that X - N(lOO, 4) and we wish
to find the'probability that X is less than or equal to 104; that is, P(X", 104)=F(l04), Since
the standard normal random variable is

X-Il
Z=--,
(J

we can standardize the point of interest x :::;; 104 to obtain

x- u 104-100
z= --' = --"--"' = 2.
(J

Now the piobability that the standard normal random variable Z is less than or equal to 2
is equal to the probability that the original nonnal random variable X is less than or equal
to 104, Expressed mathematically,

or
F(l04) = <r>(2).

<#(z)

u=1

F:agure 7-2 The standard nomtal distributkm.

7-2 Tne Normal Distribution

147

Appendix Table II contains cumulative standard nonna! probabilities for various values of

z. From this table, we can read


4>(2)

0,9772.

Note that in the relationship z = (x - /1)1cr, the variable z measures the departure of x
from the mean 11 in standard deviation (cr) units. For inst:ance. in the case just considered,
F(104) = 4>(2), which indicates that 104 is Mo standard deviations (<1= 2) above the mean,
In general. x = J1 + O';Z. In solving probleIP.s. we sometimes need to use the symmetry property of 'P in addition to the tables, It is helpful to make a sketch if there is any confusion in
determ.ining exactly which probabilities are required, since the area under the curve and
over the interval of interest is the probability that the random variable will lie on the
interval

The breaking strength (in newtons) of a synthetic fabric is denoted X, and i: is distributed as N(800,
:44), The pwxhaser of the fabric requires the fab:ic to have a strength of at least 772 nt.A fabric sample is randomly selected and tested. To find p(X;;:: 772). we first calculate

P(X< 772 l=pl'X-J1 < 772-80'1


(J

=P(z<
=4>(-2,33)=0,01.

~ ? :;', ~-

~/:..

Hence the desired probability. P(X ~ 772), equals 0.99. Figure 7~3 shows the calculated probability
to both X and Z. We have chosen to' work with the randO'm variable Z because its distribution
function is tabulated.

re~a:ive

~~i\'/['t:i
T!le time required to repair an automatic loading machine in a complex food-packaging operation of
a production process is X minutes. Studies have sho~"n that t.l:\e approx.imationX - N(120, 16) is quite
goO'd. A sketch is shown ~ Fig. 74. If the process is dO\\-u for more than 125 minates. ail equipment

cr=1

~
. ." y"':/ : - - - _ .

772

I' = BOO

Figure 7-3 P(X< 772), where X - N(300. 144).

~=4

~,x
I'~

120

125

Figure?-4 peX>.l25), whereX-N(l20, 16).

148

Chapter 7

The Normal Distribution

must be cleaned, with the loss of all product in process. The total cost of product loss ,and c1eaning
associated with the long downtime is $10.000, In order to deter.:J.I'li.ne the probability of this occurring,
we proceed as follows:

P(X > 125)

=1'(, z> 125-121


=p(Z> 1.25)
4
/
= 1- <1>(1.25)
=1-0.8944
=0.1056.

Thus. given a breakdov.lI. of the packaging machine, the expected cost is E(C)::: 0,1056(10.000+ CRI)

+ 0.8944(CRj ), where C is the total cost,and CRI is the repair cost. Simplified, E(e)::: CR + 1056. SuP'"
!

pose the management can reduce the mean of the service time distribution to 115 minutes by adding
more mainte::lancc personneL The new Cost for repair will be CRl > CRI; however,

p(x> 125)=

1'(, Z> 125-115)


= /,(Z> 2.5)
4

=1-<!>{2.5)
=1-0.9938

=0.0062,
so that the new expected cost would be Cftz + 62, and ODe would logically make the decision to add
to the maintenance crew if

or
CR,

C" <$994.

It is asstl.OC.cd that the frequency of breakdo~:ns remains unchanged.

The pitch diameter of the thread on a:fitting is nonnally distributed with a mean of 0.4008 cmanda standJtrd deviation of 0.0004 CllL The designspecitiClltiOns are O.4000O.OOlO em. This is illustruedillFig.
7"5. Notice that the procc.<>s is openrting with the mC3J1 not equal to t:he nominal specifications, We desirc
to detmrrine what fr:action of product is \\itbin tolerance. Using the approach employed previously.
1'(0.399 SX;; 0.401) = p(0.3990-0.4008 SZ < 0.4010- 0.4008')'
0.0004
0.0004

= p(-4.5SZs 0.5)

=<1>(0.5)-<1>(-4.5)
= 0.6915 - 0.0000
=0.6915.

0.3990

Lower spec,

limit

0.4010

Upper spec.
limit

Figure 7-5 Distribution of thread pitch diameters.

7-2 The Normal Distribution

149

As process engineers study tbe result') of such calculations, they decide to replace a worn cutting tool
and adjust the machine producing the fittings so that the new mean falls: dirwJy at the nom.i:nal value
of 0.4000. Then.

P(0.3990 ~ X S 04{)lO) ~
~

Pi' 0.3990- OA ,; Z:5 0.4010 -0.41


\ 0.0004
P(-2.5'; Z S+2.5)

0.0004

~ <I;>{2.5)-<t>(-25)
~ 0.9938-0.0062

=0.9876.
We see that with the adjust1:nt'::nts, 98,76% of the fittings will be ~ithin tolerance. The distribution of
adjusted maclrine pitch diameters is shown in Fig. 7~6,

The pre\'ious example illustrates.a concept important in quality engineering. Operating


a process at the nominal level is generally superior to operating the process at some other
leveL if there are two-sided specification limits.

Another type of problem involving the use of ta.bles of the nonna! distribution sometimes arises. Suppose, for example, that X ~ N(50, 4). Furthermore, suppose we want to determine a value of X. say x.
such that P(X > x) "'" 0,025. Then,
,

,J X-50) =0.025
P(X>x)='\.Z>-2or

so that, reading the normal table "backward." we obtain

.~.:::50 =L96=<I;>~1(0.975)
2

x=50 +2(L96) =53.92.

There are several symmetric mrervals that arise frequently. Their probabilities are

P(jJ.- LOOo-;fX".u+ 1.000) =0.6826,


P(jJ.- L6450-"X" Ii + 1.6450) =0.90,
P(jJ. - 1.960-" X "J.I + 1.960-) = 0.95,
P(jJ. - 2.570-;f X;f J.I + 2.57 d) = 0,99,
P(jJ.- 3,()I)o-"X:;; Ii + 3.00d) =0.9978.

(7-9)

Figure 7"'() Distribution of adjusted lIUl.chine pHd:.

diameters,

150

Chapter 7

The Normal Distribution

7-3 THE REPRODUCTIVE PROPERTY 01<' THE


NORMAL DISTRIBUTION
Suppose we have n independent, normal random yariables Xl' X2,
Nip"
for i = 1, 2, . " n. It was shown eadier that if

07),

X"' where Xi -

f=X, +x,.,. ... .,.X.,


then

E(f) /1y =

L>'

(7-11)

i=l

and
n

v(f)=ai=Lor
1''"'1

Using moment-generating functions, we see that

(7-12)

Therefore,
(7-13)

which is the moment"'generating function of a normally distributed random yariable with


mean J.l; .,. J.l.z + .,. + /1. and variance + tT, + ... + 0-;. Therefore, by the uniqueness property of the moment~generating function, we see that Y is normal with mean jJ.y and yariauee c?

a;

~1~~~r~~i;~z
An assembly consists of three linkage components. as sho\Vn in Fig. 7-7. "The properties X:.
X) are given below, with means in centimeters and variance in square centimeters.
XI

;s,;, and

N(12, 0,02)

X, - N(24, 0,03)

X, - N(lS, 0.(4)

Links are ~dU~bY different machine.'> andopemtors, so we have reason to assume thatX\.X2> and
XJ axe ~ent Suppose we want to deter"'~e P(53.& S Y S; 54.2). Since Y;; XI + X2 + XJ' Y is
distributednomuilly withme.m,uy= 12+ 24+ 18 ~54lmd variance
~ cr; + aj~O,02 + 0,03

rr= a:

+ OJ14 =0.09. Thus,

=<lI( 0.667) -

<I>{-0,667)

= 0.748- 0,252
=0.496,

7-3 The Reprodu..4ive Property of the Normal DkLribution

(0)

151

<0)

)(.-..f--_xv'---+--)(3---i
Figure 7-7 A linkage assembly.
These results can be generalized to linear combinations of indepencent norma] variables.

Linear combinations of the form


(7-14)
were presented earlier and we found that J.Ly ;::: ao +

Again, if X" X" ... , X" are indep<;ndent and

rt!~ilI~I~I~j
A shaft is to be assembled into a bearing, as shown in Fig. 7-8. The clearance is Y;;;;; X! -;X,. Suppose

x, - N(1.500, 0.0016)
and

x, - N(I.480. 0.0009).
Then,

Jly= oJ', + a,J1,


(1)(1.500)+ (-1)(1.480;
=0.02

and

~:,,;+a;:0

= (1)'(0.0016) + (-1)'(0.0009)
=0.0025,

so ::hat

r--

( Vlhen the parts arc assen:bled. there 'Will be interference if Y< 0, so

P(interference)= P{y < 0) =

,fZ< O-O.?2 \

'l

0.00/

=<1>(-0.4) = 0.344<5.

~ (~

This indicates. that 34.46% of all a'isemblies attempted would meet with failure. If the designer feels that
the n.ominal clea~.ce fly;;;;; 0,02 is as :arge as it can be :roadc for the assembly_ then the only way to
reduce the 34.46% figure is to reduce the variance of the distributions. In many cases~ this can be accOIrlLPlisbcd by the overhaul of production equipment, better training of production operators, and so on.

x,[~,,-.___
Figure 78 An assembly.

152

Chapter 7

The Non:n.a1 Distribution

7-4 THE CENTRAL LIMIT THEOREM


If a random variable Y is the sum of n independent random variables that satisfy certain
general conditions, then for sufficiently large n, Y is appro>imately normally distributed.
We state this as a theorem--the most important theorem in all of probability and statistics.

Theorem 7-1 Central Limit Theorem


If Xl' X" ... ,X, is a sequence of n independent random variables withE(X,) = J1, and VeX,)
~ (both finite) and Y = Xl + X, + ... + X" then under some general conditions
n

.
I""

Y- k
""'/1.,
Z, =

,'<I _,

il'"
l i=l

(7-15)

(12
I

bas an approximate N(O,!) distribution as n approaches infinity. If F, is the distribution


function of Zit' then
(7-16)
The Hgeneral conditions!) mentioned in the theorem are informally summarized as fol~
lows. The terms Xi' taken individually, contribute anegIigible amount to the variance of the
SUlli, and it is not likely that a single term makes a large contribution to the sum.
The proof of this theorem, as well as a rigorous discussion of the necessary ass"uinp~
!ions, is beyond the scope of this presentation. There are, however. sevenU observations that
should be made. The fact that Y is appro>imately normally distributed when the Xi terms
may have essentially any distribution is the basic underlying reason for the importance of
the nonnal distribution. In numerous applications, the random variable being considered
may be represented as the sum of n independent random variables t some of whicb may be
. measurement error. some due to pbysical considerations. and so on, and thus the normal
distribution provides a good approximation.
A special case of the central limit theorem arises when each of the components has the
same distribvtion.

Theorem 7-2

X:z, .... , Xn is a sequence of n independent. identically distributed random variables


withE(X,) ~ J1 and VeX,) ~ <1", and Y=X, +X2 + ... + X" then

If Xl'

Z =Y-"!!:
fI

(1'~

(7-17)

has an appro>imate N(O,l) distribution in the same sense as equation 7 -16.


Under the restriction that M/.,l) exists for realI, a straightforward proof may be presented for this form of the central limit theorem.. Many matheIlJl!tical statistics texts present such a proof.
'
The question immediately encountered in practice is the following: How large must 1'1
be to get reasonable results using the normal distribution to approximate the distribution of
Y? This is not an easy question to answer, since the answer depends on the characteristics
of the distribution of the Xl terms as \\leU as the meaning of lI'reasonable results.n From a

7-4 The Ceutral Limit Theorem

153

practical standpoint, some very crude rules of thumb can be given where the dis.tribution of
the X, terms falls mto one of three arbitrarily selected groups, as follaws:
1. Well bebaved-The distribution of Xi does not radically depart from the nonnal distribution. There is a bell~shaped density that is nearly symmetric. For this case,
practitioners in quality control and other areas of application have found that n
should be at least 4, That is, n " 4,
2. Reasonably bebaved-The distribution of X, has no prominent mode, and it appears
much as a uniform density. In this case, n ~ 12 is a commonly used rule,
3. TIl behaved-The distribution has most of its measure in the tails, as in Fig. 7-9. In
this case it is most difficult to say~ however, in many practical applications, n ! 100
should be satisfactory.

,~;W;p'!e1\?':
Small parts are packaged. 250 to the crate. Part weights are independent random variables \';Iith a mean
of 0.5 pound and a standard dC\-i.atiou of O.lQ pound, Twent'j crates are loaded to a pallet. Suppose
we wish to fiDd the probability that the parts Q!: a pallet 1,\-ill exceed 2510 pounds b weight. (Neglect
both pallet and crate weight,) Let

Y=X j .... X2 + ,., +XSI)X}


represent the total weight of the parts. so that

J.!y= 5000(0.5) = 2500,


a;~ 5000(0.01) = 50,

and
O"y

=..[50 =::7.071.

Then

P(Y>25lO)=P(Z> 2510-2500\
7.071 )
=1-<1>(1.41) =0.08.
Note that we did not know the distribution of the individual part weights.

t(x)

(al
Figure 7-9 ill-behaved distributions.

(b)

154

Chapter 7

Table 7~1

1M Normal Distribution

Activity Mean Times and Variances (in Weeks and Weeks~

Activity

Mean

Variance

2.7
2

3.2

1.0
L3

4.6
21
3.6

4
5

6
7

5.2
7.1

1.5

Mean
9
10
11
12

LO
1.2
0.8
2.1

13
14

1.9

15

0.5

16

Varianee

3.1

1.2

4.2
3.6

0.8
1.6

0.5
2.1

0.2

l.5
1.2
2.8

0.6
0.7

OA
0.7

'[~hlIz~J
In a construction project, a network of m.ajor activities has been eonstucted to serve as the basis for
pla."'lUing and scheduling. On a critical path there are 16 activities, The means and variances are given
in Table 7-1.
The activity times may be considered independent and the project time is the sum of the activity times on the critical path, that is, Y==-X 1 + Xz -:- .,. + XIS' where fiO) the project time and XI is the
ti:m.e for the ith activity. Although the distributions of the XI arc unknown. the distributions are fairly
well behaved. The contractor would like to know (a) the expected completion time, and (b) a project
time corresponding to a probability of 0.90 of having the project completed. Calculating Jly and ~

we obtain
/.J.1';;;;;: 49

weeks,

er!.; 16 weeks2"
The expected completion time for the project is thus 49 weeks. In detennining the tfrne Yo such that
the probability is 0.9 of having the project completed by thaI time, Fig. 7-10 may be helpful
We may calculate
PlY $ y,)

0.90

or

so that

and
y;=49+ 1.282(4)
~ 54.128 weeks,

t
O'y=

Yo

4 weeks

Figure 7~10 Distribution of project times,

7~5

The Norma1 A.pproximation to the Binomial Distribution

155

75 THE NORMAL APPROXIMATION


TO THE BINOMIAL DISTRIBUTION
I.u Chapter 5, the binomial approximation to the hypergemnetric distnDution was presented,
as was the Poisson approximation to the binomial distribution. In this section we consider
the normal approximation to the binomial distribution. Since the binomial is a discrete
probability distribution. this may seem to go against intuition; however, a limiting process .
is involved, keeping p of the binomial distribution fixed and letting n -7 ~. The approximation is known as the DeMoivre-Laplace approximation.
Vle recall the binomial distribution as

p(x)= I{ n~ ),pxq x,
x. n x.

= 0,

x=O,I,2, ... ,n,


otherwise.

Stirling's approximation to n: is
(7-18)

The error

as n -7 IXI. Using Stirling's formula to approximate the terms involving n! in the binomial
model, we eventually find that,{~ Ilirgen,
(720)

sO that
(7-21)

This result makes sense in ligbt of the central1intit theorem and the fact that X is the sum
of iudependent Bemotill:r1:rial~ ES<1tb~t E(X) = np and V(X) = npq). Thus, the quantity
(X -npl/,jnpq approximately llJlj; aN(O,I) distnDutiolL IT p is close to and n > 10, the
approximation is fairly good; however, for other values of p, the value of n must be larger.
In general, experience indicates that the approximation is fairly good as long as np :> 5 for
1
.
1
P S '2 or when nq > 5 whenp > 2'
, _,.'

"'>.~

I'r

)0>5

.,r: ..' ,

.,

~i>~~El
(j', I,C'
J IJ;.::2
V
In samp~m a production process that produces items of which 20o/a--are def~te, a random

itms

sample of{~
is selected each hour of each production shift.. The ~r of defectives in a sample is denoti?X To find. say, P(X ~. 15) we may use the nonnal approximation as follows:

.}

J
IV

P(XS1S)=

j Z,;

' "l

15-1,2.0. 0.2

~IOO(O.2)(0.8)

= P(ZS-l.25) =$\-1.25) =0.1056.

Since the bbomial distribution is discrete <llld the normal distribution is continuous. it is common
practice to use a half~inJerva1 correction or cor.ti:nuity correction. In fact, th.is ir<t1fecessity b1 ca1cu~
latiug P(X =: x). The usual procedure is to go a half-unit on either side of th4.:Q~s1,epending on
the interval of interest, Several cases are shown in Table 7-2.
"''''E:;:Y

156

Chapter 7

The Normal Distributio:l

Table 7-2 Continuity Corrections

In Terms of the
Distribution FUDctloa l

Vlith Continuity
CorrectiO:l

QuaIltity Desi:red from


Binomi.al Distribution

pex=x)

P(X" x)

P(X < xl = P(X~ x -I)

P(x:;'x)

p(a"Xsb)

rlHffi~:t~!!j
Using the data from Example 7-9, where we had n

= 100 and p = 0,2. We evaluate P(X

15).

P(X:; IS), peX < 18), P(X ~ 22). aIld P(18 < X < 21),

0. P(j~=P(14.5"
X 515.5)= <1>(155-20)_<1>(14.5-20\
~-=":
\4
4)
= <1>(- Ll25) -<1>(-1375)= 0,046,

Vz.

"'"

P{X S 15) = <I>(!2:5-20\j = 0,)30,


\

v.P(X <IS) =P(X" 17)=i


~
"
~

"",

I7

.5-2'?! = 0.266,
4

;'

@''1'(Xi'22)=I-<I>(21.5 -20)=0,354,
4

---~-'-

JI P(lS<X<21)=P(1%X,,20)

_ _ ,, _ _

=<I>( 20.5 - 20 l_dI85- 20)


" 4 } \
4
=0550-0.354 =0,196. _ _ _ _ _ _ _ _ _ _ _ _ _ ,

~,--c''-'-'---'-~-'--'-

76

11:e Lognormal Distribution

157

As discussed in Chapter 5,' the random variable p =XI" where X has a binomial distribution
with parameters p and n, is often of mte:Tst Interest.in this quantity stems primarily from
sampling applications, where a randQID sample of n observations is made, with each observation classified success or failure. and where X Is the number of successes in the sample,
The quantity p is simply the sample fraction of successes. Recall that we showed that
E(j;)

=p

and
v(p)
In addition to the DeMoivre-Laplace approximation. nore that the quantity
(7-23)

z=

p-p

...jpq/n
has an approximate N(O, I) distribution. This result has proved useful in many applications,
including those in the areas' of quality control, work measu:ement, reliability cmgineering.
and economics. The results are ouch more useful than those from the law of large numbers.

Instead of timing the activity of a maintenance mechanic over the period of a week to determine the
fraction of his time spent in an activity classification called "secondary but necessary," a technicia.n
elects to use a work sampling st'.1dy, randomly picking 400 time point'> over the week. taking a flash
observation at each, and classifying the ;:activity of the maintenance mecb.ru.tic. The value X will rep~
resent the number of times tbe mechanic was involved in a "secondary but necessary" activity ar~dp
;:;; Xl400. If tbe true fraction of time that he is involved iT. this activity is 0.2. we determine !he prob~
ahility tharp. the estim.atcd fraction, falls bet\\'een 0,15 and 0.25. That is,

'p(O.lS ps 0.3)

=<I>r 025~O.2 J-<I>I o~~o.21


\ .JO.1(i;400

\ -..:0.16/400 )

<1>(2.5) - <1>(-2.5)
~

0.9876.

7-6 THE LOGNORMAL DISTRIBUTION


The lognormal distribution is the distribution of a random variable whose logarithm follows

the normal distribution. Some practitioners hold that the lognormal distribution is as fundamental as the normal distribution. It arises from the combination of random terms by a
multiplicative process.
The lognormal distribution has been applied in a wide variety of fields, including the
physical sciences, life sciences, social ?ciences, and engineering. In engineering applica~
tions, the lognormal distribution has been used to describe "time to fail",..." in reliability
engineering and "time to repair" in maintainability engineering.

158

Chapter 7

The };ormal Distribution

7-6.1 Density Function


We consider a random variable X with range space Rx:::;; {x: 0 < x < oo}, where Y::: 1n X is
normally distributed with mean f.1 y and variance
that is,

0";,

E(Y); l1y

and

0";.

V(y)

The density function of X is

/(x)

--;~e

{-1!2)[{InJ:-/ly)!CTy

xO"y.JZ1!

=0

t,

x>O,

oilieI'W'ise.

(7-24)

The lognormal distribution is shown in Fig. 7 -II. Notice that, in general, the distnbution is
skewed, with a long tail to the right. The Excel funetions LOGNORMDIST and LOGINV
provide the CDF and inverse CDF, respectively, of the lognoIlDlll distribution.

7-6.2 Mean and Variance of the Lognormal Distribution


The mean and variance of the lognormal distribution are
E( X) = I1x = ePr +(lI2).r,

(7-25)

and
(7-26)

In some applications of the lognormal distribution, it is important to know the values of the
median and the mode. The median, which is the value x such that P(X :> Xl ; 0.5, is

x=eUl',

(7-27)

The mode is the value of x for which/(x) is maximum, and for the 10gnoIlDlll distribution,
the mode is
(7-28)
Figure 7~11 shows the relative location of the mean, median, and mode for the
lognormal distribution. Since the distribution has a right skew, generally we will find thai
mode < median < mean.

x
Median

Flgnre 711 The lognormal disuibution.

7~6

The Lognormal Distribution

159

76.3 Properties of the Lognormal Distribution


\bile the normal distribution has additive reproductive properties, the lognormal distribu~
tien has multiplicative reproductive properties, Some of the more important properties are
as follows:

a;

1. If X has a lognormal distribution wiili parameters!Ir and


and if a, b, and d are
constants such thatb
1 then W= bXC has a lognormal distrihution wiili parameters (d + ally) and (arJ'y)',
2. If X,. and X, are independent lognormal variables, wiili parameters (fly, + rJ';) and
(jJ.Y2 + a;) respectively, then W =Xl . X1 has a lognormal distribution with parame3~

ters [(jJ.y! + J.Lr). (a~1 + O'~J)].


If Xl Xz. , . XI'. is a sequence of n independent lognormal variates, with parameters
()i.YI' q),j = 1, 2, '''' n., respectively, and {a) I is a sequence of constants while b
e" is a single constant, ilien ilie product
t

W~bTIX?
j=l

has a lognormal distribution with parameters


and
4~

If Xl' X-z, .. "' X,t are independent lognorm.al variates; each with the same parameters
(fly. ~), ilien ilie geometric mean

(IT Xi)"'
\j=l

has a lognormal distribution wiili parameters)i., and a;ln.

i~~~iiicz;r~
The: random variable Y = In X has a NO O. 4) distribution, sO X has a lognormal distribution with a

mean and varia.:nce of


E(X) =: e IC +:l'l14 = ell",", 162,754

and
VeX) = e[1(10)+4)(e4 - 1).
e'4(e4 _ 1) = 53.59Se",

respectively. The mode and median are


mode = i' """ 403.43
and

median = e: o = 22.026.
In order to determiDe a specilic probability. say P(X S; 1000), we. use the trans:'ormP(L'1X ::; In 1000)

= p(Y ~ In 1000):
P(YSIn 1000) =

PI' ZS In 1000-lOi

"
2)
=<1.>(-1.55) =O.0611.

160

Chapter 7

The Normal Distribution

,\J!!~~!t1~!~'"
Suppose
Y,=InX,-N(4,1),
Y, =InX,- N(3, 0.5),
Y, = InX, - N(2, 0.4),
Y. =InX, - N(l, 0.01),
and fu."therrnore suppose XI' Xz, X,:, and X4 are independent rando:n variables. The random va\"iable
W defined as follows represents a critical perfor::nance .'3riable 0:1 a telemetry system:
W =: eL5rx~x;.1x;7X!1}.

By reproductive property 3, W w.:.u have a lognormal distribucon with parameters

1.5"1- (254+0.23+0.72+3.1'1)= 16.6

and
(2.5)' . 1+ (0.2)' . 0.5 + (0.7)' . Q.4 + (3.1)' . (0.01) = 6.562,
respecti''ely. That is to say, In W- N(16.6, 6.562). If the specifications on Ware, say, 20,000-600,QOO.
we could determine the probability mat W would fall v.ithiIJ. specifications as follows::

p(20,000" W " 600.10')


=Plln(20,000)" In w" 1n(60Q.IO')1
= <1>(1n 60010' -16.6[_<I>l' In 20,000 -16.6[
.[6.526/

../6.526

= <1>(-1.290)-<1>(-2.621) = 0.0985-0.0044

= 0.0941.

7-7 THE BIVARIATE NORMAL DISTRIBL'TION


Up to this poiI:.t. all of the continuous random variables have been of one dimension. A e:rj
important two-dimensional probability law that is a generalization of the one~d.imensional
normal probability law is called the bivariate normal distribution.. If (Xl' X2] is a bivariate
nonn~ random vector; then the joint density function of [Xl> XJ is

(7-29)

fur ~ -<Xl < 00 and- - <Xz < <>e.


The joint probability
s Xl ,; b" lIz" X, ,; b,) is defined as

pea,

Se,

1" f(x" x,)

dx1ti<z

(7-30)

"' "
and is represented by the volume under the surface and over the region {(xl' .xz): at S

Xl

::>

bj , lIz sx, s b,), as shown in Fig, 7-12. Owen (1962) has provided a table of probabilities.
The bivariate normal density has five parameters. These are P" /-0.. 0jt 0"2' and p, the cor-

f(Xl'

X"vt

x,

X,

.'f--""",=

Figure 7~1:2 The bivariate normal density,

relation coefficient between XI and X2 such thar - oc < fJ.; < OCt
0, and-l <p< 1.
The marginal densities/, and/, are given, respectively, as

(Y ) -

.. 1 ....: -

r-

'I x x

.1 \

:'

oc

< f.l2 < OX'.

lax2 -- _1_e-{~2)[(', -#,)!c.]'


r:;---

0'1 :>

0,

0':

>

(7-31)

0'1 ",;2%

""""

for _OC <XI < oc and


(7-32)

for-- <~ <oX>.


We note that these marginal densities are normal; that is,

Xl - N(u" a;)

(7-33)

and
X, - N(Ji1,

a~),

so that
E(X,) -)J,l'
E(x;) = iLz,
VeX,) ~
2 ):=: O'~.

veX

a;,
(7-34)

The correlation coefficient p is the ratio of the covariance to [at' 0';1. The covariance is

Thus
(7-35)

162

Chap,er 7

The Normal Distribution

The conditional distributions Ix,,,,(x;) and/x,",(x,) are also important. These conditional densities are nonnal, as shown here:

for-ao<X2 <ao and

(1-37) ,

for- 00 < Xl < 00. Figure 7~13 illustrates some of these conditional densities.
We first consider the distribution!XZU:l' The mean and variance are
E(X,!>:,) ~ i11. + p(a.ja)(x, -1',)

(7-38)

x,
E(Xzlx,)

x,

x,

(b)

Figure j13 Some typical conditional distributions. (a) Some example conditional distributions of
X2 for a few values of x~. (b) SO:rtle e:<.ample conditional distributions of Xl for a few values ofX:!.

I
,

7-7 The Bivariate Normal Distribl1ton

163

and

a;O - p')

VeX,Ix,)

(7-39)

Furthermore, AiL: is normal; that is,


(7-<W)
X:rlx, - N[}l, + p(IY,JO',)(x, - Ji.;J, a;(J - iT).
The locus of expected values of X:r for
xl' as shown in equation 7-38, is called the

regression ojX1 on Xl' and it is linear. Also, the variance in the conditional distributions is
constant for all Xl'

In the case of the distribution/x!;,;:. the results are similar, That is.
E(X,1x,) = Ji., + p(O',/IY,)(x, - .11.,),
V(X,Ix,) =

a;(l

(7-"1)

iT),

(7-"2)

and
(7-43)
b the bivariate nonnal distribution we observe that if p:= 0. the joint density may be factored into the product of the marginal densities and so X j and Xz. are independent. Thus, for
a bivariate normal density, zero correlation and independence are equivalent. If planes parallel to the X l ,.l2 plane are passed through the surface shown in Fig, 7~12, the contours cut
from the bivariate normal surface are ellipses. The stt:dent may wish ~o show m:s property.

In an atterr.pt to substitute a nondestructive testing procedure for a destructve test, an extensive st<.ldy
was made of shear strength. Xz., and we~d diameter, Xl' of spot welds, with the foUo'N.ing findings,
1. [XI' Xl) ~ bivariate normaL

a; "" 0.02 inch:"

2. J.ll = 0.20 inch. Jlz = 1100 pounds,

pounds'. and p=O,9,

The regression of X2 on Xl is ,thfis-.....\

(ix,,,,,) ~'" + p(<Y,/<5,)(x,- p,)


'---./=llOO~O.9"';
( 1525'(xl-0.2)
-.:0.02 ;

= 145.8:<, + 1070.80.
and the variancc is

'-~\

(~

! v(x,lx;) = 0;(1 - p')

~ ~525(0.19)=99.75.

In studying these results, the manager of manufacturing notes that since p = 0,9. that is, close to 1.
weld diameter is highly correlated with shear strength. The specification on shear strength calls for a
value greater !:han 1080. If a weld has a diameter of 0.18, he asks: "v,,'hat is the probability that the
strength speci:fi.cation ",,'ill be met?" The process engineer notes teat E(Xdo. lS) 1097.05~ therefore,

p(x. ",080)=/ Z> 1080-:097.05'


.

~99.75,/

= 1-1>(-1,71) 0.9564,
a-:d he recommends a policy such that if the weld diameter is not less tea..-: 0.18, the weld will he classified as satisfactory.

164

Chapter 7

The Norm.al Distribution

F;x;,n,~i~7;':J.S:
In developing an admissions policy for a large university, the office of student testing and evaluadon
has noted rl:at X l the combined score on the college board examinadons, and X:. the student grade
point average at the end of the freshman year, have a bivariate normal distribution, A grade point of
4.0 correspond'! to A. A study indicates that
<

1', = 1300,
",,=2,3,

0;=6400,
~=O.25.

p=O,6,
AJ:J.y student with a grade point average less than 1.5 is au:omatically dropped at the end of the fresh~
man year; however, an average of2.0 is considered to be satisfactory.
Ax.. applicant takes the college board exams, receives a combined score of 900. and is not
accepted. A.n irate parent argues that the student will do satisfactory work and, specifically, will have
better than a 2.0 grade point average at the end of the freshman year. Considering only the proba~
bilistic aspects of t.'1e problem. the director of admissions wants to determine P(~ t:: 2.0rc: 900).

Noting that

E(X,;900) = 23+(06\ ~;j900

-1300)

=0,8

and
V(X,1900)

=0, 16,

the director calculates

1_ <1,(2,0-0,8) 0,0013
;,

0,4

'

which predicts only a very sli.m chance of the parent's claim being valid.

78 GENERATION OF NORMAL REALIZATIONS


/

We will consider both direct and approximate methods for generating realizations of a
standa:unonnal variable Z, where Z- NCO, 1), Recall that X = 11+ <1Z, so realizations of X N(;1, <1') are easily obtained as x = j1 + <17"
The direc~ method calls for generating uniform [0, 1J random number :realizations in
pairs: u, and u." Then, using the methods of Chapter 4, it tums out that

',= (-2 In u:)i!2cos(2nu,),


Zz = (-2m u.)'''sin(2nu,j

(7-44)

1) variables, The values x= I' + <1, follow directly, and


the process is repeated until the desired number of realizations of X are obtained,
An approximate method that makes use of the Central Limit Tneorem is as follows:

at!, realizations of independent, N(O,

12

z=Lu,-6,

(7-45)

WIth this procedure, we would begin by generating 12 uniform [0, 1] random number realizations. adding them and subtracting 6. This entire process is repeated until the, desired
number of realizations is obtained.

Although the dire<ot method is exact and usually preferable, approximate values are
sometimes acceptable.

79 SUMMARY
This chapter has presented the normlll distribution with a number of example applications.
The normal distribution, the related standard nannal, and the lognormal distributions are

univariate, while the bivariate nonnal gives the joint density of tvlo related normal random
variables.
The normal distribution forms the basis on which a great deal of the work in statistical
inference rests. The wide application of the normal distribution makes it particularly
important.

710 EXERCISES

/X/

(7:"i. Let Z be a standard no::n:..al random variable and

7~e

personnel manager of a large company


"calculate the follo .....ing probabilities. llSing sketches , requires job applicants to take a certain test and
achieve a score of 500. If the test scores are aorn:tall.y
where appropriate:
distributed \Vith a mean of 485 and a standard deviation of 30, what percentage of the applicants pass the
test?

<a) p(0" Z" 2).


(b) p(-I:$Z;;+I).
(e) P(Z ~ 1.65).

(d)

P(Z~-1.96).

(e)

p(IZI> 1.5).

7~8.

N (30 seconds, 1.21 seeonds2).


The probability that X is at 1eas-. _.
probability that X is at most 31 seconds. (c) The probability that X differs from its expected value by more
than 2 seconds.

(I) P(-1.9';; Z;; 2).


(g) P(Z;; 1.37).

(b)

Experience indicates that thf': rhwelonment time

for a photographic printing pal

pclZl,;; 2.57).

72, Let X - N(lD, 9). Find P(X:!: S), P(X~ I2).PCz,; C':\

t/ ,: 1.9jA ccrtain type of light bulb has an output known

X S; 10).
~

roOe norroaily distributed with mea.'1. of 2500 end foot-

/ tile probability statement true.


(a} ~(c);;;;; 0.94062.

candles and ~ standard devi~~n of 75. e~d footeandles. Determ:ne a lower speCt::lcation limit such that
only 5% of the :na:mfactured bulbs will be defe.....-tive.

, 'Ji? In ea~'part. b'elow, find the value of c that makes


p(jZl

710. Show that the momentgenerati!:g function for

(b)
:!:e) = 0.95.
(c) P(1Zi5C)=0,99.
(d) p(Z ~ c) = 0.05,

14~ 1 P(Z ~ Zg) = a., detemrlne Zc for

. ~"=Q.l)'l5, =0.05, and =0.0014.

the nO!."""~ distribution is as given by equation 7-5.


Use it to generate the mean and variance.

.
(,1.0:: ; X' N(80, 10'), compu"' the following.a;

0 0225
.

~lfX-N(p. 0-'), show that Y=aX+b. where a


;:r:.:;.
b
is

?n

Ce) P(X;; 100).


(b) p(X'; SO).
(e) p(75';X;; 100).
(d) P(l5';; X).
(e) p(1X - 801,; 19.6).

7";;. The life of a particular type of dry-<ell battery is


normally distributed with a mean of 600 days and a
standard deviation of 60 day'S, What fraction of these

and

are real comta:nts,

also normally distributed.

Use the methods outlined in Chapter 3 ..

1",12. The inside diameter o:(~_piston ring is normally


distributed with ame;m of 12 em and a standard <levi-

arion of 0.02 Glli.


(a) Wbat freedon of the piston rings will buve dimnetern exceeding 12.05 em?
(b) Wb.at inside diameter value e has a probability of
Q,90 being exceeded?
(e)
t is the probability that tile iLside diameter
? ."".._:fur between 11.95 and 12.051

batteries would be expected to survive beyond 680, \ 7 3. plant manager orders a process shutdown and
days? W"'....at fraction would be expected to fail before 'etting readjustment whenever the pH of the fb.a1
560 days?
product falls above 1.20 or below 6.80, T.::e sample

166

Chapter 7

The :Sonnal Distribution

).~"

"

pH is normally distributed with unknown j.1 and a :(7)\:wh~c~the acceptable range is as in (a) and the
standard deviation of 0"= 0.10, Detennine the follow~ ':;/' hardness of each of cine randomly selected spec~
imens is independently deten:rined, what is the
ing probabilities:
, expected llumber of acceptable specimens among
~ a) Of readjusting \Vhen the process is operating as
the n.ine specimens?
intendedwithj.1::::7.0. '1
;:~- ()- L ,:..::.::-~
'/

.......,.--

\.

,:>,'

'

(b) Of readjusting when the process is slightly off

ta..rget with the mean pH 7.05.

7..20. Prove that E(Z~) "'" 0 and V(Zt\):::: 1, where z., is


as cL<>filled in Theorem 72,

(c) ?ffailing to readjust ",:hen~e process is too aIka~ l~tXi(i """ 1, 2, ... , n) be independent and iden~
line and the mean pH IS J1. - 725.
i'ruCiily distributed random variables with mean j.1 and
(d) Of falling to readjust when the process is too variance
Consider the sample mean,

cr.

acidic and the mean pH is f.1 "'" 6.75.


7~14. The price being asked for II certain security is
distributed normally \I,'1tt a mean of $50.00 and a sta.n~
dan! deviation of $5.00. Buy"" are willing <0 pay an Show thatE(X) = Ii and V(X) = d'in.
amount iliac is also nonnaTIy :n~tributed W:th. a m~ 7-22. A sbaft with an outside diameter ~O.D.) -N(L20,
of $45.00 ~~ a standard de-:atlo~ of $2.50. v.~at 1S 0.0016) is inserted into a sleeve bearing having an
the probability that a ~Ctlon ",ill take place,
~in
de diameter (I.D.) that is N(L25. 0.0009). Detet~
W a c specifications for a capacitor are that its lif,~' ,,' e the probability of interference,
must be bet-.veen 1000. ~d 5000 ~ours. The life i _ ,,..2 An assembly consists of three components
known to be normally d~,tnbutcd WIth a mean 300.0 (,placed side by side. The length of each component is
hours.. The revenue ~alize~ frmn e3;ch capacItor IS non::nally distributed v.':ith a mean of 2 inches' and a
$9.00; however, a failed urut must be replaced ~t a standard deviation of 0.2 inch. Speeifications re<;iuire
cost of $3.00 to the compa~y. Two .manW:actunng that allrsemblies be between 5.7 and 6,3 inches long.
processes can produce capacltors hav";ng sallsfactory How JPany assemblies v.ill pass these requirements?
!
mean lives. The standard deviation for process A is \.
1000 hom and for process B it is 500 hours. How- ,\241 Find the mean and variance of the linear
ever, process A manufacturing costs arc only half co'nlbinatior.
'
those for B. What value of process manufacturing cost
Y "" Xl + 2Xz + X3 + X4 ,
is critical, so far as dictating the use of process A or B? where Xl ... N(4. 3), X2 ~ N(4. 4), X, ~ Nfl, 4). and

0:

7~16. The diameter ofa ball bea.-ing is normally dis- ~X.


--1,;(3,2). \Vbat is the probability tbat 15 S; YS; 207

tnbuted random vanable With mean j.1 and a standard


deviation of 1. s~ecifiC,atio~ f?I" the ffi:a~ete: are,
6."!: X."!: 5, and a ball beanng \\'1thin these limits )'lelds
a proSt of C.doUars. However. if X.: 6. then the profit
f is -R j dollars, or if X > S. L.1e profit is -Rz dollars.
Fied the value of jl that maximizes the expected
profit

<,
/

'Z..dVIn tt:e preceding exercise, find

the Optimum

vatue of j1. if R~ :R2 == R.


7~18. Usc thc results of Exercise 7~ 16 with C= 58.00,
Rl "" $2.00, and R1. = $4,00. What is the value of Ii that
;nizes the expected profit?

7',~e Rockwell hard::ess of a particuJ.ar alloy is

rmally distributed with a mean of 70 ar.d a standard


. tion of 4.
(a) If a specimen is acceptable only jf its hardness is
between 62 and 72, wha~ is me probability that a
tandomly chosen specimen has an acceptable
hardness?
(b) If the acceptable range of hardness was (70 - c.
70 + c), for what value of c would 95% of all
specimens have acceptable hardness?"

~ound~off error has a uniform distribution on

.~ +O.5J

and round-off erron; are independent A

';~um of 50 numbers is calculated where each is


rounded before adding. \-Vhat is the probability' that
the total rO~'1d-off error exceeds 57

7-26. One hundred small bolts are packed in a box.


Each bolt weights 1 ounce, with a standard deviation
of 0.01 O'.L.1.ce, Find the probability that a box. weighs
more than 102 ounces.

S~27)An automatic machine is used to fill boxes 'With

s;;[;'p powder. Specificati(Iit, require that the boxes

weigh bet"Neen ~l.&and ;I2.2, ounces. The only data

available about ~inachine peiformance concern the

average content of grot:p~(nine boxes. It is l$noy,,"D.


that the average eontent i~ 11.9l?llUces with a standard
deviation ofl O.OS .ounce.- ~ fraction of the boxes
produced is ddcctive? Where shOUld the me~ be
located in order to minimize this fractioa defective?
Assume the weight is normally distributed.
7~28. A bus-J;r:eve1s between !:VIa cities, but visits six
Intermediate cities on thero-at.e. The means and st3Ldard deviations of t!:e traveL times are as follows:

7~10

Exercises

167

----------------'=D:Drne ""dom variable Y = In X has a.'l(SO, 2S)


Mean
StandMd
Deviation

PaL.,

Tune
(hours)

1-2

2-3

4
3
5

0.4
0,6

City

3-4
4-5

16-7
0- 7-8

0,9
0,4

0.4

~~yts the probability that the bus cOC1p:etcs it" jour" nb.~,\~!itbin 32 hottts?

\!"

ite~

A production process produces


of which
are defective. A random sample of fwo items is
'bted every ~y and the nu.::nber of def~ve items.
say ~ is c~unted, t:sing the.no~ appro::pmation to
~thTOTIllal find the follOWlllg:!
c)

.7

- N(4, I).

In X, - N(3, I).
In X3 - N(2, 0.5).
Find the mean and variance of W = eZX~X~SX~,:!ll.
Determine a set of specifications L and R such that

P(L;; W ~ R); 0,90.

/.----..,

'2::Yhow:ba: the densi"! function for a lognormally


distributed ra.'1dom variable X is given by equation
7-24.

74ft Consider the bivariate DOnna! density

~"iY

i5i

"e;:?o

f, '

IJ;:f fCILX 5 20), I


'(
(d)' P(X ~ 14),
7~30. In a work-sampling study it is ofte.'1 desired to
.find the necessary number of observations. Given that
p -== 0.1. find the necessary n such that P(O.05 ::5[; ,s,
O.15{SO.95,
.r' : ~se randoLl numbers generated from your
avorite computer package or from scali."lg the random
integers in Table 'XV of the Appendix by multiplying
by 10-5 to generate six realizations of a N(100, 4)
variable, using the follovdng:

~
J

Y,

,I'

ttJ.::X~16)

MI'(X =<15),.

7-3&. Suppose independent :random variables YJ , Y:;,


Y~ are such that

0,3
1.2

,('1,5-6

,r,
"

(hours)

dist..--i.bution. Find the mean, va."':ance, mode, and


median of X

(a) The di..rect method.


(b) The approximate methO<t

7~32. Consider a linear SOmbiDation Y::::; 3Xt - 2X;;,


where XI is N(lO. 3) ~d X1. is unifo::::::::lly distributed
on .[011. Generat{six rea~tions of the random
va;;al1'fe Y, whetc XI andX:r. are mdependent.

~ Z - N(O, 1), generate five :ealizations of Zi.

-"",<x 1 <rn,-=<X1<<><>,
where Ll is chosen so thatfis a probabili:y distribu~
tion. Are the random variables Xl and X2 indepe:::.dent?
Define wo new random variahles:
1S.

1
( Xi pXz 1
= (1- p'll1f2 '"-----i
~ 0'1
()1)

\}
Yo = X"
- 0'2
Show that the two new random variables are

0 'er:.dent.

~'The life of a tube (X,.) and the filament cliameter


(X1) are distributed as a bivwte normal random vari-

ahle with the parameters fi.l


inch,

2000 hours, fJz::::; 0,10

0-; =2500 houn;', <f, ~ 0.01 inch', and p ~ 0.87.

The quality-control manager wishes to deteani..'ie the


life of eaeh tube by illcasuring the filament dim.eter.
If a filar:r::.ent diameter is 0.098, what is the probability
that the tube will last 1950 hours?

7-42~

A college professor has noticed that grades on


each of t'\IO quizzes have a bivariate norma: clistribuger:.era~:lr:wz~tions of X.
2
tion with the parameters fi.; 75. fJz ;;;; 83, ~~ 25,
t'~1~j~If y =:: X/~,X;, wnere XI - N(J1.p v t ) and 0; = 16, and p = 0.8. If:a student receivC$ a
~'
.2
grade of 80 on the first quiz, what is the probability
x,.i~ N(J1,. a,)
and where X, and
Xo-are
independent.
thatsh e will d0 be tterOn th csecond
' HOWlS
. th e
_
.
_
one:
deVelop a generator for producmg realizations of Y;
aft ed
-,"" 8'
~wer
ect bv mQ.li.ll1g p..:.:.. --v. .
7~36. The brightness
bulbs is normally dis- ( . . . ':\

tributed. with. a mean of 2500 foor.caodles and a stan- '-~ Consider the surface y
X), whercfis the
dard deviation of 50 footcandles. The bulbs are tested bivariate normal density function.
and all those brighter than 2600 footcandles are (a) Prove that y = constant cuts the surface in an
ellipse.
placed in a special high-qUality lot. What is the probability distribution of the remaining bulbs? \Vbat is (h) Prove that y = constant with p
and
their expcc:ed hrightness?
cuts the surface as a circlc.
,;;:;.;. ~y = In X a;J.d Y - N(jJ.y, v;), develop a proce-

/mr
/

0: ;: : a;

168

Chapter 7

The Normal Distribution

7 ~44.. Let Xl and Xz be independent random variables,


each following a normal density with a mean of zero

and variance <fl. Find the distribution of

c=X;.

X,
We say that C follows the Ca:u.ciry distribution. Try to

r-:;----,

R;;;:~Xl'l"Xi

compute E( C).

The resulting distribution is known as the Rayleigh


distribution and is frequently used to model the Cistribution of radial error in a plane. Hint: Let Xl :Rcos
and X2 :::: Rsin 8. Obtain the joint probability disoibuclon of R and then integrate out a

e,

~sing a method similar to that in Exercise 7-44.


obtain the distribution of

<W2e, X - N(O, 1), F'md

the probability distribution


of Y:::: X 2 Y is said to follow the chi~square distribution with one degree of freedom. It is a.'1 important
distribution in ~tatistical methodology.

7 ~48. Let we independent random variables XI N(O, 1) for i = 1, 2, ... , n. Sbow !hat the probability
distribution of Y:::

t xl

follows a cbi-square distl'i-

i"'!

L
Xi - lVI
"(0,cr)
" and"IS Ind
dCll.
t
WL.ere
epen

7~46~ Let the

N(O,
of

independent random variables XI

bution with n degrees of freedom.

~ G'LetX- NCO,I), Define anew random variable Y

0'-) for i ~ 1, 2. F'l!ld the probabiJity distribution

~!XI. Tneu, firul the probability distribution of Y. This


is often called the hali'~It()rmLll distribution. .

,
Chapter

Introduction to Statistics
and Data Description
81 THE FIELD OF STATISTICS
Statistics deals with the collection. presentation, analysis, and use of data to solve prob!ems.
make decisions, develop estimates, and design and develop both prOduCIS and p:ocedures.
An unde:Slllnd:ing of basic statistics and statistical methods would be useful to anyone in
this information age; however, since engineers, scientists. and those working in m:mage~
:ment science are routinely engaged with data, a knowledge of statistics and basic statistical
methods is particularly vilal. In this intensely competitive, high-tech world economy of the
first decade of the tv.:enty-first century. the expectations of consumers regarding product
quality, perrorIllllllce, and reliability have increased significantly over expectations of the

recent past. Furthermore, we have come to expect high levels of performance from logistical systems at a111eve1s, and the operation as well as the refinement of these systems is
largely dependent on the collection and use of data. While !.hose who are employed in various "service industries" deal with some\Vbat different problems, at a basic level they, tOO,
are collecting data to be used :in solving problems and improving the "serv.ice" so as to.
become more competitive in attracting market share,
Statistical methods are used to present, describe, and understand variability. In observing a Y,ariable value or several variable values repeatedly, where these values are assigned
to units by a process, ~:e note that these repeated observations tend to yield different results.
\Vhile this chapter wHl deal with data presentation and description issues; the fonowing
chaplers will utilize the probability concepts developed in prior chapters to model and
develop an understanding of variability and to utilize this understanding in presenting inferential statistics topics and methods.
Virtually all real~world processes exhibit variability. For example, consider situatio:lS
where we select several castings from a manufacturing process and measure a critical .
dimension (such as a vane opening) on each part. If the measuring instrument has sufficient
resolution, the vane openings will be different (there will be variability in the dimension).
Alternatively, if we count the number of defects on printed circuit boards, we will find variability in the counts, as some boards will have few defects and others will have many
defects. This variability extends to all environments. There is variability in the thickness of
oxide coatings On silicon wafers, the hourly yield of a chemical process, the number of
errors on purchase orders, the flow time required to assemble an aircraft engine, and the
therms of natural gas billed to the residential custon:ters of a distributing utility in a given
month.

169

110

Chapter 8

Introduction to Statistics and Data Description

Almost all experimental activity reflects similar variability in the data observed, and in
Chapter 12 we will employ statistical methods not only to analyze experimental data but
also to construct effective experimental designs for the study of processes.
Why does variability occur? Generally, variability is the result uf changes ill the conditions under which observations are made. In a manufacturing context,. these changes may
be differences in specimens of material, differences in the way people do the work. differenceS in process variables~ such as temperature, pressure, or holding tice, and differences
in environmentalJactors, such as relative humidity. Variability also occurs because of the
measurement system. For example. the measurement obtained from a scale may depend on
where the rest item is placed on the pan. The process of selecting units for observation may
also cause variability. For example, suppose that a lot uf 1000 integrated circuit chips has
exactly 100 defective chips. If we inspected all 1000 chips, and if our inspection process
was perfect (no inspection or measurement error)t we would find aU 100 defective chips.
However, suppose that we select 50 chips. Now some of the chips will likely be defective;
and we would expect the sample to be about 10% defective, but it could be 0% or 2% or
12% defective, depending on the specific chips selected.
The field of statistics consists of methods for describing and modeling variability, and
for ma.ki.ng decisions when variability is present. In irerential statistics, we usually want
to make a decision about some population. The term population refers to the colleCtion of
measurements on aU elements of a universe about Which we wish to draw conclusions or
make decisions. In this text, we make a distinction between the universe and the popula~
tions, in that the universe is composed of the set of elementary units or simply units, while
a population is the set of numerical or categorical variable values for one variable associ~
ated with each of the universe units. Obviously, there may be several populations associated
with a given universe. An example is the universe consisting of the residential class cus~
tomers of an electric power company whose accounts are active during part of the month of
August 2003. Example populations might be the set of energy consumption (kilowatt-hour)
values billed to these customers in the August 2003 bill, the set of customer demands' (kilowatts) at the instant the company experiences the August peak demand. and the set made up
of dwelling cat~gory, such as single~family unattached, apartments, mobile home, etc.
Another example is a universe which consists of all the power supplies for a personal
computer manufactured by an electronics company during a givenpctiod. Suppose that the
manufacturer is interested specifically in the output voltage of each power supply. We may
think of the output voltage levels .in the power supplies as such a population, In this case,
each population value is a numerical measurement, such as 5.10 or 5.24. The data in this
case would be referred to as measuremem data. On the other hand, the manufacturer may
be interested in whether or not each power supply produced an output voltage that conforms
to the requirements. V.le may then visualize the population as consisting of attribute data,
in which each power supply is assigned a value of one if the unit is nonconforming and a
value of zero if it conforms to requirements. Both measurement data and attribute data are
called numerical data. Furthermore, it is convenient to consider measurement data as being
either continuous data or discrete data depending on the nature of the process assigning
values to unit variables. Yet another type of data is called categorical data. Examples are
gender, day of the week when the observation is taken. make of automobile. etc. Finally, we
have unit identifying data. which are alphanumeric and used to identify the universe and
sample unitS. These data might neither exist nor have statistical interpretation; however, in
some situations they are essential identifiers for universe and sample units. Examples would
be social security numbers, account numbers in a bank., YIN numbers for automobiles, and
serial numbers of cardiac pacemakers. In this book we will present teclmiques far dealing
with both measurement and attribute data; however, categorical data are also considered.

8-1 The Field of Statistics

171

In most applications of statistics, the available data result from a sample of units
se!ected from a universe of interest, and these data reflect measurement or classification of
one or more variables associated with the sampled units. The sample is thus a subset of the
units, and the measurement or classification values of these units are subsets of the respective universe populations.
Figure 8-1 presents an overview of the data acquisition activity. It is convenient to think
of a process that produces units and assigns values to the variables associated with the units.
An example would be the manufacturi.I.;g process for the power supplies. The power supplies in this case are the units, and the output voltage values (perbaps along with other variable values) may be thought of as being assigned by the process. Oftentimes, a probabilistic
model or a model with SOme probabilistic components is assumed to represent the value
assignment process. As indicated earlier, the set of units is referred to as the universe. a.'1d
this set may have a firrite or an infinite membership of units. Furthermore, in some cases this
set exists only in concept~ however. we can describe the elements of the set without enumerating them, as was the case with the sample space, 'if, associated v,.ith a random exper~
iment presented in Chapter I. The set of values assigned or that may be assigned to a
specific variable becomes the population. for that variable.
Now, we illustrate with a few additional examples that reflect several universe structures and different aspects of observing processes and population data. First. continuing
with the power supplies, consider the production from one specific shift that consists of 300
units, each having some voltage output. To begin, consider a sample of 10 units selected
from these 300 units and tested with voltage measurements, as previously described for
sample units. The universe here is finite. and the set of voltage values for these universe
UIJits (the population) is thus finite. If our interest is simply to describe the sample results,
we have done that by enumeration of the resnlts, and both graphical and quantitative methods for further description are given in the following sections of this chapter. On the other

Process

Observation

Df~
'--</

Unit
generation

Or

/
/
/
/
/

Sample units

Q)"'@
Figure 8--1 From process to data..

/
/

/"

'--/

Data

172

Chapter 8

in'JOduction to Statistics and Data D::-:scription

hand, if we wish (0 employ the sample results to draw statistical inference about the population consisting of 300 voltage values, this is called an enumerative studyl and careful
attention must be given to the method of sample selection if valid inference is to be made
about the population. The key to much of what may be accomplished in such a study
involves either simple or more complex forms of rar.dom sampling or at least probability
saT1".pling. \VhUe these concepts will be developed further in the next chapter, it is noted at
this point that the application of simple random sampling from a finite universe results in
an equal inclusion probability for each unit of the universe. In the case of a finite universe,
where the sampling is done without replacemem and the sample size, usually denoted n. is
the same as the universe Si7.e, N, then this sample is called a census,
Suppose Our interest lies not in the voltage "talues for units produced in this specific
shift but rather in the process or process variable assignment model. Not only must great
care be given to sampling methods, we must also make assumptions about the stability of
the process and the structure of the process model during the period of sampling. In
essence, the universe unit variable values may be thought of as a realization of me process"
and a random sample from such a universe, measuring voltage values. is equivalent to a rau~
dom sample on the process or process model voltage variable or population. With our
example, we might assume that unit voltage. E, is distributed as N(u. aZ) during sample
selection. This is often called an analytic sruciy, and our objective might be, ior example, to
estimate J1. and/or
Once again, random sampling is to be rigorously defined in the fol~
lowing chapter.
Even though a universe sometimes exists only in concept, defined by a verbal description of the membership, it is generally useful to think carefully about the entire process of
observation activity and the conceptual universe. Consider an example where the ingredients
of concrete speci.r:lens have been specified to achieve high strength with early cure time. A
batch is formulated, five specimens are poured into cylindrical molds, and these are cured
according to test specifications. Following the cure, these test cylinders are subjected to lon~
gitudinalload until rupture, at which point the rupture strength is recorded. These test units,
with the resulting data, are considered sample data from the strength measurements associ~
ated with the universe units (each with an assigned population value) that might bave been,
but were never actually, produced. Again, as in the prior example, inference often relates to
the process or process variable assignment model. and model assumptions may become crucial to attaining meaningful inference in this analytic study.
Finally, consider measurements taken on a fluid in order to measure some variable or
characteristic. Specimens are draw:> from well-mixed (we bope) fluid, with eacb specimen
placed in a specimen container. Some difficulty arises in universe identification. A convention here is to again view the universe as made up of all possible specimens that might have
been selected, Similarly. an alternate view may be taken that me sample values represent a
realization of the process value assignment model. Getting meaningful results from such an
analytic study once again requires close anention to sampling methods and to process or
process model assumptions.
Descriptive statistics is the branch of statistics that deals with organization, summa~
rizatioo. and presentation of data. Many of the techniques of deseriptive statistics bave been
in use for over 200 years, y;ith origins in surveys and census activity, Modem computer
technology, particularly computer graphics, has greatly expanded the field of descriptive
statistics in recent years. The techniques of descriptive statistics can be applied either to
entire finite populations or to samples and these methods and techniques are illustrated in
the following s""tions of this chapter. A wide selection of software is available, rdllging in
focus, sophisticatio~ and generality from simple spreadsbeet functions &'Uch as those found
in Microsoft Excel"", to the more-comprehensh'e but "user friendly" Minitab, to large)
comprehensive, flexible systems such as SAS. Many other options are also available.

cr.

8-3

Graphical Preser.tation of Dam

173

In enumerative and analj1ic studies~ the objective is to make a conclusion or draw


inference about a finite population or about a process or variable assignment modeL
This activity is called inferential statistics, and most of the techniques and methods
employed have been developed within the past 90 years. Subsequent chapters will foeus on
these topics.

8-2 DATA
Data are collected and stored electronically as well as by hUJrulD observation using traditional records or files. Formats differ depending on the observer or observational process
and reflect individual preference and ease of recording, In large~scale studies, where unit
identification exists~ data must often be obtained by stripping the required information
from several files and mergillg it into a file format suitable for statistical analysis.
A table or spreadsbeet-type format is often convenient, and it is also compatible with
most software analysis systems. Rows are typically assigned to units observed, and columns
present numerical or categorical data on one or mare variables. Furthermore, a column may
be assigned for a sequence or order index as well as for other unit identifying data. In the
case of t.~e power supply voltage measurements. no order or sequence was intended so that
any permutation of the 10 data elements is the same. In ather contexts. for example where
the index relates to the time of observation, the position of the elementin the sequence may
be quite important. A common practice in both situations described is to employ an index.
say i = 1,2; , .. , n, to serve as a unit identifier where n units are observed. Also, an alpha~
bedcal character is usually employed to represent a given variable value; thus if i equals t.~e
value of the ith voltage measurement in the example, e~ = 5.10, e2 5.24, e3 = 5.14.. ."
e lC = 5.11. In a table format for these data there would be 10 rows (II if a headings row is
employed) and one or!\Vo columns, depending on whether the index is included and presented in a column.
'Where several variables and/or categories are to be associated with each unit. each of
these may be assigned a colUIDIl. as saoVffi in Table 8-1 (from Montgomer:y and Runger,
2003) which presents measurements of puJl strength, wire length, and die height made on
each of25 sampled units in a semiconductor manufacturing facility, In situations where cat~
egorical classification is involved, a columo is provided for each categorical variable and a
categor:y code or identifier is recorded for the unit. An example is shift of manufact1:..""e with
identifiers D, N, G for day, night, and graveyard shifts, respectively.

83 GRAPIDCAL PRESENTATION OF DATA


In this section we will present a few of the many graphical and tabular methods for summarizing and displayi:c.g data. In recent years, the availability of computer graphics has
resulted in a rapid expansion of visual displays for observational data.

8-3.1 Numerical Data: Dnt Plots and Scatter Pints


When we are concerned with One of the variables associated with the observed units. the
data are often called univariate data, and dot plots provide a simple, attractive display that
reflects spread, extremes, centering, and voids or gaps in the data. A horizontal line is
scaled so that the range of data values is accommodated. Each observation is then plotted
as a dot directly above this scaled line, and where multiple observations have the same
value, the dots are simply stacked vertically at that scale point. ,,'bere the number of units
is relatively smail, say n < 30. or where there are relatively few distinct values represented

174 Ch.p1er 8 Inrroduction to Statistics .and Data Description


Table 8-1 Wrre Bond Pull Strength Data
Observation
Number
2

3
4

5
6
7
8
9
10
11

12
13
14
15
16
17
18

19
20
21
22

23
24
25

Pull Strength
(y)

WueLength

9.95
24.45
31.75
35.00
25.02
16.86
14.38
9.60
24.35
27.50
17.08
37.00
41.95
11.66
21.65
17.89
69.00

50

110

II

120
550
295
200
375
52
100
300
412
400
500
360
205
400
600
585
.540
250
290

10
8
4

2
2
9

8
4
11
12
2

4
4
20

10.30
34.93

10

4<5.59
44.&8
54.12
56.63
22.13
21.15

15
15

16
17
6
5

Die Height

510
590
100

400

in the data set, dot plots are effective displays. Figure 8-2 shows the univariate dot plots or
marginal dot plo", for each of the three variables with dall! presented in Table 8-L These
plots were produced using Minitab".
Where we wish to jointly display results for two variables, the bivariate equivalent of
the dot plot is called a scatter plDt. We construct a simple rectangular coordinate graph,
assigning the horizontal axis to one of the variables and the vertical ""is to the other. Each
observation is then plotted as a point in this plane. Figure 8~3 presents scatter plots for pull
strength vs. wire length and for pull strength vs. die height for the datl! in Table 8-L In order
to accommodate data pairs that are identical, and thus that fall at the same point on the
plane, one convention is to employ alphabet characters as plot symbols; so A is the displayed plot point where one data point falls at a specific point on the plane. B is the displayed plot point if two fall at a specific point of the plane. etc. Another approach for this.
which is useful where values are close but not identicaL is to assign a randomly generated,
small, positive or negative quantity sometimes called ji..l1er to one or both variables in order
to make the plo", better displays. Wbile scatter plots show the region of the plane wbere the
data points fall, as well as the data density associated with this region, they also suggest
possible association between the variables. Finally, we n.ote that the usefulness of these
plots is not limited to small data sets.

8-3
.. ~

10

!~

..

175

.,

30

20

Graphical Presentation of Data

40

60

50

70

Pull strength

10

20

15

Wire length

..,

200

100

300

400

500

600

Die height

Figure 8-2 Dot plots for pull strength, wire length, and die height.

70

70

60

60

50

" 40
~

"5

30
20
10
0

..'.

. . .. .'

"

~
c

30

.. '

10
10
Wire length

Figure 8-3
Minitab'"l.

"5

20

'

40

..

50

20

100

200

300

400

500

600

Die height

Seatter plots for pull strength vs. wire length and for pull strength vs. die height (from

To extend the dimensionality and graphically display the joint data pattern for three
variables, a three-dimensional scatter plot Dlay be employed, as illustrated in Figure 8-4 for
the data in Table 8-1. Another option for a display, not illustrated here, is the bubble plot.
whicb is presented in two dimensions, with the third variable reflected in the dot (now
called bubble) diameter that is assigned to be proportional to the magnitude of the third
variable. As was the case with scatter plots, these plots also suggest possible associations
between the variables involved.

83.2

Numerical Data: The Frequency Distribution and Histogram


Consider the data in Table 8-2. These data are the strengths in pounds per square incb (psi)
of 100 glass, nonrerurnable 1-liter soft drink bottles. These observations were obtained by
testing eacb bottle until failure occurred. The data were recorded in the order in whicb the
bottles were tested, and in this fonnat they do not convey very Dlucb infonnation about
bursting strength of the bottles. Questions sucb as "wbat is the average bursting strength?"

176

Chapter 8

Introduction to Statistics and Data Desc:iption

~ttn-~~~-l-1~o
O'---'-4'"-8.L.-':'12~---;;;,__?;;
8
12
Wire length

FigureS-4

Three~ilimensiona1

16

406'0

200300 ,\
100
~.e.\~
20 0
<:)\0

plots for pull strength, wire length, and die height.

Table,g..2 Bursting Strength in Pounds per Square Inch for 100 Glass, I-Liter, Nonreturnable Soft
Drink Bottles

265
205

263
3rJ7
220
268
260
234
299

215

197
286
274
243
231
267
281
265
214
318

346

317
242
258
276
300
208
187
264
271

280
242
260
321

265

228

223

250

260

254
281
294

299

300

258

235
283
277

267

293

200
235

221

246

328
296
276
264
269

248
263
231
334
230
265

235

272

290

283

176

265
262
271
245
301
280
274
253
287
258

261
248
260
274
337
250
278
254
274
275

278
25G
265
270
298
257
210
280
269
251

Or "what percentage of the bottles burst below 230 psi?" are not easy to answer when the
data are presented in this form,
A frequeney distribution is a more useful summary of data than the simple enumeration given in 1able 8-2. To construct a frequency distribution, we must divide the range of
the data into intervals, which are usually called class intervals. If possible, the class intervals should be of equal width, to eohance the visual infonnation in the frequency distribution, Some judgment must be used in selecting the number of class intervals in order ro give
a reasonable display. The number of class intervals used depends on the number of observations and the amount of scatter or dispersion in the data. A frequency distribution that
uses either too few or too many class intervals will not be very informative. \Ve generally
find that between 5 and 20' intervals is satisfactory in most cases, llIld that the number of
class intervals should increase with n. Choosing a number of class intervals approximately
equal to the square root of the nwnber of observations often works well in practice.
A frequency distribution for the bursting s!l'ength data in Table 8-2 is abown in Table
8-3. Since the data set contains 100 observations, we suspect that about ,fWo = 10 class
intervals will give a satisfactory frequency distribution. The largest and smallest data val~
ues are 346 and 176, respectively, so the class intervals must cover at least 346 - 176 = 170
psi units on the scale. If we want the lower limit for the first interval to begin slightly below
tile smallest data value and the upper limit for the last cell to be slightly abeve the largest
data value, then we might start the frequency distribution at 170 and end it at 350. This is
an interval of 180 psi units. Xine class intervals, each of width 20 psi, gives a reasonable
frequency distribution, ani! the frequency distribution in Table 83 is thus based on nine
class intervals.

8-3

Class Interval
(psi)

Tally

170 ~x < 190


190 ~x < 210
~x

177

Frequency Distribution for the Bursting Strength Data in Table 8-2

Table 8-3

210

Graplrical Presentation of Data

< 230

230 ~x < 250


250.s;x < 270
270 ~x < 290
290 ~ x < 310

1111

II

1+11
1+11
1+11
1+11

1+11 III
1+11 1+11 1+11 1+11 1+11 II
1+11 1+11 1+11 1111

II!III!II

310~x<330

1111

330~x<350

III

Relative

Cumulative

Frequeney

Frequeney

Relative Frequency

2
4
7
13
32
24
11
4
3
100

0.02
0.04
0.07
0.13
0.32
0.24
0.11
0.04
0.03
1.00

0.02
0.06
0.13
0.26
0.58
0.82
0.93
0.97
1.00

The fourth column in Table 8-3 contains the relative frequency distribution. The relative frequencies arc found by dividing the observed frequency in each class interval by the
total number of observations. The last column in Table 8-3 expresses the relativ~ frequencies on a cumulative basis. Frequency distributions are often easier to interpret than tables
of data. For example: from Table 8-3 it is very easy to see that most of the bottles burst
between 230 and 290 psi, and that 13% of the bottles burst below 230 psi.
It is also helpful to present the frequency distribution in graphical form, as shown in
Fig. 8-5. Such a display is called a histogram. To draw a histogram, use the horizontal axis
to represent the measurement scale and draw the boundaries of the class intervals. The vertical axis represents the frequency (or relative frequency) scale. If the class intervals are of
equal width, then the heights of the rectangles drawn on the histogram are proportional to
the frequencies. If the class intervals are of unequal width, then it is customary to draw rectangles whose areas are proportional to the frequencies. In this case, the result is called a

.
40

,30
i;'

r-

~ 20

u.

10

r-

,--I

170190210230250270290310330350
Bursting strength (psi)

Figure 8-5

Histogram of bursting strength for 100 glass, I-liter, nonreturnable soft drink bottles.

178

Chapter 8

Introductioc to Statistics and Data Description

density histogram, For a histogram displaying relative frequency nn the vertical axis. the
rectangle height' are calculated as

'gh
class relative frequency.,_
rectangIe hel&
t:::::
class width

When we find there are multiple empty class intervals after grouping data into equal-width
intervals, one option is to merge the empty intervals with contiguous interVals. thus creat~
ing some wider intervals. The density histogram resulting from this may produce a more
attractive display. However, histograms are easier to interpret when the class intervals are
of equal width. The histogram provides a visual impression of the shape of the distribution
of the measurements, as wen as information about centering and the scatter or dispersion
of tho data.
In passing from the original dam to either a frequency distribution or a histogram. a cer~
tain amount of information has been lost in that we no longer have the individual observations, On the other hand, this information loss is small compared to the ease of
interpretation gamed in using the frequency distnoution and histogram, In cases where the
data assume only a few distinct values, a dot plot is perhaps a better graphical display,
Vlhere observed data are of a discrete nature, sucb as is found in counting processes~
then two choices are available for constrUcting a histogram, One option is to center the rec~
tangles on the integers reflected in the count data, and the other is to collapse the rectangle
into a vertical line flaced directly over these integers. In both cases, the height of the rectangle or the length of the line is either the frequency or relative frequency of the occurrence
of the value in question.
In summary, the histogram is a very useful gr1ljOhic display, A histogram can give the
decision maker a good understanding of the data and is very useful in disflaying the shape,
location, and variability of the data. However, the histOgram does not allow individual data
points to be identified, because all observations falling in a cell are indistinguishable.

8-3.3 The Stem-and-LeafPlot


Suppose that the data are represented by Xj. X;:., ,." x lf' and that each number Xi consists of
at least two digits, To construct a stem-and~leaf plot, we divide each number .xl into two
parts: a steml consisting of one or more of the 1eading digits, and a lear, consisting of the
remaining digits. For example, if the data consist of the percentage of defective information
between 0 and 100 on lots of semiconductor wafers, then we could divide the value 76 into
the stem 7 and the leaf 6. In general, we should choose relatively few stems.in comparison
with the number of observations. It is usually best to choose between 5 and 20 stems. Once
a set of stems has been chosen. they are listed along the lefr-hand margin of the display. and
beside each stem all leaves corresponding to the observed data values are listed in the order
in which they are encountered in the data set.

To illustrate the construction of a stem-and-leaf plot, consider the bottle-bursting-strength data in


Table 8-2. To construct a stem-ar.d-leafplot, we select as stem values the numbers 17.18 19, .,,> 34,
The resulting stem-ana-leaf plot is presented in Fig. 8-6. Inspection of this display immediately
reveals that most of ti:.e bursting strengths lie bet'Neen 220 and 330 psi. and that the central value is
soroc'Nhere bet\llccn 260 and 270 psi. Furtheun.ore, the but'Sting strengths are distributed approxi~
mately symmetrically about the central value. Therefore, the stem-and~leaf plot., like the histogram.
7

ci1OV1S US to determine quickly some important fuatures of the data that were not immediately obvi~
o";!s in the original display. Table 8-2, Note that here the original cumbers. arc not lost, as occurs in a

3-3
Stem

Graphical Presentation of Data

179

Leaf

6
7
7

17

18
19
20
21
22

0,5,8

0,4,5
1,0,8,3

3
4
6
7
11
21
14

5,1,1,4.5,5

23
2A

2,8,2,6,8,3,5
4,0,8,0,0,7,8,3,4,8-1

25
26
27
28

5,5,5,1,2,3,0,O,5,3,8,7,O,O,~,5,9,5,4,7,9

8,4, 1,4,0,6,6,4,8,2,4, 1,7,5


0,6,1 ,0,1,0,0,3,7,3

10

29

4,6,8.9~9,3.0

30
31

7,1,0,8
7,8
1,8
7,4
6

32

33

34

2
2
2
1

100

Figure 8--6

Stem~and-leaf plot

for the borJe-bursting-strength data in Table

8~2.

histogram. Sometimes, in order to assist in finding percentiles. we order the leaves by magnitude, producing a:: ordered stem~and-leaf plot, as in Fig. 8-7. For instance, since n =. 100 is an even number,
the median, or '''midd1e'' observation (see Section 8-4.1), is the average of the tviO observations with
ranl;s 50 and 51, or
x ~ (265 .. 265)12 ~ 265,

The lenlhpercentileis the observation withnmk(O,I)(l00)+05~ 10.5 (halfway between the IOthaJ:d
11th observatiOllS), or (220 + 221)12~2205, Theftrst quartile i:s the observation withnmk(O,25)(loo)
- 0.5 = 25,5 (halfway between the 25th and 26th observations), or (248 + 248)12 = 248, and the third
qu2rti:e is the observation with nmk (0,75)(100) + 0.5 = 75.5 (halfway between the 75th and 76th
O'sc,."va.tioDs). or (280+ 280)/'2=280, The first and rhird quartiles are QCCjSionilly denoted by the syxnbo~s Ql and Q3, r~ectively. and the 1nterquartile rf1!tSe IQR= Q3 -Ql maybe used as amcasureof
variability. For the bor-Jc-bo.tsti1.g-strength data. the i;:;terquartile range is IQR =. Q3 - Q: = 280 - 248
=32, The stem~and-1eaf displays in Figs. 8-6 and 8-7 are equivalent to a histograx. ....ith IS class i:::tervals, In some situations, it may be desirable to provide more classes or stems, One way to do this would
be to modify the original stems as follows: divide stem 5 (say) into two new stems. 5* and 5- S~m 5has leaves 0, 1. 2, 3, and 4, and stem 5- has leaves 5, 6, 7, 8, and 9. This will double the number of original stems. We could increase the number of original stems by five by defining five new stems: 5:jO vrith
leaves 0 and 1, St (for twos and threes) with leaves 2 and 3, Sf (for fours and fives) with leaves 4 and
5, 5s (:or sixes and sevens) with lea\'es () and 7, and 5" v.i.th leaves 8 and 9.

8-3.4 The Box Plot


A box plot displays the three quartiles. the lllinimum, and the maximum of the data on a
rectangular box, aligned either horizontally or vertically, The box encloses the interqua...--tile
range with the left (or lower) line at the first quartile Q1 and the right (or upper) line at the
third quartile Q3, A line is drawn through the box althe second quartile (which is the 50th

180

Chapter 8

Introduction to Statistics and Data Description

Leaf

Stem

17
18
19
20
21
22
23
24
25

27
28

1
0,5.8
0,4,5
0,1,3,8
1,1.4,5,5,5
2.2,3,5,6,8,8
0,0,0,1,3.4,4,7,8,8,8
0,0,0.0,1,2,3,3,4,4,5,5,5,5,5.5,7,7,8.9,9
0,1,1,2,4,4,4,4,5,6,6.1,8,8
0,0,0,0,1,1,3,3,6,7

29

0,3,4,6,8,9,9

30
31
32
33
34

Q,l,7,&
7.8
1,8
4,7

26

Figure g~1

Ordered stcm~and~leaf plot for the

3
3

"6
7
11
21
14
10
7
4
2
2
2
1
100

bott1e~bUX5ting-streugth data.

percentile or the median) Q2 ~ i. A Une at either end extends to the extreme values. These
lines, sometimes called wbiskers, may extend only to the 10th and 90th percentiles or the
5th and 95th percentiles in large data sets. Some authors refer to the box plot as the box"
and-whisker plot. Figure 8-8 presents the box plot forthe bottle-bursting-strength data. This
box plot indicates that the distribution of bursting strengths is fairly symmetric around the
central value, because the left and right whiskers and the lengths of the left and right boxes
around the median are about the sarne.
The box plot is useful in comparing two or more samples. To illustrate, consider the
data in Table 8-4. The data, taken from Messina (1987), represent viscosity readings on
three different mixtures of raw material used on a manufacturing line. One of the objectives
of the study that Messina discusses is to compare the three mixtures. Figure 8-9 presents the
box plots for the viscosity data. This display permits easy interpretation of the data. Mixture I has higher viscosity than mixture 2, and mixture 2 has higher viscosity than mixture
3, The distribution of viSCOsity is not symmetric, and the maximum -viscosity reading from
mixture 3 seems unusually large in comparison to the other readings. This observation may
be an outlier. and it possibly warrants further exantination and analysis.

176

175

200

248

25

280

I
I

346

225

250

275

300

325

Figure 8~~ Box plot for the bottic-bursting-streng'J> data.

350

8-3

Graphical Presentation of Data

181

Table 8-4 Viscosity Measurements for Three Mixtures


Mixture 1

:Mixture 2

:Mixture 3

22.02
23.83
26.67
25.38
25.49
23.50
25.90
24.98

21.49
22.67
24.62
24.18
22.78
22.56
24.46
23.79

20.33
21.67
24.67
22.45
22.28
21.95
20.49
21.81

27.0
26.67
25.8

".g-"

25.70
25.19

24.7

24.62
24.32

24.67

rn

'"8

23.5

23.68
23.29

-;;;

22.62

22.3

22.37

22.02

21.88
21.49

21.2

21.08
20.33

20.0

Mixture

Figure 8-9

83.5

Box plots for the mixture-viscosity data in Table 8-4.

The Pareto Chart


A Pareto diagram is a bar graph for count data. It displays the frequency of each count on
the vertical axis and the category of classification on the horizontal axis. We always arrange
the categories in descending order offrequency of occurrence; that is, the most frequently
occurring is on the left, followed by the next most frequently occurring type, and so on.
Figure 8-10 presents a Pareto diagram for the production of transport aircraft by the
Boeing Commercial Airplane Company in the year 2000. Notice that the 737 was the most
popular model, followed by the 777, the 757. the 767, the 717, the 747, the MD 11, and the
11D-90. The line on the Pareto chart connects the cumulative percentages of the k most frequently produced models (k = 1, 2, 3, 4, 5). In this example, the two most frequently pro
duced models account for approximately 69% of the total aUplanes manufactured in 2000.
One feature of these charts is that the horizontal scale is not necessarily numeric. Usually,
categorical classifications are employed as in the aUplane production example.
The Pareto chart is named for an Italian economist who theorized that in certain
economies the majority of the wealth is held by a minority of the people. In count data, the
"Pareto principle" frequently occurs, hence the name for the chart.
Pareto charts are very useful in the analysis of defect data in manufacturing systems. Figure 8-11 presents a Pareto chart showing the frequency with which various types of defects

182

Chapter 8

1:

Introduction to Statistics ar.d Dt!ta D::scription

500

100

400

80

sao

60

1:
e
80
-40 a.

8 200

20

10e

COL.:ot

Percent

737

757

767

281
57.5
57.5

55
11.2

45
9.2

4.4
9.0

747

MD11

92

25

6.5
93.5

5.1
98.6

4
0.8
99.4

717

3
0.6
100.0

68.7
77.9
86.9
Cum"!o
Figure S~10 Airplane production in 2000, (Source; Boeing CornmercialAirplane Company.)

oC\..\.1! on metal parts used in a structural component of an automobile doorfrarne. Notice bO\V
the Pareto chart highlights the relatively few types of defects thaI are responsible for most of
the observed defects in the part. The Pareto chart is an imponanl parr of a quality-improvement program because it allows management and engineering to focus attention on the most
critical defects in a product or process. Once these critical defects are identified, corrective
actions to reduce or eliminate these defects must be deVeloped and implemented. This is easier to do, however, wben we are sure that we are attacking iii legitimate problem: it is much
easier to reduce or eliminate frequently occ~ng defect'> than rare ones.
81

-----_.

__

iOQ

'if
.,

75 .3"
e

ll'

.!!l

"

cr

'0

50 ~

'0

.0

E
~

so

31

L25

21

2l
10

~6

O-~.-Out of
contour

ur.der!rlmmcd

0
Q

;;;

1lla:
4

Parts net
greased
Parts r,ot
Outet
sequonce
dro.;rred
Defect typo

Missing
hoi(JS!slcts

Pans

Oth.,

Figuro S-11 Pareto chart of defects in door structural elements.

fi

8,,4

~umerical

Description of Data

183

8-3_6 Tune Plots


VIrtually everfone should be familiar with time plots, since we view them daily in media
presentations. Examples are historical temperature profiles for a given city. the closing

Dow Jones Industrials Index for each trading day, each month, each quarter, etc., and the
plot of the yearly Consumer Price Index for all urban consumers published by the Bureau
of Labor Statistics. Many other time-oriented data are routinely gathered to support inferential statistical activity. Consider the electric power demand, measured in kilowatts, for a
given office building and presented as hourly data for each of the 24 hours of the day in
which the supplying utility experiences a summer peak demaud from all cllstomers.
Demand data such as these are gathered using a time of use or "load research" mete!. In
concept, a kilowatt is a continuous. variable. The meter sampling interval is veri short, however, and the hourly data that are obtained are truly averages over the meter sampling intervals contained within each hour.
Usually with time plots, time is represented on the horizontal axis. and the vertical
scale is calibrated to accommodate the range of values represented in the observational
results. The kilowatt hourly dem",.,d data are displayed in Fig. 8-12. Ordinarily, when series
like those: display averages OVer a time interval, the variation displayed in the data is a fune~
tion of the lcngth of the averaging interval, with shorter intervals producing more variability. For example) in the case of the kilowatt data, if we used a 15-minute interval. there
would be 96 points to plot, and the variation in data would appear much greater.

8-4 NUMERICAL DESCRIP:rION OF DATA


Just as graphs can improve the display of data, numerical descriptions are also of value. In
this section, we present several impOrtant numerical measures for describing the characteristics of data.

8-4.1 Measures of Central Tendency


The most common measure of central tendency, or location, of the data is the ordinary arith~
metie mean. Because we usually think of the data as beillg obtained from a sample of units,

J
!

50

10

12

14

16

18

20

22

Hour

Figure 8-12 Summer peak doy hourly

wow.tt (KW) demand data for an office building.

184

Chapter 8

Introduction to Statistics and Data Description

we will refer to the arithmetic mean as the sample mean. If the observations:in a sample of
size n are x;. x 2 '. X'I' then the sample mean is
f::= X1 +XZ +,,,+xn

(8-1)

For the bottle-bursting-strength data in Table 8-2, the sample mean is


(00

}.>
X

i~i
lOa

26, 406

"64 06

-J:Oil - ..

From examination of Fig. 8-5, it seems that the sample mean 264,06 psi is a "typicall1 value
of bursting strength. since it occurs near the middle of the data, wbere the observations are
concentrated. However, this impression can be lIlisleading. Snppose that the histogram
looked like Fig, 8~13. The mean of these data is still a measure of central tendencYI but it
does not necessarily imply that most of the observations are concentrated around it. In general, if we think ofllie observations as having unit mass, the sample mean is just the center
of mass of the data. This implies that the histogram will just exactly balance if it is supported at the sample mean.
The sample mean represents the average value of all the observations in the sample.
\Ve can also think of calculating the average value of all the observations in a finite popu~
larion. This average is called the population mean, and as we here seen in previous chapters, it is denoted by the Greek letter J1.. \,nen there are a finite number of possible
observations (say Iv) in the population, then the population mean is

_ 1',

/1- N

(8-2)
I

'\;'N

where l'x = .L<i""l Xi is the finite population total for the population.
In the followmg chapters dealing with statistical inference. we will present methods for
making inferences about the population mean that are based on the sample mean. For example, we will nse the sample mean as a point estimate of fl.,
Another measure of central tendency is the median, or;he point at which the sample is
divided into two equal halves, Let X~l)' XC'l)l , ~It) denote a sample arranged in increasing
order of magnitude; that iSt x(li denotes the smallest observation, X(2) denotes the second

figure 8--13 A histogram.

84 1'UIllerica1 Description of Data

smallest observation, ... , and


defined mat'Iematically as

x~l!)

185

denotes the largest observation. Then the median is

"

n odd,

t(r,+I)/2)

x~ (''') +2(,;2)+1)

11

(8-3)

even.

The median bas the advantage that it is not influenced very much by extreme values. For
example, suppose that the sample observations are
1,3,4,2,7,6. and 8.
The sample mean is 4.43, and the sample median is 4. Both quantities give a reasonable
measure of the central tendency of the data. Now suppose that the next~to-Iast observation
is changed, so that the data are

1,3,4,2,7,2519, and 8.
For these dara. the sample mean is 363.43. Clearly, in this case the sample mean does not
tell us very much about the central tendency of most of the data. The median, however, is
still 4, and this is probably a much more meaningful measure of central tendency for the
majority of the observations.
Just as;; is the middle value in a sample, there is a middle value in the population. We
define fl as the median of the population; that is, fl is a value of the associated random variable sucb that half the population lies below i1 and half lies above.
The mode is the observation that occurs most frequently in the sample. For example,
the mode of the sample data
2,4,6, 2) 5~ 6. 2, 9, 4, 5, 2, and 1
is 2, since it occurs four times, and no other value occurs as often. There may be more than
one mode.
If the data are symmetric~ then the mean and median coincide. If. in addition~ the data
bave only one mode (we say the data are unimoda1)~ then the mean, median; and mode may
all coincide. If the data are skewed (asymmetric, with along tail to one side), then the mean,
median, and mode will not coincide. Usually we find t.'13t mode < median < mean if the distribution is skewed to t'Ie righ~ while mode> median> mean if the distribution is skewed
to the left (see to Fig. -8-14).
The pistribution of the sample mean is well-known and relatively easy to work with.
Furthermore. the sample mean is usually more stable than the sample median, in the sense
that it does not vary as much from samplp to sample. Consequently. many analytical ,"tatistical techniques use the sample mean. However. the median and mode may also be helpful
descriptve measures.

Negative or left skew

Symmetric

Positive or right skew

(a)

(b)

(e)

Figure 8.. 14 The mean and zuedian for symmetric and skewed distributions,

186

Chapter 8

Introduction to Stati.s::ics and Data Description

8-4.2 Measures of Dispersion


Central tendency does not necessarily provide enough information to describe data ade~
quately. For example, consider the bursting strengths obtained from two samples of six bottles each:

230 250 245 258 265 240


190 228 305 240 265 260

Sample 1:
Sample 2:

The mean of both samples is 248 psi, However, note that the scatter or dispersion of Sample 2 is mueh greater than that of Sample 1 (see Fig. 8-15), In this section, we define several widely used measures of dispersion.
The most important measure of dispersion is the sample variance. If XII -"2, "', xI! is a
sample of n observations, then the sample variance is
n

2:.(x,-xj"
si=

j=!

n-l

S~
n-l

(8-4)

Note that computation of'? requires calculation ofi, n subtractions, 3.I!d n squaring and
adding operations. The deviations X t - may be rather tedious to work with, and several
decimals may have to be carried to ensure nur.:tencal accuracy. A more efficient (yet equiv'<
alent) COlO'lputational formula for calculating Sa is

(8-5)
The formula for S;r:x presented in equation 8-5 requires only One computational pass; but
care must again be taken to keep enough decimals to prevent round-off error.
To see how the sample variance measures dispersion or variability, refer to l'1g. 8-16
which shows the deviations Xi -:x for the second sample of six bottle-bursting strengths. The
greater the amount of variability in the bursting~strength data., the larger in absolute magnitude some of the deviations'x; -x, Since the deviations Xi - iwill always sum to zero, we
must use a measure of variability that changes the negative devi.ations to nonnegative quantities. Squaring the deviations is the approach used in the sample variance. Consequently,
if ,? is smaR then there is relatively little variability in the data, but if SZ is large, the vari~
ability is relatively large.
The units of measurements for the sample variance are the square of the original units
of the variable. Thus. if,X is measured in pounds per square inch (psi). the units for the sam~
pIe variance are (pSi)2.

~~;.pi~;~1'
We wi11 calculate tte sample v'ariance of the bottle~bursting stre::J.gths for the second sample in F1g.
8-15. The deviations Xl -.i for this sample are shown in Fig. 8-10.

o 0

, , ,

180

200

220

240

"1"

r~-2W--2BO

Sample mean "'" 248

-= Sample 1

Sample 2:

Figure 815 Bursting-Strength dara,

o
~.~~:"-:~_---o'_

300

320

8-4 Numerical Description of Data

x,o

x,
o

x,

x,

Xo Xs
o 0

187

180

320

Figure 8-16 How the sample variance measures variability thro'.lgh ':he deviations Xi -;t:.

Observations

x, = 190

-58

~=228

-20
57
-'i

'1. = 305

=240
x5=265
xs~ 260

x4

33&4
400
3249
&4
289

17
12

lUI

x-248
From equation 84,

I,(xi -xl'
s2

;",1

== S=

n..:....l

n-1

7510 '502( pSI.


')'
5=.1

We may also calculate SkY: from the forn::w.lation given in equation 8-5, so that
II

s2.

= S;c
n-I

"

1r

01

\2

lx, --llx, . ~ 367.534- "J488) ;6


__
1

1",1

n. i,.,1

n-I

'

:502{psi)2.

If we calculate the sample variance of the buISting strength for the Sample I values, we
find that;' 158 (psi)'. This is considerably smaller than the sample variance nf Sample 2,
confirming OUI initial. impression that Sample I has less variability than Sample 2.
Because;' is expressed in the square of the original units~ it is not easy to interpret
Fmthermore, variability is a more difficult and unfamiliar concept than location or central
tendency, However, we can solve the "curse of diroensioo.ality'~ by working with the (positive) square root of the vari.ance~ $, called tlle sample standard deviation.. This gives a measure of dispersion e.xpressed in the same units as the original variable.

The sample standard deviation of the bott1e~bursting strengths for the Sample 2 bottles in Rumple 8~2

and Fig, 8-15 is


s= {l ; -11502 =38.76 psi.
For the Sample 1 bottles. the standard deviation of bursti.:og strength is

s= ";i"s-g == 12.57 psi.

188

Cbapter 8

Introduction to Statistics and Data Description

Compute the sz.mple: variance and sample standard deviation of the bottle-bursting-strength data in
Table 8-2. Note that

lC.

'00

2:>:;.

LX,
,., :26,406.

7,074,258.00

Consequently,

S=; 7,074,258.00 - (26,406)'1;00; 101,489.85 and ,2; 101.489,85199 = 1025.15 psi',


so that the sample suu:dard deviation is

,=..]1025.15 ;32.02 psi.


-.~-.--.~------------------~

\\'hen the population is finite and consists of N values we may define the population Yari~
ance as

.v

L(x, -,ul

0'2

= 1=:1___ _

(8-6)

'

which is simply the mean of the average squared departures of the data values from the population mean. A closely related quantity, if', is also sometimes called the population variance and is defined as

---0'-,

N-I

(8-7)

a:

Obviously. as N getS large,


-1 (f1., and oftentimes the use of iT simplifies some of the
algebraic formulation presented in Chapters 9 and 10, \\'bere severa! populations are to be
observed, a SUbscript may be employed to identify the population characteristics and
descriptive measures, e,g., f.Lx, O'~,
etc., if the variable is being described,
We noted that the sample mean may be used to make inferences about the population
mean. Similarly. the sample variance may be used to rn.ake inferences about the population
variance, We observe that the divisor for the sample variance,?t is the sample size minus
I, (n - I). If we actually knew the true value of the population mean!l, then we could define
the sample variance a,.;; the average squared deviation of the sample observations about J.l.
In practice, the value of J.l is almost never kno'W'n, and so the sum of the squared deviations
about the sample average must be used instead_ However! the observations Xi tend to be
closer to their average i than to the population mean j..L, so to compensate for this we use as
a divisor n 1 rather than n.
Another way to think about this is to consider the sample variance s2 a,.., being based On
II
I degrees affreedom. The term degrees offreedom results from the fact thac the n deviations XI - X. X:!: - x, .. ,' XII - x always sum to zero, so specifying the values of any 11 1 of
these quantities automatically determines the remaining one. Thus. only n - 1 of the n de\1.ations Xi - X are independent.
Another useful measure of dispersion is the sample range

s!.

-<

(8-8)

The sample range is very simple to compute, but it ignores all the information in the sample between the smallest and largest observations, For small sample sizes, say II :;; 10, this
;nformation loss is not too serious in some situations. The range traditionally has had wide-

8-4

~wr.erical

Des..-ription of Data

189

spread application in statistical quality control, where sample sizes of 4 or 5 are common
and computational simplicity is a major consideration; however, that advantage has been
largely diminished by the widespread use of electronic measurement, and data storage and
analysis systems, and as we will later see, the sample variance (or standard deviation) pro~
vides a "better" measure of variability. We will briefly discuss the use of the range in statistical quality-control problems in Chapter 17.

Cak'::ate the ranges of the tv,.'o samples 0: bottle-bur:sting~strength data from Section 8-4. shown in
Fig. 8-15. For the first sample, we find that
}(,

265-230=35.

whereas for the second sample


}(, =305 -190= 115.
Note that the range of the second sample is much larger than the range of me first, implying that the
second sample has greater variability than the first.

Occasionally. it is desirable to express variation as a fraction of the mean, A measure


of relative variation called the sample coefficient of variation. is defined as
(8-9)

The coefficient of variation is useful when comparing the variability of two or more data
sets that differ considerably in the magnitude of the observations. For example, the coefficient of variation might be useful i1. compa.."ing the variability of daiiy electricity usage
within samples of single-family residences in Atlanta, Georgia, and Butte, Montana, dnring
July.

8-4.3 Other Measures for One Variable


Two other measnres, both dimensionless, are provided by spreadsheet or statistical software
systems and are called skewness and kurtosis estimates. The notion of skewness was graphCally illustrated in Fig. 8-14, These characteristics are population or population model
characteristics and they are defined in terms of moments, /1" as described in Chapter 2,

Section 2-5.
skewness {3, = J.13

(1'

and

kurtosis

/34 = J.1! -

3,

(1

(8-10)

where. in the case of finite populations. the kth central moment is defined as
N

2:(xi-4

(8-11)

N
As discussed earlier, skewness reflect" the degree of symmetry about the mean; and nega*
tive skew results from an asymmetric tail toward smaller values of the variable, while positive skew results from an asymmetric tail extending toward the larger values of the variable,

190

Chapter 8

Introduction to Statistics and Data Description

Symmetric variables, such as those described by the normal and uniform distributions. hav~
skewness equal zero. The exponential distribution, for example, has skewness /33 = 2.
Kurtosis describes the relative peakedness of a distribution as compared to a normal
distribution, where a negative value is associated with a relatively flat distribution and a
positive value is associated with relatively peaked distributions. For example, the kurtosis
measure for a unifonn distribution is -1.2. while for a nonnal variable. the kurtosis is zero.
If the data being analyzed represent measurement of a variable made On sample units.
the sample estimates of f3, and /3, are as shown in equations 8-12 and 8-13. These values
may he calculated from Excel~ worksheet functions using the SKEW and KURT functions.

/3, =

skewness

/3 _

kurtosis

(n-l}(n-Z)

. =~-.,...---; n>2.

(n)(n + 1)
(n-l)(n-2)(n-3)

i=1

3; n>3.

If the data represent measurementS on all the units of a finite population or a census, then
equations 8-10 and 8-11 ;;hould be utilized directly to detennine these measures.
For the pull strength data shown in Table 8-1, the worksheet funetions return values of
0.865 and 0.161 for the skewness and kurtosis measures, respeetively_

8-4.4 Measuring Association


One measure of association between twO numerical variables in sample data is ealled the
Pearson or simple correlation coefficient, and it is usually denoted r. Where data sets COntain a number of variables and the correlation coefficient for only one pair of variables, des~
ignatedx andy! is to be presented. a subscript notation such as r:r:y may be used to designate
the simple correlation bern-"een variable x and variable y, The correlation coefficient is

S"

(8-14)

where Sa is as shown in equation 8-5 and Syy is similarly defined for the y variable, replacing x with yin equation 8-5, while
11-

Sx, = I(x,
1""1

/11

\J.ln\/Jt\

-:;:)(y, -y)= IX,y;


I

\.1=1

1--n Ix,).: Iy, i.


1

\1 .. :

(8-15)

\i",,1.1

Now we will return to the wire bond pull strength data as presented in Table 8-1, with
the scatter plots shown in Fig. 8-3 for both pull strength (variable y) vs. wire length (variable XI) and for pull strength vs. die height (variable x]). For the sake of distinguishing
between the two correlation coefficients. we let r l be used for y vs. x:. and T2 for y vs. Xz.
The eorrelation eoeffieient is a dimensionless measure which lies on the interval (-1, +1].
It is a measure of linear association between the two variables of the pair. As the: strength
of linear association increases, ;r!-71. A positive association means that larger Xl values
have larger y values associated with them, and the same is true for X:z and y. In other sets of
data, where the larger x values have smaller y values associated with them. the correlation
coefficient is negative. When the data reflect no linear association, the correlation is zero.
In the case of the pull strength data", = 0.982, and r, =0.493. The calculations were madc

8-4 Numerical Description of Data

191

using lvlinitab. selecting Stat>Basic Statistics>Correlation. Both are obviously positive. It


is important to note however. that the above result does not imply causality. We cannot
claim that increasing Xl or Xz causes an increase in y. This jmportant point is often missedwith the potential for major rrrisinteIpretation-as it may well be that a fourth, 1l.oobserved
variable is the causative variable influencing all the obser\led V'driables.
In the case where the data represent measures on variables associated with an entire
finite universe, equations 8-14 and 8-15 may be employed after repJacing x by jl," the finite
population mean for the x population and y by /Jy, the finite population mean for the y population, while replacing the sample size n.in the formulation by lV, the universe size. In tlris
case, it is customary to use the symbol P:cy to represent this finite population correlation
coejficient bel;\Veen variables x and y.

8-4.5 Grouped Data


If the data are in a frequency distribution. it is necessary to modify the computing formulas for the measures of central tendency and dispersion given in Sections 84.1 and 84.2.
Suppose that for each of p distinct values of x, say Xl' Xz. "., X P' the observed frequency is
Jj. Then the sample mean and sample variance may be computed as
p

l"fjXj
~

(8-16)

(8-17)

respectively.
A similar situation arises where original data have been either lost or destroyed but the
grouped data have been preserv'ed. In such cases, we can approximate the important data
moments by using a convention, which assigns each data value the value representing the
midpoint of the class interval into which the observations were classified, so that all sample values falling in a particular .interval are as.qigned the same value. TIris may be done eas~
ily, and the resulting approximate mean and variance values are as follows in equations 8-18
and 8.19.lfmj denotes the midpoint ofthejth class interval and there are c class intervals,
then the sample mean and sample variance are approximately .

if m if m
j

)=1

(8-18)

and
C

~ im'
L..iJJ
si.

J""l

--I'n~J.
(~ fm, J'
I'

\.J=l

n-I

(8.19)

192

Chapter 8

Introduction to Statistics and Data Description.

~~~iiii1!~.f~;.
To illustrate the use of equatio!ls 8-18 and

8~19

we compute tb.e mean and variance of bursting

strength for the data in the :frequency distribution of Table 8~3. Note that there are c 9 class intervals, and ilia, m, ISO,f, =2, m,= 200,!i =4, m, =220,1, =7, m, =240.1. = 13, m, =2f1J,f, =32,
= 2lI0,f, = 24, m, = 300'/1 = 11. m, = 320,fs = 4, m, = 340, andj, = 3, Thus,

m,

L,JjmJ

- j-'

26,460 2'"
60 pst.
x=---=---:m:
\)"+.
n
100
and

7,091,900-(26,460)' /100
99

91499
.

."
,pSI).
'

Notice that these are very close to the v.uues obtained from the ungrouped data.

'When the data are grouped in class intervals, it is also possible to approximate the
median and mode. The median is approximately

(8-20)

where LM is the lower limit of the class interval containing the median (called the median
c1ass),fM is the frequency in the median class, T is the total of all frequencies in the class
intervals preceding the median class, and A is the width of the median class. The model say
MO, is approximately

(_a_)c.,

MO = L,w +
,
a+b

(8-21)

where L MO is the lower limit of the modal class (the class interVal with. the greatest frequency), a is the absolute value of the difference in frequency between the modal class and
the preceding class, b is the absolute value of the difference in frequency between the
modal class and the following class, and C. is the width of the modal class,

8-5

SUMMARY
This chapter has provided an introduction to the field of statistics, including the notions of
process, universe, population, sampling, and sample results called data. Furthermore, a
variety of commonly used displays have been described and illustrated. These were the dot
plot, frequeru:y distribution, histogram, stem-and-Ieaf plot, Pareto cbart, box plot, scatter
plot, and time plot.
We have also introduced quantitative measures for summarizing data. The mean,
median and mode describe cen.tral tendency. or location, while the variance, standard devi~
ation, range, and interquartile range describe dispersion Or spread in the data. Furtbermor~
measures of skew and kurtosis were presented to describe asymmetry and peakedness,
respectiely. We also presented the correlation coefficient to describe the strength of linear
assoctztion between two variables. Subsequent chapters will focus on utilizing sample
results to draw inferences about the process or process model or about the universe.

8-6 Exc::"Cises

193

8.6 EXERCISES
8~1. The

shelf life of a high~speed photographic film is


being inves~ated by the manufacturer. The fo!1owing

datl are available,

Life

Life

Life

Life

(days)

(days)

(days)

(days)

126
131

129
132

134
136
130
134
120

116

128

125
13&

126
127

120

122

125

11:

150
130

148

120

149

117

141

145
162
129

129
147

126
117
143

94.1

93.1

97.8
93.1
86.4

94.6
96.3
94.7

87.6

9U

shown below.
(a) Construct a frequency distribution and histogra.m.
(b) Find the sample mean, sample variance, and sample standard deviation.

133

of tile data.

5
6
10
5

32.6
33.1
34.6
35.9
34.7
33.6
32.9
33.5

35.8
37.6
37.3
34.6
35.5
32.8
32.1
34.5

34.6
33.6
34.1
34.7
35.7
36.8
34.3
32.7

8--3. The :following data represent the yield On 90 con~


secutive batches of ceramic substrate to which l! metal
coating has been applied by a vapordeposition
proeess. Const.ract a histogra::n for ::hese data ane
comment on the properties of the data,
94.1

93.2
90.6
91.4
88.2
86.1

95.1
90.Q

92.4
87.3

87.3
84.1
90.1
95.2
8ciJ
94.3
93.2

86.7
83.0
95.3

94.1
92.1
96.4
88.2

86.4
85.0
84.9
87.3

89.6
90.3

92.4
90.6
89.1
88.8
86.4
85.1

84.0
93.7
87.7
90.6

96.1
98.0

129
14<)
131

8-2. The percentage of cotto:!. in a material used to


manufacture men's shirts is givc., below. Construct a
histogram for t.'lC data Comment on the properties of
the data.

37.8
36.6
35.4
34.6
33.8
37.1
34.0
34.1

87.3
86.4
84.5

127

34.7
33.6
32.5
34.1
35.1
36.8
37.9
36.4

82.9

94.4

133

33.8
34.2
33.4
34.7
34.6
35.2
35.0
34.9

83.7

96.8

~s

33.6
34.7
35.0
35.4
36.2
36.8
35.1
35.3

973

8~4. An electronics company manufac:ures power


snppl:es for l!. persO!li: computer. They ?roduce sev-

4
2

34.2
33.1
34.5
35.6
34.3
35.1
34.7
33.6

89.4
88.6
84.1
82.6
83.1

eral hundred power supplies each shift. and each unit


subjected to a 12~hour burn-in test. The nut:1ber of
units failing during this 12-hour test each shift is

Construct a histogrnm and comment on the properties

86.6
91.2
86.1
90.4
89.1

84.6

85.4

&3,6
85.4
89.7
87.6

86.6

85.1

89.6
90.0
90.1

94.3

91.7

4
8

10

10

10

13

14
8

10

12

9
16

II

10

4
5
14
2

2
7

2
4

8
6

10

4
2

4
10

7
14

8
6
4
6

3
2
8
10
9
II

13
12

5
4
6
5

15
4

7
5
3
2
6

11

13
3
13
3
7
3

4
13
12
10
2

5
.7
10

4
8
7

6
7

8
4
12

2
2

17

10

2
9

'10

14

9
J1
7
2
8

:3

3
6
5

4
8

10

10

9
2

87.5

84.2

8 ..5. Consider the shelf life data in Exercise 8~ 1, Com-

85.1

pute the sample mean. sample va.."'iance. sad sample

905
95.6

standard deviation,

88.3
84.1

8~6. Consider ::he cotton percentage da':a in Exercise


3-2. Find the sample =ean. sa=ple variar.c~. sa.."'!1ple
sr.:andard deviation, samp!e median, ar:.d sample mode,

194

Chapter 8

Introduction to Statistics and Data Description

8~7. Consider tl:e yiClddata in Exercise 8-3. Calcula!e

the sample mean.


dard deviation.

samp~e

variance, and sample stan-

S-S. An article in Computers r:ma' Industrial Engineering (2001; p. .51) describes the time-to-failure data (in
hours; for jet engines. Son::.e of the data are reproduced below.

Engine #
I
2

3
4
5
6
7

8
9
10
:1
12
13

Failure
Time

150
291
93
53
2
65
183

144
223

!97
187
197
213

Failure
Engine #

Tlille

14
15
16

171
197
200
262
255
286
206
179
232
165
155
203

17

[3
19
20
21
22
23
24
25

(a) Construct a frequcr.cy distn'bution and his.togra.'1l


for these data.
(b) Calculate the sample mean. sample median, sam~
pIe variance, and sa.r.:lple standard deviation.
8-9. For the timeto.-failure data in Exercise 8-8, suppose the ffth o'::>scrvation (2 hours) is discarded. Construct a frequency distributioa and a histogram for the
remaining data, and calculate the sample mean, sa.'"npIe median, sample variance, and sample standard
deviation. Compare ':he results with ::hose o:.tained LTl
Exercise 8-8. Wbat impact has removal of this observation had on ':he s.um.ma.ry statistics?

8-10. An article in Tecnnol7U!rrics (Vol. 19, 1977, p.


425) presents the followi:lg data on motor fuel octane
ratings of 5eY'eral blends of gasoline:

88.5,87.7,83.4,86,7,87.5,91.5,88.6, 100.3,
95.6,93.3,94,7,91,1,91.0,94.2, 87.8, 89.9,
88.3,87,6,84.3,86.7,88,2, 90,8,88.3,98,8,
94.2,92.7,93,2,91.0,90.3,93,4,88.5,90.1,
89.2,83.3,85.3,37.9,88.6,90.9,89.0,96.1,
93.3,91.8,92,3,90.4,90.1, 93,0, 88,7, 89,9,
89,8,89.6,87.4,88.4,83,9,91.2,89.3,94.4,
92,7,91,8,91.6,90,4,91.1,92.6,89,8,90,6,
91.1,90.4,89.3,89.7,90.3,91.6,90.5,93.7,
')2.7,92,2,92,2,91.2,91.0,92.2,90.0,90,7.

(a) CoIilltruct a ste:n-and-leaf pIal


(b) Construct a frequency distribution and bi<;togram.
(c) Calculate the sample mean, sample va.....iance. and
sample standard deviation.
(d) Find the sample median and sample mode.
(e) Dete:J:.::.ine:he ske\1lIless and kurtosis measures,
8~11. Consider the shelf-life data in Exercise 8-1.
Cor.struct a stem~and-leaf plot for these data. Con.,
struct an ordered stem-and-leaf piOL Use this plot to
find the 65th and 95th percentiles,

&.12. Consider the cotton percentage data in Exercise

8-2,
(a) Cons:ruC! a stem-and-leaf plot.
(b) Calculate the sample mean, sample va:..ance, and

sample standard deviatiou.


(c) Cons-r..::ct an ordered stem~and-leaf plot.

(d) Fmd the median and the fin;t and third quartiles.
(e) Fmd the interquartile range.
(f) Detcmirle the ske\\'ness a;)d kurtosis measures.
8~13.

Consider ':he yield data in Exercise 8-3,

(a) Construct an ordered stcm-and-leaf plot.

(b) Find the median and the first a."ld third quartiles.

(c) Calculate the inlerquartile range.


8~14. Construct a box plot for the shelf-life data in
Exercise 8-L Interpret the data using this plot.
8~15. Construe!. a box plot for the cotton perce:ttage
data in Exercise 8-2. Interpret the data using this plot,
8~16. Construct a box plot for the yield data in Exercise 8-3. Compare it to the histogram (Exercise 8~3)
and the stem~and-le3f plot (Exercise 8-13). btc:rpret
the data.
8~17.

An article in the Electrical Manufacturing &

coa Winding Conference Proceeding:; (1995, p. 829)


presents the results for the number of returned shipments for arecord-of~:he-month club. The company is
interested in the reason for a returned shipment The
result') are shown below. Construct a Pareto chart and
interpret the data.
Reason

Number of CustOt:l.efS

Refused
Wrong selection
Wrong answer
Canceled
Other

195,000
50,000
68,000
5,000
15,000

&-.6 Exercise'S
8~18. The following table contains the frequency of
occurrence of final letters in an adcle in the Atlcmta
Journal. Construct a histogram from these data. Do
any of the numerical descripto;:s in this :.:hapter have

any meatiliig for these dam?

12

b
c
d

II
Jl
20

25
13
12
12
8
0

11
12

f
g
h

q
r
s
u
v

w
x
Y

8-19. Show the following:

19
13
1
0
15
18
20
0
0
41
0
15

Cal That ,",,' (x;-)=O.


~"."1
(b) That

~
. x/
2:"(x. -xl-'2:'2
[",1

'

,,,,1

8~20.

The weight of bearings produced by a forging


process is being investigated. A sa:nple of six bearings
provided the weights US, J.21. l.J9, l.J7, 1.20, and
1.2: pounds. rmd the sample mean, sample varian:e.
sa:nple standard deviation, and sarr.ple mediat:..

8-21. The diameter of eight automotive piston rings is


shown below. Calculate the sample mean, sample
variance, and sample standard de\.iation.
74,001 mm
73",98 mm
74.005
74,000
74.003
74.006
74,001
74.002

8-22. The thickDess of pri.nted circuit boards is a very


impor.ant characteristic, A sample of eight boards had
the folJowi::g thicknesses (in thousands of 3t'. inch):
63,61, 65, 62, 61, 6',60, ar.d 66, Calculate the s=ple
mean, sample variance. and sample standard deviation.
Vllhat are the units of measureme::::.t for each Statistic?
8~23. Coding the Data Consider the printed circuit
board thickness data in Exercise 8"22.
(a) Suppose that we subtract a constant 63 from eaeh
number. How are the sample mean, sample variance, and sample standard deviation affected?
(b) Suppose that we multiply each number by lOa.
How are the sample mean. sample variance, and
sample standard deviation affected?

195

8~24. COding the Data. Let Yi::= a + bx", i = 1,2, ....


II, where a and b are nonzero constants. Find the relationship between x and y, and between Sx and Sy'

8~2S. Consider the quantity ",~


L,-..l (Xi _a)2. For what
value of a is '.his quantity m.i.nimized?
8~26. The Trimmed l\>lean. Suppose :hat the data are
arranged in ir.creasing order, L~% of the observations
removed from each end, and the sample mean of L.1.e
remaining numbers calculated. The resulting q::.antit,Y
is c.alled a trimmed mean. The trimmed mean generally lies between the sample mean i and t,i:e sample

median i: (why?).
Ca) Calculate the 10% trimmed mean for the yield
da.ta in Exercise S-3.
(b) Calculate the 20% trimmed mean for the yield
data in Exercise 8-3 and compare it with the quan~
tit)' found in part (2.).

8-21. The Trimmed :\1ean. Suppose th2.t LN is :r.ot an


integer, DeveJop a procedure fur Obtaining a trimmed
mean.
8-28. Consider the shelf-life data in Exercise S~l.
Construct a frequency distribution and histogram
using a class interval width of 2. Compute the approx~
imate mean and standard deviation from the frequency
distribution and compare it with the exact values
found in Exercise 8~5.
8..29~ Consider the follov.'ing frequency distribution.
(a) Calcclate L.':.e sample mean, variance. and stan~
dard deviation.
(b) Calcclate the :nedian and mode.
1~1161D

IUIN UOl21 UZU31M

13

15

19 20

IS 15

JO

S3{l, Consider the fQllowing frequency distribution,


(a) Calculate the sample mean, variance. and stan-

dard deviation,
(b) Calculate the median and mode,

x, I -4
.&

J{

-2

-1

60 120 ISO 200 240 190 160

90

30

-3

8-31. For the two sets of data in Exercises 8"29 and S-.
30, compute the sample coefficients of variation,
8-32. Compute the approximate sample mean, sample
variance, sample median, and sample mode from the
data in the follolfling frequency distribution:

196

Chapter 8

Introduction to Statistics and Data Description

Class lntep,rru.
1O~x<20

20~x<30

30'; x< 4()


4O~x<50

50 ~x<60
605x<70

Frequency
121
165
184
173
142
120

70~.x<80

118

30~x<90

110

90Sx< 100

90

8~33. Compute the approximate sample me.ao, sample


variance, median, .and mode from the data in the fol~

Distance Covered

Exhaustion TIme

until Exhaustion

(in seconds)

(in meters)

610
310
720
990
1820
475
890
390
745
885

1373
698
1440
2228
4550
713
2003

488
lll8
1991

lowing frequency distribution;


Dete.o.nine t.':te simple (pearson) correlation between
time and distance. Interpret your results.
Class Intental
-1O~x<O

O';x< 10
105x<20

Frequency

30~x<40

'12
16
9

4()5x<50
50Sx<60

4
2

20.$x<30

8-34. Compute the approximate sample mean, sampie


standard deviation, sample variance, median, and
mode for the data in the following frequency

distributiOll:

Class Iutexval
600 ~x<650

6505x< 700

41
46

7005x<750
750';x<800
800';x<850

50

850~x<900

64

900$x<950
950,; x <: 1000
1000'; x < 1050

65
70
72

52
60

8-35. An article in the International. Journal a/Indusrrial Ergor.o".Jc3 (l999, p. 483) describes a study conducted to determine the relationship between
exhaustion time: and distance covered until exhaustion
for several wheelchair exercises perfonned on a
400-m outdoor track. The time and distances for ten

participa!lts are as fonows:

8-36. An electric utility which serves 850.332 residen~


tial customers on May 1. 2002, seleets a sample of 120
customers randomly and installs a time-of-use meter
or "'load resea..rch meter" at each se1ected residence. At
the time of installation. the technician also :records the
residence size (sq, ft.), During the SUm:n:ler peak
demand period, the company experienced peak
demand at 5:31 P,!,t on July 30, 2002. July bills for
usage (kwh) were sent to all customers, including
those in the sample group. Due to cyclical billing, the
July bills do no: reflect the same time-of-use period
for all customers; however, each customer is assigned
a usage for July billing, The time-of-use meters hay\':':
memory and they record time specific demand (kw) by
time interval; so for: the sampled customers. average
15-minute demand is available for the time interval
5:00-5:45 PM, on July 30, 2002. The data file Load
data contains the kwh. kw, and sq, ft, data for: each
of the 120 sampled residences, This file is available
at www.wiley.com/ooliegelhines.Using Minitzb or
other software. dQ the fOllowing:
(a) Construct scatter plots of

1. kw vs. sq. ft. and


2, kw V5. kwh,
and COlllOlent On the observed displays. specific:illy in regard '.:0 the nal.'Ure of the associatiou in
parts 1 and 2 above, a.'ld for each part. the general
pattern of observed variation in kw across the
range of sq, ft, and the range of kwh.
(b) Construet the three-din:.ensional plot of lew vs,
kwh and sq. ft.

(c) Construct histograms for kwh, kw, and sq. ft.


data. and comment on the patterns observed.
(d) Construct a stem-and~leaf plot for the kwh data
and compare to the histogram for the kwh data.

8-6
(e) Determine the sample mean, median, and mode
for kwh, kw, and sq. ft.
(f) Determine the sample standard deviation for kwh,
kw, and sq. ft.

(g) Determine the first and third quartilcs for kwh, kw,

and sq. ft.

Exercises

197

(h) Determine the skewness and kurtosis measures for


kwh, kw, and sq. ft., and compare to the measures
for a normal distribution.
(i) Determine the simple (pearson) correlation
between kw and sq.ft. and between kw and kwh.
Interpret.

Chapter

Random Samples and


Sampling Distributions
In this chapter, \ve begin our study of statistical inference, Recall that statistics is the science of drawing conclusions about a population based on an analysis of sample data from
that population. There are many different ways to take a sample from a population. Furthermore t~e conclusions that we can draw about the population often depend on how the
sample is selected. Generally, we want the sample to be representative of the population.
One important method of selecting a sample is random sampling. Most of the statistical
techniques that we present in the book assume that the sample is a random sample. In this
chapter we wlll define a random sample and introduce several probability distributions useful in analyzing the information in sample data.
j

9-1

R4.l'<'DOM SA.'\IlPLES
To define a random sample, let X be a random variable with probability distributionf(x).
Then the set of n observations Xl' X1 .... Xl'll taken on the random. variable X. and having
numerical outcomes Xl'~' . , X". is called a random satr".ple lithe observations are obtained
by observing X independently under unchanging conditions for n times. Note that the observations XI' Xz, .. X" in a random sample are independent random variables with the same
probability distributionf(x). That is, the marginal distributions of Xl' Xv ... , X, aref(x l ),
f(x,), ... .j(x,), respectively, and by independence the joint probability distribution of the
random sample is
g(x"

Xz, ... , x,) = f(x) , f(x.j ..... f(x,).

(9,1)

Definition

X" X" ... , X, is a random sample of size n if (a) theXs are independent random variables,
and (b) eveq observation Xi bas the same probability distribution.
To illustrate this defmition. suppose that we are investigating the bursting strength of
glass, I-liter soft drink bottles, and that bursting strength in the population of bottles is normally distributed. Then we would expect each of the observations on b1l!Sting strength X"
~, .. , XII in a random. sample ofn bottles to be independent random variables v:.1th exactly
the same normal distribution.
It is not always easy to obtain a random sample. Sometimes we may use tables of uni~
form random numbers, At otber times, the engineer or scientist cannot easily use formal
procedures to help ensure randomness and must rely on other selection methods. Ajudgment sample is one chosen from the population by the objective judgment of an indivi~ual.

198

9-1

Random Samples

199

Since the accuracy and statistical behavior of judgment samples cannot be described, they
should be avoided.

Suppose we ",ish to take a random sample of five batches of raw material out of 25 available hatches.
We may n1l!llber the batches 'With the integers 1 to 25. Now, using Table XV of the Appendix, arbi~
tra."ily choose a row and colum."l as a st.arting point Read do';O.'D the chosen column, obtaining rNO digits at a time, until five acceptable numbers are found (an acceptable number lies be!ween 1 and 25),

To illustrate, suppose the above process gives t:s a sequence ofnumben5 that rea.ds 37, ~8, 55,02.17,
61.70.43,21.82.13.13,60.25. The bold numbe;s spec:Iy which batches of raw material arc to be

chosen as the random s<L'llp:e.

We first present some specifics on sampling from finite universes. Subsequent sections will
describe various sampling distributions, Sections marked by an asterisk may be omitted
without loss of continuity.

*9-1.1 Simple Random Sampling from a Finite Universe


When sampling n items without replacement from a universe of size N, there are Cd possible samples, and if the selection probabilities are 1t, = I/i:i, for k= I, 2, "" (:), then this is
simple random sampling. Note that eacb universe unit appears in exactly (;:;}
of the possible samples, so each unit bas inclusion probability of (~'~ ~ )/(~ =
As will be seen later, sampling without replacement is more '''efficient'' than samFling
with replacement for estimating the finite population mean or total; however. we br..efly discuss si:rnple random sampling with replacement for a basis of comparison. In this case, there
are N" possible sampJes, and we select each v.'ith probability 1':1<:;:;: liNn, for k= 1,2. ,., N tt ,
In this situation, a universe unit may appear in no samples or in as many as n, so the notion
of inclusion probability is less meaningful; however, if we consider the probability rhat a
specific omt will be selected at least once, that is obviously 1 - (1 -11)"', since for each unit
the probability of selection on a given observation is liNt a constant, and the n selections
are independent, so that these observations may be considered Bernoulli trials.

N'

Consider a univcn;e consisting offive units numbered 1,2,3,4"s.ln sampling without replacement. we
will employ a sample.of si;r.e two and enumerate the possible samples as
(1,2), (1,3), (1,4), (1.5), (2,3), (2,4), (2,5), (3,4). (3,5), (4,5),

Note that there are (~) =- to possible samples. we select one from these. where each has an equi!.l
selection probability of 0.1 assigned, that is si:uplcrandom sampling. Consider these possible s~ples
to be numbered 1,2, ... ,0, where 0 represents num;", 10, Now, go to Table XV (iJ: the Appendix),
sho~.l1g n1;ldom integers, and vAth eyes closed place a finger down, Read the first digit from the fivedigit integer presented. Suppose we pick the integer which is in row 7 of columl14. The fust dig:t is
6. so the sample cor.sists of units 2 and 4. An alternate to us'..::.g the table is to cas~ a. sing;e icosohedron die and pick the sample corresponding to the outcome, Kotice also that each unit appears in
exactly foUI of the possible samples, and thus the inclusion probability for each unit is 0":';', which is
simply rJN =-215,

IT

-May be orcitted on first reading,

20t}

Chap{et 9

Random Samples and Sa.'11pling Distributions

It is usually not feasible to enumerate the set of possible samples. For example. if '
N = 100 andn = 25, there would be more than 2.43 X 1013 possible samples, so other selection procedures which maintain the properties described in the definition must be
employed. The most commonly used procedure is to first number the universe units from 1
to N, thcn use realizations from a random number process to sequentially pick numbers
from 1 to N, discarcful,g duplicates until n units have been picked if we are sampling without replacement, kceping duplicates if sampling is with replacement. Recall from Chapter
6 that the term random numbers is used to describe a sequence of mutually independent
variables U 1, Uz, ... , which are identically distributed as uniform on [0,1). We employ a
realization u 1 ~, ... , which in sequentially selecting units as members of the sample is
roughly outlined as
(Unit Nurober),=LN u)+ I,

1,2, ... ,1. i= I, 2, .... n, i<j,

where J is the trial number on which the nth, or finaJ.. unit is selected. In sampling without
replacement, J ~ n, and LJ is the greatest integer contained function, 'When sampling with
replacement, J = n.

*9-1.2 Stratified Random Sampling of a Finite Universe


In finite-universe sampling, sometimes an explanatory or auxiliary variable(s) is available
that has bown value(s) fofeach unit of the universe. These may be either numerical or categorical variables or both. If We are able to use the auxiliary va..<iables to establish criteria
for assigning each universe unit to exactly one of the resulting strata. before sampling
begins, then simple random samples may be selected within each stratum, and the samp.ling
is independent from stratum to stratum, which allows us to later combine~ with appropriate
weights, stratum "statistics" such as the means and variances obtained from various strata;.
In general, this is an "efficient" scheme in the sense of estimation of popUlation means and
totals if, following the classification, the variance in the va..'iable we seek to measure is
small within strata while the differences in the stratum mean values are large. IfL strata are
formed, then thestrarum sizes areNt, N2, ... ,ND andNI +N2+ ... +NL =N. while the sampIe sizes are n l , Tl.:!, , nL> and 1:t + Tl.:! + ... + nL = n. It is noted that the inclusion probabilities are constant within strata. as n,/Nh for stratum h; but they may differ greatly across
strata. Two commonly used methods for allocating the overill sample to strata are proportional allocation and Neyman opt.i.mal allocation, where proportional allocation is
n, =n(N,/l'I), h = 1,2, ... , L,

and Neyman allocation is


nh =n

r.LN>aa
LNh

Lh=l

h = 1,2,.",L.

(9-2)

(9-3)

The values Gh are standard deviation values within strata, and when designing the sampling
study, they are usually unknown for the variables to be observediu the study; however. they
may often be calculated for an explanatory or auxiliary va..<iable where at least one of these
is numeric, and if there is a "reasonable" correlation, Ipi > 0.6, between the variable to be
measured and such an auxiliary variable, then these surrogate standard deviation values,
denoted a~, will produce an allocation wbich will he reasonably close to optimal,

9-2 Statistics and Sampling Distributions

201

In seeking to address growing COnSUmer concerns regarding the quality of claim processing, a
national tnaruiged i:.ealth care:bealtb insurance company has identified several characteristics to be
monitored on a monthly basis. The most important of these to the company are, fi.rst, t..'!e size of the
m~ ''financial error," which is the absolute value of the error (overpay or underpay) :or a claim and.
second. the fraction of eorrectly rued claims paid to providers within 14. days. Three processing cen~
tern are eastern, mid~America, and western. Together, these eenters process about 450,000 claims per
month, and it has been observed that they differ in accuracy and tb::e11ne.ss. Fu.rt.hermore, the correlation between the total dollar amount of the claim end the financial error overall is historically abou:
0,65-0.80. If a sample of size 1000 claims is to be drawn monthly, strata might be formed using centers E, MA, and Wand total claim sizes as $0-$200, $20h$900, $9(11-$= claim, Therefore there
would be nine strata. And ufor a given year there were 41.357 claims, these may be easily assigned
to strata. sinee they are natu...-ally grouped by center. and each center uses the same processing system
to record data by claim amount. thus allowing ease of classification by claim size, At this point, the
WOO-unit pla.1Jled sa.~1e is allocated to the ;tine strata as shown in Table 9-1, The allocation in italics, sho'W'tl. first, employs proportional allocation while the second one uses "optimal" allocation. The
values represent the standard deviation in the claim-amount met.:ic w.i:hin the stratum, since the standard deviation in "financial error" is unknown. After deciding on the allocation to use. nine independent~ sirLple random samples, without replacement, would be selected, yielding 1000 claim. forms
to be inspected. Stratum identifying subscripts are not shown in Table 91,

It is nored that proportional allocation results in an equal inclusion probability for all
units, not just witb.in strata, while an optimal allocation draws larger samples from strata in
which the product of the internal variability as measured by standard deviation (often in an
auxiliary variable) and the number of units in the stratum is latge, Thus, inclusion probabilities are the same for all units assigned to a stratum but may differ greatly across strata..

92 STATISTICS AND SAMPLING DISTRIBUTIONS


A statistic is any function of the observations in a random sample that does not depend on
unknoYt'n parameters, The process of drawing conclusions about populations based on

Table 91

Data fur HeaI~ Insurance R"QDlple 9~3


Claim Amount
$201- $900

5901 - SMax CIait:J

Center

SO - S200

N= 132,365
(1' $42

N=41,321
(1' = $255
.=94154

N= 10,635
(1' $6781
n=241371

MA

N=96.422
(1' =$31
n=218115

N=31,869
"= $210
n=72/35

N=6,163
(1' =$5128
n=14/163

N=82,332
(1' = $57
n=187124

N= 33,793
(1'=$310
n=76/54

N= 6,457
(7' = $7674
n=151255

2781333

N=311,1l9
.=70516&

N= 106,983
.=242/143

N=23.255
"=531789

441,357
100011000

All

=
.=

All

184,321

4181454
134,454

3041213
122,582

202

Chapter '9

Random Samples and Saru~li:rtg Distributions

sample data makes considerable use of statistics. The procedures require that we understand>
the probabilistic behavior of certain statistics. In general, we call the probability distribution of a statistic a sampling distribution. There are several important sampling distributions
that wili be used extensively in subsequent chapters. In this section, we define and briefly
iliustrate these sampling distributions. Firs~ we give some relevant definitions and additional motivation.
A statistic is now defined as a value detennioed by a function of the values observed
in a sample. For example, if XI' X2" .. ') Xn represent values to be observed in a probability
sample of size n on a single t'dndom variable X, thenK and S', as described in Chapter g in
equations 8~1 and 8~, are statistics. Furthermore, the same is true for the median. the
mode, the sample range, the sample ske\\'Iless measure, and the sample kurtosis. ~ote that
capital letters are used here as this reference is to random variables, not specific numerical
results, as was the case in Chapter 8.

9-2_1 Sampling Distributions


Definition
The sampling distribution of a statistic is the density function or probability function that
describes the probabilistic behavior of the statistic in repeated sampling from the same universe or on the same process variable assignment model.
Examples have been presented earlier, in Chapters 5-7. Recall that random sampling
with sample size n on a process variable X provides results XI' X2> " XII' which are mutually independent random variables, all vtith a common distribution function. Thus, the sample mean, X, is a linear combination of n independent variables. If E(X) = I' and VeX) = cr',
then recall that E(X) = Jl and V(X) = cr'ln. And if X is a measurement variable, the density
function of X is the sampling distribution of this statistic, There are several important
sampling distributions that will be used extensively in subsequent chapters. In this section
we wili describe and briefly illustrate these sampling distributions.
The form of a sampling distribution depends on the stability assumption as wen as on
me form of the process variable model. In Chapter 7, we observed that if X - N eu., cr'), then
X - NC/.!, cr'ln), and this is the sampling distribution of X for n;;: L Now, if the process variable assignment model takes some form other than the normal model illustrated here (e.g.,
the exponential model), and if all stationarity assumptions hold, then mathematical analysis
may yield a closed form for the sampling distribution of X. In this example ease, if we
employ an exponential process model for X, me resulling sampling distribution for X is a
form of the gamma distribution. In Chapter 7 , the important Central Limit Theorem was pre~
sented. Recall that if the moment-generating function M ,,(t) exists for all t, and if
Zn =

X-I'
th
lim F.(z)
I r>
en
,

cr!"n

..... ~ \IllZ)

l,

(9-4)

where Fiz) is lhe cumulative distribution function of Z" and <I>(z) is the CDF for the standard normal variable, Z. Simply stated, as n -t ~, Z, -t N(O, I) random variable, and this
result has enormous utility in applied statistics. However1 in applied work, a question arises
regarding how large the sample size n must be to employ the N{O,I) model as the sampling
distribution of Z" or equivalently stated, to describe the sampling distribution of X as
NC/.!, cr'ln). This is an important question, as the exact form of the process variable assignmentmodel is usually unknO\\'Il. Furthermore~ any response must be conditioned even when
iris based on simulation evidence. experience. or "'accepted practice." Assuming process sta~
bility, a general suggestion is that if the skewess measure is close to zero (impJying that the
variable assignment model is symmetric or very nearly so), then X approaches normality

9-2 Statistics and Sampling Distributions

203

quickly, say for n ~ 10, but this depends also on the standaulized knrtosis of X, For example, if fJ, ~ 0 and Ill,! < 0.75, then a sample size of n = 5 may be quite adequate for many
applications. However, It should be noted that the tail behavior of Xmay deviate somewhat
from that predicred by a nolllW model.
Wnere there is considerable skew present, application of the Central Limit Theorem
describing the sampling distribution must be interpreted with care. A Hrul e of thumb" that
has been successfully used in survey sampling, where such variable behavior is common
(fJ, > 0), is that

n > 25(fJ,)'.

(9-5)

For instance, returning to an exponential process variable assignment model. this rule suggests that a sample size of n > 100 is required for employing a nolllW distribution to
describe the behavior of X, since fJ, = 2.

Definition
The standard error of a statistic is the standard deviation of its sampling distribution. If the
standard error involves unknown parnrneters whose values caD be estimated, substitution of
these estimates intO the standard error results in an estimated standard error.
To illustrate this definition, suppose we are sampling from a normal distribution with
mean j.J. and variance fi1. Now the distribution of Xis normal with mean j.J. and variance
d'ln, and so the standard error of X is

If we did not know (J but substituted the sample standard deviation s into the above, then
the estimated standard error of Xi.

t~~~pg'[~j
Suppose we colicet data on the tension bond strength of a modified portland cement mortar, The ten
observations are
16.85,16.40, :7.21,16.35,16.52,1'7.04,16.96,17.15,1659,16.57,
whe~e tension bond strength is measured in units of kgf/cml , We assume that the tension bond
strength is well-described by a notnl31 distribution. The sample average is

x= 16.76 kgflcm'.
First, suppose we know (or are willing to assume) that the standard de'\'iation of tension baed strength
is 0'= 0.25 kgfIcm 2 Then the standard error of the sample average is

(J,/.,fn =0.25/~ =0.079 kgf/cm 2


If we are unwilling to assume that 0'= 0.25 kgfJcm\ we could use the sample standard deviation. s ==
0.316 kgfJcm2 to obtain the estimared standard error as follows:

s/-./n =O.316/..jl0 =0.0999 kgf/cm2 ,

204

Chapte:: 9:

Random Samples and Sampling Distributions

*9-2.2 Finite Populations and Enumerative Studies


In the ca')e where sampling may be conceptually repeated on the same finite universe of
units, the sampling distribution of Xis interpreted in a manner similar to that of sampling a
process, except the issue of process stability is not a COncern, Any inference to be drawn is
to be about the specific collection of unit population values. Ordinarily, in such situations,
sampling is without replacement. as this is more efficient. The general notion of expectation is different in such studies in that the expected value of a statisticBis defined as

G)

EO(8) = L1!,6k ,

(96)

k=l

where fI, is the value of the Statistic if possible sample k is selected. In simple random sampling, recall that nk ::: 1J('~),
Now in the case of the sample mean statistic, X. we have EC(X) :::.; fl.;::. where I1r is the
finite population mean of the random variable X. Also, under simple random sampling
without replacement.
(97)

where a~ is as defined in ~uation 8-7. The ratio nhV is called the sampling fraction, and it
represents the fraction of the population measures to be included in the sample. Concise
proofs of the results shov,'ll in equations 96 and 9-7 are given by Cochran (1977, p. 22).
Oftentimes) in studies of this sort, the objective is to estimate the population total (see equa~
tion 8-2) as well as the mean. The "mean per unit,l1 Or mpu. estimate of the total is simply
Tx=N X, and the variance of chis statistic is obviously W)=lv"' V(X). Wlrile necessary and
sufficient conditions for the distribution of X to approach normality have been developed,
these are of little practical utility, and the rule given by equation 9-5 has been widely
employed.
In sampling with replacement, the quanitities E'(X) = p" and

v"(X) = a~/n.

'tr'here stratification has been employed in a finite population enumeration study, there
are two statistics of common interest. These are the aggregate sample mean and the estimate
of population total. The mean is given by
(9-8)

(99)

In these formulations, Xh is the sample mean for stratum h, and WII given by (Nh/b\ is
called the stratum weight for stratum h, Note that both of these statistics are expressed as
simple linear combinations of the independent" stratum statistics Xhl and both are mpu estimators. The variance of these statistics is given by

(9-10)

I
I

9-3

The withln~stratum variance terms,

The Chi-Square Distribution

205

&;", may be estimated by the sample variance terms,

S;~, for the respective strata, The sampling distributions of the 3&:,oregate mean and total
estimate statistics for such stratified, enumeration studies are stated only for situations
where sample sizes are large enough to employ the limiting normality indicated by the Central Limit Theorem, Note that these statistics are linear combinations across observations
within strata and across strata. The result is that in such cases we take
(9-11)

and

~~~~~~j
Suppose the sampltng pla."l. in Example 9-3 utilizes::he "optimai" aliocatio;;. as shov.';l:n Table 9-L
The stratum ~ple means and samp!e staadard deviations on the fi.'1a!1cial error variable are cakJ~
lated with ::he results shown in Table 9-2, where the units of measurement are error $/claim.,
Then utilizing the results presented in equations 9-8 and 9-9, the aggregate statistics whe::l evaluated are.>'= $18.36, and f, = $8,103,3,5. Utilizing equations 9-10 and eoploying the within-stratum
sample variance ,,'Blues
to estimate the w-lt.1in~stratum variances
the estimates for the variance
in the sampling distribl.rtior. of the sample mean and the esti.mate of total are V(fl) "'" 0.574 and
Hi.>:) "'" 1.118 x 101!, and the estimates of the respective standard errors are thus $0.785 and $334,408
for the mean and tOtal esrimator distributions.

s!,

o;h

9-3 THE elU-SQUARE DISTRIBlJTlON


Many other useful sampling distributions can be defined in terms of norma! random variables. The chi-square distribution is defined below,

Table 9~2

Samp~e Stati~tics

foc Health Insurance Example


Claim Amount

MA

VI'

All

$0-$200

$201-$900

N= 132,365
n=29
.>'=6.25

N=41,321

$901 - $Max Claim


N~

10,635

All
184.321

454

n::.-::54

n=371

x=34.10

x=91.65

N=96,422
n= IS
x;;;; 5.30

N=31,869
n=35
x=22.00
16.39

N=6,163
n=l63
x=72.oo
56.67

134,434

N=82,332
n=24
x= 10.52
9.86

N=33,793

N=6,457
n=255
x= 124.91
109.42

131.225
333

N= 311,119
n=67

N= 106,983
n = 143

n==54

>'=46.28
31.23

----

N=31,898

n=789

213

44l,357

n 1.000

206

Chapter 9

Random Samples illld Sampling Distributioru;

Theorem 91
Let Z" z" ... , Z, be normally and independently distributed random variables, with mean
I' Q and variance d' I. Then the random variable

, Z'
Z,
.. i+ Z'
21' .. , . .... k

X~=

has the probability density funetion

'( ) _

1 /k\u ('/2 )-1e


-<42.

J U -

2'12 1 1 __

U>.

,2)
otherwise

(9-12)

and is said to follow the chi-squ!U'e distribution with k degrees of freedom, abbreviated X~.
For thc proof of Theorem 9-1, see Exercises 7-47 and 7-48.
The mean and variance of ~e distribution are

X;

(9-13)

!J.=k
and

(9-14)

Several chi-square distrib!ltions are shown in Fig. 9-1. Note that the chi-square random variable is nonnegative. and that the probability disuibution is skewed to the right. However, as
k increases, the distribution becomes more symmetric. As k -+ <'<'\ the limiting form ot'the
chi-square dh1ribution is the normal distribution.
The percentage points of the
distribution are given in Table ill of the Appendix,
Define X!r.: as the percentage point or value of the chi~square random variable with k
degrees of freedom sucb that the probability that exceeds this value is IX. That is,

X!

X!

p{X~" x~,d = J~
z., f(u)du a.

10

15

20

25

Figure 9-1 Sever.ll X' distributions.

9-3 The Chi-Square Distribution

207

Ims probability is shown as the shaded area in Fig. 9-2. To illustrate the use of Table m,
note that

p {X:, ;"Xi.".IQ} =P (X;," l8.3lJ =0.05.


That is, the 5% point of the chi-square distribution with 10 degrees of freedom is X;.",.lO =
18.31.
Like the nonnal distribution, the chi-square distribution has an important reproductive
property.

Theorem 9-2 Additivity Theorem of Chi-Square


Let X:'

X; . ... , X; be independent chi-square nmdom variables with k" k" ... , k, degrees of

freedom, respeotiveJy. Then the quantity

Y=X'+X'+"'+X'
I
-z
"/3
follows the chi-square distribution with degrees of freedom equal to
p

k=IA
p"l

PrOfl! Note that each chi~square random variable X~- can be written as the sum of the

squares of k standard nonn.al random variables, say


,
j

i=l. 2, ... , p,
Therefore,
.u

Y=LX;
i=t

P ki

LLZ~
1",1 j=:

and since all me random variables ZfJ are independent because the X; are independent, Y is
just the S1..--n1 of the squares of k
~ kJ independent standard normal random variables.
~'=l
From Theorem 9-1, it follows that Y is a chi-square random variable with k degrees of
freedom.

='"

As an example of a statisti:: t:!:iat follows the chi-square distribution, suppose that Xl> x., .... , XI! is a
random sample from a normal population. with mean J1. and variance 01-. The functiQU of t.1C sample
variance

f(u)

Figure 9~2 Percentage point X!..t of the chi-square distribution.

208

Chapter 9

Random S""'ples and Sampling Distributions

is distributed as X!.:' We will use this random variable extensively in. Chapters 10 and 11. We will see
in those chapters that becauSe the distribution of t:l.:ris random variable is chi square. we can construct
confidence interval estimates and test statistical hypotheses about the variance of a nor:wa1 population.
To illustrate heuristically why the distribution of the random variable in Example 9~6,
(n. _l)SlJo-i, is clJ.i~square, note that

(9-15)

If X in equation 9-15 were replaced by,u.. then the distribution of

~::lXi - 1')'
;"'!

i,

,,'

is
because each term (x,.- ]1)1<115 anindependeut standard normal random variable. Now consider
the following:

"
1')' = Y[(X,-X)+(X-Jl)]

,\..1
II

i.

f'

= Y(X,-X)' + I(X-Jl)' +2(X-Jl)I(X,-X)


j,.,l

; ..:

1",1

Irx,-X)' +n(X-I')'.
10.. ;

Therefore,

or

"
I(X,-I')'
,.,

(n-I)S'
2

1-

,Z

lX-I')
+-.,-,-.

(9-16)

(r in
"
Since X is normally distributed v.ith mean.}l and variance erIn, the quanti:y (5: fJif( fTln) is distributed as X!. Furt.iJ.errnore, it can be shown thaJ: the random Variables X anaS*ire~ae'pendent.
(j

2::1

Therefore. s.ince
(Xi - j..t)1
is distributed as X~, it seems logical to use the additivity proP"'
erty of the chi-~u.are distrihucion (Theorem 9~2) and conclude that t..1.e distribution of (n - 1)S2/til is
X~_I'
'

1(12

9-4 THE t DISTRIBUTION


Another important sampling distribution is the t distribution, sometimes called the student
t distribution.

Theorem 9-3
Let Z - N(O, I) and V be a chi-square random variable v>'ith k degrees of freedom. If Z and
V are independent, then the random variable

T= r,'V V/k

9-4 The r DistrilY,;:tion

209

has the probability density function

(t) = f[(k+l)!Zl.

-oo<t<_.

-Jiikr(k/2) [(t 2/k)+lfl)/2'

(9-17)

and is said to fallow the t distribution with k degrees of freedom, abbreviated t,.
Proof Since Z and V are independent, their joint density function is
1kf21-1

f(z,v)::=

v"

r(

-( _

\.

,e J;~+\'JI'2.

.J27rZkf2 ~ j

-oo<z<.()(),O<v<oo,

Using the method of Section 4-10 we define a new random variable U


imrerse solutions of

=V, Thus, the

t= Z
and

u v
are

Z=t~f
and
v::::u,

..

The Jacobian is

IU

1=

i'lk

I0

Thus,

ill= 1.<:
.
~k
and so the joint probability density fwletion of T and lJ is

g(l,u)

."j~

t}k/2)-1 e-[(1I/k)t

~2Jrk2kf2~~)

Now. since v> 0 we must require that u > 0, and since rearranging equ1ition 9-18 we have

-....

and since 1(1)=5; g(t,u)du, we obtain

+uV2

(9-18)

,
00

< z < 00, then -

00

< t < "">, On

210

Chapter 9

Random Samples and Sampling Distribu:iODS

J(t)

-oo<t<=-,
Primarily because of historical usage, many authors make no dis'"...inction between the
random variable T and the symbol t, The mean and variance of the t distribution are p. = 0
and d' = k/(k- 2) for k > 2, respectively, Several t distributions are shown in Fig, 9-3, The
general appearance of the rdistribution is sllnilar to the standard normal distribution, in that
both distributions are symmetric and unimodal, and the maximum ordIDate value is reached
at the mean p. = 0, However, the t distribution has beavier tails than the normal; that is, it
has more probability further out, As the number of degrees of freedom k ..... ~, the limiting
form of the I distribution is the standard nonnal distribution, In visualizing the t distribution
it is sometimes useful to know that the ordinate of the density at the mean fJ.. := 0 is approx ~
imately four to five times larger than the ordinate at the 5th and 95th percentiles, r'Or example, ""th I 0 degrees of freedom fort this ratio is 4.8, with 20 degrees of freedom this factor
is 4.3, and with 30 degrees of freedom this factor is 4.1. By comparison. for the normal distribution, this factor is 3.9,
The p,,!,centage points of the t distribution are given in Table IV of the Appendix. Let 1,
be the percentage point or value of the t random variable with k degrees of freedom such that

p{TZ'a,.}=f J(t)dt=a,
'a>

This percentage point is illustrated in Fig. 94. Note that since the t distribution is symmetric
about zero, we find t1-<1.,.1: =:; -ta.,k' Tb.is relationship is useful, since Table N gives only upper-rail
percentage points, that is, values of for a$ 0.50. To illustrate the use of the table, note that

t""

P{T~ tQl)S,IO}

=PIT:;' L812} =0,05.

Thus, the upper 5% point of the ,distribution with 10 degrees of freedom is


Similarly, the lower-tail point to..".:o ~ -t'"",10 = -I ,812,

''os,,. = 1.812,

;~~~!~~:J\i
As an ex:runple of a ra."1dom variable that follows the t distribution. suppose thatX:.~, ... , Xil is a xan.com sampic from a normal distribution withmcanfJ. and variance
and letX and 51. denote thesam~
pIe mean and variance. Corsica the statistic

rr.

o
Figure 9~3 Several: distributions,

98 The F DIstribution

211

Figure 9-4 Percentage points of the t distribution.

(9-\9)
Divlding both the nurneretor and denominator of equ.ation 9-19 by c;. We obtain

X-/1
...3[y=_;

S'l/n

~~~

-,;S-/(72

Sbee (X -11)/(c;I';;;) - N(O.1) and S2!dl_ X:~;/(n - 1), and since X and S2 are independent, we see
from Theorem 9-3 mat
~
(9-20)

follows a t distribution with v:::: n ~ 1 degrees of freedom.. In Chapter;, 10 and 11 we \\111 use the rdDciom variable in equa:i.on 9-20 to construct confidence intervals and test hypotheses about the c;.ear.
of a normal distribution.

9-5 TIlE F DISTRIBlTTION


A very useful sampling distribution is the F distribution,

Theorem 94
Let Wand Ybe indepe_ndent chi-square random variables with u and v degrees of freedom,
respectively. Then the ratio

W/u
Ylv

has the probability density function

r(u-+v
- ~(u)ltl2
- p'u(2)-1
h(t)

" 2 ; v

0</<=,-

(921)

and is said to follow the F distribution with u degrees of freedom in the numerator and v
degrees of freedom in the denominator. It is usually abbreviated FIJ,'i"

Proof Since Wand Yare independent, their joint probability density distribution is

212

Chapter 9

RJmdom Samples and Sampling Distributions


(,12)-1 (,/1)-1

I(w,y)

W'

2"2r(%)z'f2r(~J e

-(w+y)/2

O<w,Y<QQ
,

Proceeding as in Section 4-10 t define the new random variable M ::;:;: 1': The inverse solutions
of/= (wlu)I(ylu) and m = y are

w= umf
v
and

m.

Therefore. the Jacobian

J=~ ~L~m.

I~ ~ I

Thus, the joint probability density function is given by


. (

UUfm

g(j,m)

)(','2)-1

-;:;l-;:;
2'f2 r( ~

12' r(x.)

\.2.:

and since h(fJ =

'ij2

J1

-(m/2)(('i'lf+l)

0< j,m < ')


,

\2

r; g(f, m) dIn, we obtain equation 9-21, compleling the proof.

The mean and variance of the F distribution are Il = v/(v - 2) for v > 2, and

2v'(u+v-2)
u(v-2)'(v-4)'

v>4.

Several F distributions are shown in Fig. 9-5. The Frandom variable is nonnegative and the
distribution is skewed to the right. The F distribution looks very similar to the chi-square

distribution ill Fig. 9-1; bowever, the parameters u and v pro"ide extra flexibility regarding
Shape.

u=5,v=5

Figure 9-5 The F illstribution.

9-5 The F Dis:ribution

213

The percentage points of the F disrribution are given in Table V of the Appendix. Let
F "-"-~ be the percentage point of the F distnlJution with u and v degrees of freedom. such that
the probability that the random variable F exceeds this value is

Th.is is illustrated in Fig. 9-6. For example, if u = 5 and v = 10, we find from Table V of the
Appendix that
PiFe. F,m.',lo) = PiF;' 3.33) = 0.05.

That is, the upper 5% point of F,.lC is FO,05~.JO = 3.33. Table V contains only upper-tail percentage points (values of F a,U,lo' for a:s; 0.50). The lower~tail percentage points F I~,v can
be found as follows:
1

fl-a,lt,v =~.

(9-22)

a,v,1I

For example, to find the lower-tail percentage point F O95,5,lOl note that
4.74

0.211.

A.s an example of a statistic that follows the F dis~bution. suppose we have two notmal popula1ons
with variances ~ and 0;. Let independent nm'dom samples of sizes n l and r"1; be taken. from popu1a
!ions 1 and 2. respectively, and let S~ and be the sample variances. Then the rano

S;

(9-23)

has an F distributitin -v.'ith n, - 1 numeratOr degrees of freedom and li.z 1 denominator degrees of
freedom. This follows dire.."tly from the facts that (nt -1)~10'~ - X:, ~ 1 and (n.z-l),f/O'~ "" X!l-! and
from Theorem 9...!, The nmdom variable in equation 9-23 plays a key role in Chapters 10 and 11,
where we address the preblems of confidence interval estimation and hypothesis testing about the
variances of two independent normal populatiOI15.

h(0

Figure 9-6 Upper and lower percentage points of the F distribution,

214

Chapter 9

Random Samples and Sampling Distributions

9-6 SL-:rvL\1ARY
This chapter has presented the eoncept of random sampling and introduced sampling' distributions.In repeated saml'.ling from a population, sample statisties of the sort discussed in
Chaptcr 2 vary from sample to sample, and the probability distribution of suell statisties (or
funetions of the Statistics) is called the sampling distribution, The normal, chi-square, Student t, and F distributions have been presented in this chapter and will be employed extensively .in later chapters to describe sampling variation.

9-7

EXERCISES

/~ Suppose that a random variabl: is normally dis\:.Juted with mean JJ. and variance if. Draw a random
sample of five observations. What is the joiut density
~tioo of the sample?

( 9~2_'TransistorS have a life that is exponentially dis~


"tnbuted ",1.th pararr-.erer ;_ A random sampl::: of It :ranslstors is taken. "/hat is the joint density function of
the sample?
9~3. Suppose thar X is uniformly distributed 00 the
,interval from 0 to 1. Consider a ranqom sample of size
4 from X. ~at is the joint density function of the
sample?

94. A lot consists of N transistors, and of these M


(M $" N) are d:::fective. We randomly select twa transistors without replacement from this 10t and deter
rrine whether they are defective or ooodefectivc. The
ra.'1dom variable
M

X-=

IlOI

the vendor 2 observed resistances asscmed normaUy


and independently distributed with a mean of 105 Q
an! a standard deviation of 2,0 0.. "/hat is the saro~
pling distribution oCr! -X2?
9~9.

Consider the resistor problem in Exercise 9-8.

Fmd the standard elTO! of X! ~ Xl'


9~ to. Consider the resistor problem in Exerci$e 9~8. If
we could not assume that resistance is normally dis~
tribut~ what could be said about the sampli:cg distribution ofX\
9~11.

SuppoSe that independent random samples of


sizes n l and '1:1 are taken from wo normal populations
with mean.., JJ.I and f.J..;. ar.d variances (j~ and
respectively" If X, and X2 are the sample means, find
the sampling distribution of the statistic

a;,

X, -:1'2 - (Ill -1'2)

Mlnl)+ ((j~ I~) ::

if the iOO transistor is nondef~tive, .


if the ith transistor is defective.,

.=L2,

Detennine the joint probability function for Xl and Xl'


'Woat arc the marginal probability functions for XI and
~i? Ate Xl and X2independent random variables?

,: 9~~A population of power supplies for a personal


~puter has an outpUt voltage that is normally dis~
tributed with a mean of 5.00 V and a standard devia~
tion of 0.10 V. A random sample of eight power
supplies is selected. Specify the sampling distribution

of X
9.-6, Consider j}e power supply problem described in
Exercise 9~5. What is the standard error off?
'/\?'"

9-7. Consider the power supply problem described in


Exercise 9-5. Suppose that the papulation standard
deviation is nnknown. How would you obtain the estimated standard error?

QtA procurement specialist has purchased 25 rcsis-

~from vendor
Let X IP X!:2'

1 and 30 resistors from vendor 2.

.. " X j ,2j represent the vendor 1 observed

resistmce.s assumed normally and independently distributed with a mean of 100 .Q and a standard deviation of 1.5 n, Similarly, letA;l'X:u, ... , XzJi)represent

9~ 12. A manufacturer of semico:r.ductor devices takes


a random sample of 100 chips and tests them. classi~
fying each chip as defective or nondefe<!tive. Let
X, 0(1) if the lth chip is Ilondefective (defective).
The sample fraction defective is

Xl -tXZ+'''+XIOO
100

"What is the sampling distribution ofp?


9 ..13. For the semiconductor problem in Exercise
9-12.:find the standard error of p. Also find the estimated standard error of p.

9,.14. Develop the moment-generating function of the


chi-square distribution.
9~ 15. Derive the mean and variance of the chisquare
random variable wij} u. degrees of f..-eedom.
9~16.

Derive the mean and variance of the

distribution.

917. Derive the mean and variance of the '1


distribution,

9--18. OrderStatistic.s. LetXj , X2, ""X>j be arando:rr


sample of size n from X. a random v.d.riable baving
distribution function F(:;;). Rank the elements iu arde;

-g~ 7

of increasing DIlffie:t:CaI magnitude, resciting in X(I)'

XO) . "., X{,,). where X(ll is the smallest sample element


(Xm=min [X1,Xl .... X~}) andX(~) is the largest sample element (X,,,) = max: {XI' Xl' "" X~}). A;,) is called
11e ith order statistic, Often. d:e distributio:;:; of some
of the order Statistics is of interest. particularly the
;ninimum and maximurI.'.. sample valnes. XiI} and ~"l'
respectively. Prove that t.1je distribution functions of
X(n and XC!!}' denoted respectively by FX(,j(t) and
Fx",,(t).

Exercises 215

variable with parameter A. Derive the distribution


functions and probability distributions for X{l) and
X:,,), Use the results of E-'Ccrcise'""9-18.
9-22. Let Xl' Xl' ... , X:, be a tandom sample of a con~

tinuous random variable. Find


E[F(X,,)l]

and

are
FX{,,(t)

= 1-:1

9-23. Using Table ill of the Appendix, lind the followmg values:

F(tW.

F",.,<r) ~ [F(r)]".

Prove that if X;;is concinuous y,'ith probability distributionf(x), then the probability distributions of Xli) and

:1;" are

.~X:.;".
(~X~,~.ll
(c)

f"",ttl ~ n[l- F(t)1-' fit),


ix",,{r) ~ n[F{tl]-' fi')
9 19. Continuation of Exercise 9-18. Let Xl' Xl' ....
X be a random sample of a Bernoulli random vaii:able
with parameter p. Shov.' that
.
R

P(X(,)~ 1)~

P(X(l)

1- (l-p)'.
0) = 1 - p'.

X~.cr.5,m'

(d) X~:J such that P[X~J ~X~l:'l}

--

tlse the results ofE.1tercise 9-18.


9-20. Continuation of Exercise 9-18. Let X!, X'). ....
Xn be a random sample of a non:nal random variable
\\ith mean J1. and va.>iance a2. Using the resclts of
Exercise 9~18, derive the density functions of X(l) and

. X:~)'
9--21. Continuation of Exercise 9-18. Let Xl' X'). . ,
Xi!!) be a tandom sample of a.'1 exponential r:mdom

=;

0.975.

9-24. Using Table IV of the Appendix. find the following values:_


(a)

iO.Z!,lO'

(b)

t;,."w

(c)

ta,10

such that Pi tiC::; fa,lo} = 0.95.

9-25. Using Table V of the Appendix, find the follOV{~


ing \--alues;
(a)

FO.2S.4.9

(b)

F01)'j,)s .c"

(e)

F O95 .6.S'
(d) F O.90.24,24'
9-26~ Let FI-4.!i,~ denote a lowerwtail point (I. s:; 0.50)
of the F..,v distribution. Prove that F;-tt,H,y =; lIFa,v,JJ'

Chapter

10

Parameter Estimation
Statistical inference is the proeess by which information from samp1e data is used to draw
oonelusions about the population from whieh the sample was selected. The teehniques of
statistical inferenee ean be divided into two major areas: parameter estimation and hypothesis testing. This ehapter treats parameter estimation, and hypothesis testing is presented in
Chapter 11.

As an example of a parameter estimation problem. suppose that civil engineers are ana~

lyzing the compressive strength of eoncrete. There is a natural variability in th~' strength of

each individual eonerete speeimen.onsequently, the engineers are interested in estimating


the average strength for .the population consisting of this type of concrete. They may also
be interested.in estimating the variability of eompressive strength in this population. We
present methods for obtaining point estimates of parameters such as the population mean
and variance and we alSQ discuss methods for obtaining certain kinds of interval eSP-males
of parameters called confidence intervals.

lO-l

POINT ESTIMATION
A point estimate of a population parameter is a single numerical value of a statistic that corresponds to that parameter, That is. the point estimate is a unique selection for the value of
an unknown parameter. More precisely, if X is a random variable with probability distributionj(x), characterized by the unknow~ parameter e, and if X" X" ... , X, is a random sampie of size n from X, then the statistic e= heX,. X" ... , XJ corresponding to eis called the
estimator of $, Note that the estimate eis a random variable, because it is a function of sar:n~
pie data. After the sample has been selected, takes on a particular numerical value called
the point estimate of e.
As an example l suppose that the random variable X is normally distributed with
unknown mean J1. and known variance 0'2, The sample mean X is a point estimator of the
unknown population mean 11. Thar is, fl = X. After the sample has been selected. the numerical value x is the point estimate of)1. Thus, iX1 = 2.5, X, = 3,1, x, 2.8, and x, 3.0, then
the point estimate of)1 is

X= 2.5+3.1+2.8+3.0

2.85.

Similarly, if the population variance Ii' is also unknown, a point estimator for Ii' is the sample variance S' and the numerical value s' 0.07 calculated from the sample data is the
point estimate of Ii'.
Estimation problems occur frequently in engineering. We often need to estimate the
following parameters:
The mean J1. of a single population

216

to-I

Point Estimation

217

The variance (12 (or standard deviation a) of a siugle population


The proportion p of items in a population iliat belong to a class of interest
The difference between means of two populations, i1, - }1,
The difference between two population proportions. Pl - P2
Reasonable point estimates of these parameters are as follaws:
For JL, the estimate is {l

the sample mean

For <i', the estimate is &' = S', the sample variance

is

For p, the estimate is p = XJn, the sample proportion, where X the number of items
in a random sample of size n that belong to the class of interest
For .u I - }1" the estimate is fl, - p., = X, means of two iudependet!,t :random samples

Xz. the difference between the sample

For PI - P2' the estimate 15Pl - P2f the difference betv:een two sample proportions
computed from two independent random samples
There may be several different potential point estimators for a parameter. For example.
if We wish to esti..llare the mean of a random variable, we might cODsider the sample mean,
the sample media:a, or perhaps the average of the smallest and largest observations in the
sample as point estimators. In order to decide which point estimator of a particular parameter is the best one to llse, we need to examine their statistical properties and develop some
criteria for comparing estimators.

10-1.1 Properties ofEsfunators


A desirable property of an estimator is that it should be "close" in sOme sense to the true
value of the unknown parameter. Formally. we say that is an unbiased estimator of the
parameter 8 if

(10-1)

Thatls. iHs an unbiased estimator of eif"on the average,t its values are equal to e. Note that
this is equivalent to requiring that the mean of the sampling distribution of be equal to e.

~~~!~if:l
Suppose that X is a random variable with mean fland variance cr, I.etX1,XZ> " ' . X" be a randomsam
pIc of sizen: from X Show that the sample llleanX and samplevanance S1. are unbiased estimators of
jJ. and ,"" respectively. Consi&.'T
w

and sfuce E(X/)

14 for all i = 1. 2, ... , n,

218

Chapter 10

Parameter Estimation
1 '

E(X)=- ~>=Ii.
n ;",1

Therefore, the sample mean X is an unbiased .estimator of the population mean p. Now consider

1 ..
~I'-'-)
=-1:.
,,-1 ;..! \ I
<

X'+X'-2XX.

~_I
Ji x1 -,&')
n-1 \)"'{

~~[iE(Xl)-nE(X')].
1 /..
1

It

However. since I:.(X;) ~,; + <1' andE(X') ~,;.,. <1'1n, we have

~(s') ~ n~ Il'i(Ii' +0"') -.(11' + "'In)]


~_1_(nj.t2+ n0'2 _np2 _a'll
n-l

i_I

=0-=.

cr.

Therefore, the sm:nple variance S2 is an unbiased estimator of the population va.riance


However.
the sample standard deviation S is a biased estimator of the population standard deviation 0: For latge
samp!es tbis bias is negligible.

----------------------------

The mean square eUor of an estimator {) is defined as

MSE(0 = E(e - (f)'.

(10-2)

The mean square error can be rewritten as follows:

MSE(0 = [0- (fI)]' + (8- E(e)f


= Vr/f) + (bias)'.

(10-3)

That is. the mean square eITOr of is equal to the variance of the estimator plus the squared
bias. If is an unbiased estimator of 0, the mean square error of (j is equal to the variance of e.
The mean square error is an important criterion for comyaring two e~timators. Let 81
and 0, be two ~timat0I' of the parameter e, and let MECfl0 and MSE(e..) be the mean
square errors of 0, andB.,. Then the relative efficiency of It, to 8, is defined as

MSE(e,)
MSE(e,) ,
If this relative effici~cy is less than one, we would conclude that a{ is a more efficient esti...
mater of () than is Oz. in the sense that it has smaller mean square error. For example,
suppose that we wish to estimate the mean ,il of a population. We have a random sample of

10-1 Point Estimation

219

l'! observations XI' Xz, ... , X.'I' and we wish to compare hvo possible estimators for J1~ the
sample mean X and a single observation from the sample, say Xi - Note that both X and j{,
..are
------.- .. ----- .. -.--~--~
,
unbiased estimators of ).1; consequently, the mean square enor of both estimators: is simply the variance. For the sample mean, we have MSE(X) =' VeX) = <5'fn, whare <5' is the
population variance; for an individual observation, we have MSE()(,) = VeX,) <5'. Therefore, the relative efficiency of Xi to X is

MSE(X)
MSE(X,)
Since (lfn) < I for sample sizes n "2, we would conclude that the sample mean is a better
estimator of !1 than a single observation Xr
Within the class of unbiased estimaton;, we would like to find t.~e estimator that has the
smallest variance. Such an estimator is called arninimum variance unbiased estimator. Figure 10-1 shows the probability distribution of two unbiased estimaton; 8, and with 81 having smaller variance than~. The estimator 81 is more likely than to produce an estimate
that is close to the true value of the unknown parameter 8,
It is possible to obtain a lower bound on the variance of all unbiased estimators of ().
Let 8be an unbiased estimaror of the parameter 9, based on a random sample of n observations, and let fix, IJ) denote the probability distribution of the random variable X. Then a
lower bound on the variance of fJ is'

t\

9"

(10-4)

This inequality is called the Cramer-Rao lower bound. If an unbiased estimator satisfies
equation lO~4 as an equality, it is the minimum variance unbiased estimator of 8.

;;!!~pI~!!8~1
We 'Will show that the sample mean X is the minimum variance unbiased estimator of the mean of a
nonna! dist;r!lmtion v.-ith known variance.
~.
-~

Figure 10~1

The probability distribution oct'.Vo unbiased estimators, 81 and (J2,

ICertain conditions on the functionf{X, B) a..--e required for obtaining the Cmn6'-Rao inequality (for example,
see Tucker ::962). These conditiOn:> are satisfied by most oCme standard probability distribution&.

220

Chapter 10

Parameter Estimation

From Example lO~l we observe that X is an U!lbiased est.irnator of p.. Note tha~

Substituting into equation 10-4 we obtain

V{X) :::

(~l

~2

nEJ.I-!n(a-J'2ii)-!( X -I' yli


ldflL

2\tj)~

1
~--~~~

nE(X-l'r
0"4

=-.
n

Since we know ::hat, in general, the variance of the saaple mean is V(X):;:::: erln. we see that VeX} sat~
isfies the C:amer-Rao lower bound as an I!qmility. Therefore Xis the minimum variance unbiased
estimator of J.l. for the normal distribution where 0=' is knO\Vll,

Sometimes we find that biased estimators are preferable to unbiased estimators


because they have smaller mean square error. That is, we can reduce the variance of the estimator considerably by introducing a relatively small amount of bias. So long as the reduc~
tioa in variance is greater than the squared bias, an improved estimator in the mean square
error sense will result. For example, Fig. 10-2 sbows the probability distribution of a biased
estimator 81 with smaller variance than the unbiased estimator 82- An estimate based~ on 81
would more likely be close to the true value of $ than would an estimate based on $,. We
will see an application of biased est:.i.mation in Chapter 15.
An estimator l:r that has a mean square error that is less than or equal to the mean
square error of any other estimator {!, for all values of the parameter e. is called an optimtJl
estimator of O.
Another way to define the closeness of an estimator {; to the parameter eis in terms of
consistency, If en is an estimator of ebased on a random sample of size n. we say that8r. is
consistent for () if,. for > 0,
(10-5)

Consistency is lMge-sample property, since it describes the limiting beh.TInr of the estimator (j as the sample size tends to infinity. It is usillilly difficult to prove that an estimator

10-1

Figure 10~2

221

Point Estimation

A biased estimator, 81, that has smaller variance than the unbiased estimamr. e~:

is consistent using the definition of equation 10-5. However, estimators whose mean square
error (or variance, if the estimator is unbiased) tends to zero as the sample size approaches
infinity are consistent. For example, Xis a consistent estimator of the mean of a normal distribution, since X is unbiased and limn~ VeX) = limll~ (a2/n) = O.

10-1.2 The Method of Maximum Likelihood


One of the best methods for obtaining a point estimator is the method of maximum likelihood. Suppose that X is a random variable with probability distributionf(x, IJ), where 8 is
a single unknown parameter. Let XI' X2 "0' XII be the observed values in a random sample
of size n. Then the likelihood function of the sample is
L( 8) = f(x 1, 8) . f(x" 8) ..... f(x" IJ).

(10-6)

Note that the likelihood function is now a function of only the unknown parameter

e. The

maximum likelihood estimator (?vll..E) of is the value of that maximizes the likelihood
function L(fJ). Essentially, the maximum likelihood estimator is the value of that maxi,mizes the probability of occurrence of the sample results.

Let X be a Bernoulli random variable. The probability mass function is


p(x) ~ r(1 - p)'-',

::::: 0,

x ~ 0, I,

othenvise,

where p is the parameter to be estimated. The likelihood function of a sample of size n would be

..

We observe that if p maximizes L(p) then p also maximizes lnL(p) sinee the logarithm is a monotonieally increasing function. Therefore,

222

Chapter 1Q

Parameter Estimation

Now
dlnL(p) =
dp

t,XI _lr.- t,'I)


p

1-p

Equating this zero and solving for p yields the Ml.B p, as

an intuitively pleasing answer. Of COurse, one should also perfOCl a second derivative test, but we
have foregone '<hat here.

"~~I~~!,ii.~
Let X be normally distributed wi;n unknown mean f.1 and knO'Wtl variance
of a sample of size n is

cr, The likelihood function

Now

lnLC") = -(n/2)ln(2""'l -(2a'r' L(X1 -

Il)'

1..1

dlna;ll)

(a'r'L(x,-Il).
J"':

Equating tbls last reswt to zero and solving for f.1 yields

- -..- - - - - - - - - - - - - - - - -

..

It may not always be possible to use calculus methods to determine the maximum of

48). This is illustrated !n the follawing example.

:!~P!~!ji~~:
LetXbe uniformly distributed on the interval 0 to a, 'The likelihood function of a random sample X!,

x,.. "., X" of size n is

Notc that the slope of this function is not zero an)'\Vhcre, so we cannot use calculus methods to find
the maximum likelihood estimator a. Hov{ever, notice that the likelihood function increases as a
decreases. Therefore, we would maximize L(a) by settingd to the sIIULllest value that it could reason~
ably assume. OearIy. a can be no smaller than the largest sample value, so we wo;;Jd use the largest
observation as fi. Thus, d =maxi Xi is the !>.-1LE for a.

------

10-1 Point Estbation

223

The method of maximum likelihood can be used in situations where there are several
unknmvn parameters. say 81> 82, . '. 8k' to estimate, In such cases, the likelihood function
is a function of the k unknown parametern 8" a" "., 13k and the IDa,mnum likelihood estimators [Ili ) would be found by equaring the k first partial derivatives iJL(B" 8" . ", B,)liJ8;,
1,2. ,.. k, to zero and sol\ug the resulting system of equations.

:~~iF'~~2

rr

cr.

LetXbenormally dist:ibuted v;'ithmean f.I. and variance


where both ,uand are unknown, Find the
maxi.mw:n likelihood estimators of ,U and 02. Tue likelihood function for a nmdom sa.::cple of size n is

L(Ji.,()2) = tIO'~e-{;t!-f1)~llC12
,.\
1

e -IV2"')k~,(,-4

(z=,t
and

Now

dlnL(ll,a')

all

"

, :Lh-,u)=o,

(J

i=l

dlnL(!'.,,')
"\
a(,,-)
The solutions to the above equations yield the roaxlmur.llikelihood estimators

and
-2

(J

I ~( X; -X
-)' .
=-""-'

n 1..,1

which is closely related to the unbiased sample variance Sl, Namely, 6"1

n_l)/n)S2.

Maximum likelihood estimators are not necessarily unbiased (see the maximum likelihood estimator of d' in Example 10-6), but they usually may be easily modified to make
them unbiased. Further, the bias approaches zero for large samp1es. In general, maximum
likelihood estimators have good large-sample or asymptotic properties. Specifically, they
are asymptotically normally distributed, unbiased, and have a variance that approachcs the
Cramer-Ran lower bound for large "- More precisely, if 11 is thc maximum likelihood estimator for 8. then .-yJ;(8- 8) is normally distributed with mean zero and variance

~,In(iI-e)l=v(~niJ}=

-2

E[:e 1nf(X,1I) J
for large n, Maximum likelihood estimators are also consistent, 1n addition, they possess the
invariance property; that is, if (J is the maximum likelihood estimator of () and u( If) is a func-

)
224

Chaprer 10

Parameter Estimation

tion of II that has a ,ingle-valued inverse, then the maximum likelihood estimator of u( IJ) is
u(fJ).

It can be shown graphically that the maximum of the likelihood will occur at the value
of the maximum likelihood estimator. Consider a sample of size n ;::::: 10 from a normal
distribution:
14.15,32.07,32.30,25.01,21.86,23.70,25.92,25.19,22.59,26.47.

Assume that the population variance is known to be 4. The 11LE for the mean. J1., of a normal distribution has been shown to beX. For this set of data, x; 25. Figure 10-3 displays
the log~likelihood for various values of the mean. Notice that the maximum value of the
log~likelihood function OCC1Jl'S at approximately x= 25. Sometimes, the likelihood function
is relatively flat in the region around. the maximum, This may be due to the size of the sam..
pie taken from the population. A small sample size can lead to a fairly fl't log likelihood,
implying less precision in the estimate of the parameter of interest.

10-1.3 The Method ofMomenfs


Suppose that X is either a continuous random variable with probability density f(x; Ii" e"
__ ., Ii,) or a discrete random variable with distributionp(x; 8" 1/2' __ ., Ii,) characterized by k
unknown parameters. Let X l' XZl ~ Xn be a random sample of size n from Xt and define the
first k sample moments about the origin as
t; 1,2,. __ ,k.

(10-7)

The first k population moments about the origin are

/1;; E(X'); [

x'f(x;e"eZ,",ek)tb:,

; :2'>'p(x;e1,8z,,,,e,),

t=1.2, ... ,k.

xG.R~

27
Sample mean
Figur< 10-3

Log lil:'elihood for various means.

X continuous,
X discrete.

(1()'8)

lO~1

Point Estimation

225

The population moments {J1;} will, ill general. be functions of the k unknown parameters
( 8i ). Equating sample moments and population moments will yield k simultaneous equations in k unknowns (the (Ii); that is,
t=

The solution to equation 10-9, denoted 8"e"

I, 2, ... , k.

(10-9)

... ,tI" yields the moment estimators of 8"

8"

"., 9;:

~~~w~!~Ii917~

cr)

cr

LetX - N(p.,
where 11- and
are unknol1."!l. To derive estimators for ,u and
moments. recaE that for the normal. distributio::l

cr by the method of

11-; =. tt.

14 = cfl + ;t.
The sample moments are m{ :;;{l/n)'~ ,Xl and rn2::;C: (l/n) ~~ IX?, . From equation 10-9 we obtain

L...t.,.,.

L...t"",

which have the solution

't!~R!~10}~
Let X be uniformly distributed on the interval (0, a). To find an estimator of a by the method of
moments, we note that the first population moment about zero is
1
a
Il,, = [' x-d:t=-,
.. 0 a
2
The first sample moment is just X Therefore,

or the moment estimator of a is just twice the sample mean.

The method of moments often yields est.imarors that are reasonably good. In Example
10-7, for instance, the moment estimators are identical to the maximum likelihood estima~
tors. In general. moment estimators are asymptotically normally distributed (approxi-

mately) and consistent. However, their variance may be larger than the variance of
estimators derived by other methods, such as the mcthod of maximum likelihood. Occasionally! the method of moments yields estimators that are very poor, as in Example 10-8.
The estimator in that example does not always generate an estimate that is compatible with
our knowledge of the situation.. For example, if our sample observa.tions were Xl == 60, Zz :::::
10, and x) :::::5. then a= 50, whlch is unreasonable, since we know that a ~ 60.

226

Chapter 10

Parameter Estimation

10-1.4 Bayesian Inference


In the preceding chapters we made an extensive study of the use of probability. Until now,
we have interpreted these probabilities in the frequency sense; that is, they refer to an
experiment that can be repeated an indefinite number of times, and if the probability of
occurrence of an event A is 0.6; then we would expect A to occur in about 60% of the exper~
imental trials. This frequency interpretation of probability is often called the objectivist Or
classical viewpoint.
Bayesian inference requires a different interpretation of probability, called the subjec~
tive viewpoint. We often encounter subjective probabilistic statements~ such as "There is a
30% chance of rain today." Subjective statements measure a person's degree of belief"
concerning some event, rather than a frequency interpretation. Bayesian inference requires
us to make use of subjective probability to measure our degree of belief about a state of
nature. That is, we must specify a probability distribution to describe our degree of belief
about an unknown parameter, This procedure is totally unlike anything we have discussed
previously, wntil now, parameters have been treated as unkno\VD constants. Bayesian infer~
ence requires us to think of parameters as random variables.
Suppose we letf(S) be the probability distribution of the parameter or state of nature
fI. The distributionf(1t) summmizes our objective information about e prior to obtaining
sample information. Obviously, if we are reasonably certain about the value of 9, we will
choosef(S) with a small variance, while if we are less certain about 8,/(8) will be chosen
with a larger variance. We callf(8) the prior distn'bution of 6.
Now consider the distribution of the random variable X. The distribution of X we denote
f(xl8) to indicate that the clistribution depends on the unknown parameter 9. Suppose we take
a random sample from X, say X" X" ... ,X~ The joint density of likelihood of the sample is
f(x"x.,., ... , x,l8) = f(x,:8)f(x.,.I8) ... f(x,I8).

We define the posterior distribution of f) as the conditional distribution of 8, given the


sample results. This is just
(10-10)

The joint distribution of the sample and 6 in the numerator of equation 10-10 is the product of the prior distribution of 6 and the likelihood, or
f(;t" x.,., ... , x,; 8) = f(8) f(x" x.,.. , xn!8)
The denominator of equation 10-10, which is the maIginal distribution of the sample, is just
a no.rm.alizing constant obtained by

(IO-Il)
, 0

Consequently, we may write the posterior distribution of eas

)_ f(6)f(x x" ... ,x,18)


( "
,

f (Srl'X,,,,,,x, -

(10-12)

xllx2""'Xn)

We nOle that Bayes' theorem has been used to tr.msforrn or update the prior distribution
to the posterior distribution, The posterior distribution reflectS our degree of belief about fJ
given the sample information. Furthermore, the posterior clistribution is proportional to the
product of the prior distribution and the likelihood, the constant of proportionality being the
normalizing constantf(x" x.,., .. , x,).

HH
Thus, the posterior density for
given the result of the sample.

Point Estimation

227

e expresses our degree of belief about the value of e

The time to failure of a transisto:; is known to be exponentially distributed with pttameter A" For a random sample of n transistors, the joint dens:ty of the sample elements, giver. A. is

-ltx

f(x"x" .... x,I;,)~;l.'e ;;; .


Suppose we feel that the prior dist:ibution for Ais also exponential,
j(A) ~ k-M,

A> O.

; ; ; 0,

otherwise,

where k wou1d be chosen depending on the exact knowledge or degree of belief we have about t"J.e
value of A. The jOint density of the sample and ). is

f (x!,XZ,",xn ..:t)_'
-kA, e->(2'>.+<) .
and the marginal density of the sample is

f (Xl.XZ, ... ,xl1 ) =: [ kl." e -!{l>") d:A.

kr(n+ 1)

(Ix,+kr'i'
Therefore, l1e posterior density for ;., by equation

lO~12> is

ar.d we see that the posterior density :ot ;. is a gzu:u.'Ul'i distribution 'With pa."'amcters n + 1 and IX; + k

10-1.5 Applications to Estimation


In this section, we discuss the application of Bayesian inference to the problem of estimating an unknown parameter of a probability distribution. Let Xl' X" .,., X, be a random sample of the random variable X having density f(xll/), We want to obtain. point estim.are of e.
Letf(1f) be the prior distribution for and let eccl; I/) be the loss/u:nction. The loss function
is a penalty function reflecting the "payment" we must make for misidentifyin~ eby realization of its point estimator /i, Common choices for eeel; I/) are (6 - 1/)' and
Generally~ the less accurate a realization ofi!J is, the more we must pay. In conjunction \vith a
particular loss function, the risk is defined as the expeeted value of the loss function with
respect to the random variables XI' X2, .. "Xfj comprising In other words, the risk is

e- el',

e.

R(d;O) ~

E[i( Ii; e)]

= [ [ ...

i{d(xj,xi, ...,x,);B}f(xj,x" .. ,x,le)dxjdx, "dx"

e,

where the function d("" x." .. '. x,,). an alternate notation for the eslimator is simply a
function of the observations. Since is considered to be a random variable, the risk is itself
a random variable. We would like to find the function d that minimizes the expected risk.
We WTite the expected risk as

,-

228

Chapler 10

Para..'!leter Estin:;.:1tion

B(d) = E[R(d;&)] = [R(d;&)f(e)de

=[{r ,.. [l{d(X1'X2 ....'Xn );&}f(x;'X2' ...'Xn :&),u1,u' .. ,u,} f(&)d&.

(10-13)

We define the Bayes estimator of the parameter to be the function d ofllie sample Xl'
Xn that minimizes the expected risk. On interchanging the order of integration in
equation 10-13 we obtain

X2

"',

B(d) = [

...

r {[ t{d(x; ,X2 "..,x );e}f(X 'X2' .... x l&)f(&)d&},uI,uC,u,.


n

(10-14)

The function B will be minimized if we can find a function d that minimizes the quantity
within the large braces in equation 10-14 for every set of the x values. That is, the Bayes
estimator of &is a function d of the that minimizes

x,

J':i{dh.x" ...,x,);&}f(x1,x" ...,x,le)f(ejd&


= [l(e;e)f(x;,x" ... ,x,;&)de
= f(X1,X2'''''X,)[ l(e;e)f(elx1,x" ... ,xn)de.

(10-15)

Thus, the Bayes estimator of 8 is the value that minimizes

[ l(e;eY(&jx1 ,x"

... ,xn )de.

(10-16)

If the loss function e(1}, (1) is the squared-error loss (8_11)', then we may show that the Bayes
estimator of e, say is the mean of the posterior density for &(refer to Exercise 10-78).

e,

~};~~ii!~l;~f~'
Consider the situation in Example 10-9. where it wa.<; shown. that if the random variable X is expo~
nent:ially distributed with parameter;;" and if the prior distribution for A. is exponential with parameter k, then the posterior distribution for).. is a gamma distribution. with parameters n ..k 1 and

~'~
Xi + k. Therefore. if a squared-error loss function is assumed. the Bayes estimator for A. is the
,L..,=l
mean of this gamma distribution.

rJ,+1

iXi+k
'"I
Suppose that in the ti:m'e~to-failure problem in EXac1ple 10-9. a reasonable exponen'ial prior distribution fox A. has parameter k::;;; 140. This is equiValent to saying that the prior estimate for)" is
,,10

0.07142. A r:mdom sample of size n== 10 yields A...J.4x,< =1500. The Bayes estimate of).. is

"

rJ,+-1

10+-1

.l.~--~----=o.o6/07.

to.

Ix,~k

1500+ 140

i.I

We may compare this with the results that would have been obtained by classical methods. The max"
imurn lilce1ihood es.timator of me parameter A. in an exponential distribution is

1().1

Point Estimation

229

Cor.sequently, the maximum likelihood estimate of t., based or. the foregoL'1g sa.-nple data, is
;I.' =

~ = 10

tx;

0,06667,

1500

i=!

Note that the results produced by the two methods differ somewhat. The Bayes estimate is slightly
closer to the prior estimate than is the maximum likelihood estimate.

Let X[, J;, ""XII be a random samplc from the normal density MID mea."l J.L and va..riance 1, where ,I.l
is unknown. Assume that the prior density for J.1. is nonnal with mean 0 and variance 1; that is,

f(ll) = _1_ e-(I/')p'


.,J2;r

-=<J1.<~

The joint conditional density of the sample given p. is

f (xlx2''Xr. If.l )- (2tr),,/2 e

-ll!,)r(X,-p)'

-(!/2Jfr";-2,J.tr ..,+nd"1

=---e'

(2,,:),"''Z

Thus, the joint dC!lsity of the sample and p:s

The marginal density of the sample is

f'~xl,.x2'x" )1
- (2n')~"-i)f2

L.!z x '} [ cXPu,


), '_2
t 2lln+1 J.L

exp~ 2

__

-l~'
j J.l.

jP'..x

By completing the square in the exponent under t.~e integra;, we obtain

(n+l)Ii2(2")"f.l exp

[-.!(ZX'
_n',' 'I'
2
' n-'-l)j

using:he fact :hat the i:ategral is (2n:):t.l./(n ,.;..l)lfl (si:ace a normal density has to integrate to 1). Now
the pos;;erior density for lJ. is

230

Chapte: 10

Parameter Estimation

There is a relationship betvleen the Bayes estimator for a parameter and the maximum likelihood estimator of the same parameter. For large sample sizes the two are near,!}'
equivalent, In general, the difference between the two estimators is small compared to ;In,
In practical problems, a moderate sample size will produce approximately the same estimate
by either the Bayes Or the maximum likelihood method, if the sample results are consistent
with the assumed prior information. If the sample results are inconsistent with the prior
assumptions, then the Bayes estimate may differ considerably from the maximum likelihood
estimate. In these clrcumstances, if the sample results are accepted as being correct, the prior
information must be incorrect. The maximum likelihood estimate would then be the better
estimate to use.
lf the sample results do not agree with the prior information, the Bayes estimator will
tend to produce an estimate that is between the maxbnUnl likelihood estimate and the prior
assumptions. If there is more inconsistency between the prior ioformation and the sample,
there will be a greater difference between the two estimates. For an illustration of this. refer
to Example 10-10.

1/

10-1,6 Precision of Estimation: The Standard Error


When we report the value of a point estimate. it is usually necessary to give some :idea of
its precision. The standard error is the usual measure of precision employed. If tiis an esti~
mator of then the standard error of~ is just the standard deviation of i), or

e,

(1()..l7)

If Gt! involves any unknown parameters, thee. if we substitute estimates of these parameters into equation 1()"17, we obtain the estimated standard error ofe, say6s. A small standard error implies that a relatively precise estimate has been reported.

.Ex:#l'l~l~#
An article in the Journal a/Heat Transfer (Trans. AS:ME. Scs. C, 96, 1974, p. 59) describes a method

of measuring the then::lal conductivity of Anneo iron. Using a temperature of 100"'F and a power input
of 550 W, the followi:.1g 10 measurement<: of thermal conductivity (in Btulnr-ft-l:IF) were obtained:
41.60, 41.48, 42.34, 41.95, 41.86,
42.18, 41.72, 42.26, 41.81. 42.04.

A poiot estimate of mean thermal conductivity allOOF and 550 W is the sa::nple mea..'1, or
'1= 41.924 BtuIbr-ft-'F.
The star.dard error of the sample:mean is Ui :;;; G/..J;., a."ld since Gis un..tmown. we may replace it with
the sample standard deviation to obtain the estimated standard error of;;'

&1 =' ~:;;; O"~4 =0.0898.


-.Jr!.

..,'10

Notice that the standard error is about 0.2% of the sample mean, implyiug that we have obtained a rel~
arively precise pobt estimate of thermal conductivity.

r
!

10-1

Point Estimation

231

When the distribution of $ is unknown or complicated, the standard errOr of 0 may be


difficult to estimate using standard statistical theory. In this case, a computer-intensive
technique called the bootstrap can be used. Efron and Tibshirani (1993) provide an excellent introduction to the bootstrap technique.
Suppose that the standard error of ii is denoted (To' Fu,iher, assume the population probability density function is
by f(x;lJ). A bootstrap estimate of (T, can be easily
COD.snucted.

1~ Given a random sample fromf(x;~, Xl'~'

.. "' X;,.

estimate 8, denoted bye.

2. l:sing the estimate 9, generate a sample of size n from the distributionl(x; 8). This
is the bootstrap sample.

Ii;.

3. l:sing the bootstrap sample, estimate 9. This estimate we denote


4. GenerateB bootstrap samples to obtain bootstrap estimates, 9;, for i = 1,2, ... , B (B
= 100 Or 200 is often used).
5. Let

e' =

t. e;Is

represent the sample mean of the bootstrap estimates.

6. The bootstrap standard error of i1 is found with the usual standard deviation
formula:

_.)2

-8

S; =
In the literature, B - 1 is often replaced by B; for large values of B, bowever, there is little
practical difference in the estimate obtained.

::~R~~!~::m:i~;
The failure times X of an electronic component arc known to follow an exponential distribution with
unknown paramete: Il A random sample of ten cot.lponents resulted ill the following fail~e times (in
hours):

195.2, 201.4, 183.0, 175.1. 205.1, 191.7, 188.6, 173.5,200,8,210.0.


The mea.."I. of the e.xpone;ritial distribution is given by E(X) =. 11)., It is also known that E(X; =. 1I1l A
reasonable est:imate for A &n is i = VX'. From the sa.'nple data, we find X 192.44, resulting in
l=: 1/192,44 =0.00520. B= 100 bootstrap samples of size 1'/.::::.10 were generated using Mi..'litab$ll with
fir, 0.00520) =: 0.00520e-olrn2fu, Some of the bootstrap estirr.ates are sho'Jlll in Table 10-1,
The average ofllie bootstrap estimares is found to be
ercoc of the estimate is

X' =

Ii;/.1

'\00=0.00551. The standard

1",1

0.00169.

Table 10-1 Bootstrap Estimates for Example 1013


Sample

Smr..p1e ~ean, X;

,:;

2
3

243.407
153.821
126.554

0.00411
0.00650
0.00790

100

204.390

0.00489

232

Chapter 10

J,

Parameter Esti.w:ltion

1()"2 SINGLE-SAMPLE CONFIDENCE :Ii'<"TERVAL ESTIMATION


In many situations, a point estimate does not provide enough information about the parameter of interest For example, if we are interested in estimating the mean compression
strength of concrete, a single number may not be very meaningfuL An interval estimate of
the form L ,; J1 ,; U might be more useful. The end points of this interval will be random
variables) since they are functions of sample data.
In general, to construct an interval estimator of the unkno\\ll parameter e. we must find
two statistics, L and U, such that
P(L $

(1$

UJ = 1- c<

(10-18)

The resulting interval


(10-19)

is called a lOO( 1 - a)% confidence ir.te!Valfor the unknovtn parameter e. Land U are called
the lower- and upper-confidence limits, respectively. and I - a is called the confidence coefficiem. The interpretation ill a confidence interval is that if many random samples are collected and a 100(1 - a)% cenfidence interval on e is compured from each sample, then
100(1- a)% oithese intervals will contain the true value of e. The siruationis illustrated in
Fig. 104 which shows several 100(1- a)% confidence intervals for the mean J1 of a distribution. The dol, at the center of each interval indicate the point estimate of J1 (in this case Xi,
Notice that one ill the 15 intervals fuils to contain the true value of J.L If this were a 95% confidence level, in the long run, only 5% of the intervals would fail to contain J.L.
Now in practice, we obtain only one random sample and calculate one confidence
intervaL Since this interval either will or '\\ill not contain the true value of 8t it is not reasonable to attach a probability level to this specific event. The appropriate statement would
be that elies in the observed interval [L, ("1 with confidence 100(1 - a). This statement has

~Ir-._ - - - l

Figure 104 Repeated construction of a confi


dence interval for JL

10-2 Single-Sample Confidence Interval Estimation

233

a frequency interpretation; that is, we do not know if the statement is true for this specific
sample, but the method used to obtain ille interval [L, U] yields correct statements
100(1- a)% of the time.
The confidence interval in equation 10-19 might be more properly called a two-sided
conJidence interval, as it specifies both a lower and an upper limit on (J, Occasionally, a onesided confidence interval might be more appropriate. A one-sided 100(1- a)% lower-confidence interval on "is given by the interval
L'; e,

(10-20)

where the lower~confidence limit L is chosen so that


P(L'; 8)

= 1- a.

(10-211

Simila::ly, a one-sided 100(1- a)% upper-co:rlidence interval on e is given by ille interval

esu.
where the upper-confidence limit U is chosen so that
P{

e,; UJ = 1 -

a.

(10-23)

The length of the observed two-sided confidence interval is an important measure of


the quality of the information obtained from the sample. The half-interval length e - L Or
U - eis called the accuracy of the estimator. The longer the confidence inter1lal. the more
confident we are that the interval acrually contains the true value of 8. On the other hand;
the longer the interval, the less information we have about the true value of e, In an ideal
situation, we obtain a relatively short interval with high confidence.

10-2.1

Confidence Interval on the Mean of a Normal Distribution,


Variance Known
LetXbe a normal random variable with unknown mean I'and known variance 0'-, and suppose that a random sample of size n. X" x" ... , Xn is taken. A 100(1 - a)% confidence
interval on jl can be obtained by considering the sampling distribution of the sample mean
X. In Section 9-3 we noted that the sampling distribution of X is normal if X is normal and
approximately normal-if the conditions of the Central Limit Theorem are met The mean of
X is,u. and the variance is rrln. Therefore, the distribution of the statistic
Z=X-I'

a/.Jn
is taken to be a standa..--d normal distribution.
The distribution of Z
this figure we see th.at

(X -l'l/(Gj.Jn) is sbown in Fig. 10-5. From examination of

Figure 10-5 The distri:mtion of Z

234

Chapter 10

Parameter Estimaticn

or

This can be rearranged as

p{X - Z.I' Cf/..Jn 0; J1';; X + Z.12 O"/..Jn} = I-a.

(10-24)

Comparing equations 10-24 and 10-18, we see that the 100(1- a)% two-sided confidence
interval on J1 is
./
(10-25)

t~p!~iij.;H:
Consider the thermal conductivity data in Example 10-12. Suppose that We want to find a 95% confidence interval cn The mean thec:nal conductivity of Ar::nco iron. Suppose we know that the star:.dru:d
deviation of The.rmal conductivity at 100"F ar.d 550 W is 0'= 0.10 BtuIhr-ft-R If we assume that
thermal conductiv:ty is normally distributed (or that thc conditions of the Central Limit Theorem are
meo), then we can use equation 10-25 to construct the confidence interva1. A 95% interval implies tliat
1 - a = 0.95, so ex:::;; 0.05, I!n.d from Table IT in the Appendix Zan. """ Z:J,0512 """ ZO.02S """ 1,96, The lower
confidence limit is
L:::::i-Za,12 (j/Jn

= 4 '.924-1.96( 0.1 0)/-50'


=41.924-0.062
41.862
and the upper confidence limit is

U::::.x+Zat'1.vj..J;i
=~1.924+ 1.96(O.10)/-JiO
=41.92~+O.062

= 41.986,
Thus the 95% two-sided confidence intcrval is
41.862 ., 1''; 41.986.

This is our interval of reasonable wlues f-or mean thermal conductivity at 95% coniidence.

Confidence Level and Precision of Estimation


Notice that in the previous example our choice of the 95% level of confidence was eSsentially arbitrary, What would have happened if we had chosen a higher level of confidence,
say 99%1 In fact, doesn't it seern reasonable that we would want the higher level of confidence? At a = 0.01, we find Z"", = 20.112 = Z,oo, = 2.58, while for a= 0.05, 20,0"" = 1.96.
Thus, the length of the 95% confidence interval is
Ie-'

/ ......

2( 1.96CfI"ln ) =3.92O"I..Jn,
whereas the length of tne 99% confidence interval is

2(2.5&O";..Jn) = 5. 15a/..Jn.

10-2 Singlc-Sample CDnfidence Interval Estimation

235

The 99% confidence interval is longer tllan the 95% confidence interval. Tbis is why we
have a higher level of confidence in the 99% confidence interval. Generally. for a fixed sample size n and standard deviation a, the higher the confidence level, the longer the resulting
confidence interval
Since the length of the confidence interval measures the precision of estimation, we see
that precision is inversely related to the confidence level. As noted earlier, it is highly desir~
able to obtain a confidence interval that is short enough for decision-making purposes and
that also has adequate confidence. One way to achieve this is by choosing the sample size
n to be large enough to give a confidence interval of specified length with prescribed
confidence.

Choice of Sample Size


The accuracy of the confidence interval in equation 10-25 is ZalZ,a / Jti. This means that in

using x to estimate /1, the error E = IX - JJi is less than Za12 r:r -.In. .with confidence 100( 1
0:), This is shown graphically in Fig. 10-6. In situations where the sample size Can be con~
trolled. we can choose n to be 100(1 - a)% confident that the error in estimating ,il is less
than a specified error E. The appropriate sample size is
/Z

a ,Z

n= \-0.'2E-/ .

(1026)

If the right-hand side of equation 10-26 is not an integer, it must be rounded up. Notice that
2 E is the length of the resulting confidence interval.

To illustrate the use of this procedure, suppose that we wanted the error in estimating
the mean thermal conductivity of Armco iron in Example 10-14 to be less than 0.05 BtuIhr
-ft-"F, with 95% confidence. Since r:r= 0.10 and ZU.D25 = 1.96. we may find the required
sample size from equation 10-26 to be

n=(Za/zr:rJ'Z = [(1.96)0.lOT =15.37=16.

E
0.05 J
Notice how, in general, the sample size behaves as a function of the length of the COn
fidence interval 2 E. the confidence level 100(1 - a)%, and the standard deviation r:r as
follows:
As the desired length of the interval 2 E decreases, the required sample size n
increases for a fixed value of r:r and specified confidence.
As crincreasest the required sample size n increases for a fixed length 2 E and specified cQnfidence.
As the level of confidence increases, the required sample size n increases for fixed
length 2 E and standard deviation r:r.

Figure 10-6 Error in estimating I' \'\1th X.

\'1

236

Chapter 10

Parameter Esti..ma.tion

One-Sided Confidence Intervals

It is also possible to obtain ooe-sided confidence intervals for Il by setting either L = - ~'br
U = - and replacing Z"", by Z". The 100(1 - (X)% upper-confidence inteITal for Il is
-

fJ5.X+Za G' "in,

(10-27)

and the 100(1 - (X)% lower-confidence interval for Il is


X-Za(J .Jn~Il.

(10-28)

10-2.2 Confidence Interval on the Mean of a Normal Distribution,


Variance Unknown
Suppose that We wish to find a confidence interval on the mean of a distribution but the vari~
ance is unknown. Specifically, a random sample of size 11, X1,X:: . ,.~X" is avaj]able, and X
and S' are the sample mean and sample variance, respectively. One possibility would be to
replace (Jin the confidence interval formulas for Il with known variance (equations 10-25,
10-27, and 10-28) with the sample standard deviation s. lithe sample size, n, is relatively
large, say n > 30, then this is an acceptable procedure, Consequently, we often call the con~
fidence intervals in Sections 10-2.1 and 10-2.2large~sample confidence intervals, because
they are approximately valid even if the unknown population vatiances are replaced by the
corresponding sample variances.
When sample sizes are sI:1all, this approach will not work, and we must use another
procedure. To produce a valid confidence interval; we must make a stronger assumption
about the underlying population. The usual assumptiO:l is that the underlying population is
normally distributed. This leads to confidence intervals based on the t distribution. Specifically, let Xl' X'Z' ... , Xn be a random sample from a normal distribution with unknown mean
,u and unknown variance <r. In Section 94 We noted that the sampling distribution of the
statistic

X-u

t=~

Sf";n

is the t distribution with n - 1 degrees of freedom. We nOw show how the confidence inter~
val on f.l is obtained.
The distribution of r = (X - 1l)/(sj.Jn) is shown in Fig. 10-7. L.."11ing r""", __ , be the
upper al2 percentage point of the t distribution with n - 1 degrees of freedom, we observe
from Fig. 10-7 that

Figure 10-7 The t distribution,

lO~2

Single.Samplc Confidence Interval Est.i.ma.tion

237

or

p{-'.,I2,n-' S ;/~ s 'aj2,n-l} ~4 -a.


Rearranging this last equation yields

p{x - '.!~.rt-l S/..;;; s!1 sX H cil.n_1 sf";;;} ~ 1- a,

(10-29)

Comparin,g equatiOIl5 10-29 and 10-18, we see that a 100(1- a)% two-sided coniidence
interval OIl !1 is
Ir
'r
(10-30)
X -'.IO,n-1 Sf-In SW> X + '.IO,n-l Sj-vn,
A 100(1- a)% lower-coniidence interval on!1 is given by
X-

'a,lI-l

r
S -v n

::; J.l.,

(10-31)

and a 100(1- a)% upper-coniidence interval on!1 is

(10-32)
Remember that these procedures assume that we are sampling from a normal populatiOIl, Tbis assumption is important for small samples, Fortunately, the no!111ility assumption
holds in many practical situations. When it does not) we must use -distribution-free or M!t~
parametric confidence interv$. Nonparametric methO"'&sare'dlscussed in Chapter 16. However, when the population is normal. the t-distribution intervals are the shonest possible
100(1 a)% confidence intel"',rals. and are therefore superior to the nonparametric methods.
Selecting the sample size n required to give a confidence interval of required length is
not as easy as in the kno\VU (f case~ because the length of the interval depends on the 'Value
of a (unknmvn before the data is collected) and on n. Furthermore, n enters the confidence
interval through both l/-.Jn and t~l' Consequently. the required n must be determined
through trial and erroc

t~~\~~~fgl
An .article in. the Journal <if Testing and Evaluation (Vol 10, No.4, 1982, p. 133) presents the
following 20 measurements on residua! flame time (in seconds) of treated specimens of cbildren's
nightwcar:
9.85. 9.93, 9.75, 9,77,
9.87, 9.67, 9,94, 9.85,
9.83, 9.92., 9,74, 9,99,
9.95, 9,95, 9,93, 9.92,

9.67,
9.75,
9.88,
9.89,

We ms.i, to find a 95% confidence interval on the mean :residual flame time, The sample mean
and standard deviation are
x~9,8475,
s~O,0954.

From Table IV of the Appendix we find to.025 ,19 "'" 2,093. The lower and upper 95% confidence limits

are

L~ X-la/2"H s/~
~ 9,8475 - 2.093(0,0954)/-J20

=9,8029 seconds.

238

Chapter 10

/'

Parameter Estimation

and

(r
U = x +tlX/'L.J1-1S/'Vn
= 9.8475+ 2.093(0.0954)/fjjj
9.&921 seconds.
Tnerefore the 95% confidence interval is
9.8029 sec ~)J. ~ 9.8921 sec

We are 95% confident that the mean residual fla:ne time is between 9.80"..5 and 9.8921 seconds.

10-2.3

Confidence Interval on the Variance of a Normal Distribution


Suppose that X is nO!Il1ally distributed with unknown mean)1. and unknov-n variance a'. Let
X:,X2, X, be random sample of size n, and let S2 be the sample variance. It was shown
in Section 93 that the sampling distribution of
2

X::::::

(n-l)S2
0"

"

is chi-square with n - 1 degrees of freedom. This distribution is shown in Fig. 10-8.


To develop the confidence interv:al. we note from Fig. 108 that

,.

P(X:'O",~I $X $ X:;"" .,) = 1-

or
r

}-1

2 '

2
n-l)S
P~l Xl-ai2.1:1~1
(f2

<

X~/2.n-l -

a.

This last equation can be rearranged to yield


((

P ~ ,n-l)S
'2

< (n-l)S"

,;:, <Y -

Aaj2,n-t

"";

rl-l_
- a.

(1033)

%1-a/2,n-1)

Comparing equations 1033 and 1018, we see that a 100(1 - a)% two-sided confidence
interval for a' is

(n-l)S2 <
2

%a/2,1I-1

012

o X~-(l12,n-1
Figure 10-8 'The X" distribution.

< (n-l)S'

(1-2

%1-/2,n-1

(1034)

10-2 Single-Sa."np]e Confidence Ir.terval EstL.'71ation

To fmd a 100(1 - a)% lower-confidence interval on


"
..
X'U,I1-I> gl'nng

{n-I)S' ~
2

239

a', set U = = and replace X~'_l with

;::;,.(j ,

(10-35)

Xa,n-l

The 100(1 - a}% upper-confidenee interval is found by setting L = 0 and replacing

X.~-ttr2Jl-l with X~----I"L,n:-:' resulting in

2/(n-l)S2

0' ~

(10-36)

Xt-a.n-t

A manufacturer of soft drink beverages is jntcrested in the uniformity of the m..'1.chine used to 'fill cans.
Specifically. it is desirable that the standard deviation 0' of the filling process be less than 0.2 fluid
ounces; otherwise there will be a bigher than allowable perce:ltage of cans that are under:filled. We
will assume that fill volume is approx:i.rr.ately normally distributed. A random sample of 20 cans result
in a s.arople variance of S1 = 0.0225 (fluid ouncesJ2. A 95% upper.cOI:iidence interval is found from

equation

10~36

as fo1:ows:

or
G 2 ,;

(19)0.0225
ml17

0.0423 (fluidounces'!'.
.

TIlls last statement may be cO:lverted into a confidence interval on the standard deviation O'by taking
the square root of both sides, resulting in
0'$

0.21 fluid ounces.

Therefore. at the 95% level of confidence. the data do not sup?Ort the claim that the ;Jtoeess staI:dard
deviation is less than 0.20 fluid our.ces.

10-2.4 Confidence Interval on a Proportion


It is often necessary to construct a 100(1 - a)% confidence interval on a proportion. For
example, suppose that a rundom sample of size n has been talren from a large (possibly infinite) population. and XC'; n) observations in this sample belong to a class of interest. Then
f; ~ X!n is the point estimator of the proportion of the population that belongs to this class.
Note that n and p are the parameters of a binomial distribution. Furthennore. in Section 75 we saw that the samplin,g distribution of p is approximately normal with mean p and variance p(l - p)/n, if p is not. too close to either 0 or 1, and jf n is relatively large. Thus, the
distribution of

is "l'Proximately standard nonnal.


To construct the confidence inteI'\'al on p, note that

P(-Z"" ;;Z;;Z"") = 1 a

240

Chapter 10

Parameter Estimation

or

This may be rearranged as

(10-37)

.fir):"-- lin

p
as the standard error of the point estimator p.
We recognize the quantity
l;nfortunately. the upper and lower lintits of the confidence inrerval obtained from equation
10-37 would contain the unknown parameter p. However, a satisfactory solution is to
replace p by Pin the standard error, giving an estimated standard error. Therefore.

Iil(l-ill

P { P--Z~i2V

Ip(l=iij}_

'

';p:5P+Za/~~

-1--a,

(10-38)

and the approximate 100(1-- a)% two-sided confidence interval onp is

!il(I--P)

p--ZaI2~

lil(l--il)

,;p:5P+Zat2~

(10-39)

An approximate 100(1 -- a)% lower-confidence interval is


- -- Z

a',

~(l=iij.:5 p,

(10-40)

and an approximate 100(1-- a)% upper-cOllfidence interval is


P

,;'p+ za~IIP'(111-- P')

(10-41)

~~i?j~:!.t;.
In a random sample of 75 axle shafts, 12 have a surlace finish ~t is rougher than the specifications
will allow. 'Thetefore, a point estimate of :he proportion p of shafts in the population that exceed :he
roughness specifications is p:::::x/n:::;: 12nS =0.16. A 95% two-sided Confidence interval for p is CQIlJ.puted from equation 10-39 as

or

which simplifies to
0.08';;p:; 0.24.

10-2 Single-Sar:::ple Confidence Interval Estimation

241

Define the error in estimating p by jJ as E = iF - pI. Note ......


that
we are approxnnately
.
100(1 - a)% confident tbat this error is less than Za/2 'i p(l- p )/n. Tnerciare. in situations
where the sample size can be selected, we may choose n to be 100(1 - a)% confident that
the error is less than some specified value E. The apPrOpriate sample size is
~

"=(Z: J

p(l-p).

(10-42)

This function is relatively flat from p:::: 0.3 to p = 0.7. An estimate of p is required to use
equation 10-42. If an estimate p from a pre-vious sample is available, it could be substituted
for p in equation 10-42, or perhaps a subjective estimate could be made. If these alternatives
are unsatisfactory, a preliminary sample could be taken, p computed, and then equation
10-42 used to determine how many additional observations are required to estimate p with
the desired accuracy. The sample size from equation 10-42 will always be a maximum for
p = 0.5 [that is, p(1
0.25], and this can be used to obtain an upper bound on Il. In other
words, we are at least I OO{ 1 - a)% confident that the error b estimating p using p is less
than E if the sample size is

n=(Z;'2 rrO.25).
1n order to maintain at least a 100(1 - a)% level of confidence the value for n is always
rounded up to the next integer.

!~3#p!~i()j8
Consider ihe data in Example 10-17. How large a sample is required if we want to be 95% con5dent
that the error in usingp to estimate p is less than 0.05? Usingp = 0.16 as an initial estimate of p, we
fiLd from equation 1042 that the required sample size is
')' p(l-pl=(!.96J" 016(084)=207
( Z',02'
E'
0.05"
"

We note that the procedures developed in th1s section depend on the nonnal approxi~
mation to the binomiaL In situations where this approximation is inappropriate, particularly
cases where n is small, other methods mUSt be used. Tables of the binomial distribution
could be used to obtain a confidence interval for p. If n is large but p is smaIl, then the Pois~
SOn approximation to the binomial could be used to construct confidence intervals. These
procedures areillustrated hy Duncan (1986).
Agresti and Coull (1998) present 1I!1 alternative form of a confidence interval On the
population proportion, p, based on a large-sample hypothesis test on p (see Chapter 11
of this text). Agresti and Coull show that the upper and lower limits of an approxnnate
100(1 - a)% confidence interval on p are
'
". Z;;/2....
p---_z
2n

/'l
0, .. ,

I' A( ')
.pl-p

,
Z;j2

1---"'-,tz.
4n~
Z;!2

1+-'n
The authors refer to this as the score confidence interval. One-sided confidence intervals
can be constructed simply by replacing Zafl with Z".

,
242

Chapter 10

Parameter Estimation

To illustrate tins confidence interval, reconsider Example 10-17, whlch discusses the
surface finish ofa shaft with n =75 and p = 0.16. The lower and upper limits of a 95% contidence interval using the approach of Agresti and Coull are

1+ (1.96)"
75

0.186 0.087
=1.051
0.177 O.083,
The resulting lower- and upper-confidence limits are 0.094 and 0.260, respectively.
Agresti and Coull argue that the more complicated confidence interval has several
advantages over the standard large~sample interval (given in equation 10-39). One advantage is that their confidence interval tends to maintain the stated level of confidence better
than the standard large-sample interval Another advantage is that the lower-confidence
limit: will always be non-negative. The J.a.."ge~sample confidence interval can result in negative lower-confidence limits, which the practitioner will generally then set to 0, A method
which can report a neg~tive lower limit on a parameter that is inherently non~negative (such
as a proportion, p) is: often considered an inferior method. Lastly, the requirements that p
not be close to 0 or 1 and !l be relatively large are not requirements for the approach suggested by Agresti and CoulL In other words, their approach results in an appropriate confidence interval for any combination of nand p.

103 TWO-SAMPLE C01~;FIDENCE INTERVAL ESTIMATION


10-3.1

Confidence Interval on the Difference between Means of Two


Normal Distributions, Variances Known
Consider two independent random variables XI with unknown mean fl.l and knmvn variance
andX, with unl:nownmean IL, and known variance <1,. We wish to fmd a 100(1- a)%
confidence interval on the difference in means)1: -111,. Let Xli' X 12 ,", XI"I be a random
sample of n1 observ-atious from Xl' andXZ1 ' Xn. ,.. , X2l:~ be a random sample of nz observations fromXz_JiX: andXz a..~ the sample means, the statistic

cr,

is standard nonnal if X; andX, are normal or approximately standard normal if the conditions of the Central Lintit Theorem apply, respectively. From Fig, 10-5, tins implies that

P{-Z"",;Z :>Z"nJ: 1- a
or

10-3 Two-Sample Confidence Interval Estimation

243

This can be rearranged as

I 2
100t

(y.,2

~Xl -Xl +Za/l'\I-+~

I nJ

'}

(10-43)

=l-a.

"'z

Comparing equations 10-43 and 10-18, we note that the 100(1 - a)% confidence interval
for 1'1 - 1"1 is

I 2

C:r--2

XJ-Xo-Z"
1 -X,+ZI
al""'I1l!l+'" ""J-"-$X
,- ,-'1.
..
a 2'11::..+"2,
l~

\~

(10-44)

One-sided confidence intervals on 1'1 - 1"1 may also be obtaiued, A 100(1- a)% upperconfidence interval on 1'1 - 1"1 is

(10-45)
and a 100(1 - a)% lower-confidence interval is
, 2

.,

10'1
0';1-+-"1';

XJ -X2 -Za

1Lz

"\ fll

1"1,

(10-46)

Tensile stre:lgth tests were perfoIDlcd on two different grades of alumint:.Il1 spars used in manufacturing the v,.ing of a commerciai t:rn:osport aircri.L~ From past experience with the spar manufacturing
process and the testing procedure. the standard deviations of tensile strengths are assumed to be
knOWll. The data obtained are shov,'U in Table 11)...2.

If fl.l and p-:. denote tile true mean tensile strengths for the two grades of spars, then we may 5nd
a 90% confidence ioterval on the difference in mean strength #1 - pz .as follows:

- -;t.z
- - Z
L ""';T;.

"-,--,-

10'1 0',
1-+a, .. ~ nl
n.z
r>

1,(1-0-',;;---(,-5,7,
=87,6-74.5-1.645)-'1_+..:::...L

10

;2

=13,1-0,88
=12,22 kglmm',

TabJe 10--2 Tensile Strength Test Result for Aluminum Spars


Spar
Grnde

Sample

Sample :Mean
Te:lsile Strength

Size

(kg/mm~)

1
2

,111:= 10
.,;12

,",;74,5

xl := 87,6

Standard Deviation
0'; = 1.0
0', ; 1.5

244

Chapter 10

Pa.-amete:: Estimation

I, 0)'

'1 5)2

10

12

:87,6-74.5~L64511~~_+_l_'\

=13.1+0.88

= 13,98 kg/mm 2
Therefore t..~e 90% co:rfidence interval on the difference in mean tensile strength is

12,22 kgImm'- S Il, -

f.I, ~

13.98 kg/nun'-

We are 90% confident that the mean tensile strength of grade 1 aluminum exceeds that of grade 2 a1u~
minum by between 12,22 and 13.98 kg/rnm?

If the standard deviations 0', and 0'2 are known (at least approximately), and if the sam
pIe sizes n l and ft:t. are equal (n! ;;;; nz : : n, say), then we can determine the sample size
required SO that the error in estimating Iii - ,Uz using XI - X2 will be less than E at 100(1 a)% confidence, The required sample size from each population is
( Zl2 )"( 2 2)
n=\E
0'1+0'2'

(10"1-7)

Remember to round up if n is not an integer.

10-3.2 Confidence Interval on the Difference between


Means of Two lIIormal DistribUtions, Variances Cnknown
We now extend the results of Section 10-2.2 to the case of two populations with unknown
means and variances, and we wish to find confidence intervals on the difference in means
11, - ill. If the sample sizes n, and n, both exceed 30, then the normal known-variances distribution inter,.'a1s in Section 1()"3, 1 can be used. However. when small samples are taken,
we must assume that the underlying populations are normally distributed with unknown
variances and base the confidence intervals on the t distribution.

Case I.

cr: cr; = 0"

a:.

Con.<ilder two independent normal random variables, say X; with mean J1! and variance
and X2 with mean f1-;. and variance 0;. Both the means Pl and Jl.z and the variances d; and
0-; are unknown. However. suppose it is reasonable to assume that both variances are equal;
that is,
0". We wish to find a 100(1 a)% confidence interval on the difference
in means J1! - Jl.z.
Random samples of size Itl and are taken nuX J andXz respectively, Let the sample
1
means be denoted X, and X, and the sample variances be denoted S 1 and
Since both S;
and are estimates of the commOn variance 0", we may obtain a combined (or "pooled")
estimator of 0":

cr: =cr; =

nz.

S;.
~

S;

s' : (n, -1)sl+ (n, -l)si


p

nl

+nz.-2

'

(10-48)

10-3

Two~Sample

Cor.fider.ce Interval Estitr.ation

245

To develop the confidence interval for Jil - fLJ.t note that the distribution of the
statistic

is the t distribution with 7>1 + ll:: - 2 degrees of freedom, Therefore.


P{- lat'2..nrtr..2:S: t:;; tl.li2.nltll:r2 } = 1-

or

This may be rearranged as

ri-~1

+ t"/2"

f'"1

'

+,,_2S, 1-+- = I-a,


I.
711
~ J

(10-49)

Therefore) a 100(1- a)% two~sided confidence interva: for the difference in means Pi - Jl'J.
is

11
1
$.J11 ~f.l2 ~XI-X2 +ta/21'1 +1I~~2Sp.,.i-+-.
'I~'vnl~

(10-50)

A one-sided 100(1 - a)% lower-confidence interval on ,U, - fl.z is


(10-51)
and a one-sided 100(1 - a)% upper-confidence interval on flI - fl.z is
(10-52)

In a batch chemical process used for etching printed circuit boards, two different catalysts are being

compared to de~r.mine whether they require differer:.t emersio!l times fo: re...l1oval of identical quantities of photoresist material, Twelve batches were rn."l with catalyst 1, resti:ung in a sample mean
emersion time of x, ~ 24.6 minutes and a sample standard de..iation of s! = 0,85 minutes. Fifteen
batches were ron -with catalyst 2, resulting in a rce.a... emersioD time of ~ ;;: 22.1 minutes and a stan
da.."; deviation of S2 "" 0.98 minutes. We will find a 95% confidence interval on the difference in means
w

246

Olapter 10

Parameter R~tim.ation

PI - fI2, assuming that the standard deviations (or variances) of the two populations are equa::. The
pooled est.i.mate of the COIllllJ.on va..'iance is found using equation 10-48 as follows:

, (n,-I),; .;.('" -1)$;

s = -'-'--'--'--'-'-:-'-'"1+"2- 2

= 11(0.85)' H4{O.98)'
12.,.15-2
=0.8557.
The pooled standard deviation:is sf! =JO.8557 =0.925, Since
calculate the 95% lowet:~ and upper-confidence limits as

lQ/2.n,+".,....2

til1lZ5.2:i ""

2,060, we may

' -

/1 I
=24.6-22.1-2.060(0.925).,-+112 15
= 1.76 minuteS
and

= 24.6 - 22.1 + 2.060(0.925),1...1:.. +...1:..


112 15
; ; ; 3.24 minutes,
That is, the 95% confidence inte:val on the difference in mean emersion times is
1.76 minutes s:: fl.l - fl? :5 3.24 minutes.

We are 95% confident tb,at catalyst 1 requires an emerslon time that is between 1.76 minU!es and 3.24
minutes longer than that required by catalyst 2.

Case II.

d; " d;
In many siruations it is not reasonable to assume that ~ = ~. Wilen this assnmption 1s
unwarranted, one may still find a 100(1- ci)% confidence intervil for 111 i1'2 using the fact
that the statistic

t _ X,

X, -(11, - Jl,)
2 '
..js!!", +S,/nz
'21

is distributed approximately as t with degrees of freedom given by


2 I

,2

Si in, +S21nz I
..,

v='

"

'''-2

"t+1

nz+ 1

(5.;/",)' J>ilnzt

(10-53)

Consequently, an approximate 100(1 - ci)% two-sided confidence interval for 111 - !1-"
when d; " ~, is
(10-54)

10-3 Two-Sample Confidence mre!>,al Esti.-nation

247

Upper (lower) oDe-sided confidence limits may be found by replacing the lower (upper)confi.dence limit with -00(00) and changing cd2 to a.

10-3.3 Confidence L.,terval on PI

/1z for Paired Observations

In Sections 10-3.1 and 10-3,2 we developed confidence intervals forthe differeDceiD means
where two indepe."ldent random samples were selected from the two populations of bterest. That is, Il j observations were selected at random from the :first popUlation and a CompletelY independent sample of rLz observations was selected at random from the second
population, There are also a number of experimental situations where there are o:lly 11- different e:x:perimental units and the data are collected in pairs; that is, two observations are
made on each unit.
For example, the journal HW'IlmI Faclors (1962, p, 375) reports a study in which 14
subjects were asked to park two cars having subscantially differen: wheelbases and turning
radii The time in seconds was recorded for each car arid subject, and the resulting data are
sho'Nn in Table 10-3. Notice that each subject is the "'experimental unit" referred ~o earlier.
\Ve wish to obtain a co!lfidence interval on the difference in :::nean time to park the two cars,
say fll .LIz,
In general, suppose that the data consist of n pairs (Xli' X,,), (X", X,,), ,..,(X", X,,).
Both XI and X, are assumed to be normally distributed with mean 11, and 11-" respectively.
The random variables within different pairs are independent, However, because there are
two measurements on the same experimental unit, the two measurements within the same
pair may not be independent. Consider the n differences D j ;X1! -XlI' D2 =X12 -Xn, ... ,
D" =X1r. - X2r.' Kow the mean of me differences D. say ,tiD' is
110 =E(D)

= E(X, - X;) = E(X I ) -

E(X,) = 11, - f12.

because the expected value of Xl - X2 is the difference in expected values regardless of


whether Xl and X2 are independent. Consequently, we can construct a confidence interval
for ,til - J.i.2just by finding a confidence interval on fla. Since the differences Df are normally
and independently distributed, we can use the t-distribution pr0Cedure described in Section
Tabl.lO-3 TlIile in Seconds to Parallel Park Tv.'o Automobiles

Automobile
2

Difference

37,0
25,8
16.2

17.8
20.2
16,8

24.2

41.4

5
6

22,0
33,4
23.8
58,2
33.6

8
9
10
11
12
13
14

21.4

19.2
5,6
-{J.6
-17,2
0.6

38.4

-5,0

16,8
32,2

7.0
26.0

27,8

5.8

24.4

23,2

1.2

23.4
21,2
36.2
29.8

29,6
20,6
32,2
53,8

-6.2
0,6
4,0

-24.0

248

Chapter 10

Parameter Estimation

JO-2.2 to find the confidence interval on ,uo' By analogy with equation 10-30, the 100(1 -

a)% confidence interval on I1D = 11; -

i11 is

\,
(10-55)

where 15 and SD are the sample mean and sample standard deviaoon of the differences D"
respecovely. This confidence interval is valid for the case where (j~"*
because So estimates
~ veX, X,). Also, for large samples (say 2: 30 pairs), the assumption of nOrmality is unnecessary.

a;

0';,

We now return to the data in Table 10-3 concerning the time for n =: 14 subjects to parallel park two
cars, From the column of observed differences d, we calculate d:= :.21 and sa=- 12.68. The 90% Confidence interval for f.tn:=: J.1, - P2 is found from equation 10~55 as follows:
d-tl,105,:3Sd -./nS,f.tD:5.d+tMS,OSd .[,;,

L21-1.771(12,6S)!M "I"D 51.21+ 1.711(12.68)/ M,


-4.79'-;I"D '-;7.21.
~otice

that ::he confidence L'1terval on PD includes zero, 'This implies that at the 90% level of confidence, the data do not support the ciaim ::hat the two cars have different mean parking times PI and
J1.z. That is, the: value PD =. PI -ILl =0 is not inconsistent lit.!) the observed data,

Note that when pairing data, degrees of freedom are lost in comparison to the two-sample
confidence intervals. but typically a gain in precision of estireation is achieved because 8tl
is smaller than Sp'

10-3.4

Confidence Interval on the Ratio of Variances


of Two Normal Distributions
Suppose that Xl and X2 are independent nonnal random variables Vlith unknown means fJ.,
and i12 and unknown variances 0'; and
respectively, We wish t<J find a 100(1 - a)% con
fidence inten'al on the ratio ~Ja;," Let two random samples of sizes n1 and YI.z be taken on
Xl and X21 and let and denote the sample variances. To find the confidence mterva,L
we note that the Sar::1pling distribution of

a;,

87

S;

si/a~

F=---'

Sf/a?;

is F with !'1.z - 1 and n 1 - 1 degrees of freedom. This distribution is shown in Fig. 10-9.
From Fig, 10-9, we see that
P{F 1-al2J'T;rJ,nr
or

Hence

1 :::;

F $ F an.,ny-1J'TI-t} = 1 - 0:

1O~3

Two-Sar:\.ple Confidence Interval Estimation

249

Figure 10-9 The distribution of F"rl,ll;-l'

Comparing equations 10-56 and 10-18, we see that a 100(1 - a)% two-sided confidence interval for ~/a; is

sf
c;f ~2
si Jli-CZ/2,/l.l-l./I.:-l:5; O'~ s:: s:f Fa:tZ,i'll-!:./l.l-l'

(1O-57)

where the lower 1 - aJ2 tail point of the F"r'tJ:!!-; distribution is given by (see equation

9-22).
1

(10-58)

FI)./2,fI,l -l,n:-t

We may also construct one-sided confidence intervals. A 100(1 - a)% lower-confidence limit on ~/ 0"; is
~2

0;'

sf

~-'"
<-'
'<l-a,flz-l,illl -1 - O'~ ,

(10-59)

while a 100(1 - a)% upper-confidence interval on a',ia', is


(10-60)

;:[~E~!,'i~2,!
Consider the batch chemical etching process described in Example 10-20. Recall that two catalysts
are being compared to measure their effectiveness in reducing emersioD times for printed circuit
boards. n; 12 batches were run with catalyst 1 and ll:z = 15 batches were rJ.!l with catalyst 2, yielding $! = 0,85 mL.'lutes and $1 :;;;;; 0,9S minutes. We will:find a 90% cor.iidence interval on the ratio of
va.ri3x.\ccs ~!o;. Froro equation 1!)-'57. we find that
2
$1

ar

$~

0'2

$2

-rro,9)J4JI :S:-2 $21'0.05,14,11>


of2

(0.85): 0.3% cr~ ::; (0.85)' 2.74,


(0.98t
"i (0.98)'
or

0.29:S; a~

cr2

s 2.06,

250

Chapter 10

Paramete, Estimation

usmgthe factiliatFQ,95.14,H = lIFo.6~,II.A "'" If2.58 =0.39. Since this confdeace interval includes unity,
we could not claim that the staIl.d.ard deviations of the emersion times for the two catalysts are different at the 90% level of confidence.

1()"3.5 Confidence Interval on tbe Difference between Two Proportions


Ii there are two proportions of interest, say P, and P,. it is possible to obtain a 100(1- a)%
confidence interval on their difference, PI - Pl' If two independent samples of size n[ and
n., are taken from infinite populations so that X, and X, are independent. binomial random
variables mth parameters (n 11 PI) and (~, pz), respectively, where Xl represents the num~
ber of sample observations from the first population that belong to a class ofinterest and X,
represents the number of sample observations from the second population that belong to the
class of interes~ then fit = X,/n, and P2 = X,In., are independent estimators of p, and p"
respectively. Furthermore, under the assumption that the normal approximation to the binomial applies, the statistic

p, - ,0, -(p, - P2)

Z--r-~"='~~~~"'~

jP,(l- p,) , p,(l- p,)

~n,~n.,
is distributed approximately as standard nonnaL Using an approach analogous to that oftlle
previous section, it follows that an approximate 100(1 - a)% two-sided confidence interval forp,-pzis

PI -

P2 -

,1,0,(1-,0,)+,02(1-,02)

a!2~i

n,

""

r;::---c-.

~-;

+ Zo/2. ,p,(I- PI) . -,,0.2",-(I~p2,,-)

,,; p, - P2 ,,; PI - p,

An approximate

100(1 -

(10-61)

ll:2

nl

a)% lower-confidence interval for p, - P2 is

PI-PZ -

Za Vlp'(I-p;)+p2(I-fi~L
-~--~Pl
n1
n:z

P2'

(10-62)

and an approximate 100(1 - a)% upper-confidence interval for p, - p, is

.-

p~

P2

~n.-'
+Za, ,lp,(I-p,Y~I-':'~
~
P2
'
.
Yl

1",

(10-63)

n.,

;:~~~~~19;~:;
Consider the data in Example 10-17. Suppose that a modification is made in the surface finishing
process and subsequently a second random sample of 85 axle shafts is obtained, The number of defective shafts.in this second sample is 10. Therefore. since!t: ;; 75, Pt 0,16, ~ = 85. and /)2:::: 10/85:;:;:;
0,12, we can obtain an approximate 95% co:nfidence interval on the difference in the proportions of
defectives produced under the !\Va processes from equation 10-61 as

10-4

Approx.ima~e

Confidence Intervals in Max.imum Likelihood Esti.:nation

251

or
0.16_0.12_1.96.1.16(0.84) + 0.12(0.88)
,75
S5

r,

'~~=C:::

...

:> Pl _ p, <016-012
1961,16(0.84) 0.12(0,88)
_.
, + , 'i
75
+
ll5
.
This simplifies to

- 0.07 s: P, - Pz'; 0.15,


This U:.rervaJ includes zero, so, based on the sample datl, it seems unlikely that the changes made in
the st:.rface finish process have reduced the proportion of defective axle shafts being produced..

10-4 APPROXIMATE CONFIDENCE INTERVALS


IN i\fA:Xl1VrulVI LIKELmOOD ESTIMATION
If the method of maximum likelihood is used for parameter estimation, the asymptotic
properties of these estimators may be used to obtain approximate confidence intervals:. Let

Ii be the maximum likelihood estimator of fJ. For large samples, /j is approximately normally

distributed with mean eand variance V(~ given by the Cramer-Ran lower bouad (equation
10-4). Therefore, an approximate 100(1 a)% confiGence interval for (ii,
(10-64)
Usually, the V(fiJ is a timction of the unknown parameter e. In these cases, replace

ewith {i

~,~iij!J~~
Recall Example 10-3, where it was shoW!;; that the maximum likelihood estimato= of the parameter p
of a Beruoulli distribution is p= (lIn.)""
XI = X. Using the Cr:un6:'-Rao lower bound, we may
Lr",l
verify that the lower bound for the variance offt is
"),,
V\P
-

1~ 10[pX(I- Pj'- l]
X

- J~_ (I-X)]'
'~lp

\l-p)

nE[X2 + (I-X)'
p' (I-p)'

_2;((I-Xl]'
;(I-p)

For the Bernoulli distribution, we observe thatE(X) =P and E(X,) =p. Therefore, this last express:on
simplifies to

V(p\"
[1 1 1
, )
n-+-P (l-p)
This result should not be surprising, since we know directly that for the Bernoulli distribution, V(X)
= VeX,)/. =p(l- p)/n. Jn any case, replacingp in VCP) oy p, the approxi=te 100(1- a)% contitlence
interval for p is found from equation 10-04 to be

252

Chap~er

10

Parameter Estimation

10-5 SIMDLTA.c'<EOUS CONFIDENCE INTERVALS

.
'\
1-1

Occasionally it is necessary to construct several confidence intervals on more than one


parameter, and we wish the probability to be (I - a) L'Iat all such confidence intervals
simultaneously produce correct statements. For example, suppose that we are sampling
from a normal population with unknown mean and variance, and we ~...sh to construct confidence intervals for I' and a' such that the probability is (1 - a) that both intervals simultaneously yield correct conclusions. Since X and - are independent, we could ensure this
result by constructing 100(1 - a)!!2% confidence intervals for each parameter separately,
and both intervals would simultaneously produce correct conclusions with probability
12
(1 (I- al"~" = (I - a).
Ii the sample statistics on which the confidence intervals are based are not independent random variables, then the confidence interv'als are not independent. and other methods
must be used. In general, suppose that m confidence interVals are required. The Bonferroni
inequality states that

ar

p{all m statements are Simultaneously correct}

" I-a " 1-

La,.

(1065)

=1

where 1

(Xi

is the confidence level used in the ith confidence interval. In practice, we select

a yalue for the simultaneous confidence level 1 - a, and then choose the individual ai such
that ~
~ at:::: a. 1.Jsually. we set a". = aim.
~I=l
As an illustration, suppose we wished to construct two confidence intervals on the
means of!VIo normal distributions such that we are at least 90% confident that both statements are simultaneously correct. Therefore, since 1 - a 0.90, we have a:::; 0.10, and
since two confidence intervals are required, each of these should be constructed with
a,= a12 0.1012 =0,05, 1,2. Thatls, two individual 95% confidence intervals on 1', and
III will simultaneously lead to correct statements with probability at least 0,90.

,=

10-6 BA):'ESIAN CONFIDENCE INTERVALS

\.

Previously, we presented Bayesian techniques for point estimation. In this section. we will
present the Bayesian approach to constructing confidence intervals.
We may use Bayesian methods to construct interVal estimates of parameters that are
similar to confidence intervals. If the posterior density for 8 has been Obtained, we can construct an interval, usually centered at the posterior mean, that contaIns 100(1 - a)% of the
a)% Bayes interval for the
posterior probability. Such an interval is called the 100(1
unknown parameter
While in many cascs the Bayes interval estimate for will be quite similar to a classical confidence intenral with the same confidence coefficient, the interpretation of the two
is very different. A confidence interval is an interval that" before the sample is taken, will
include the unkno"n 8 ",th probability I - a. That ls, the classical confidence interval
relates to the relative frequency of an inreIVlll including 8, On the other hand, a Bayes interSince the
val is an interval that contain! 100(1 - a)% of the posterior probability for
posterior probability density measures a degree of belief about 8 given the sample results,

e.

e.

10~7

BootStrap Confidence Intervals

253

the Bayes kterval provides a subjective degree of belief about IJ radler tila:! a frequency
interpretation. The Bayes interval estimate of is affected by the sample results but is not
completely determined by them.

Suppose that the random variableXis normally distributed \:.-1m meanjJ. and variance 4. T..'1e value of
jJ. is :.mknOVl!l. but a reasona.ble prior density wocld be normal \Villi mean 2 and variance L Tha: is,

We can show that the posterior density for J.t is

-+1)'

,J!2

r I(
'(
- 8 1'}
eXP1-'::+1 j.i.- nx+ I
L 2 4

n+4

the methods of Section 10-1.4. Thus, the posterior distribution for jJ. is normal with mean
S)/(r. + 4) and variance 4!(n "7 4). A 95% Bayes interval for jl, which is symmetric about the
posterior mear.;, would be

nX-8

2
"Vn+4

nX+8

---z".'2S~"J!~--+z".c"

n+4

(10-66)

n-4

If a ra:odom sample of size 16 is taken and we:5.nd that X:::;; 2.5, equation

1O~66

reduces TO

1.52~J!"3.28.

If we ignore the prior information, the :::lassical confidence interval for jJ. is

We sec tb.at the Bayes i.:.'"lterval is slightly shorter than the classical co:r6dence i..'"lterva1, because the prior
:.riormation is equiva1en~ to a slight increase in lle sample size if ne prior knOWledge was 3.';sumed.

10-7 BOOTSTRAP CONFIDENCE INTERVALS


In Section 10-1.6 we inL.-oduced the bootstrap technique for estimating the standard error
of a parameter, 8. The bootstrap technique can also be used to construe: a confidence interval on
For an arbitraI)' parameter 8, general 100(1 - a)% lower and upper limits are,

e.

respectively.

L ~ ~ - 100(1

aI2) percentile of eil -Ii),

u ~ ~-lOOeal2) percentile of e~- fJ).

254

O1apter 10

Parameter Estimation

Bootstrap samples can be generated to estimate the values of L and U.


Suppose B bootstrap samples are generated and
and i? are calculated. From
these estimates, we then compute the differences 8; - 8*,0; - (j"', .. .
8"', arrange the dif~
ferences in increasing order, and find the necessary percentiles I OO( 1 - aJ2) and I OO{aJ2)
for L and U. For example, if B = 200 and a 90% confidence interval is desired, then the
100(1 0.1012) = 95th percentile and the 100(0.1012) = 5th percentile would be the 190th
difference and the 10th difference, respectively.

il;,O;, ... ,O;

8; -

A"l electronic device consists of four components. The time ~o failure for each component follows ar.
exponential distribution and the co:nponents are identical to and bdependen! of one another. The
elcctronic device will fail only after all four components have failed. ine times to failure for the electronic components have been collected for lS such devices. The total times to failure are
78.7778, 13.5260, 6.8291, 47.3746, 16.2033, 27.5387, 28.2515, 385826,
35.4363, 80.2757, 50.3861. 81.3155. 42,2532, 33.9970, 57.4312.

It is of interest to construct a 90% co::Jiidence mterval on the exponential paramcter ;L By definition,


Ll],e sum of r indepcadent and identically distrlbuted exponential ra:.dom variables follows a gamma"
distribution and is defined as gamma(r, It). Therefore, r=4, bllt A. needs to be estir:lated. A bootstrap
estimate for Acan be found using the technique given in Section 10-1.6, Using the time-tofailure data
above. we find the average time to faibrc to be x::; 42.545. 1he mean of a gamma distribution is E(X)
""" riA 2.I1d A is calculated for each bootstrap sample. Ruuui.ng M:nilablS for B """ 100 bootstraps, we
found that the bootstrap estimate X"", 0.0949. Using the bootstrap estimates for each sample, the differences can be calculated and some of the calculations are showr. in Table 10-4.
\\Then the 100 differences are arranged in increasing order, the 5th percentile and the 95th per~
centiles turn out to be -0.0205 and 0,0232, respectively. Therefore, the resulting confidence limits are
L= 0.0949 -0.0232 =0.0717,
U= 0.0949 - (-{).0205) = 0.1154.

We are approximately 90% confident that the :rue value of Alles bet\1.'cen O.fYl 17 a:.d 0.1154. Figure

10-10 displays the histogram of the bootstrap estimates 1~ while Fig. lQ..ll depicts the Cift"'erences
A: - T. The bootstrap esf..JIlates are reasonable when the estimator is unbiased and the stmdm"d eITO!
is approximately constant.

Table 104 Bootstrap Estimates for Example 1026


Sample _ _ _ _ _ _i:...,_ _ _ _ _ _
~.'-,-_X'
__

1
2
3

0.037316
0,090689
0.096664

-{).OO75392
-{).o041660
0.0018094

100

0.090193

-{).0046623

10-8

Other Interval Estimation Problems

255

r
20
-

r-

I-

--

il--h

0,0650,0750,0850,0950.1050.1150.125 0.135 0.145 0.155

lambda

Figure 10-10 Histogram of the bootstrap estimates i;.


0-

20

r-

r--

--

r-

-0,03 -0,02 -0,01

_r-r-

Il-h

,
0,00

0.01

0,02

0,03

0,04

differences

Figure 10~11 Histogram of the differences):.; -

J'

10-8 OTHER INTERVAL ESTIMATION PROBLEMS


10-8.1 Prediction Intervals
So far in this ehapter we have presented interval estimators on population parameters, sueh
as the mean, 1.1,. There are many situations where the praetitioner would like to predict a single future observation for the random variable of interest instead of predicting or estimating the average of this random variable. A prediction interval ean be eonstrueted for any
single observation at some future time.
Consider a given random sample of size n, Xl' X 2, ... , X n, from a nonnal popUlation
vrith mean J.l and varianee cT. Let the sample average be denoted by X, Suppose we wish to
predict the future observation Xn+l' Sinee Xis the point predietor for this observation, the
prediction error is given by XlI+1 - X. The expected value and variance of the prediction error

are

256

Chapter 10

Parameter Estimation

aDd

Since Xltt1 and Xare independent, normally distributed random variables, the prediction
error is also Donnally distributed and

iX,-X)-O

Z= \

1\'1"

_______

t'

1'\

o-'l1+-j
~
n;
and is standard nonnal, If cT is unknown, it can be estimated by the sample variance,
and then

S',

follows the ,distribution with n-l degrees of freedom.


Following the usual procedure for constructing confidence intervals, the two-sided
100(1- a)% prediction interval is

By rearranging the inequality we obtain the final fonn for the two-sided 100(1 - a)% prediction interval:

I I)

(
l''
i ,.
I' 2
2
X-ta./'l,ll-l;JS
ll+-;;)::;XIl+~S;X+tai2.II-I\S
(1+;.

(10-67)

The lower one-sided 100(1- a)% prediction interval on Xn+: is given by

f,(!'1

X-t~'"'l';S ll+~)5Xr.+l'

(10-68)

The upper one-sided 100(1 - a)% prediction interval onX_, is given by

(10-69)

,!iEiii~!~I~~i
Ma..mum forces experienced by a transport airq:aft for an airline on a particular route for 10 flights
are (in units of graviry, g)

1.15,1.23,1.56,1.69,1.71,1.83,1.83,1.85,1.90,1.91.

10-8

Other Interval Estimation Problems

257

The sample average and sample standard deviation are cai.culared to be i """ 1.666 and s : : :. 0.273,
respectively. Rmay be of importance to predict the next maximum force experienced by the <L.."'Crnft.
Since !C.cG!!.9 = 2.262. the 95% prediction interv<u onX! 1 is

:.666 -

t"!2"~IJo.273)2(1
+ 1 I" XII" 1.666+ t"/2'~1 JO.Z73)'(1-..!..).
I
10,
I
W

L018"X" ~2.314.

108.2 Tolerance Intervals


As presented earlier in this chapter. confidence intervals are the intervals in which we
expect the true population parameter, such as Il. t.o lie. In contrast, tolerance intervals are
in~ervals in which we expect a percentage of the population values to lie.
Suppose that X is a normally distributed random 'Y'ariable with mean f.l and variance 0"2,
We would expect approximately 95% of all values of X to be contained within the interval
f.L:: 1.645a. But what if jJ and crare unknown and must be estimated? Using the point esti~
mates xand s for a sample of size n, we can construct the interval:X 1.645s. UnfOrtUnate} y,
due to the variability in estimating fJ and a the resulting interval may contain less than 95%
of the values. In this particular instance, a value larger than 1.645 v.ill be needed to guat~
antee 95% coverage when using pDint estimates for the population parameters. We can construct an interval that will contain the stated percentage of population values and be
relatively confident in the result. For example, we may want to be 90% confident that the
resulting intenra] coverS at lea,;:t 95% of the population values. This type of interval is
referred to as a tolerance interval and can be constructed easily for various confidence
levels.
In general. for 0 < q <: 100" the two-sided tolerance interval for covering at least q% of
the values from a normal population with 100(1 - a)% confidence is i ks. The value k is
a constant tabulated for various combinations of q and 100(1 .- a), Values of k are given in
Table XlV of the Appendix for q ~ 90, 95, and 99 and for 100(1 - a) ~ 90, 95. and 99.
The lower one-sided tolerance interval for covering at least q% of the values from a
nOOl1al population with 100(1 - a)% confidence is i - <S. The upper one-sided tolerllllCe
interval for covering at least q% of the values from anormaI population with 100(1- a)%
confidence is :x + ks. \TariOtlS values of k for one-sided tolerance intervals were calculated
using the technique given in Odeh and Owens (1980) and are pro;ided in Table XlV of the
Appendix.

Reconsider the maximum forces for the t:rar..spon nL"craft in Exru:Lple 10-27. A two-sided tole:ance
interval is desired that would cover 99% of all maximum forces with 95% confidence, From Tabk
XIV (A.ppendix), with I - a = 0.95, q ~ 0.99, and n la, we ftnd that k = 4.433, The sample average and sample standard deviation were calculated asx= 1,666 ands= 0.273. :espectively. The resulting tolerance intervalls then

1.666 4.433(0.273)

or
(0.4 56,2.876).

Therefore, We conclude that we are 95% confident that at leaSt 99% of all maximum forc~s would lie
between 0.456 g and 2,876 g.

258

Chapter 10

Parameter Estimation

It is possible to construct nonparametric tolerance intervals that are based on the


extreme values in a random sample of size n from any continuous population, If P is the
minimum proportion of the population contained between the largest and smallest observation with confidence 1 a, then it can be shown that
n1"'-1 - (n -1)1'"

a.

Further, the required n i,... approximately

I 1 + P X~.4
n=-+-"-'--.
2 l-P 4

(10-70)

Thus, in order to be 95% certain that at least 90% of the population will be included
between the extreme values of the sample, we require a sample of size
.

n=l.+~. 9.488 =46.


2

0.1

Note that there is a fundamental difference between confidence limits and tolerance
limits, Confidence limits (and thus confidence intervals) are used to estimate a parameter
of a popu1ation~ while tolerance limits (and tolerance intervals) are used to indicate the limit, between which we can expect to find a proportion of a population. As n approaches infinity, the length of a confidence interval approaches zero, while tolerance limits approach the
corresponding quantiles for the population.
.

10-9 SlTh:l.MARY
This chapter has introduced the point and interval estimation of unknown pa.rnmeters. A
number of methods of obt.airring point estimators were discussed, including the method of
maximum likelihood and the method of moments, The method of maximum likelihood usu~
ally leads to estimators that have good statistical properties. Confidence intervals were
derived for a variety of parameter estimation problems. These intervals have a frequency
interpretation. The N/o-sided confidence intervals developed in Sections 10-2 and 10-3 are
summarized in Table 10-5. In some instance.'>t one-sided coclidence intervals may be appropriate. These may be obtained by setting one confidence limit in the two-sided confidence
interval equal to ttie lower (or upper) limit of a feasible region for the parameter, and using
a instead of cd2 as the probability level on the remaining upper (or lower) contidenco
limit. Confidence intervals using a bootstrapping technique were introduced, Tolerance
intervals were also presented. Approximate confidence intervals: in maximum likelihood
estimation and simultaneous confidence intervals were also briefly introduced.

Table 10~S

Summary of Confidence Interval 'Procedures

Point
Estimator

Problem Typ~

Two~Slded

100(1

a)% Confidence Interval

dl known
X
.~~==--~--~~~~~----~=------

Mean II of a normal distribution, varianl;c

Difference in means of two nonnal distributions


)11 and 11.,.. variances O'~ and a~ known

XI--X;:

Me..m It of a normal distribution.


variance a'k unknown

.X

XI

II)

--Zajl

"1

~ 5; XI--X.1 +z'111

11:

O'~

0';

'"

111

-+-

X -tall,~ ! .'1/ -f,; .s: J1 s X+ t"/1,'1_1 s/ In


-

Difference in means of two normal


distributions PJ -lkJ. variance a~ == (f~
unknown

XI-X:!

XI -

X2 -

--- 5111-JJ2 .s:Xj -Xl


n1

1"/2,,,,+,,,

whereS

=
p

l)jfference in means of two normal


distributions for paired samples

Po;: III

Jj

D-tut;,HSV/.Jn

!O,J1() S

+lj2,~,t", )Sp

[-I + -1 ,

Vfll

n1

r(:'!-1)Sl?+(1l2--~l~

n:;+n.--2

15 + t"/l,,, __ 1SD/.,j;;

-- Jit

Variance (f1 of a normal distribution

s'J

Sq'

s'

.s~

Ratio of the variances ~/a~ oflwo


nomulI dislribulions

0'1

S2

cr; s;

::;.-L;s:iF

fJ.j1,~,-1,,,,-1
~

Proportion or parameter of a binomjal


distribution p
Difference in two proportions: or two
binomial parameters PI - P2

fJ

Pi - P2

'2on
P

~Nl~P)c""p c"+z
F(Hj
p
at!

" " Z,,/l


. fE",li~;;:J+

p~ - p] --

n!

=~=
{II.

<>

'"

<'

5 p; .. Pi ~. PI - Pi

+ L'."J~'-'-----'-"'-

J
(g

260

Chapter 10

Parameter Estimation

10-10 EXERCISES
10-1. Suppose we have a random sample of size 2n.
from a population denoted X. and E(X) <= P and V(X)
= cf. Let

10~10. Find the estimator of /_ in the exponential dis~


tribudon by the method of moments, based on a random sample of size n,
10~11~ Find moment estL.'1lators of the parameters r
and ). of the gamma distribution. based on a random
sarr.ple of size n.

be two estimators of JL Vlhich is the better estimator


of j.1.? E.~plain you:- choice,
10-2. Let Xl' X;., .. , X1 denote a :-a.'1dom sample from
a population having mean j.1. and variance <fl. Consider the following esti.n.tators of jJ.:

8, =~l~~;. _ _C,

10~12. Let X be a geometric random variable with


parameter p. Find an estimator of p by the method of
moments, based on a random sample of size n.
10~13. Let X be a geometric random variable with
parameter p. Fmd the maximum likelihood estimator
of p, based Or'! a random sample of size n.

10-14. Let X be a Bernoulli random variable with


parameter p. Find an es:::i:mator of p by the method of
moments, based on a random sample of size n,
Is either est.mator U:lbiased? \\!hich estimator is "better"? In what sense is it better?
10-3. Suppos.e tha: 8, and &;. are estimators of the
paramet", 8. Weknow thatE(e,)~ e,E(fI,) = el2, Vee,)
:= 10, and '\'(8;,) ;:: 4. \Ar'hid:. esti..::::lator is 'ber-..er"? In
what sense is. it better?
104. Suppose tha: lil' ~, and t\ a..-re estimators of e.
We ","ow lb." E(G,) = E{e.,) = e, E(B,) '" e, Vee,) = 12,
Veil,) 10, and E(i!, - 0' = 6. Compare these three
estimators. Whicb do you prefer? \\-by?
10,.5. Let three random samples of siz.es n1 := 10, n2 ;;;;;;
8, and n~ = 6 be taken from a. populatioa willi mean J.1.
and variance a'. Let S~,
and S~ be the samp:e vari~
ances. Show t.'1a.

82 _

lOSt . . . 8Sj -+ 6si

24
is an unbiased estin:Ia{or of dl,
10-6. Best Linear Cnbiased Estimators. An e.<:;tima.~
tor is called a :;mear estim.ator if it is a linear combi~
nation of the observ-ations in the sample, 8is called a
best linear unbiased cstL."l1ator if. of all linear func~
dons of the observations. it both is ur.biased and has
m.in.llnum variance. Show that the sample mean X is
the bes;: linear unbiased estimator of the population

meanp.
10M7. Find L,e t'laX.i.mum likelihood estimator of the
parameter c of the Poisson distribution, based on a
random sample of size n.
10~8.

Find the esti:nator of c in the Poisson distribu~


tion by the method of moments, based on a random
sample of size n.
10~9. Fmd the maximum likelihood e-timator of the
parameter A. in the exponential distribution, based on a
random sample of size !L

10,,15. Let X be a binor::rial random v:a..-iable with


parameters n (known) and p, Find an estimator of p by
the method of moments, based on arandom sample of
size N.
10-16. Let X be a binornia~ random variable with
parameters n and p, both unknown, Fmd es:imators of
n and p by the method of momen~, based on a random
sample of size N.
10-17. Let X be a. binomial random va..';able with
Pl'Lrameters n (unlo:1own) and p. Find the maximum
likelihood estimator of p, based on a random sample
of size N.
10~18. Set up the likelihood function for a random
sample of size n from a Weibull distribution. What
difficulties would be encountered in Obtaining the
maximum likelihood estimators of the three parame~
ters of the Weibull distribution?

e,

10~19. Prove that.if {j is an unbiased estimator of


and iflirc,,-.lf'(e);;;;;; 0, then 8is a consistent estimator

of e.

10~20. Let X be a random ..mabIe with mean jJ. and


varianee d". Given two random samples of sizes n1
and n2 with sample means XI and X;;l respe;;tive1y,
show ':hat

0< a < 1,
:is an unbiased estimator of J1. Assuming X\ and Xz to
be ffidependent. find the value of a that minimizes the
va...""iance of X,
10-21. Suppose that the random variable X has the
prohability distribution
fix) = ({+ 1)x1,

=0,

o<x< 1,
oth~'ise,

Let X;, Xz, "', Xii be a random sample of size fl., Find
the maximum likeJihood estimator of Yo

10-10 Exercises
10-22. Let X have :..'lJ.e truncated (on the left at x) expo~
nential distribution
fix) ~ A exp[-A(X - x,)].
x> x, > 0,
;:: 0,

otherwise.

Let X" X2 ,." X" be a random sample of size n. Find


the ~:::\Um likelihood estimator of ).,.
1(}~23. Assr:me that ;t in the previous exercise is
}:nmvn but xf is unknown. Obtain the maximum likelihood estimator of At'

I,

1(124. Let X be a random variable whh mean f1 and


'l2.Iia."lce r;il, and let X:. XZ7 .",X" be a random sample
of size n from X. Show thaI the estimator G
.'
K",~lr
~i"'! ,Xi+l - Xi) is unbiased for .an appropriate

choice for K. Find the appropriate value for K.


i

::

10~25. Let X be a normally distributed random vari


able with mean 11 and variance UZ. Assume that rr is
known and Jl unknown. The prior density for ,u is
asSUJUed to be normal with mean Jio ani! variance ~.
Determi.."le the posterior density for 11, given a :andom
sample of size!t from X.

1()"26. LetA,Zbe norm.illy distributed \Vi.th known mean


Jl a!1d unknown variance al. A.~:''''::le that the prior
dcllSity for 11 cr is a gan:u:na distribution with parame
ter:s m + 1 and m~, Det:::m:rine the pOsterior density
for ller, given a random sample of size n from X.
w

10-21. Let X be a geometric random variable \Vith


p, Suppose we assume a beta distributon
with parameters a and b as the prior density for p.
Detennine the posterior density for p, given a random
sl!II1ple of size n from X.
pa.~ter

10~28.

Let X be a. Bernoulli random variable Vllth


parameter p.lf the prior density for p is a beta distribution with par..uneters a and b, detennine t.1e poste~
rior density for p, given a nmdom sample of size n
from X.
10-29. Let X be a Poisson random variable with
parameter A.. The prior density for A is a ga:::nma distribution with Pi!I'atnct!!::'5 m -;-1 and (m + 1)/~. Determine the posterior density for A, given a random
sample of s;ze n from X.
lQ.3Q. Suppose that X ~ N(u, 40), and let the prior
density for f.1. be N(4, 8), For a random sample of size
25, the value
4.85 is obtained. \Vhat is the Bayes
estio:tate of J.L, assu,"l)ing a squared-error loss?

x""

10~31. A process manufactures printed circult boards,


A locating notch is drilled a distance X from a component hole on the board. The distance is a ta..'1dom
\'a!iable X ~ N(p., 0.01). The prior density for ,1.1. ~ uniform between 0.98 and 1.20 inches. A random sample
of size 4 produces the value X"" LOS. Assuming a
squared-e>ror loss, determine the Bayes estimate of 11.

261

10-32. The time be'h'leen failures of a mili.i::ig machine


is exponentially distributed with parameter A.. Suppose we assume an exponential prior 0:1 A with a
mean of 3000 hOllrs. Two machines are observed and
the average t:i.:;;te betvleen failures is i "'" 3135 hou\'s,
Assumiag a squared-error loss, detem:U:le the Bayes
estimate of )..
10-33. 'The weight of boxes of candy is noonally dis
tributed wi'll mean J1. and variance ~. It is reasoaable
to assume a prior density for il that is normal Vtith a
mean of 1O pounds and a variance of B" De:ermine the
Bayes estir:1ate of jJ. gjven that 11 sample of size 25
produces i :0.05 po:mds. If boXes ':...1at weigh less
than 9.95 pounds are defective, what is the probability
that defective boxes Viill be produced?
10-34. The number of defects that occur on a silicon
wafer used in integra:ed circult manufacturing is
known to be a Poisson random variable with patamelet A. Assume that the prior density for A is eAl1onen~
ti.al \Vith a parameter of 0.25, A total of 45 defects
were observed on 10 wafers:. Set up an integral that
defines a 95% Bayes. interva: for .t. "rna:: difficulties
would you er.counter in evaluating &.:is inregral?
10~3S,

The random variable X has a density function

f(xW)~2;.

and the prior density for

0<x<8,

eis

J'(e)~l.

o<e<1.
eassUttling n: "" 1.
(b) Find the Bayes estimator for e assuming :he loss.
function
~
and" = l.
(a) Find th::: poslerior density for

e(e; e) e'(e- et

l(}~36. Let X follow the Bcrnot;lli distribution with


parameter p. Ass',lme a reasonable prior density for p
::0 be

ftp)

6p(l- p).

=0,

O';p$l,
othenvise.

If the loss function is squared error, find the Bayes


est.i.=:t2.tor of p if one observation is available. If :'1e
loss funcrion is
f(P;p) =2(p _ p)'.
:find the Bayes estimator of p for n "" L
10~37. Consider the confidence interval for Jl with
known standard deviation ct:

X -Z",,<1;;; ~ il ~X +Zr;:/yFn,

az ""

where (XI +
a. Let a "" 0.05 aad find the interval
for a~ "" a;;:;; a/2:::;; 0.025, ~ow find the interval for the
case (Xl ;; 0.01 and ~ 0.04. Wnich interval is
shorter? Is there any advantage to a "symmetric" carr
fidence interval?

262

Chapter 10

Parameter Estimatinn

10~38. "When Xl' X2, ,,'. X~ a;e indepe~dent Poisson


random variables, each 'With pa.'1UD.eter A, and when n
is relatively large, the sample mean X is approximately normal \\-1.th mean Aand variance Nn.

{a) What is the distribution of the statistic

X-I.?
~}.i;;

(b) Use the results of (al 10 fud a 100(1- a)% confi

dence interval for ;.

and G') = 0.18 fluid ounces for the tv.o machines,


respecti,,-ely. Two random samples of n.j ;:::: 12-'bottles
from machine 1 and ~ = 10 bottles from machine 2
are selected. and tie sample mean fill volUJX'.es are Xl
,; 30,87 fluid ounces andJ"2 = 30.68 fluid ounCes,
(a) Construct a 90% 'twt.rsided confidence interval on
the mean difference in fill volume,
(b) Construct a 95% two-sided confidence interval on
tk mean difference il1 fill volume. Compare the
v.idth of this interval to the width of the interval in
part (a).

10-39. A maD:Jfactu.rer produces piston rings for an


automobile engjne, It is known :hat ring diameter is
approximately normally distributed and bas standard

(c) Construct a 95% upper-confidence interval on the

deviation Cf= 0.001 mm.A random sample of 15 rings


has a mean diameter ofx;c: 74.036 mm.

10~46.

(a) Construct a 99% two-sided confidence iutervalon

the mean piston ring diameter.


(b) Construct a 95% lower-confidence limit on the

mean piston ring diameter.


1040. The life in houts oia 75-Wlight bulb is known
to be approximately normally distribQted, "With a standard deviation CJ = 25 hours. A random sample of 20
bulbs has a mean life of x:::::; :014 hou."'S.
(a) Construct a 95% two~sjded confidence interval on
the mean life,
(b) ConStI"Jc: a 95% lower-confidence interval on the

mean life,
1041. A civil engineer is analyzing the compressive
strength of concrete. Compressive strength is approx~
imately normally distributed with a variance cf1 =.
1000 (psi}2. A random sample of 12 specimens has a
mean compressive strength of = 3250 psi.
(a) Construct a 95% two-sided confidence interval on
mean compressive strength,
(b) Construct a 99% tVlo-sided confidence interval on
mean compressive strength. Compare the mdth
of this confidence interval with the width of the
one found in pat! (a),

10-42. Suppose that in Exercise 10-40 we wanted to


be 95% confident that the error in estimating the mean
life is less than 5 hours. What sample size should be

used?
10-43. Suppose that in Exercise 10-40 we wanted the
total width of the confidence interval on mean life :0
be S hol1."'S. \\'hat sa:nple size should be used?

mean difference in fill volume.


The burning rates of tviO different solid-fuel
rocket propellan.ts are bcirtg studied. It is known that
both propellants have approximately the same standard
de'.'iation of burning rate; that is. 0', = OJ::::::: 3 cmIs. Two
random samples of n1 :::::: 20 and 1t;;:::::: 20 spe~ens are
tested, and the sample mean burning rates are Xl ;;;.; 18
cmJs and ~ 24 cm/s. Construct a 99% confidecce
interval on the mean difference:u:' burning rate,
10-47. Two different formulations of a lead~free gasoline are being tested to study their road octane mlJll
bers. The variance of road octane number for
formulation 1 is
1.5 and for formulation 2 it is
a;:::::: 1.2. Two random samples of size I'l l "" 15 and
liz = 20 are tested, and the mean road octane Ullmbers
observed are XI:::::: 89.6 and Xz:: 92.5, CO:lstruct a 95%
two-sided confiderlee interval on: the difference in
:nean road octane number.
w

a-; : : :

10-48. The compressive stIength of concrete is being


tested by a civil engineer. He tests 16 specimens and
obtains the following dara:

2216
2225
2318

(a)

(b)

(c)

(d)

2237
2301
2255

2249
2204
2281 . 2263
2275
2295
2250
2238
2300
2217
Construct a 95% two-sided confidence in:e!'t'al oc
the mcan strength.
Construct a 95% lower-confidence interv~ on the
mean strength.
Comtrucra95% twO-sided confidence interval on
tte mean sll'e::1gth asst!lT!ing that a= 36. Compare
this interval \\ith the one from part (a).
Construct a 95 % t:'.VQ--sided prediction interva1 for
a. single compressive strength.
Construct a t:'.Vo-sided tolerance interval that
would cover 99% of all compressive strengths
with 95% confidence.

10-44. Suppose tb.at in Exercise 10-41 it is desired to


estimate the compressive strength with an CIror that is
less than 15 psi. 'Vlhat sample size is required?

(e)

10-45. Two machines are used to fill plastic bottles


mth dishwashing deterge::1t. The stat.dard deviations
of fill volu.:m.e are ttown to he 0'1 =: 0.15 fluid ounces

1049. An article in An.nual Revie.vs Material


Research (200 1, p. 291) presents bond strengths for
various energetic materials (explosives, propellants.

10-10 Exercises
and pyrotechnics), Bond strengths for 15 such materi~
als ate'shown below, Constxuct a two-sided 95% confidence interval on the mean bond stre!l.gth.

323,312,300.284.283,261,207,183.
180,179.174.167.167,157. J20.
1050. The wall thlckness of 25 glass 2liter bottles
was measured by a quality-control engineer, The salll

pIe mean was


4.05 mm and the sample sta.r:.dard
deviation was s 0.08 mm. FInd a 900/0 lower-confidence in!erVa: on the mean wall thickness.
[

I
I

10-51. An industrial er:.gineer is in~erested in estimating the mean time required to assemble a printed cir~
cuit board, How large a sample is required if the
engineer wishes to be 95% confident that the erro!' in
estiJnating the mean is less than 0,25 minutes? The
standard dev:iat!ou of assem'Oly time is 0.45 minutes.
lO~52.- A random sample of size 15 fro:n a norma]
population has mean
550 and variance S2:::::: 49,
Find the following;
(a) A 95% two-sided confidence interval on jl.
(h) A 95% lower-confidence interval on ,u.
(c) A 95% upper-confide::.ce intm'3.l o~}1.
(d) A 95% two-sided prediction intm'3.l for a single
observation.
(e) A two~sided tolerance interval that would eover
90% of all observations with 99% confidence.

10~53. An article in Computers in Cardiology (1993,


p, 317) presents the results of a heart stress test in
which the stress is induced by a particlliar drug. The
heart rates (in beats per minute) of nine male patients
after the drug is administered are recorded. The average heartrute was found to be x= 102.9 (bpm) with a
sample standare. deviation of s :;:;: 13.9 (bpm), Fmd a
90% eonfidence interval on the mean heart rate after
:he drug is ad.tt1.h'iistered.

10R54. Two independent random samples of sizes n1 ==


18 and lIz = 20 are taken from two normal popula~
tions. The sample m.eans are Xl ::;:;: ZOO and Xz = 190. We
know that the variances are 0"2 15 and O"~:::::: 12. Fmd
the folJoW..ng:
'

(a) A 95% rn,'o-sided confidence interva~ on)1: - f.L::..


(b) A 95% lower-eonfidence interval

Ot!

III - fLl.

(e) A 95% upper-coufidenee interval on Jil - J.L;;.


lO~S5. The output voltage from two different types of
transformers is being investigated, Ten transfonners
of each type are seleeted at random and the voltage
measured. The sample z:..eans
12.13 voltS and
12.05 volts. We know that the variances of output
voltage fur the two types of transformers are O'~ = 0.7
and 0; = 0.8, respectively. ConstrUct a 95%

263

two-sided confidence interval on the difference in


mean Voltage.
lO~56. Random samples of SlZC 20 wc.."'C <l:-dwn from
two independent normal populations. The saxr,p1e
means and standl:rd deviations were XI ':;:;: 22.0, .)1 ;::;
1.8. ~ ;. 21.5, and $1 = 1.5. Assuming that
0;,
find the following:
(a) A 95% two-sided confidence inter'la: on!1l - ,1)..):
(b) A 95% upper-confidence ir.tcrval ou Ji; fLl.
(c) A 95% lower-confidence interval 0!1 PI 1-0"

er: : : :

10..51. The diameter of sreel rods m.a.nufactu.red on


two different extrusion machines is being investigated. Two random samples of sizes n l ::: 15 alld n., ;;;;
18 are selected, and the samp:e mea.'1S and sample
variances are :i\ ;:: 8.73, s~ :;:;: 0.30. X1 = 8,680, and $~ :;:;:
0.34. respectively, Assuming that oi == a-i. construct a
95% two~sided corlictence interval on the difference
in mean rod diameter.
10~58.

Random samples of sizes n: == 15 and "2 10


are drawn from two mdependent Donnal populations.
The sample means anc. variances are
300, if = 16.
~::: 325, s~ =49. Assuming mar o~ ",~. construct a
95% two-sided confidence interval On 11: - P?,

x; :::

lO~59.

Consider the data in Exercise 10-48. Construct

the following;
(a) A 95% two-sided confidence interval on <fl.
(b) A 95% loweN;orJ'idence interval on d.
(c) A 95% upper-coDfictence interval On <f.

10-60. Consider the data in Exercise 10-49. Construct


the following:
(a) A 99% two-sided confidence int:en'al 00
(b) A99% lower~confidc:lCeinterval on
(c) A

99% upper-confideI:.ee interval on

<f.

cr-,

cr.

10~61. Construct a 95% two-sided confidence interval


on !he variance of the wall thickness data in Exercise
10-50.
10~6.2. In a random sample of 100 light bulbs, the
samp:e standard deviation of bul'O life was found to be
12.6 houtS. Compute a 90% -apper-confc.ence interval
on the variance of bulb life.
10~63.

Consider the data in Exercise 10-56. Construct


95% tvlo-sided eonfidence interval on the rajo of
the population variances ti/~,
3

10~64. Consider the

data in Exetcise 10-51. Construct

the following:
(a) A 90% two-sided confidence interval on ~/a;.

(b) A 95% two~sided confidence interval on 0-71a;.


Compare the width of this interval '1.-ith the 'tvidth
of the interval in part (a).

264

Chapter 10

Parameter Estimation

(c) A 90% loweN!onfidenceinterval On ~/o;.


(d) A 90% upper-confidence interval on ~/a;,

Subject

PSJ

FSI

2
3
4
5
6
7
8
9
10

25.9
30.2
33.7
27.6
33.3
34.6
33.1
30.6
30.5
25.4

33.4
37.4
48.0
305
27.8
27.5
36.9
31.1
27.1
38.0

10-65. Construct a 95% two-sided confidence interval


on the ratio of the variances ~Jo; using L'1e data in
Exercise 10-58.
10.-(i6. Of 400 randomly selected motorists, 48 were
found to be uninsured, Construct a 95% two-sided
confidence interval on the uninsu.red rate fer lnOtonS!$,

1CJ..67. How large a samp!c would be required in Exerelse 10~66 to be 95% confident that ilie error in estimating the uninsured rate for motorists is less than
0.037
10M6S. A manufacturer of electronic calculators is
interested in estimating the fraction of defective units
produced. A random sample of 8000 calculators con~
tains 18 defectives. Compute a 99% upper...confidcnce
intcrval on ilie fraction defective.
10-69. A study .is to be conducted of the percentage of
homeowners who own at least two television sets.
How large Ii sample is :required if we wish to be 99%
confident that the error in esti.:nating this quantity is.
less than 0.01?

Find a 95% confidence interva: on t'1e difference in


mean completion times. Is there any indication that
one joystick is preferable?
10~73~ The manager of a fleet of automobiles is testing two brands of radial tires. He assigns one tire of
each brand at ~dom to the t\vo rear wheels of eight
ca.."'S and runs the cars U!1til the tires wear out. The data
(in kilometers) 3...'"C shown below:

10*10. A study is conducted to determine whether


there is a significant difference in union membership
based on sex of the person. A random s.ample of 5000
factory-employcd men were polled, and of this group,
785 were members of a union. A random sample of
3000 fac:ory-employed women were also polled, and
of this group, 327 were members cf a union, Construct a 99% confidence interval on the difference in

proportions PI - P2'

Car

Brand I

Brand 2

1
2
3
4
5
6
7

36.925
45.300
36,240
32.100
37.210
48.360
38,200
33,500

34.318
42.280
35,500
31.950
38.015
47.800
37,810
33,215

10~11.

The fraction of defective product produced by


two production lines is being analyzed. A random
sample of 1000 units from line 1 has 10 defectives,
while a random sample of 1200 units from line 2 has
25 defeetives. F'md a 99% confidence interval on the
differc:lce in fraction defective produced by the two
lines.
10~72.

The resultS cf a study on powered wheelchair


were presented in the Proceedtngs of the IEEE 24th Annual Northeast Bfoengineer~
ing Conference (1998, p. 130). In this study, the
effects of two types of joysticks, force sensing (FSJ)
and position sensing (pSJ), on pov,:er wheelchair
control were investigated, Each of 10 subjects was
asked to test both joysticks. One response of inter~"t is
the time (in seconds) to complete 3 predeter:rcined
couzse. Data typical of this type of experiment ax as
follows:
driving performance

Find a 95 % confidence i."l.rerval on !he difference in


mean mileage. Which brand do you prefer?
10~ 74. Consider the data in Exercise IQ..50. Find ecn~
fidenee intervals on J.l and (72 such that we are at least
90% confident that bofuinter\.'ai~ si.xru.li.taneously lead
to correct conclusions.

10~75.

Consider the data in Exercise 10-56. Suppose


that a ~dom sa:npk of size n:; = 15 is obtained from
a third nor:wU population. Vtit.'1 Xj::: 20.5 and 5J ::; 1.2.
Find two-sided confidence intervals on #1 - IJ;;. f.ll f.lJ, and iJ...J. - f.lJ such that the probability is at least 0.95
that all three intervals. simultaneously lead to comet
c(!!1c1~sicns.

10-76. A random variable X is normally distributed

with mean f.land variance a1 10. The prior density


for J1. is uniform between 6 and 12. A random sample

10,10 Exercises
of size 16 yields x = 8. Construct a 90% Bayes inter~
va: for f.L Could you reasonably accept the hyPothesis
that J1 = 9?
lO~ 77. Let X be a noI'Ill..'illy distributed random Ya.ri~
able with mean fJ. =: 5 and unknown variance d2, The
prior dcr.sity for 11(12 is a gamma distribution with
parameters r::;;; 3.and A= 1.0, Determine <he posterior

265

density for lIcr.If a random sample of size 10 y:elds


Z(x; - 4)2 =. 4.92, detenr,lne me Bayes estimate of
lJO"': asst.l.!tli.ng a squared-error loss. Set up aa integral
that defines a 90% Bayes interval for 1/62.

10-78. Prove that if a squared~ecror loss function is


used, the Bayes estimator of eis the lUcan of the pos~ ,
terior distribction for 8.

Chapter 11

Tests of Hypotheses
Many problems requL""e that we decide whether to accept or reject a statement about some
parameter, The statement is usually called a hypothesis, and the decision-making procedure
about the hypothesis is called hypothesis testing. This is one of the most useful aspects of

statistical inference. since many types of decision problems can be formulated as


hypothesis-testing problems. This chapter will develop hypothe.~is~testing procedures for
severa} important situations.

11-1 INTRODUCTION

11-1.1

Statistical Hypotheses
A statistical hypothesis is a statement about the probability distribution of a random variable. Statistical hypotheses often involve one or more parameters of this distribution. For
example, suppose that we axe interested in the mean compressive strength of a particular
ty'Pe of concrete. Specifically, we are interested in deciding whether or not the mean COm~
pressive strength (say /1) is 2500 psi. We may express this fonnally as

Ho' /1 = 2500 psi,


HI' /1 ,,2500 psi.

(11-1)

The statemcntHo' /1 = 2500 psi in equation 11-1 is called the null hyporhesis, and the statement HI: J1 2500 psi is called the alternative hypothesis. Since the alternative hypothesis
specifies values of /1 that could be either greater than 2500 psi or less thaI: 2500 psi, it is
called a two-sided alternative hypothesis. In some situations, we may wish to formulate a
one~sided alternative hypothesis, as in

Ho: /1 = 2500 psi,


HI' /1> 2500 psi.

(JJ-2)

It is important to remember that hypotheses are always statements about the population
or distribution under study~ not statements about the sample. The value of the population
parameter specified in the null hypothesis (2500 psi in the above example) is usually determined in one of three ways. First, it may result from past experience or knowledge of the
process, or even irom prior experimentation. The objective of hypothesis testing then is usually to determine whether the experimental situation has changed. Second, this value may

be determined from some theory or model regarding the process under srudy. Here the
objective of hypothesis testing is to verify the theory or modeL A third situation arises when
the value of the population parameter results from external considerations, such as design
or engineering specifications, or from contractual obligations. In this situation, the usual

Objective of hypothesis testing is conformance resting.

266

11-1 Introdcccion

267

We are inrerested in making a decision about the truth or falsity of a hypothesis. A pro~
cedure leading to such a decision is called a test of a hypothesis. Hypothesis~testing procedures rely on using the information in a random sample from the population of interest. If
this information is consistent with the hypothesis., then we would conclude that the hypothesis is true: however, if this information is inconsistent with the hypothesis, we would conclude that the hypothesis is false.
To test a hypothesis. we must take a random sample. compute an appropriate test s.ta~
Listie from the sample d~ and then use the information contained in this test statistic to
make a decision, For example, .in testing the null hypothesis concerning the mean compressive strength of concrete in equation 11 ~ 1. suppose that a random sample of 10 concrete
specimens is tested and the sample mean i is used as a test statistic. 1f x> 2550 psi or ifi
< 2450 psi, we will consider the mean compressive strength of this pa..--ticular type of concrete to be different from 2500 psi. That is, we would reject the null hypothesis Ho: j1 :
2500. Rejecting Ho implies that the alternative hypothesis, HI' is true. The set of all possible values of;;; that are either greater than 2550 psi or less than 2450 psi is called the critical region or rejection region for the test.A1ternatively, if 2450 psi s::x S 2550 psi. then we
would accept the null hypothesis Ho: j1 = 2500. Thus, the interval [2450 psi, 2550 psi] is
called the acceptance region for the test. Note t:hat the boundaries of the critical region,
2450 psi and 2550 psi (often called the critical values of the test statistic), have been determined somewhat arbitrarily. In subsequent sections we will show how to construct an
appropriate test statistic to determine the critical region for several hypothesis-testing
situations.

11-1.2 Type I and Type II Errors


The decision to accept or reject the null h}-pomesis is based on a test statistic computed
from the data in a random sample. /nen a decision is made using the information.in a random sample, this decision is subject to error, Two kinds of errors may be made when test~
ing hypotheses. If the null hypothesis is rejected when it is true, then a type I ettor has been
made. If the null hypothesis is accepted when it is false, then a type It error has been made.
The situation is described in Table ll-!.
The probabilities of occurrence of type I and type II errors are given special symbols:
'(t

= P (type I error): P {reject HOIHo is true},

f3 =P (type II error) :

P (accept HolHo is false).

(11-3)
(11-4)

Sometimes it is more convenient to work with the power of the test. where
Power: 1- f3 =P (rejectHolHo is false).

(11-5)

Note that the power of the test is the probability that a false null hypothesis is correctly
rejected. Because the results of a test of a hypothesis a..""e subject to error, we cannot '1Jrove"
or "disprove a Statistical hypothesis. However, it is possible to design test procedures that
control the error probabilities (t and f3 to s"itably small values.
Ta.ble 11-1

Decisions in Hypothesis Testing

is True
l=eptH,
RejectH,

NQ error

lYre I error

is False
Type II etro::
No error

268

C""pter II Tests of Hypotheses


The probability of type I error ais often called the significance level or size of the teSt
In the concrete~testing example, a type 1 error would occur if the sample meanx> 2550 psi
or if;' < 2450 psi when in fact the true mean compressive strength 11- = 2500 psi. Genernlly,
the type I error probability is controlled by the location of the critical region. Thus, it is USually easy in practice for the analyst to set the type I error probability at (or near) any desited
value. Since the probability of wrongly rejecting Ho is directly controlled by the decision
maker, rejection of Ho is always a strong conclusion. Now suppose that the null hypothesis
Ho: J1 =: 2500 psi is false. That is, the true mean compressive strength J1 is some value other
than 2500 psi. The probability of type IJ error is not a constant but depends on the true mean
compressive strength of the concrete. If jJ. denotes the true mean compressive strength, then
[3(!1-) denotes the type IJ error probability corresponding to J1. The function [3(!1-) is evaluated
by finding the probability that the test statistic (in this case Xl falls in the acceptance region
given a particular value of j.1., We define the operating characteristic curve (or OC curve) of
a tcst as the plot of [3(!1-) against J1.. An example of an operating characterisric curve for the
concrete-testing problem is shown in Fig. 11-1. From this curve, we see that the type II error
probability depends on the extent to which He: J1. = 2500 psiis false. For example, note that
[3(2700) < [3(2600), Thus we can think of the type II error probability as a measure of the
ability of the test procedure to detect a particular deviation from the noll hypothesis He.
Small deviations are harder to detect than large Ones. We also observe that since this is a
two-sided alternative hypothesis, the operating characteristic curve is symmetric; that is,
[3(2400) =[3(2600). Fu:ttherrnore, when 11- =2500 the probability of type II error [3 = I - a,
The probability of type II error is also a function of sample size, as iliustrated in Fig.
11-2. From this figure, we see that for a given value of the type 1 error probability a and a
given value of mean compressive strength the type 1I error probability decrca..~s as the sam~
ple size n increases. That is, a specified deviation of the true mean from the value specified
in the null hypothesis is easier to detect for larger sample size..I) than for smaller ones. The
effect of the type I error probability a on the type IJ error probability fl for a given sample
size n is illustrated in Fig. 11~3. Decreasing acauses f3 to increase, and increasing a causes
[3 to decrease.
Because the type IJ error probability [3 is a function of both the sample size and the
extent to which the null hypothesis Ho is false, it is customary to think of the decision to
accept Ho as a weak conclusion, unless we know that ,8 is acceptably small. Therefore,
rather than saying we <'accept H o'" we prefer the terminolQgy "fail to reject R o." Failing to
reject He implies that we have not found sufficient evidence to reject HOI that is, to make a
strong statement. Thus failing to reject Ro does not necessarily mean that there is a high

.,

'.00

L~

___________ _

I
~(24CO) ~ ~(260C) .[---------- :
I

~(23GO)~~(2700)

---t---t--..L--

----O.GC - - . 2300 2400 2500

::===-~

~7Co
Li

;..

Figure 11~ 1 Operating characteristic curve for the concrete testing example.

11-1 btroducton

0,00 L

269

...=;;;;;:;;;;~=-----;;;;~-=::::~=_~
2500

/L

Figure 11~2 The effect of sample size on the operating characteristic curve.

1,00

0,00 '---"'---------:=;;-----="--~
2500
~

Figure llR3 The effect of type I error on the operating characteristic curve.

probability that Ho is true. It may imply that more data are required to reach a strong cOn~
elusion. This can have .important implications for the formulation of hypotheses.

111.3 OneSided and Two-Sided Hypotheses


Because rejecting Ho is always a strong conclusion while failing to reject Ho can be a weak
conclusion unless f3 is knovro to be small, we usually prefer to construct hypotheses such
that the statement about which a strong conclusion is desired is in the alternative hypothesiI, H" Problems for wbkh a !wo-sided altemative hypothesis is appropriate do not really
present the analyst with a choice of formulation. That is, if we wish to test the hypothesis
that the mean of a distribution /l equals SOme arbitrary value, say ,u", and if it is important
to detect values of the true mean /l that could be either greater than 110 or less than 110. then
One must use the t'~\lo-sided alternative: in

He: /l =110,
HI: /l" ,u,,:>fany hypothesis-testing problems naturally involve a one-sided alternative hypothesis,
For example. suppose that we want to reject He only when the true \'alue of the mean
exceeds 110, The hypotheses would be

Ho: /l = 110.
H!: /l > ,u",

(11.6)

270

Chapter 11

Tests of Hypotheses

This would imply that the critical region is located in the upper tail of the distribution of the
test statistic. That is. if the decision is to be based on the value of the sample mean XI then
we would reject Ho in equation 11~6 ifi is too large. The operating characteristic curve for
the test for this hypothesis is shown in Fig. 114. along with the operating characteristic
curve for a two-sided test. We obsen'e that when the true mean j.J. exceeds J.ifJ (Le., when the
alternative hypothesis. Hi: f1. > J.i.fJ is true), the one-sided test is superior to the two-sided test
in the sense that it has a steeper operating characteristic curve. When the true mean j.J.:=: J.ifJ.
both the one-sided and two-sided tests are equivalent. However, when the true mean j1Is
less than /4. the two operating characteristic curves differ. If j1 < ,l.l.w the two-sided test has
a higher probability of detecting this departure from f.1o than the one-sided test. This is intuitively appealing, as the one~sided test is designed assuming either that fJ. cannot be less than
f.1o or, if IJ. lS less than f.1o, that it is desirable to accept the null hypothesis.
In effect there are two different models that can be used for the one-sided alternative
hypothesis. For the case where the alternative hypothesis is HI: J.l > J1.o, these two models are

H 0: jJ. ~ f.1o,
Hoc> f.1o

(11-7)

Ho' IJ. <;, f.1o,


H t , IJ. > f.1o.

(11-8)

and

In equation 11-7, we are assn.ming that jJ. cannot be less than I1rr and the operating characteristic curve is undefi.'1ed for values of j1 < ,Uo. In equation 11-8, we are assuming that fJ. can
be less than f.1o and that in such a situation it would be desirable to accept Ho. Thus for equation 11-8 the operating characteristic curve is defined for all values of jJ. <;, J.1.o. Specifically,
if IJ." .11u, we have (Jrjl) ~ I af,p), where af,p) is the significance level as a function of J1.
For situations in which the model of equation 11-8 is appropriate, we define the significance
level of the test as the maximum value of the type 1 error probability a; that is, the value of
a at J1. = ikJ. In situations where one~sided alternative hypotheses are appropriate, we will
usually write the null hypothesis with an equality; for example, Ho: jJ. ~ f.1o. This will be
interpreted as including the cases Ho' jJ. <;, f.1o or Ho' .u" .11u, as anropriate.
In problems where one~sided test procedures are indicated. analysts occasionally experienee difficulty in choosing an appropriate formulation of the alternative hypothesis. For
example, suppose that a soft drink beverage bottler purchases 10-ounce nonreturnable bottles from a glass company. The bottler wants to be sure that the bottles exceed the specification on mean internal pressure or bursting stren"cth. which for lO-ounce bottles is 200 psi.

~l

air------_

1.00 ;

OC curve for a
two-sided test

one--siaed test
o.oo'====---l--~2::=="-+-

""

"

Figure 11-4 Operatir.g characteristic CUJ."V'C,5 fortvvo-sided and one-sided tests.

11-2 Tests of HypoL'>eses on a Single Sample

271

The bottler has decided to formulate the decision procedure for a specific lot of bottles as a
hypothesis problem. There are two possible formulations for this problem. either

H,: fl." 200 psi,


Ht : fl.> 200 psi

(11-9)

or

Ho: fl.;>: 200 psi,


HI: fl. < 200 1'5i-

(11-10)

Consider the formulation in equation 11-9. If the null hypothesis is rejected, the bottles will be judged satisfactory~ while if Ho is not rejected, the implication is that the bot~
tles do not conform to specifications and should not be used. Because rejecting Ho is a
strong conclusion, this formulation forces the bottle manufacturer to "demonstrate" that the
mean bursting strength of the bottles exceeds the specification. Now consider the formulation in equation 11- to. In this situation. the bottles will be judged satisfactory unless Ho is
rejected. That is, we would conclude that the bottles are satisfactory unless there is strong
evidence to the contrary.
Which fonnulation is correct, equation 11-9 or equation 11-1O? The answer is "it
deperu:ls;' For equation 11-9, there is some probability that II, will be accepted (i.e .. we
would decide that the bottles are not satisfactory) even though the true mean is slightly
greater than 200 psi. This formulation implies that we want the bottle manufacturer to
demonstrate thar the product meets or exceeds our specifications, Such a formulation could
be appropriate if the manufacturer has experienced difficulty in meeting specifications in the
past. or if product safety considerations force us to hold tightly to the 200 psi specification.
On the other hand, for the formulation of equation 11-10 there is some probability that II0
will he accepted and the bottles judged satisfactory even though the true mean is slighCy less
than 200 psi. We would conclude that the bottles are unsatisfactory only when there is strong
evidence that the mean does not exceed 200 psi; that is, when Ho: fl.2: 200 psi is rejected. This
formulation assumes that we are relatively happy wiD the bottle manufacturer's past performance and that small deviations from the specification of fl.2: 200 psi are not harmful.
In formulating one-sided alternative hypotheses. we should remember that rejecting Ho
is always a Strong conclusion, and consequently, we should put the statement about which
it is important to make a strong conclusion in the alternative hypothesis. Often this will
depend on our point of view and experience with the situation.

11-2 TESTS OF HYPOTHESES ON A SINGLE SA,,'\IPLE

11-2_1

Tests of Hypotheses on the Mean of a Normal Distribution,


Variance Known

Statistical Analysis

Suppose that the random variable X represents some process or population of interest. We
assume that the distribution of X is either normal or that) if it is nonnormal: the conditions
uf the Central Limit Theorem hold. In additinn, we asslllne that the mean fl. of X is unknown
but that the variance c? is known. We are interested in testing the hypothesis

Ho: fl. = i1<J,

II,'
where 110 1s a specified constant.

,It

i1<J,

(11-11)

272

Chapter II Tests of Hypot,oeses


A random sample of size n. Xl' X2

. , Xll~

is available. Each observation in this sam~

pre bas unknown mean J1 and known variance d'. The test procedure for He: J1 = !10 uses the
test statistic
(11-12)

If the null hypothesis He' J1 = !10 is true, then E(Y:; !10, and it follows that the distribution
of Zo is N(O, I). Consequently, if H,: J1 =!10 is true, the probability is 1 - a; that a value of
the test statistic Zo falls between - Zqfl and Zal2' where Za/2 is the percentage point of the
standard normal distribution such that P{Z;" Zan) = a/2 (i.e., Z"" is the lOO(l-a/2) percentage point of the standard normal distribution). The situation is lllustrated in Fig. 11-5 ..
Note ",lilt the probability is a; that a value of the test statistic Zo would fall in She region Zo
>Za!2 or 20 <-Za.'2 whenHo: fJ.::;:;.Uo is true, Clearly, a sample producing a value oftbe test
statistic that falls in the tails of the distribution of Zo would be unusual if Ho' jl = !10 is true;
it is also an indication that Ho is false, Thus, we should reject Ho if either

Zo>Za/2

(1l-l3a)

or
(ll-l3b)

and fail to reject Ho if


(11-14)
Equation 11~14 defines the acceptance region for Ho and equation 11-13 defines the critical region or rejection region. The type I error probability for this test procedure is ex,

The burning rate of a rocket propellant is being studied. Specifications req~ that the mean burr.in.g
rate must be 40 em/s, Furthermore. suppose that we know that the standard deviation of the burning
rate is approximately 2 cmfs, The experi:nenter decides to spe'cifY a type I errot probability a== 0.05,
and he will base the test 0;1 a random sample of size n == 25. The hypot.heses we wish to test are

He: Jl ~ 40 emfs.
H,:Jl*40cmls.
Twenty-five specimens are tested, and the sample mean burning rate obtained is i;:; 41.25 em/so
The value of u.1.e test statistic in equation 11-12 is

x- JJ.o

Zo:=! (J/~T;;
41.25-40

2/.f25

3.125.

Cr:tcal region
aJ2

),(!il-----=-----!

Figure 11 ~5 The distribution of Z:; when Ho: J1 ;:; J1Q is true,

11-2 TestS of Hypotheses oa a Single Sample

273

Since a= 0.05. thebcundaries of the critical region arc Zo.m L96and~Zo.tT'..5 =-1.96. and we note
that Zo falls in the critical region. Therefore. He is reJected. and we conclude mat the mea:: burning
rate is not equal to 40 cds,

Now suppose that we wish to test the one-sided alternative, say


H0: Jl = 11<1,
H,: Jl >!L:r

(11-15)

(Note that we could also wTIte He: /.1 ~ JJ.o.) In defining the critical region for this test; we
observe that a negative value of the test statistic 2Q would never lead us to conclude that Ho: j.t
=11<1 is false. Therefore, we would place the critical region in the upper tail of the NCO, 1) distribution and reject H0 on values of ZO that are too large, That is, we would reject Ho if

:z;, > Z".

(11-16)

Ho: Jl = 11<1,
HI: j.l <!L:r

(11-17)

Similarly, to test

we would calculate the teststaostic Zo andrejectHo on values of ZO that are too small. That
is, the critical region is in the lower tail of the N(O, 1) distribution, and we reject Ho if
(11-18)

Choice of Sample Size

I.

In testing the hypotheses of equations 11-11, 11-15, and 11-17, the type I error probability
a is directly selected by the analyst. However, the probability of type II error f3 depends on
the choice of sample size. In this section, we will show how to select the sample size in
order to arrive at a specified value of (J
Consider the two-sided hypothesis
H 0: Jl = 11<1,
HI: Jl '" 11<1.

Suppose that the null hypothesis is false and that the trUe value of the mean is J1 = 11<1 + 0,
say, whe:e 0> O. ~ow since HI is tr'J.e. the distribution of the test statistic Zo is
(11-19)

The distribution of the test statistic z" under both the null hypothesis Ho and the alternative
hypotbesisHl is shown in Fig. 11-6, From exam.irring this figure, we note that if Hl is true,
a type II error will be made ooly if - Zan S z" 5 Z<>/2 where :z;, ~ N( 15 .J;;/a, 1). That is, the
probability of the type II error f3 is the probability that z" falls between - Za12. and Z<>/2 given
that Hi is lrue. This probability is shown as the shaced portion of Fig, 11-6. Expressed
mathematically, this probability is
/

15.[;;1

f3=<11 1 ZaP-a -<II -ZO/2-

0..[,;)'

(11-20)

where <!>(z) denotes the probabilily to the left of z on the standard normal distribution. Note
that equation 11-20 was obtained by evaluating the probability that Zo falls in the interval

274

Cbapter 11 Tests of Hypotheses


Under H 1 : ji.*'1kz;

-z~'2

Figure n-6 The distribution of z;, under Ho and H,.


[- Za/2' Zari] 0:1 the distribution of Zo when Hi. is true, These two points were standardized
to produce equation 11-20. Furthermore, note that equation 11-20 also holds if Ii< 0, due
to the symmetry of the normal distribution.
While equation 11 ~ 20 could be used to evaluate the type II error, it is more eonvenient
to use the operating characteristic curves in Charts VIa and VIb of the Appendix. These
curves plot .as calculated from equation 11-20 against a parameter d for various example
sizes n. Curves are provided for both a= 0.05 and a = 0.01. The parameter d is defined as

dJI1- 1101
(J

18 1
(J

(11-21)

\Ve have chosen d so that one set of operating characteristie curves can be used for all problems regardless of the values of f.1.Q and (5. From examining the operating characteristic
curves or equation IlM20 and Fig. 11-6 we note the follOwing:
1. The further the true value of the mean 11 fromllo, the smaller the probability of type
II error pfot a given n and ex. That is, we see that for a specified sample size and
large differences in the mean are easier to detect than small ones.

a..

2. For a given 8 and a, the probability of type n error fi decreases as n increases. That
is, to detect a specified difference in the mean 0, we may make the test more powerful by increasing the sample size.

CO!l.Sider the rocket propeBant problem in Example 11-1. Suppose that the analyst is concerned about
the probability of a type II error if the true mean burning rate is J1. = 4-1 em/s. We may use the operating characteristic curves to find {J. Note that 5=41 40 = 1, n::; 25. 0"= 2. and a= 0.05. Then

and from Chart VIa (Appendix), ?lith n = 25, we find that fJ= 0.30, That is, if the true mean bur.Ling
rate is J1.=41 em/s, the:J. there is approximately a 30% chance that this will nothe detected by the test
w:ithn=25,

;iixipipiiJ~~j
Once again, eo:.mder the rocket propella:ot problem in Example 11-1. Suppose that the analyst would
like to design the test so that if the true mean hurning rate differs from 40 r::tD!s by as much as 1 cr:::Js,

~'i

11-2 Tests of Hypotheses on a Sing:e Sample

275

the test will detect this (i.e" reject Ho: JJ. = 40) "''ith a high probability. say 0,90. The operati;:g char~
acteristic curves can be I.:.sed to:tmd the sample size that wi!! give Sl.:.ch a test. Since d;;;;:; lti - .14llkf=
112. a; 0,05. and fJ= 0,10, we find fron Chart VIa (Appendix) tlJat tlJe required sample size is n =
40. approximately,

In general. the operating characteristic curves involve three pru:ameters: j3. 8, and n,
Given any two of these parameters, the value of the thlrd can be determined. There are two
typical applications of these curves,

L For a given n and 8. find j3, This was il!ustratedin Example 11-2, This kindofprohlem is often encountered when the analyst is concerned about the sensitivity of an
experiment already performed. or when sample size is restricted by economic or
other factors,
2. For a given j3 and ii, find n, This was il!ustrated in Example 11-3, This kind ofprohlem is usually encountered when the analyst has the opportunity to select the sample size at the out~et of the experiment
Operating characteristic curves are given in Cha.-ts vlc and VId (Appendix) for the
one-sided alternatives. If the alternative hypothesis is H t : J.L > /10. then the abscissa scale on
these charts is

(11-22)

When the alternative hypothesis is HI: J.L < /10> the corresponding abscissa scale Is
d=J1,~J1.
(5

(11-23)

It is also possible to derive formulas to determine the appropriate sample size to use to
obtain a particular value of j3for a given 0 and a. These formulas are alternatives to using
the operating characteristic curves. For the two-sided alternative hypothesis, we know from
equation 11-20 that

or if 0> 0,
(11-24)

<p(

since
-Za/2 - 8.[; j
inverses to obtain

(J) = 0 when 0 is positive. From equation 11-24, We take nor:nal

or

n=

(11-25)

1
276

Chapter 11

Tests of Hypotheses

o..Jn/a)

This approximation is: good when 4>(-Za,t2 is small compared to /3, For either of
the one--sided alternative hypotheses in equation 11~15 or equation lI 17, the sample size
required to produce a specified type II error with probability f3 given (j and 0: is
M

(Za + 2ft )'.:r2


Ii'

n=

(11-26)

Returning to the rocket propellant problem of Example l:i.~3, we note that 0'=2,0=41-40;:;: 1, X=::
0.05, a.'1d /3= 0.10. Since Zan. =2:,.f12S = 1.96 and Zfj =2:,.tO = 1.28, the sample size required to detect
this departure from He: J.l= 40 is found. in equation 11-25 :0 be

(z."

=-

+2,)'".,

l;i-~

(1.96+1.28)'2'
l'

-42,

which is hJ. close agreement \\'ith the value determined from the operating characteristic curve, Note

that the approximation is good, since <I>(~Z.I2-0.rn/"')=<!>(-!'96-(I)..J4212)=-5.20=O


which is small relative to /3,

The Relationship Between Tests of Hypotheses and Confidence Intervals

There is a close relationship between the test of a hypothesis about a parameter and the
confidence interval for O. If [L, ('1 is a 100(1 - a)% confidence interval for the parameter
then the test of size a of the hypothesis

e,

Ho: 8= eo>

H"e",eo
will lead to rejection of Ho if and only if 80 is not in the interval [L, UJ. As an illustration,
consider the rocket propellant problem in Example 111. The null hypothesis Ho' (1 = 40
was rejected using a= 0.05. The 95% two-sided confidence interval on (1 for these data may
be computed from equation 1025 as 40.47 :0;(1:542.03. That is, the interval [L, UJ is [40.47,
42.03], and since .110 40 is not included in this interval, the null hypothesis Ho' (1 = 40 is
rejected.
Large Sample Test with Unknown Variance
Although we have developed the test procedure for the null hypothesis H D: (1 = flo assuming that <:? is known. in many practical situations d2 will be unknown. In general. if n 2 30,
then the sample variance $2 can be substituted for c? in the test procedures with little
harmful effect. Thus. while we have given a teSt for known cf2-~ it can be converted easily in
to a large-sample test procedure for unknown <f. Exact treatment of the case where cJ1 is
unknown and II is small involves the use of the t distribution and is deferred until
Section 11-2.2.

P-Values
Computer software packages are frequently used for statistical hypothesis testing. Most of
these programs calculate and report the probability that the test statistic will take on a value

.~

1:-2 TesTS of Hyporbeses 0;1 a Single Sample

277

at least as extreme as the observed value of the statistic when Ho is true, This probabllity is
usually called a P~value. It represents the smallest level of significance that would lead to
rejection of Ho. Thus, if P=O.04 is reported in !he computer output, the null hypothesis Ho
would be rejected at the level a = 0.05 but not at the level a = O.O!. GeMrally, if P is less
than or equal to a, we would reject H0' whereas if P exceeds a we would fail to reject H o'
It is customary to call the test statistic (and the data) significant when the null hypoth~
esis Ho is rejected, so we may think of theP-value.s the smallest level aat which the data
are significant. Once the P-value is known, the decision maker can determine for himself
or herself how significant the data are without the data analyst formally imposing a preselected level of significance.
It is not always easy to compute the exact P-value of a test. However. for the foregoing normal distribution tests it is relatively easy. If ~ is the computed value of the test sta~
tistic, then the P-value is

'*-<D(2oI)]
P=<j-<D(2o}
<1>(20)

for a two-tailed test,


for an upper-lail tes~
for a lower-tail test.

To illustrate, consider the rocket propellant problem in Example 11 -!. The computed value
of the test statistic is Zo = 3,125 and since the alternative hypothesis is twotailed, the p.
value is

P = 2[1 - <1>(3.125)]

=0.0018.

Thus, Ho: JL = 40 would be rejected at any level of significance where a:?: P ~ 0.0018, For
example, Ho would be rejected if a= 0.01, but it would not be rejected if a=O.OOl.
Practical Versus Statistical Significance

In Chapters 10 and 11, we present confidence intervals and tests of hypotheses for both single-sample and two-sample problems, In hypothesis testing, we have discussed the statisti~
cal significance when the oui] hypothesis is rejected. What has not been discussed is the
practical significance of rejecting the null hypothesis. In hypothesis testing, the goal is to
make a decision about a claim or belief, The decision as to whether or not the null hypothesis is rejected in favor of the alternative is based on a sample taken from the population of
interest. If the null hypothesis is rejected, we say there is statistically significant evidence
against the null hypothesis in favor of the alternative. Results that are statistically significant
(by rejection of the null hypothesis) do not necessarily imply practically significaJ!t results,
To illustrate, suppose that the average temperature on a single day throughout a par~
ticular state is hypothesized to
63 degrees. Suppose that n = 50 locations within the
state bed an average temperature of = 62 degrees and standard deviation of 0.5 degrees.
If we were to test the hypothesis
J1 = 63 against H l : J1;/::. 63, We would get a resulting
Pvalue of approximately 0 and we would reject the null hypo!hesis. Our conclusion would
be that the true average temperature is not 63 degrees. In other words; we have illustrated a
statistically significant difference bv~'een the hypothesized value and the sample average
obtained from the data. But is this a practical difference? That is, is 63 degrees different
from 62 degrees? Very few investigators would actually conclude that this difference is
practical. In other words, statistical significance does not imply practical significance.
The size of the sample under investigation has a direct influence on the power of the
test and thc practical significance. As the sample size increases, even the smallest differences between the hypothesized value and the sample value may be detected by the

278

Chapter 11 Tests of Hypotheses

hypothesis tes~. Therefore, care must be taken when interpreting the results of a hypothesis
test when the sample sizes are large.

11-2.2 Tests of Hypotheses on the l\Jean of a Normal Distribution,


Variance Unknown
When testing hypotheses about the mean J.1 of a population when ,j2 is unkno'NU, we can use
the test procedures discussed in Section 11-2.1 provided that the sample size is large (n ~
30) say). These procedures are approximately valid regardless of whether or not the underlying population is normal However, when the sample size is small and ,j2 is un.lmO'NU1 we
must make an assumption about the form of the underlying dlstribution in order to obtaill a
rest procedure. A reasonable assumption in many cases is that the underlying distribution is
normal.
Many populations encountered in practice are quite well approximated by the norm.aJ
distribution, so this assumption will lead to a test procedure of wide applicability. In fact,
moderate departure from normality will bave little effect on the test validity. When the
assumption is unreasonable, we can either specify another distribution (exponential.
Weibull, etc.) and use some general method of test construction to obtain a \'alid procedure)
or we could use one of the nonparametric tests that are 'valid for any underlying distribution
(see Chapter 16).
Statistical Analysis

Suppose that X is a normally distributed random variable with unknown mean f1 and vari~
ance
Vole wish to test the hypothesis that f1 equals a constant 110, Note that this situation
is similar to that treated in Section 11~2.1. except that now both J.1 and if- are unlolOWD.
Assume that a random sample of size nj say Xl' Xl"'" Xfl.' is available. and letX and S2 be
the sample mean and variance, respectively.
Suppose that we wish to test the two-sided alternative

cr.

Ho: 11= f1v,


H[:I1"".Uo

(11-27)

The test procedure is based on the statistic


(11-28)

which follows the t distribution with n 1 degrees of freedom if the null hypothesis Ho: 11
: ; :. J.1o is true. To testHo: J.1::: 110 in equation 11 ~27. the test statistic Ie in equation 11-28 is calculated, and Ho is rejected if either
(1I-29a)

or
(Il-Z9b)
where tall, 'I _ I and -tan.. n _ I are the upper aud lower Ct!2 percentage points of the t distribution with n - I degrees of freedom.
For the one-sided alrernative hypothesis

Ho: 11 = /10,
H,: 11'> /10,

(11-30)

'!

r
!

11-2 Tests of Hypotheses on a Single Sample

279

we calculate the test statistic to from equation 11-28 and reject Ho if


(11-31)

For the other one-sided alternative,

He: /1 = /1."
II,: /1 </10,

(11-32)

we would reject Ho if
(11-33)

The breaking strength of Ii textile fiber is a r.ormally distributed random variable. Specifcations
require that the :t;).can breaking strer.gth shQuld equal 150 psi. The manufact'.1ter would like to detect
any significant departure fro:.;::: this ..'alue. Thus, he wishes to test

no: /1 = 150 psi,


HI:).i;J:.lSOpsi.
A random sample of 15 fiber specimens is $elected and their breaking stre~gths determined, The sample mean and variance are computed from the sample data as i = 152.18 and 52 16,63. Therefo;:e,
the [est statistic is

'0

'1-/1,

152,18-150
,,16.63/15

= --,--;::=- "'" -r,=:".=- '= 2.07.


sj.,Jn

The type I error is specified as a = 0.05. Therefore t{W2S.14 :=: 2,145 and -toA)'l"s,!4 -2, V:S. and we
would CQucludc that there is not sufficient evidence to reject the hypothesis that JJ. == 150 psi.

Choice of Sample Size


The type II error probability for tests on the mean of a normal distribution with unknown
variance depends on the distribution of the test statistic in equation 11-28 when the null
hypothesis Ho: JI = j1f, is false. V,Ilien the true value of the mean is JI = /10 + S, note that the
test statistic can be written as

So'

z+S~

(11-34)

The distributions of Z and W in equation 11-34 are N(O, 1) and ~


-i), respectively,
and Z and Ware independent random variables. Howeve~, (] J~/(] is a nonzero constant.,
so that the numerator of equation 11 ~34 is a LV"( 8. /;;/a. 1) random variable. The resulting
distribution is called the noncentral t distribution with n - 1 degrees of freedom and

x;_:!(n

280

Chapter 11 Tests of Hypotheses

Ii

nancentrality parameter o.JnICi. ~ote that if 0= 0, then the noncentral tdistribution reduces
ta the usual or central t distribution. In any case, the type II error of the two-sided alternative (far example) would be

/3 =P(- 'ai2 _ 1S to S; 'ai2.n _110;< 0)


=P(-

where to denOtes the noncentral t random variable. Finding the type II error for the I-test
involves fmding the probability contained between two points on the noncenttal t
distribution.
The operating characteristic curves in Charts VIe, Vlf, VIg, and Vlh (Appendix) plot /3
against a parameter d for various sample sizes n. Curves are provided for both the two-sided
and one-sided alternatives and for 0; = 0.05 or a = 0.01. For
two-sided alternative in
equation 11-27, the abscissa scale factar d on Charts VIe and VI!is defined as

the

181

(11-35)

For the one-sided alternatives. if rejection is desired, ie., )1 > Jl.rr as in equation II-3D, we
use Charts VIg and Vlh with
(11-36)

while if rejection is desired, Le., )1 < }.4j. as in equation 11-32,

d= Po -/1
Ci

=!...
Ci

(11-37)

V'le note that d depends on the unknown parameter dl. There are several ways to avoid this
difficulty. In some cases, we may use the results of a previous experiment or prior information to make a rough initial estimate of C?, If We are interested in examining the operating characteristic after the data has been collected, we could use the sample variance? to
estimate (32. If analysts do not have any previous experience on which to draw in estimating (;2, they can define the difference in the mean 5iliat they wish to detect relative to (3.
For example, if one wishes to detect a small difference in the mean~ one might use a value
of d = 10l/CiS 1 (say) whereas if one is interested in detecting auly maderately large differences in the me,", one might select d = 101/Ci= 2 (say). That is, it is the value of the ratio
10l/Ci that is important in determining sample size. and if it is possible to specify the relarive size of the difference in means that we are .interested in detecting. then a proper value
of d can usually be selected.

i~~tl1jiJ~_i1'.i:
Consider t..'le fiber-testing problem in Ex:ample 11-5. If the b;'eaking strength of this fiber differs from
150 psi by as much as 2.5 psi, the analyst would like to reject the null h)"pothes;s Ho: j.t = 150 psi v,rith
a probability of at least 0,90. Is the sample size n = 15 adequate to ensure that the test is lhis sensitive?
Ifwe usc the sample standard deviation s = --/16.63 =4.08 to est:imare cr, then d = 101/0"= 25/4,08 ='
0.61. By referring to the operatiJ::i.g characteristic curves in Chart VIe, with d = 0.61 and n = 15, we
find ~ = 0.45. Thus, the prabability of rejecting Ho: J.L; 150 psi if the true mean diffurs from this value
by 2..5 psi is 1 : 0.45 ~ 0.55, approxi.oJately. and we would conclude that a sample size of

v-!

11-2 Tests of HY;>otheses on a Single SampJe

281

= 15 is not adequate. To find the sample size required to give the desired degree of protection, enter
the operating characteristic curves in Chart VIe with d == 0.61 and f3 = 0.10, and read the con:espon~

1:1

ding sample size as 11.

35, approxbately,

11-2.3 Tests of Hypotheses on the Variance of a Normal Distribution


There are occasions when tests assessing the variance or standard deviation of a population
are needed. In chis section we present tvlo procedures, one based on the assumption of normality and the other one a large-sample test.
Test Procedures for a Normal Population

Suppose that we wish to test the hypothesis that the variance <72 of a normal distribution
equals a specified value, say";, Let X - N(Ji, (72), where!1 and d' are unknov"n, and letXt
X" ., .. x" be a random sample of n observations from this population. To test

=,,;,

Ho: d'
Ht'd'",~,

(11-38)

we use the test statistic


2

Xo

(n-l)S2

= ..------0-'

(1039)

<70

ic

where S' is the sample variance. Now if Ho: d' = ,,; is true, then the test statistic follows
the chi-square distribution with n -I degrees of freedom. Therefore,Ho' d'=,,; would be
rejected if
(11-40a)

or if

;Co<X;-a/2,n-t,

(lI-40b)

i-

where X~, n _ 1 and


an., ~ _ 1 are the upper and lower ctJ2 percentage points of the chlsquare distribution with n - 1 degrees of freedom.
The same test statistic is used for the one-sided alternatives. For the one-sided
hypothesis

Ho: d' =<7~


H:: if- > O'~.

(11-41)

we would reject Ho if
,

Xo> Xa.r.~l

(11-42)

Ho' d'=<7:,

(I !-43)

For the other one-sided hypothesis,

Hi:

02< 0';,

we would rejeet Ho if
(11-44)

282

Chapter 11

Tests of Hypotheses

Consider the machine described in Example 10-16, which is used to fill cans with a soft drink bever.
age. If the variance of the fill volume exceeds 0.02 (fluid our.cesf1, then an unacceptably large ~~
centage ofilie cans wi:l be underfilled. The bottler is interesred in testing the hypothesis

H,: d'; 0.02.


H,:<r> 0.02.
A randoCl sample of n.:::::: 20 cans yields a sample variance of il:::::: 0.0225. Thus, the test statistic is

(19)Q0225

21.38

0.Q2

Ifwe choose 0::.:::0.05, we find t.~at .ioll~l~ : : : 30.14, and we would conclude that there is no strong evi~
deuce tha: the variance offill volume e.,,<ceeds 0.02 (fluid ounces)2.

Choice of Sample Size

Operating characteristic curves for the


tests are provided in Charts VIi through YIn
(Appendix) for iX= 0.05 and Ct= 0.01. For the two-sided alternative hypothesis of equation
11-38, Charts VIi and VIj plot fJ against an abscissa parameter,

I..;~,

(1145)

"0

for various sample sizes n, where O'denotes the true value of the standard deviation. Charts
VIk and \11 are for the one-sided alternative H t : d' > 0-;, while Charts VIm and \'In ardor
the other one-sided alternative HI: a'l < ~. In using these charts. we think of eras the value
of the standard deviation that we want to detect.

In Example 11-7. find the probability of rejecting Ho: a1 = 0.02 if the true va.,"iance is as large as d- "'"
0,Q3, Since 0' .jO,03:;;;O.1732and <To .J0.02:=O,l414~ea.bscissaparameteris

From Chart \'Ik, v.ith;1.= 1.23 and n 20, we find that {:J ""'" 0.60. ThaI is, there is only about a 40%
chance that Ho: d1 = 0.02 will be ::ejected if the variance is really as large as <T2 = 0.03. To reduce p.
a larger sample size must be used. From ':he operating characteristic curve. we note that to reduce f3
to 0,20 a sample size of?5 is necessary.

A Large-Sample Test Procedure


The chiwsquare test procedure prescribed above is rather sensitive to the nonnality assumption. Consequently, it would be desirable to develop a procedure that does not require this
assumption. "'hen the underlying population is not necessarily normal but n is large (say n
;" 35 Qr 40), then we can use the following result: if Xl' X2 , .. " Xn is a random sample from
a population with variance
the sample standard deviation S is approximately nonnal
with mean E(S) "" "and vartance V(S) =
if n is large.

cr,

,,'an,

11~2

Tests of Hypotheses on a SIngle Sample

283

Then the distribution of


(11-46)

is approxioately standard normal.


To 7est

Ho: cf=~

(11-47)

Hl:cf",~,
substitute 0'0 for v in equation llA6. Thus, the test statistic is

(11-48)

and we would reject Ho if Zo > Zan or if Zo < - Zan' The same test statistic would be used
for the one-sided alternatives. If we are testing
Ho: cf

(l'~

(11-49)

HI: cf > 07"


we would reject Ho if Zo > Za> while if we are testing

Ho: cf = d,;,
HI: cf <

( 11-50)

d,;,

we would reject Ho if Zo < -Za-

An injection-mo1ded plastic part is used in a graphics printer. Before agreeing to a lQng~term contract.
the printer manufacturer wants to be sure using a"'" 0.01 t."1at the supplier can produce parts vrith a
standard deviation of length of at most 0.025 m.aL The hypotheses to be tested are

Ho: d'

6.25 X 10-4,

H,' d' < 6.25 X 10-4.


smce (0,025)'2= 0,000625. Arandoro sample of n::: 50 parts is obtained, and t.1C sample standard devi~
ation is s = 0,021 I1t01. The test statistic is

Since ~Zom -2.33 and the observed value of Zo is not smaller than this critical value, Ho is aot
rejected. That is, the evidence from the supplier's process is not strong enough to justify a long-term
contract.

11-2.4 Tests of Hypotheses on a Proportion


Statistical Analysis

In many engineering and management problems, we are concerned with a random variable
that follows the binomial distribution. For example, consider a production process that
manufactures items that are classified as either acceptable or defective. It is usually

284

Chapter 11

TestS of Hypotheses

reasonable to model the occurrence of defectives with the binomial distibution, where the
binomial parameter p represents the pro:portion of defective items produced,
We will consider testing

Ho:p=po,

(II-51)

H,:p*po

An approximate tesl: based on the nonnal approxi.-nation to the binomial will be given. 11:lis
approximate procedure will be valid as long as p is not extremely close to zero or 1, and if
the sample size is relatively large. Let Xbe the number of observations in a random sample
of size n that belongs to the class associated with p. Then, lithe null hypothesis Ho: P =Po
is true, we have X - N (npo. nPo(l- pf)). approximately. To testHo: p =Po calculate th~ test
statistic

(II-52)
and reject Ho: p = p, if
(11-53)
Critical regions for the one-sided al.ternative hypotheses would be located in the usual'

manner.

A semieonductor firm produces logic devices. The contract with their custome!; calls for a fraction

defective of no more ilian 0.05. They wish to test

He' P = 0,05,
H,:p>0,05,
A random sample of 200 devices yields six defectives. The. test statistic is

6-200(0,05)

~200(0,05)( 0,95)

-1.30

Using a= 0.05. we f...1.d !hat ZV.C5:::: 1.645, and so we cannot reject the null h}-pothesis that? = 0.05.

Choice of Sample Size


It is possible l<l obtain closed-form equations for the {3 error for the tests in this section, The
{3 error for the two-sided alternative H,: p Po:s approximately

11~2

Tests of Hypotheses on a Single Sa.'"Dp:e

285

If the alternative is H,: p < p", then

(11-55)
whereas if the alternative is H,: p

> p", then


(11-56)

These equations can be solved to find the sample .3ize n. that gives a test ofleve1 a that
has a specified f3 risk. The sample size equations are

(11-57)
for the two-sided altemative and

Za~Po(l- Po) +Zp~p(l- plY

n=

P-Po

(1l-5B)

for the onesided alternatives.

For the situation described in Example 11"10. suppose that we wish to find the ,8 error of the test if p

;;; o.m, Using equation 1i-56. the f3 error is

( 0.05-0.07 +1.645.1(0.05)(0.951/200 \

f3=~

" ;

~(0.07)(O.93)/?OO

= <1>(0.30)

= 0.6179.
This type II error probability is not as small as one might like. but ft "'" 200 is not'particularlY large and
0,07 is not very fat from the null value Po =< 0,05. Suppose that we wanl the f3 error to be no larger than
0.10 if:he true value of the fraction defective is as large as p;:; 0.07. The required sample size would
be found from equation 11-58 as
n = [L645~(O.05)(O.95) +L2ll~(O.07)(0.93)J'2
0.07 =0.05

= 1174,
which is a very large sample size, However, notice that we are trying to detect a very small deviation
fron:: the null value p, = 0.05.

286

Chapter 11

Tests of Hypotheses

11-3 TESTS OF HYPOTHESES 0.:-1 TWO SAMPLES

11-3.1 Tests of Hypotheses on the Means of Two Normal Distributions,


Variances Known
Statistical Analysis
Suppose that there are two populations of interest. say X, and X,. We assume that X, has
and that X2 has unknown mean iLl and known
unknown mean il, and known variance
variance
We will be concerned with testing the hypothesis that the means /11 and Jlz are
equal. It is assumed either that the random variables Xl and X2 are normally distributed or,
if they are nonnormal, that the conditions of the Central Limit Theorem apply.
Consider first the two-sided alternative hypothesis

<ri.

a;.

Ho: il;

=iLl.

(II-59)

H,: il, '" iLl


Suppose that a random sample of size nl is drawn from Xl' say Xll' X: z,.,., XII'j!' and that a
second random sample of size n2 is drawn fromX2, say X Z1 ,X22,. .. ) Xlnt' It is assumed that
that the (X,) are
the (XI}) are lndepencently distributed with mean il, and variance
independently distributed with mean iLl and variance
and that the two samples (Xlj) and
(X,) are independ':!'t. ~e test procedure is based on the distribution of the difference in
sample means, say X: ~ Xl. In general, we know that

<ri,

ai,

aJ

_
_
It'
a~)
Xl-X,-N, il,-iLl,-+-' .
\

Thus, if the null hypothesis Ho: il,

nl

llz

=iLl is true, the test statistic


ZO = ~~X"-",

J7n:f + <1~"2

(11.60)

V
follows the N(O, I) distribution. Therefore, the procedure for testing Ho: ill = iLl is to calculate the test statistic in equation 11-60 and reject the null hypothesis if

z;,

(1I-6Ia)
or

Zo '" -Z"".

(1l-6Ib)

The one-sided alternative hypotheses are analyzed similarly, To test

Bo: ill = iLl,


H,: ill > iLl.

(11-62)

the test Statistic Zo in equation 11-60 is calculated, and Ho: p., = iLl is rejected if

Zo>Z..

(11-63)

To test the other one-sided alternative hypothesis,

Ho: il, = .u"


HI: J1.1 <fl2.

(11-64)

use the test statistic Zo ill equation 1160 and reject Ho; il, = iLl if

Zo<-Za-

(11-65)

11-3 Tests of Hypotheses on Two Samples

287

Ex'ii.np'j"ii
2
...,,,.,S1'.,'-.'

{,."",/.",."".,,-,.,

The plant mar..ager of an orange juiee canni.....g facility is interested in co:rcparing the performance of
twa different production lines in her plant. AI; line number 1 is relatively new, she suspects that its ont~
put in number of cases per day is greater than the n1.Ullber of cases produced by the older line 2. Ten
days of data are selected at random for each line, for which it is found that ='- 824.9 cases per day
and X::.:;:;; 818.6 cases per day. From experience with operaticg this type of equipmectlt is la:-ovm that
d, = 4() and 0; = 50. We wish :0 test

xl

Ho: J1.t =p,.


H j:J1.j>p,.
The value of the test statistic is

20

824.9-8,8.6 ~2.10.
/40 50

Xj-.i'2

0; ai

i -;;~. +--;;;
j

i 10 + 10

"Gsing a:= 0.05 we find that 21:!,os 1.645, and since 4 > 4.1)5' we would reject H J and condude that
t.'le mean number of cases per day produced by the new produetion line is greate! than the mean number of cases per day produced by the old line.

Choice of Sample Size


The operating characteristic curves in Charts VIa, Vlb, VIc, and 'lId (Appendix) may be
used to evaluate the type IT errorprobab'Jity for the hypotheses in equations 1159, 1162,
and 11-64. These curves are also useful in sample size detertll.ination. Curves are provided
for 0:= 0.05 and 0:= 0.01. For the two-sided alternative hypothesis in equation 1159. the
abscissa scale of the operating characteristic curves in Charts VIa and 'lIb is d, where

1111-1121

d=

!81

rc;r~ai ~vl+ai'

(11~66)

and one must choose equal sample sizes. say n ;;;; fl.l ::::; ~. The one~sided alternative hypotheses requi::e the use of-Charts VIc and V1i For the one-sided alternative HI: 111 > P, in equatian 11-62, the abscissa scale is
d

I"
~(Ji

2 '

(11-67)

+ Ci2
where n =r.l :;;; nz The other one-sided alternative hypothesis, HI; Pl < f11., requires that d
be defined as
(1168)
and n =n, =",.
It is not unusual to encounter problell1S where the costs of collecting data differ sub
stantially between the NO populations, or where one population variance is much greater
than the other. In those cases, One often uses unequal sample sizes. If!11 nz the operating
characteristic curves may be entered with an equivalent value of n computed from

n-

aT. +0'22

-ufj",+u?;,/",'

(11-69)

288

Chapter 11

Tests of Hypotbeses

Ii '" >' ".,. and lbeir values are fixed in advance, lben equation 11-69 is used directly to calculate n. and the operating characteristic curves are entered 'With a specified d to obtain /3.
If we are given d and it is necessary to determine and "., to obtain a specified f3, say f3*,
then one guesses at trial values of n 1 and i1:!, calculates n in equation 11 ~69, enters the curves
with the specified value of d, and finds Ii Ii f3 ~ f3", then lbe trial values of n l and"., are satisfactory.If f3>' f3*. lben adjustments to ", and"., are made and the process is repeated.

n,

Consider the orange juice production line problem in Exampk 11 ~ 12. If the true Cifferenee in ~a:u
production rates were 10 cases per day, find the sample sites required to detect this differ~::ce with a
probability of 0.90. The appropriate value of the abscissa parameter is

10
.)40+50

1',-1'2

~(jf +O'i

1.05,

and since a=O.05, we find trom Chart,/1c :hat n =n! =~= g,

------~~---------------

It is also possible to derive formulas for the sample size required to obtain a specified

f3 for a given 0 and IX These formulas occasionally are useful supplements to lbe operating
characteristic curves. For the two~sided alternative hypothesis. the sample size n! ; i1:! = n
is

(Z<X/2 +

zl (<Y?
0

This approximation is valid when

<Yn

(11-70)

<r>( -Za/2 - o~~-I~CJf + ~T) is small compared to {i

For a one-sided alternative, we have nl

:::: ~

n. where

(11-71)
The derivations of equations 11-70 and 11-71 closely follow the single-sample case in Section 11-2. To illustrate the use of these equations. consider the situation in Example 11-13.
We have a one-sided alternative with a~ 0.05, o~ 10, = 40,0; 50, and f3 = 0.10. Thus
Za 20.05 = 1.645, Zp = 20.10 = 1.28, and the required sample size is found from equation
11-71 to be

a;

which agrees with lbe results obtained in Example 11-13.

11-3,2 Tests of Hypotheses on the :\leans of Two Normal Distributions,


Variances Unknown

.u,

We now considertests of hypotheses on lbe equality of the means


and Il2 of two normal
and
are unknown. A t statistic will be used to test
distributions where the variances
lbes. hypolbeses. As noted in Section 11-2.2, lbe normality assumption is required to

CJ;

a;

r,

! 1~3

Tests of Hypotheses 0:1 Two Samples

289

develop the test procedure, but moderate departures from normality do not adversely affect
the procedure. There are two different situations that must be treated. In the first case, we
assume that the variances of the two normal distributions are unknown but equal; that is, .i
In the second, we assume that ~ and a~ are unknown and not necessarily equal,-

: : : a;

cr.
Case 1: 01: 01: = cr

Let X, and X, be two independent normal populations with


unknown means 11: and f.L2, and unknown but equal variances <17;;;:
We wish to test

a;;;:;: rr.

Ho: f.11 :;:::: P'2~

(11-72)

HI: /11 '" /12.

Suppose thatX w X12 ,.,. XliI) is a random sample ofn} obsex::ati~ns ffOmXl\andX::1. X22
... , X"" is a random sample of,," observations from X" Let Xl' X" S;, and S; be the sample

me~s and sample variances. respectively. Since both S~ and S; estimate the common

\"a..;ance <i-, we may combine them to yield a single estimate, say

s' __ h -1)Sf +("" -I)si


n1 +nz-2

This combined or upooledn estimator was introduced in Section


in equation 11~72, compute the test statistic
t -

0-

fll :;::: f.1.:!.

is true,

lO~3,2.

~-x.,
II
1
Sp_I-+--~ nl

If Ho:

(11-73)

To test Ho: 111 = f.1.:!.

(11-74)

""2

'0 is distributed as tn:+~_2' Therefore, if


(l1-75a)

or if

(lH5b)

we reject H~: f.Ll ,~.


The one-sided alternatives are treated si:nilarly. To test

Ho' /11 ~ /12,


HI: /11 > /12,

(11-76)

compute the test statistic to in equation 11.-74 and reject Ha: /11 = /12 if
(11-77)

For the other one-sided alternative

Ha: /1, = /12,

(11-78)

H ,: /11 < /12,

calcuinte the test statistic to and reject Ho: /11

= /12 if
(11-79)

The two-sample Hest given in !his section is often called the pooled t-test, because the
sample variances are combined or pooled to estimate the common variance. It is also known
as the independent t-test, because the two normal populations are assumed to be independent

290

Chapter 11 Tests of Hypothese,

. Example 1(14
Two catalys':S are being analyzed to dete.."'!!line how t"ey affect the mean yield of a chemical process.
Specifically, cata!yst 1 is cu....-rently in use, but catalyst 2 is acceptable. Since catalyst 2 is eheaper, if
it does not change ti:e process yield, it should be adopted. S\;,ppose we v.ish to test the hypotheses

HO:!'I=i'?,
H,:!',"i'?.
PLot plant data yields nl

=- 8, Xl)::;;; 91.73, s~ =3.89, nl. = 8. ~:= 93,75, and

4.02. From equation

IH3. we find
(7)3.89+ 7(4.02)

h8-2

3.96.

The test statistic is

91.73-93.75
'1 1
1.99 1-+-

2.03.

Using tX= 0.05 we find:hat tM2'i.:4:::: 2.145 and-tC.CZ5, [4 =-2.145, and. conseqr.ently, Ho: #) = 111
cannot be rejected. That is. We do no: have srror.g c\'idence to conclude that catalyst 2 results in a
mean yield that differs fron:: the .rne:an yield when catalyst 1 is used,

a; u;: In some siruations, we cannot reasonably assume that the unknown vari0'; are equal, There is not an exact t statistic available for testing Ho: Pi ~ .'-'2

Case 2: '#'
ances ~ and

in this case. However, the statistic

Xl-X,

tG=~~~

1St + si

01-80)

~ ", P..,
is distributed approximately as t with degrees of freedom given by

(St + SfJ'
v

lnl

n.z

(SUn,), (si. In.,J

(11-81)

-'-'-'-"''-- + .l..::..--c:..<_
n: +1
liz +1

a;,

if the nUL hypothesis Ho: 11, ~ i'? is true. Therefore, if ~"


the hypotheses of equations
11-72.11-76, and 11~78 are tested as before, except that to is used as the test statistic and
n1 + 1lz - 2 is replaced by v in determining the degrees of freedom for the test. This general

problem is often called the Behren&-Fisher problem.

manufacrurer ofYideo display units is testing two microcirc;rit designs to determine whether they
produce equivalent current floW. DevelOpment engineering has obtained the following data:

Desjgn 1

1ft;;;::

15

Xl ;;;:: 24,2

Design 2

.'12:;;

10

Xl

=:

23.9

i'I'

11-3

Tests of Hypo:hcses on 1\\10 Samples

291

We wish to test

He: 1'1 = fJo,


HI:p'l:;LfJ.-?
where bot.;' populations are assumed to be normal, but we are unwilli."lg to asS\lJne that the un.1caown
and
are equal. The test statistie is

variances

ci:

a;

'_ ,,-x,
1sf -.-.!i.
V", n,

24.2- 23:9 -0 184

)10 ___ 2ij" -.

to -

The degrees of freedom on

10

t; are found from equation 11-81 to be


/"
i

115

2 \2

(!Q."t' 20,\2

E..+~~ I

',n~ n,./
(sUnJ (sifn,.)'

15
16

-~-+

ltl+1

10)

-2=;6.

(lOi 15)' + (20/10)'


11

ltl-t-l

Using a;:::, 0.10, we find that t:d7..v:::::: to 05,J6 = 1.746. Since It~1 < t(,05.16' we cannot rejeet Hfi. III :!; /k}.,

Choice of Sample Size


The operating characteristic curves in Charts VIe, Vlf, vlg, and VIh (Appendix) are used to
evaluate the type lIerrorfor the case where
c?, if. Unfortunate,y, when <:(" <1" the
dismbution of is unknown if the null hypothesis is false, and no operating characteristic

01= =

t;

curves are available for this case.


For the two-sided alternative in equation 11-72, when
Charts VIe and VI! are used with
d = '---'-_:.1
2a

a; = c?, = if and n, = "2 = n,

2a

(11-82)

To use these curves, they must be entered with the sample size n* ::: 2.n - L For the onesided alternative hypothesis of equation 11-76, we use Charts VIg and VIh and define
d

f.11-J.Lz=J..,
2a
2a

(U-83)

whereas for the other one-sided alternative hypothesis of equation 11-78/ we use
d=f.12-f.1, =J..,
2a
2a

(11-84)

It is noted that the parameter d is a function of a, which is unknown. As in the single-sample

t-test (Section 11 ~2,2). we may have to rely On a prior estimate of a, or use a subjective
estimate. Alternatively, we could define the differences in the mean that we wish to detect
relative to (1,

!~~~Iil~l;;
Consider the catalyst expc...ment in Example 11-14. Suppose that if catalyst 2 produces a yield that
differs from the yield of catalyst 1 by 3.0% we would lil.--e to reject the null hypothesis .....ith a proba~
bility of at leas! 0,85. What sample size is required'? Using!ip 1.99 as a rO\1gh escrnlte of the

292

Chapter 11

Tests of Hypotheses

common standard deviation a, we have d = 1000za= 13.00i/(2)(1.99) = 0.75. From Chart VIe (Appendix) with d;;;;: 0.75 and fJ= 0.15. we find n* =20, approximately. Therefore, since n" =2n - 1.
n .......l 20-!-1
n=-2-=-2-=1O.5 ~ll (say),

and we would use sample sizes ofnl;;;;: n2 ;;;;: n "'" 11.

11-3.3 ThePairedt-Test
A special case of the two-sample t-tests occurs when the observations on the two popula~
lions of interest are collected in pairs. Each pair of observations, say (X:J X1.j), is taken
under homogeneous conditions, but these conditions may change from one pair to another.
For example, suppose that we are interested in comparing two different types of tips for a
hardness-testing machine. Ibis machine presses the ti.p into a metal specimen with a known
force. By measuring the deplh oflhe depression caused by lhe tip. the bardness of the specimen can be determined. If several specimens were selected at random, half tested with tip
I, half tested wilh tip 2. and lhe pooled or independent Hestin Section 113.2 applied. the
results of the test could be invalid. That is, the metal specimens could have been cut from
bar stock that was produced in different heats, or they may not be homogeneous. which is
another way hardness might be affected; then the obseNed differences betvleen mean hardness readings for the two tip types also include hardness differences between specimens.
The correct: experimental procedure is to collect the dara in pairs; that is, to take two
hardness readings of each specimen, one with each tip. The test procedure would then con~
sist of anal)l'zmg the differences between hardness readings of each specimen. If there is no
difference between tips. then lhe mean of lhe differences should be zero. This test procedure
is called the paired t-test.
Let (Xu. X'l)' (XI2 , Xnl, ... , (X",. X,J be a set of n paired observations. where we
assume lhat XI .:: N(p.l'
and Xl - NCp., 0) Define lhe differences between each pair of
obsenrations as Dj =Xu - X2i' j = 1, 2, ... n.
The D] are normally distributed v;.':ith mean

cr;)

so testing hypolheses about the equality of III and '"'- can be aecomplished by perfonDing a
Hest on }J.D' Specifically, testing Ho: J1.~ = fJ? against HI: J.i.l ;t!; f-12 is equivalent
to testing

one~sarnple

Eo: IlD= 0,
HI: IlD* 0.

(1l8S)

The appropri.ate test statistic for equation ll8S is

15

(ll-86)

where
(1187)

11-3

Tests of Hypotb.eses on Two Samples

293

and

s'-D

(11-88)

are the sample mean and variance of the differences. \Ve would reject Ho: /l.v =0 (implying

that Il,,,, 10 if to> Ian., <_lor if to < -1",'2"


similarly.

-1'

One-sided alternatives would be treated

l;E~~i~'lll!z!
An article in the Joul7Ul1 of Strain Analysis (Vol IS, No.2. 1983) compares several methods fur predicting the shear strength for steel plate girders. Data fur two of these methods. the Karlsruhe and
Lehigh procedures. when applied to nine specific girders. are shown in Table 11-2. We wish to determine if there is any difference (on the average) between the tvlo methods,
The sample aYera~e a.nd standard deviation of the differences d, are (1 = 0,2739 and Sd "'" 0.1351.
so the test statistic is "

0.2739
0.1351/.J9

6,08.

For the twQ~sided alternative HI: 1lfJ;;t 0 and a= 0.1, we would fail to reject only if Itol < t~[},l, 3 """ 1.86.
Since to> to.05 , 5' we conclude that the two strength prediction methods yield d.:.fferent results. Spec:f~
ically. the KarhTUhe method produces, on average, higher stfCngt:!l predictio::tS than does t.!..e Lehigh
method.

Paired VersrL(j' Unpaired Comparisons Sometimes in performing a comparative experi~


ment, the investigator can choose between the paired analysis and the two-sample (or
unpaired) I-test. If n measurements are to be made on each population, the two-sample t statistic is

Table 112 Strength P:edictions fur Nine Steel Plate Girders (Predicted LoarliObserved Load)

Girder

Karlsruhe Method

Lehigh Melhod

Slil
S2/1
S31l

!.la6
!.l51
1.322
1.339
1.200
1.402
1.365
1.537
1.559

1.061
0.992
1.063
1.062
1.065

Soil

S5/1
S2/1
S2/2
S2/3
S2/4

un
1.037
1.086
1.052

Difference
0.125
0.159
0.259
0.277
0.135
0.224
0.328
0.451
0.507

294

Chapter 11

Tests of HypotOeses

which is compared to tOIl. 211- 2- and of course, the paired t statistic is

15

to=S I"
Di ">In

which is compared to tfYi2,n-I' Kotie<! that since

the numerators of both statistics are identical However, the denominator of the two-sam~
pIe t-test is based on the assc.mption that XI andX2 are independent. In many paired experiments, there is a strong positive correlation betleen Xl and X2 That is,

v(15) = V(Xj - X2 )
= V(X:) + V(X,)-2Cov(X;, X,)
20-2 (1-

p)

n
assuming that both populations Xl and Xz ha'!'e identical variances. Furthennoret S ~/n esti!llates the variance ofD. Now, whenever there is positive correlation within the pairs, the
denominator for the paired t-test will be smaller than the denominator of the t'No-sample
t-test This can cause the t\Vo-sampJe t-test to considerably understate the significance of the
data if it is incorrectly applied to paired samples.
Although pairing will often lead to a smaller value of the variance of Xl - X;.:, it does
have a disadvantage, Namely. the paired t~test leads 1.0 a loss of n.. - 1 degrees of freedom in
comparison to the two-sample t-test. Generally, we know that increasing the degrees of
freedom of a test increases the power against any fixed alternative values of the parameter.
So how do we decide to conduct the experiment-should we pair the observations or
not? Although there is no general answer to this question, we can give some guidelines
based on the above discussion. They ate as follows:
1~

If the experimental units are relatively homogenous (small a) and the correlation
between pairs is small, the gain in precision due to pairing will be offset by the loss
of degrees of freedom, so an independent-samples experiment should be used,

2. If the experimental uIUts are relatively heterogeneous (large a) and there is large
positive correlation betv.'een pairs, the paired experiment should be used.
The rules still require judgment in their implementation, because (j and p are usually
not \:nm>;ll precisely. Furthermore, if the number of degrees of freedom is large (say 40 or
50). then the loss of n - 1 of them for pairing may not be serious, However, if the number
of degrees offreedom is small (say !O or 20), then losing half of them is potentially serious
ifnot compensated for by an increased precision from pairing.

11-3.4 Tests for the Equality of Two Variances


We now present tests for comparing two variances. Following the approach in Section
11-2.3, we present tests for normal populations and latge-sample tests that may be applied
to nonnormal populations.

11-3 Tests of Hypotheses on Two Samples

295

Test Procedure for Normal Populations

a;)

Suppose that two independent populations are of interes~ say Xl - NlJ1l,


and X2 - NIJ1z,
<1,), where !J."
p.z, and <1, are unknown. We wish to test hypotheses about the equality

a;,

a;

of the two variances, say Hol = 0;. Assume that two random samples of size OJ from population 1 and of size n" from populatioo 2 are available, and let and S, be the sample variances. To test the two-sided alternative

S;

Ho: cr~= 0;,

(11-89)

Hi: ~*o;.
we use the fact that the statistic
S2

Fa =-{-

(11-90)

Si.

is distributed as F, with", -1 and n" -I degrees of freedom. if the null hypothesis llc:
is i.'!1e, Therefore, we would reject Ho if

a;

0-;=

(11-91,)
or if

Po < F 1 _ aJ2 ,I'lI_1. /1.~-1'

(I1-91b)

where F Cd2 /1.1 -1,112-1 and F 1 _ aI2.fll-l,II2- 1 are the upper and lower (J}2 percentage points of
the F distribution with", - I and n" -I degrees of freedom. Table V (Appendix) gives only
the upper tail points of P, so to fmd F 1 - a12,I'lI- 1,111-1 we must use

Pi - aj2'''1 -1,n2-1

(11-92)

The same test statistic can be used to test one-sided alternative hypotheses, Since the
notation Xl and X2 is arbitrary, let Xl denote the population that may have the largest vari~
ance. Therefore, the one-sided alternative hypothesis is

Ho: O'~ == t1~.


11,: cr; > cr;.

(11-93)

If
(11-94)

we would reject Ho: ~ =

0';.

Chemical etching is used :0 remove >copper from. printed cir>cuit boards, X and X:t
yields when twO different concentrations are used Suppse 6at we wis:::' to test
Ho:

a;

Hj;

vi:? 0;.

Two samples of sizes n t = liz """ 8 yield 37=3,89

represen~

p::ocess

296

Chapter 11

Tests of Hypotheses

If a::. 0.05. we find th~t Fo,q.t5. 7, 7"'" 4.99 and FQS75 7, 7= (FruJ1j 7, 7r! "'" (4.99r I 0.20. Therefore,
'Ne ca.'1ll()t reject Ho: 0'1"'"
and we can conclude mat there is no strong evidence that the variance
of me yield is affected by the concentration,

0';.

Choice of Sample Size

Charts v1o, VIp, v1q, and VIr (Appendix) provide operating characteristic curves for the
F-test for a= 0.05, and a= om, assuming that n, = "? = "- Cb,,'ts VIo and VIp are used
with the two-sided alternative of equation 11-89. They plot j3 .gabst the abscissa parameter

"1
(11-95)
"2
n2 == n. Charts VIq and VIr are used for the one-sided alternative of
A=

for various 111 ::::

equa~

tion 11-93.

For the chemical process yield ar.:.alyses problem in Example 11 ~ 18, suppose that one of the concen~
trations affected the variance of the yield so that one of the variances was four times the other an,d We
wished to detect this with probability at least 0.80. What sample size should be used? Note that if one
variance is four times the oilier, men
,1.=
0',

By referring to ChartVIo, with p;:::; 0.20 and 1.= 2, we find that a sample size of nl "'" fl;1::' 20, a:pprox~
imately, is necessary.
A Large-Sample Test Procedure
Vlhen both sample sizes lZl and 1t:l are large. a test procedure that does not require the normality assumption can be developed. The test is based on the result that the sample standard
deviations 51 and S1. have approxi.mare normal disuibutions with means 0i and 0"2' respectively) and variances u;!2nl and cr:/2~. respectively. To test

Ho:

,
(jl

=az,

(11-96)

HI: cr;:t-~,

we would use the test statistic

r,1

S 1--+-
pOj

2n,

2n"

where Sf} is the pooled estimator of the common standard deviation (j, This statistic has an
approxiinate standard normal distribution when ,,; = 0-;. We would reject Ho if .z;, > Zan or
if~ <- Zc:.'Z' Rejection regions for the one~sided alternatives have the same form as in other
two~sample

normal teSts.

1l3.5 Tests of Hypotheses on Two Proportions


Toe teSts of Section 11-2.4 can be extended to the case where there are two binomial
parameters of interest, say PI and P2> and we wish to test that they are equal. That is, we
wish to test

Ho:p: =p"
H,:p, *p,.

(11-98)

11-3

Tests of Hypotheses on Two Samples

297

We will present a large-sample procedure based on the normal approximation to the binomial and then outline one possible approach for small sample sizes.

Large-SampleTes! for Ho:p, =pz


Suppose that the two random samples of sizes n l and nz are taken from two populations,
and let Xl and X2 represent the number of observations that belong to the class of interest
in samples 1 and 2, respectively. Furthennore, suppose that the normal approximation to the
binomial applies to each population, so that the estimators of the population proportions
PI =Xlinl and pz =Xzln,. have approximate normal distributions. Now, if the null hypothesis Ho: PI =P2 is true, then using the fact thatpl =pz = p, the random variable

PI-P2

is distributed approximately NCO, 1). An estimate of the common parameter pis

- Xl +X
p = - - -z.
n1 + 1lz
The test statistic for Ho: PI

=P2 is then
(11-99)

If
(11-100)
the null hypothesis is rejected.

Two different types of.:fire control computers are being considered for use by the U.S. Army in sixgun IDS-rom barteries. The two computer systems are subjected to an operational test in which the
total number of hits of the target are counted. Computer system 1 gave 250 hits out of 300 rounds,
while computer system 2 gave 178 hits out of260 rounds. Is there reason to believe that the two computer systems differ? To answer this question, we test

Ho:PI=PZ,
Hl:Pl *pz
Note thatp, =250/300 = 0.8333,p, = 178/260= 0.6846, and

250+178
300+260

0.7643.

The value of the test statistic is

0.8333 - 0.6846

= 4.13.

0.7643(0.2357)[_1_+_1_J
300 260
If we use a = 0.05, then 20,025 = 1.96 and -ZO.025 = -1.96, and we would reject Ho, concluding that
there is a significant difference in the two computer systems.

298

Ch3j:;ter 11

Tests of Hypotheses

Choice of Sample Size


The computation of the f3 error for the foregoing test is somewhat more involved than .in the
single-sample case. The problem js that the denominator of z;, js an estimate of the standard
deviation ofpl - pz under the assumption that PI::;;;; P2 =p. \\'hen Ho: PI:::::' P2 is false, the stan~
dard deviation ofPl -pz is

(11-101)

If the alternative hypothesis is twogided. the fl risk tums out to be approximately

fl ~<!>( Za/2~pq(lin~:,~:) -(PI - p,))


(11-102)

where

and <lj;, _{h is given by equation 11-101. If the alternative hypothesis is H,: P, > P2' then

(lH03)

and if the alternative hypothesis is H t : P, <P2' then

(11-104)

For a specified pair of values Pl and P2 we can find the sample sizes nl ""2 ~ n
required to give the test of size a; that has specified type II errOr fl. For the two-sided alternative the common sample size is approximately

(11-105)
where ql = 1 - PI and q2 = 1 - Pz' For the one~sided alternatives, replace Zd2 in equation
11-105 with2w

11-3

Tests of Hypotllese, on 1\vo Samples

299

SrnaIlSample Test for Ho: p,; p,

Most problems involving the comparison of proportions Pl and p? have relatively large sam~
pIe sizes, so the procedure based on the nonnal approximation to the binomial is widely
used in practice. Howe;.'er, occasionally, a small-sample-size problem is encountered, In

such cases" the Z~tests are inappropriate and an alternative procedure is required. In this section we describe a procedure based on the hypergeometric distribution.

Suppose that Xl and X2 are the number of successes in 1vto random samples: of sizes nl
and fi,1' respect.:Yely. The test procedure requires that we view the total number of successes
as fixed at the value X, + X, ; Y. Now consider the hypotheses

;p"
B,:p,>p"

HO:PI

Given that Xl + X:;:.;:;:; Y,large values of Xl supportH1 whereas small or moderate values of
Xl support Ho- Therefore, we will reject Ho whenever Xl is sufficiently large.
Since the combined sample of n1 + lLz obse...,ryations contains Xi + Xz : : : Ytotal successes,
if Bo: p, ; p" the successes are no more likely to be concentrated in the first sample than
in the second, That is, all the ways in which the + '" responses can be divided into one
sample of nl responses and a second sample of fl-.2 responses are equally likely_ The number
of ways of selecting Xl successes for the first sample leaving Y - X: successes for the
second is

n,

Because outcomes are equally likcly. the probability of there being exactly Xl successes in
sample I is detennined by the ratio of the number of sample I outcomes having Xl sucCesses to the total number of outcomes. or

x~!Y success in n1 + fl-.2 responses)

(11-106)

given that Ho: Pi

P2 is true. We recognize equation 11-106 as a hypergeometric


distribution.
To use equation 11,106 for hypothesis testing, we would compute the probability of
finding a value of Xl at least as extreme as the observed value of X:_ Note that this probability is a Pvalue, If this P-value is sufficiently small, then the null hypothesis is rejected,
This approach coudd also be applied to lower-tailed and two-tailed alternatives.

:~~,p,!~~[@i~
InsulatirJ.g cloth used in printed circuit boards is manufactured in large rolls_ The manufacturer is trying to irr;prove the process yield. that is, the number of defect~free rolls produced. A sample of 10 rolls
contains exactly four defect-free rolls. From analysis of the defect types, manufacturing cO$ineering
S:lggests S2VC:a1 changes in the p!"ocess. Fo:!lowing implerne::ttation of these chlUlges. another sample
of 10 rolls yields 8 defect-free rolls. Do the data support the claim thz:t the new process is bet::er tha.'1
the old one, using a:;:;Q,lO?

300

Chapter II

Te:,'ll; of Hypotheses

To aIlswer this question. we compute the P-vahle. mour ex.amp1e. nl """ns = 10, Y= 8 "t" 4= 12, and
the observed value of xl = 8, The values of Xi that are more extreme than 8 are 9 and 10, Therefore
12\,'8'
. ( 8 121
p(XJ :8112 successes):
=0.0750.

({oy
,10 J

(12Y81

19;\I;

P(XI =912successes
) =~=O.0095.
I

llO J
.

Ga~J

P(XJ :10 12 successes/= (20)

:0.0003.

,10
The P-value is p: 0.0750 -;. 0.0095 + 0.0003 = 0.0848. Thus. at the level a: 0.10, ::he nulll:ypothesis is rejected and We conclude that the e:.gineering changes have improved the process yield,

This test procedure i~ sometimes called the Fisher-lrviin test Because the test depends
on the asst::mption that Xl + X2 is :fixed at some value, some statisticians have argued against
use of the test when X, + X, is not actually fixed. Clearly X; + X,is notfixed by the sampling
procedure in our example. However, because there are no other better competing procedures,
the Fisher-lrviin test is often used whether or not Xl + X2 is acrually fixed in advance.

114 TESTING FOR GOODNESS OF FIT


'The hypothesis-testing procedures that we have discussed in previous sections are for problems in which the fann of the density function of the random variable is known, and the
hypotheses involve the parameters of the distribution. Another kind of hypothesis is often
encountered: we do not know the probability distribution of the random variable under study,
say X, and we wish to test the hypothesis that X follows a particular probability distribution.
For example, we might wish to test the hypothesis that X follows the normal distribution.
m this section, we describe a fonnal goodness-of-fit test procedure based on the
chi-square distribution. We also describe a very useful graphical technique called probability plotting. Finally. we give some guidelines useful in selecting the form of the population
distribution.
The Chi-Square Goodness-of-Fit Test

'The test procedure requires a random sample of size n of the random variable X, whose
probability density function is unknown, These n observations are arrayed in a frequency
histogram having k class intervals. Let 0 i be the observed frequency in the ifu class interval. From the hypothesized probability distribution we compute the expected frequeney in
the ith class intervaL denoted E j , The test statistic is
2_

*'

Xo - L,
[=1

(0, -Ed'
Ej

(11-107)

It can be shown that X; approximately follows the chi-square distribution with k - p - 1


degrees of freedom, where p represents the number of parameters of the hypothesized

11-4

Testi~gforGooCness

of Fit

301

distn'bution estimated by sample statistics. This approximation improves as n increases. We


would reject the hypothesis that X eanfonns to the hypothesized distribution if
>

xi

XlX.k~p~ :.

One point to be noted in the application of this test procedure concerns the magnimce
of the expected frequencies, If these expected frequencies ~'"e too slI'.a11. then X~ will not
reflect the dep2ItUre of observed from expected, but only the smallest of the expected frequencies. There is no general agreement regarding the minimum value of expected rre*
quencies. but values of 3, 4. and 5 are widely used as minimal. Should an expected
frequency be too small. it can be combined with the expected frequency in an adjacent class
interval. The corresponding observed frequencies would then be combined also, and k
would be reduced by L Class intervals are not required to be of equal width.
We now give three examples of the test procedure.

Eiimi.ii;2Z
p ..; . .,
'Co .

A Completely Specified Distribution A computer scientist has developed an algorithm for gener~
ating pseudorandom i...tegerS over :he inter.'al 0-9. He codes the algorithm and genera:es 1000
pseudo::andom digits. 'The data are shown ir. Table 11-3. Is :here evidence th~t the rz:ndom number
genera~or is working correctly'?
If the randoJ:J. number generator is working correctly, then the values 0-9 should follow the discrete uniform distribution, which implies that each of the integers. should occur about 100 ti.D::.es. That
is, the e.xpecredfrequencies Ei = 100, for i "'" 0, 1, ... ,9. Since these expected f:.-eG,uencies ean be determined v.rifuout estimating any parameters from the sample c.ata, the resul:ing chi-situate goodness-offit test will have k - p - 1 = 10 - 0 ~ 1 9 degrees of freedom.
The observed value of the test statistie is
, i
J'
xl "" I. \o!. :.~
J=;

Et

(94-100)' + (93-100)' + ... +",(9_4",-",IO..c0),-'


100
100
100
= 3.72.

Since ioM.? = 16,92 we ~are unable to rejeec the hypothesis tlJ.at the data eome from a discrete uniform
distribution. Therefore, the :random numbet generator seems to be working satisfactorily.

E~~~.ti83
A Discrete Distribution The number of defects in printed cirenit boards ls hypothesized to follow
a Poisson distribution. A random sample of n 60 printed boards have been eollected, and the :mIDber of defects observed. The following data result:

Table 113 Data for Example 11-22


Total
0

frequencies, 0,

94

93

112

101

104

95

100

99

108

94

1000

Expected
frequencies. Ej

100

100

lOa

100

100 100

100

:00

100

100

1000

Observed

302

Chapter 11

l,

Tests of Hypotheses
Number of Defects

Observed Frequency

32
15
9

1
2
3

I
i

'P.::;e mean of the assumed Poisson distribution in this example is unknown and must be estimated
from the sample data. The estimate of the mean Dumber of defects per board is the sample <:I'Verage;
that is (320 + 15 . 1 + 9 2 + 4 3)160 ~ 0.75. From the cumulative Poisson distnoution with pararr~"
eter 0.75 we may compute the expected frequencies as Ei =np,. where Pi is the theoretic~. hypofue..
sized probability associated with the 1m class interval and n is the total number of observations. The
appropriate h)l'otheses are

') c-<'''(075''
,}
o:Pt x ;;
.x!

"=0,1,2.. ",

HI :p(x) is Dot Poisson '.1tith ,.t =: 0.75,

We may compute the e;r.:pected frequencies as follows:

Nu.'11.ber of Failures

Probability

Expected F=equency

'0
1

0,472

28,32

0.354

2!.24

0,133

7,98

,,3

0.Q41

2A<5

The expected frequencies are obtained by multiplying the sample size times the respective pmbabUlties. Since the expected frequency in the las~ cell is less than 3, we combine the last two cells:
Number of Failures

ObserVed Frequency

Expected Frequency

32

28,32

15

21.24

13

10.44

The test statIstic (which will have k - P - t = 3 - 1 - t :;:; 1 degree of .freedom) becomes

(32-28.32)' + (15-21.24)' +Q3-10,44)' =294


28.32
21.24
10,44
"
and since X~~,l
3.&4. we cannot reject the hypothesis that the occurrence of defects follows a
Poisson dis~Jlution v,.itlt mean 0.75 defects per board.

A Continuous Distribution A manufacturing engineer is testing a power supply used in a word.


processing work station, He \\1shes to determine whether output voltage is adequately described by
a normal distribution. From arandom sample of n.::::::: 100 llIlits he obtains sample estimates of the mean
and standard deviation
12.04 V and s:;:; 0.08 V.
A common practice in constructin.g the class iutervals for the frequency distribution used in the
chl-square goodness~of-fit test is to choose the cell boundaries so that the expected frequencies Ei """
"Pi are equal for all cells. To use this method, we want to choose the cell boundaries 101 al' ... , at for ,!
the k cells so that all !:he probabilities

x"""

114 Testing for Goodness of Fit

303

are equal Suppose we decide to use k;;; 8 cells. For the standard normal distribution the intervals :hat
divide the scale into e:gh' eq"alIy1<ely segments are (0, 0.32), [0.32, 0,675), [0.675, 1.15), [Ll5, -l,
and thea: fow: "mir.::or image" intervals on the other side at zero. Denoting :hese standard normal endpoints by Go. at ...., as> it is a simple matter :0 calculate the endpoints that are necessary for the genernl normal problem at hand; :narnely, we define the new class interval endpoints by the transformation
a;::::::i + sa;; i;:;;;. O. 1, ,,,.8, Por example. the sixth interVal's right endpoint is
12.04 + (0.0&) (0.675) = 12.094.

For each interVal. Pi:::;; t:::::: 0,125. so the expected cell frequencies are El
The complete table of observed and expected frequenc:es is given IT. Table
The computed value at the chi-square statistic is

100 (0,125) = 12.5.

Si:1ce rNO parameters in t.l"e normal distribt.1ion have been estimated. we would compare ,to:::::: L12 to
a cru-s(j.uare disttibutio:l vrith k - P - 1 =: 8, - 2 - 1 : : : 5 degrees of freedom. Using a:::.. 0,10. we see
that X;,I...i = 13.36, and so we conclude that there is no reason to believe that output voltage is not normally distributed.

Probability Plotting
Graphical methods are also useful when selecting a probability distribution to describe
data, Probability plotting is a graphical method for determining whether the data conform
to a hypothesized distribution based on a subjective visual examination of the data. The
general procedure is very simple and can be performed quickly. Probability plotting
requires special graph paper, known as probability paper, that has been designed for me
hypothesized distribution, Probability paper is widely available for the normal, lognormal,
Weibull, and various chi-square and gamma distributions. To construct a probability plot,
the obsenrations in the sample are first ranked from smallest to largest. That is. the sample

Table 114 Observed iUld Etpected Frequencies


Oass

Observed

Interval

Frequency, 0 1

x< 11.948
11.948 ~x< 11,986
1'.986~x< 12,014
12,014:>x< 12,040
12.040:>. 12,066
12.066 s 12,094
12.094sx< 12.132
12.132 sx

10
14
12
13
11
12
14
14

100

Expected

12.5
12.5
12.5
12,5
12.5
12.5
12.5

12.5
100

304

Chapter II

Tests of Hypotheses

Xl' X2 , . X,,,! is arranged as X(li' X(z), ... X(Il where X(f> ~ XU+ 1)' The ordered abserva~
tions XC;) are then plotted against their observed cumulative frequency U - O.5)ln on the
appropriate probability paper. If the hypothesized distribution adequately describes the
data, the plotted points will fall approximately along a straight line; if the plotted points
deviate significantly from a straight line, then the hypothesized model is not appropriate.'
Usually, the determination of whether or not the data plot as a straight line is SUbjective.

E~~i!e~~:;Z~(
To illustrate probability plotting, consider the followlllg data:

-{).314, 1.080,0.863, -{).179, -1.390, -{).563, 1.436, LlS3, 0.504, -{).801.


We hypothe:,ize that the$C data are adequately modeled by a normal distribl.!tion. The observations are
arranged in llscending order and their cumulative frequencies U- 0.5)/11, calculated as follows:
0.5) I n

2
3
4
5
6

-1.390
-{).801
-0563
-{).314
-{).179
0.504

0.863

L080
Ll53
1.436

10

0.05
0.15
0.25
0.35
0.45
055
0.65
0.75
0,85
0.95

The pall's of values ~)) and U - 0.5)1n are now plotted on normal probability paper; This plot is sbo\Vtl
in Fig. 11-7. Most normal probability ,?aper plots lOO(j - 0.5)/n on the right vertical scale a:::ld
100[1 - U - 05)ln~ on the left vertical scale. with the variable value plotted on tl:.e horizontal scale.
We have chosen to plot X,j) versus IOOU 0,5)1n On the right vertical in Fig. 11-7. A straight line.,

2r

99
98

5~

95

1I

10

:[

20

"~

:t

"

70

"l

I
1:.

90

SO

70 ~
SO 0

50
60

50 :::>

1:

20

SO

10

90

95

""

98~'~

-2,0

-5

___ L-._~L- ___

-1.5

-1,0

-0,5

0.5

1.0

..~. ~.~,_J2
1,5

2,0

XU:

Figure 11~7 Normal probability plot

11-4 Testing for Goodness of Fit

305

chosen subjectively, has been drawn through the plotted points. In drawing the straight line, one
should be influenced more by the points near the middle than the extreme points. Since the points fall
generally near the line, we conclude that a normal distribution describes the data.

We can obtain an estimate of the mean and standard deviation directly from the normal
probability plot. We see from the straight line in Fig. 11-7 that the mean is estimated as the
50th percentile of the sample, orjl = 0.1 0, approximately, and the standard deviation is estimated as the difference between the 84th and 50th percentiles, or 0-= 0.95 - 0.10 = 0.85,
approximately.
A normal probability plot can also be constructed on ordinary graph paper by plotting
the standardized normal scores Zj against XU), where the standardized normal scores satisfy

j -0.5

= p(Z" Zj) = <!J(Zj)'

For example, if (j - 0.5)1n = 0.05, then <!J(Z) = 0.05 implies thatZj = 1.64. To illustrate, consider the data from Example 11-25. In the table below we have shown the standardized normal scores in the last column:
j

XCJ)

(j-O.5)/n

Zj

1
2
3
4
5
6
7
8
9
10

-1.390
-0.801
-0.563
-0.314
-0.179
0.504
0.863
1.080
1.153
1.436

0.05
0.15
0.25
0.35
0.45
0.55
0.65
0.75
0.85
0.95

-1.64
-1.04
-0.67
-0.39
-0.13
0.13
0.39
0.67
1.04
1.64

Figure 11-8 presents the plot of Zj versus XU). This normal probability plot is equivalent to
the one in Fig. 11-7. Many software packages will construct probability plots for various
distributions. For a 1vf!nitab example, see Section 11-6.
2.0 , - - - - - - , - - - - - , - - - - - , - - - ,

1.0

-1.0

-2.0 :-:c-----,:-:c---;.------;:'-;;-------;;'
-2.0
-1.0
0
1.0
2.0
XU)

Figure 11-8 Normal probability plot.

306

Chapter 11 Tests of Hypotheses

Selecting the Fonn of a Distribution


The choice of the distribution hypothesized to lit the data is importan~ Sometimes analysts
can use their knowledge of the physical phenomena to choose a distribution to model the
data. For example, in studying the circuit board defect data in Example 11-23, a Poisson
distribution was hypothesized to describe the data. because failures are an event per unit"
phenomena, and such phenomena are often well modeled by a Poisson distribution. Sometimes previous experience can suggest the choice of distribution,
In situations where there is no previous experience or theory to suggest a distribution
that describes the data, analysts must rely on other methods. Inspection of a frequency histogram can often suggest an appropriate distribution. One may also use the display in Fig.
11-9 to assist in selecting a distribution that describes the data. w'ben using Fig. 11-9. note
that the f3., axis increases downward. This figure shows the regions in the /3" f3., plane for
severa! standard probability distributions, where

E(X-fl')'

, - =7--:-;-;':?f2\3/2-

~ -,
"'y}Jl

[<1 )'

14
5

{3,

c 6

~
.0

:i

!:

7;

J!
9'

p,
Figure 11-9 Regions in the P" {3, plane for vmom standard di,ttibutions. (Adapted from G. I.
Hahn and S, S. Shapiro, Statisrical Models in Engineering, John Wlley & Sons, New York, 1961;
used with permission of th publisher and Professor E, S, Pearson. Urri\'eTI)1ty of London.)

1l~5

Con:ingency Table Tests

307

is a standardized measure of skewness and

f3., ;~( X - IJ.

0"

is a standardized measure of kurtosis (or peakedness), To use Fig, 11-9, calculate the sam~
pie estimates of /3, and f3." say

and

, M.
/3 2 ; M"
2

where

II'(X -X)'
_.,

M;J
n

h:::1

'

j = 1,2,3,4,

and plot the point~" fJ.,. If this plotted point falls reasonably close to a point, line, or area
that corresponds 10 one of the distributions given in the figure, then this distribution is a logical candidate to model the data.
From inspecting Fig. 11-9 we note that all normal distributions are represented by the
point /31 ; 0 and {3, ; 3. This is reasonable, since all normal distributions have the same
shape, Similarly, the exponential and uniform distributions are represented by a single point
in the /3" {3, plane. The gamma and lognormal distributions are represented by lines,
because their shapes depend on their parameter values. NOte that these lines are close
together. whieh may explain why some data sets are modeled equally well by either distribution, We also observe that there are regions of the /3" {3, plane for which none of the distributions in Fig. 11-9 is appropriate, Other, more general distributions, such a<; the Johnson
Or Pearson families of distributions, may be required :in these cases, Procedures for fitting
thesefamilies of distributions lmd fignres similar to Fig, 11-9 are given in HaIm and Shapiro
(1967).

11-5 CONTINGENCY TABLE TESTS


11any times, the n ele:uents of a sample from a population may be classified according to
two different criteria. It is then of interest to know whether the two methods of classification are statistically independent; for example. we may consider the population of graduating engineers and we rnay wish to determine whether starting salary is independent of
academic disciplines. Assume that the first method of classification has r levelS and that the
second method of classification has c levels, We willie! Olj be the observed frequency for
level i of the first classification method and for level j of the second classification method.
The data WOUld, in general. appear as in Table ll-S. Sueh a table is commonly called an

r X c contingency table.
We are interested in testing the hypothesis that the row and eolumn methods of classification are independent. If we reject this hypothesis, we conclude there is some interaction
between the two eriteria of classification, The exact test procedu..'"es are difficult to obtain.
but an approximate test statistic is valid fot' large n. Assume the Of; to be multinomial random variables and Pij to be the probability that a randomly selected element falls in the ijth

308

Chapler 11

Tests of Hypothese,

Table 115 An r x c Contingency Table

Column

Row

i 0::

,0,,,,--+_O~""--t--

0 12__

0,

0" -,-_ _

!,-O. ; ~;. .'

'_'-

cell, given that the two classifieations are independent. Then Pij = Ui""j. where u/ is the
probability that a randomly selected element falls in row class i and Vj is the probability that
a randocly selected element falls in column class j. Now, assuming independence, the

maximum likelihood estimators of Uj and Vj are


,Ie

uj=-LO/i.
12 I""-l

Vj

"

1 r
=- 2..:0ij'
n i=:l

(11108)

Therefore, assuming independence, the expected number of each cell is

(11109)
Then, for large n, the statistic
2

+~ (Oij-ES

Xo=-:..L

1""1/,,,1

-X(~l)(C-;)'

(111lO)

ij

approlcimately, and we would reject the hypothesis of independence if

X;" X~

(r-1)(,-I)'

A company has to choose among three pension plans. Management wishes to know whether the pref~
crence forplan:s independent of job classification. The opinions of arandom sa:nple of 500 employees are shown in Table 11-6, We may compute ul ;:: (340/500} ". 0.68. ~ ::: (1601500) ::: 032, Vi ;:::
(2001500) 0040. il, = (2001500) 0040. and y) = (100/500) = 0.20. The expecTed frequencies may be
computed from equation ll-l09. For exan:ple, the expected number of salaried workers favoring pet.sinn plsn 1 is

11-6 Salnpi" COT.1puter Output

Table

1l~7

309

Expected Frequencies foc Example 11-26

Pension Plan
2

Total

136

68

340

Saillried

workers
Hourly
workers
Totals

136
64

64

32

160

200

200

100

500

The expected frequencies a...-e shovro in Table


11-110 as follom:

1l~7,

The test statistic is computed from equation

Since X~.O$;.! ;;; 5.99, we reject the typotbesis of independence and conclude that the preference for
pension plans is not independent of job classification.

Using the t<.vo-way contingency table to test independence between two variables of
classification in a sample from a single population of interest is only one application of contingency table methods. Another common situation occurs when there are r populations of
interest and each population is divided into the same c categories. A sample is then taken
from the ith population and the counts entered in the appropriate columns of the illl row. In
this situation we want to investigate whether or not the proportions in the c categories are
the same for all populatioos_ The null hypothesis in this problem states that the populations
are homogeneous with respect to the categories. For example, when there are only two categoriest such as success and failure, defective and nondefective. and so on, then the test for
bomogeneity is really a test of the equality of r binomial parameters. Calculation of
expected frequencies, determination of degrees of freedom, and computation of the
chi-square statistic for the test for homogeneity are identical to the test for independence.

116 SAMPLE COMPtJTEROUTPUT


There are many statistical packages available that can be used to construct confidence inter~
vals, carry out tests of hypotheses, and determine sample size. In this section we present
results for several problems using :Minitab,

A study was conducted on the tensile strength of a pa.."ticular fiber under various temperr.:.tures. The
results of the study (given in :MFa) are
226,237,272,245,428,298,345,201,327,301,317,395,332,238,367,

310

Chapter 11 Tests of Hypotheses

SUpPOSe: it is ofiuteres[ to determ.ine if the mean tensile strength is greater than 250 t1Pa. That is, test
Ho:Jl~250,

HI: Jl > 25Q.


A non:nal probability plot was constructed for the tensile strength and is given in Fig. 11 ~ 10. The no:~
mality assumption appears to be satisfied. The population variance for tensile strength is asswned to
be unknown. a."ld as a result. a single-sample Hest will be used for this problem.
The resulrs from Minitah for hypothesis testing and confidence interval OZl the mean are
!T.st of m" 250 vs rou > 250
Variable

::'5

~S

S Mea.'t

65,9

17.0

LOwer Bound
272.0

95.0%

Vaxiable

StDev

Mean
301.9

<rs

3.0S

I
I

DoDO!

The P-value is reported as 0,004, leading us 'to :eject the null hypothesis and conclude that the mean
tensile strt':ngth is greater tha.'1 250 MFa. The lower one-sided 95% confdence mterval is given as
272<,u...

t:1qlDipJ;li:28"
Reconsider Examp1e 11-11. comparing two methods for predicting the shear strength for steel plate
girders.. The Mlnitab$ output for !he paired t-test using Ct= 0.10 is
Fail.'~d

Karlsl.'uhe

Kar ls:ruhe
Lehigh
Difference

N
9
9

Mean
1. 3401
1. 0662

0.2739

SE Mean
0.0487
0.0165
0.0450

0.1_351

190% CI for mean difference:

IT~Test

Sc:Dev
0.1460
0.0494

(0,1901. 0.3576)

of mean difference "" o (v5o not '" 0): T-Valc.e

6.08 P-Valc.c ""

~ooo

0,999
0.99
0.95
;0.

0.80

:a=0.50

0.20 I
0.05

I
0.01
0,001

'.--.---

---

----

i
200

300

400

Tenslle strength

Figure 11-10 No:mal proh.bilityp!ot fur Example 11-27.

11-6

Sample Co:nputer Ouqmt

311

The results of t.ie MinitablPl output are iT.: agreement with the J;eS'Jlts round in Example 11~ 17"
:Mnitab((!l also provides the appropriate confidence interval for:he problem. Using a= 0.10, the level
of confidence is 0,90; the 90% confideot.-'e interval on the differel:ce between the two methods is
(0,1901,0.3576). Since the confidence interval does not coutain Zero, we also conclude Liar there is
a significant difference between l1e two mefuods.

The nc.:rnber of airline flights canceled is recorded for all airlines for each day of service. The Dumber of flights recorded and the Dumber of 'llese flights that were canceled on a single day ir. Match

2001 are provided below for t'NO major airlines.

Airline

# of Flights

41 of Canceled Flig..its

2128
635

115

American Airlines
America West Airlines

Proporrion

49

Is there a significant difference in the proportion of canceled flights for the two airlines? The hypotheses of interest are He: Pl = h, VefS1.:S H!: Pi ; P2' A two~sample test 0:< proportions and a twO-sided
confide::.ce interva: on proportions are
Sample
1
2

Sample p

115
49

i128

0.054041

635

0.07716$

Estimate for p(1) - p(2); -0.0231240


95% ex for 1'(1) - 1'(2): (-(L0459949, -0.00(253139)
Test for p(l) - p(2) "" 0 (vs not "" 0): Z "" -1,98 F-Valuc

The P"valile is given as 0.04.8, indicating that there is a significant difference between the proportions
of flight... canceled for American and America West airlines at a 5% level of significance. The 95%
confidence interval of (-O.()<i60. -0.0003) indicates thatAmeIica West airlines had a statistically significant higher proportion of canceled flights k1an American Airlines for the single day.

,~~~~1!}1:3~
The mea::J. compressive strength for a particular high-strength couercte is hypothesized to be f.J. 20
(MPa). It is knOWD that tile standard deviation of compressive strength :s (j-= 1.3 MPa. A g':Onp of
engineers wants to determ.ioe ':he number of concrete specimens that will be needed in the study to
detect a dec:ease ill the mean compressive strength of two stm:dard deviations. Ii 6c average compressive strength is a;;t>J,ally less than",u - 20', they want to be confident of corre.."tly detecting this significant difference. In other words. the test of interest would be HQ: ,il ;;;; 20 versus H;: J1. < 20. For this
study. the significance level is set at a= 0.05 and the power of the test is 1- {J= 0.99, What is :he winimam nrunber of conerere specimens ::hat should be used L'1 t1'.is study? For a difference of 2a or 2.6
MFa, a=0.05, and 1- f3= 0.99, ::heminimllm sampie size can be found using}o.1initab~. The result~
ing output is
Testing tnean

null (versus < ::lull)

Calculating power for mean

Alpha = 0.05

I,
i

S~gffia ~

nu:'l + difer<tnce i

1.3

Sample

Target

Difference

SiZe

-2.6

Power
0.9900

I
Actual
Power

0.9936

312

Chapter 11

Tests of Hypotheses

The .rnin.imum number of specimens to be used in the study sholli.d be r.

= 6 in order to attain t!:te

desired power and level of significance.

A rr:anuiactu."'et of robber belts wishes to inspect and control the number of nonconforming belts pro.
duced on line. The proportion of noncor':orrning be~ that is acceptable is p =. 0.01, For practical purposes., if thc proportion increases top = 0.035 0(' greatet'. the manufacturer wants to detect this change.
That is, the test of interest would be Ho: p::: 0.01 versus HI: P > 0.01. If the acceptable level of significance is a::: 0.05 and the power is 1 {3 = 0.95, how many rubber belts should be selected fat
inspection? For a:::. 0,05 and 1 - f3::: 0,95. the appropriate sample size can be determined using
Minitab@.Theoutputis

'TestiLg proportion

= 0.0:

(versus> O,Cl)

Alpha" O.OS

A.lternative
Pro;?or.ior.

Sample

Target
Power

Actua2-

Size

:3,SOE-02

348

0.9500

0.9502

Power

Therefore, to adequately detect a significant change in thc proportion of nonconforming n:bbcr belts,
random sac:ples of at least n. """ 348 woU:d be needed.

11-7 SUMll4ARY
This Chapter bas introduced hypothesis testing. Procedures for testing hypotheses on means
and variances are summarized in Table 11-8. The chi-square goodness of fit test was intro~
duced to test the hypothesis that an empirical distribution follows a particular probability
law. Graphical methods are also useful in goodness-of-fit testing. particularly when sample
sizes are small. 1\vo-way contingency tables for testing the hypothesis that two method.;; of
classification of a sample are independent were also introduced. Several eompurer examples
were also presented.

11-8 EXERCISES
11-1. The breaking strength of a fibet used in manufac:uring cloth is tequited to be at least 160 psi Past
experience has indicated that the standard deviation of
bteaking strength is 3 psi. A random sample of four
speci!nens is tested and the average breaking strength
is found to be 153 psi.
(a) Should the fiber be judged acceptable with

a=O.05?
(b) \Vh31 is rhe probability of accepting Ho: Ji. ~ 160
iftbe fber has a trne breaking strength of 165 psi?
11~2. The yield of a chemical process is being studied. The variance of yield is known from previous
ex.perience with this process to be 5 (units of a2 ):;:::per~
centagel). The past five days of plant operation have
resulted in the following yields (in percentages): 91,6.
88.75,90.8,89.95,91.3.
(a) Is there reason to believe chcyield is k:ss than 90%1

(b) 'What sample size would be required to detect a


true mean yield of 85% with probability 0.95?

11-3_ The diameters of bolts are knowu to have a


standa:d deviation of 0.0001 inch. A tandom sample
of 10 bolt.,> yields an average diameter of 0.2546 inch.
(a) Test the hypotbesis that the true mean diameter of
bolts equals 0.255 inch. using a= 0.05.
(b) Vihat size sample would be necessary to detect a
true mean bolt diameter of 0.2551 inch with a
probability of at least 0"90?

114. Consider the data in Exercise 10-39.


{a) Test the hypothesis that the mean piston ring
diametetis 74.035 rom. Use ct= 0,01.
(b) 'What san::.ple size is required to detect a troemean
diameter of 74.030 with a probability of at least
0.951

ll-S ExeJX:ises

313

Table 11-8 Sll1ll.ltla.""y of Hypothesis Testing Procedures on Means and Variances

---_

Alternative

....

x-~

z,

a' known

H,: 1'- iJ<;,


cf unkJ'1own

H,:I'''iJ<;
H!:j1.> Pc
H,: I' < i4J

(II -.In

H,: i"" .u"


H,:I':>iJ<;
H,:I'<i4J

X -Po

t:=~i-,-

S/"IlI.

'""',

Xl -X1
2
Ja
a~
,-L+......t....

Parameter

iJ "I
H,: 1': =,"""

tc

":

0{ * 0'; unknown

to

=rXj-X2
..
~-

d - (I' -101<7

d - (i4J - I'll"

Hj:Jl,"> P2

Z.

d-I)l,)l,1 ~O'f +a~


(.
,; r;-:;
d=: J.lJ-).t).l. "';0'1 +O'!

H J:J1.j<.Uz

4 <-Z,

d~()J.2 -PI)/.,}(T~ +a~

H!:j1.;#,Uz

Irol > t crn.>1, +,,:-2

d = II': - p,,112a

to>

tCl.,I1,

h71-2

d - (1', -/'o)l2er
d=(fl2-J.l.))!2a

H,:I', "'p"
2

lSi -"-~

Vfl.1

t(ta.II_1

to <-(a,,,-!

H,:I', <p"

":

Ho: 1'[ = /'0,

d=(I'-iJ<;)/"
d = (i4J - I'll"
d = II' - iJ<;jia

Iz,,! :> Za/1

HI~J!!>111.

crunknown

d-II' - i4J11a

141>
4>2.
4<-2.
ItJI > lall,n_1

H,: 1', '" p"

~ ...=

ZQ=

(j~ fl.'ld ~ knowL

a~=~

OCCurvc

--

Ho: I' = i4J.

H,:I',

Crite:ia for
Rejection

Hypothesis

Test Statistic

Null Hypothesis

1t:!

Hj:Jl! >J.t:.
H1:Jll <.Liz

to <-til, v

H[:a'",o;

%0> Xc!'.!..n-I

IC>ta.<

lSt-+-1
s;:
~.,

v=

fl.l

,\2

11;.;.1

"1),(-'),
\Si.tI!
,S;;r,,!

-2

- - - + ..- - It, """ 1

Ho:if=~

ll:! +1

, (n-l)S'
Xo=.... _--

:z

ci

A;:;:;; GIGo

or
2

.. ;:

Xa<XI-011.lI_1
o

Hl:

He:"~ = 0;

Fo=S~lS~

A= oioe

X~<X~ _o:.n_l

;.= (flan

Fr:>FtPZ.n:_:.nz._l
or
Fo < FI_all.lIl I,tt;:-,

A= Ci1/U1

Fo>Fl7.nl" .,,7z-1

A=at / l.1:!

a' < Of;

H!:a~;zf: cS

H;:d>~

11-5~

XO>XV,lI-l

H,:a'>d';,

Consider the data in Exercise 1040, Test the

ume. whether or not this yolu.::r:,e is 16.0 ounces, A

hypothesis that the mean life of the light bulbs is 1000


hours. Use Ct= 0.05.

random sample is taken from the outpu: of each


machine.

U--6. Consider the data in Exercise 1041. Test the


hypothesis that mean compressive strength equals
3500 psi. Gsc a= 0.01.
11~ 7.

Two machines a..~ used for filling plastic bottles


with a net volume of 16.0 ounces. The filling
processes can be assumed normal, with standard deviations '" =O.oJ5 and '" =oms. Quality engineering
s~ that both mach.i.net :fill to the same net vol-

Machine 1

16.03
16.04
16.05
16.05
16.02

16.01
15.96
15.9S
16.02
15.99

Machine 2
16.02
15.97
15.96
16.01
15.99

16.03
16.04
16.02
16.01
16.00

314

Chapter 11

Tests of Hypotheses

(a) Do you tbin.k that quality engi!leering is correct?

Usea:M5.

at random fro:n the current production. Assume that


shelf life is normally distributed,

(b) Assuming equal sample sizes, what sample size


should be used to assure that f3 0,05 if the true
difference in means is 0.0751 Asswue that
a=O.05.

(c) What is the power of the test in. (a) for a true difference in .means of 0.075?
11~8. The f;1m development department of a local
store :s considering the replacement of its current
Elm-processing .machine. The time in which it takes
the machine to completely process a roU of film is
impo:"".ant. A random sample of 12 rolls of 24-exposure color film is selected for processing by the c\.trrent machine. The average processir.g time is 8.1
minutes, 'filth a sample standard deviation of 1.4 mi:tutes. A random sample of 10 rolls of the same type of
film is se:ected for !esting in the new machine. The
average processiltg ti;:ne is 7.3 minutes. wit.~ a sample
standard deviation of 0.9 minutes, The local store '>Vill
not pUIChase the new machine unless the processing
time is more than 2 minu~es shorter than the current
machine. Based on this information. should they purchase the new machine?

11-9. Consider the data in Exercise 1{)"45. Test the


hypothesis that both machines filllO the same volum!!.
Use Ct= 0.10.

11.. 10. Consider the data in Exercise IO-4ti Test He:


::= J.l::. against H! :,U: > JIz, using a;;::; 0.05,

Round

Deviation

2
3

11.28

-9.48

-10.42
-<l.51

,[

1.95

8
9

6.25
IO.ll
-<l.65
-0.68

6.47

10

Tes~

the hypothesis that the mean latetru deviation of


these mortar shells is ::ffO. Assume that lateral devia~
tion is normally distributed.
11~13. The shelf life of a photographic film is of
interest to the manufacturer. The manufacturer
observes the following shelf life tor eight urits chosen

124
116

159
134

(b) If it is important to detect a ratio of Clr::r of 1.0


with a probability 0.90, is the sample size
su.fEcient?
11~ 14.

The titanium content of an alloy is being stud-

ied in the hope of ultimately increasing the tensile


strength. An analysis of six. recent heats choSen at mcdam produces the following ritanium con~ents.

8.0%
9.9

9.9

7.7%
11.6
14.6

Is th.et'C any evidence that the mean titanium content is


greater than 9.5%1
1l~15.

An article in the Journal of Construction


Engineering and Management (1999, p. 39) presents
some data on the nwuber of work hours lost per day
on a construction project due to weather-related inci~
dents, Over 11 workdays, the following lost work
hours were recorded.

Consider the gasoline road octane number


data in Exercise 10-47. If fonm.:lation 2 produces a
higher road. octane number than fClfl'4ulation 1, the
::r.anufacturer would like to dctect tl".is, Fonnulate and
test e.n appropria:-c hypo:hesis, using (X= 0,05.

Deviation

163

that t.":e mean shelf life is


greater than or equal to 125 days?

ll~ll.

Round

128 days

134

(a) Is there any evidence

j1;

11 ~ 12. The la:eral deviation in yards of a certain type


of mortar shell is being investigated. by the propellant
manufac:urer. The following data have been observed.

108 days

8.8

8.8

12.5
5.4
12.8

12.2
133
6.9

9.1

2.2

14.7
Assuming work hours are :r.otma11y distributed. is
there any evidence to conclude that the mean number
of work hours lost per day is greater than 8 hours?
11 ..16. The percentage of scrap produced in a !!leW
finishing operation is hypothesized to be less than
7.5%. Several days were chosen at random and the
percentages of scrap were calculated.
5,51%

732%

6.49
6.46
537

8.56

8.81
7.46

(2.) In your opinion. is the true scrap rate less than

7.5%1
(b) Ifitls important to detect a ratio of (JIG: 1.5 v.'ifu
a probability of at least 0.90, what is the :mini~
mum sample size ';hat can be used?

(c) For 5/(1:::.2.0, what is the power of the above test?

1\ "8

11-17. Suppose that we must test the hypotheses

Exercises

Field

315

Model

H,:I1" 15,

H,:11<15,
-where it is knoVtn that c? :;;; 2.5. If a=;; 0.05 and the
trUe mean is 12, what sample Si4C is necessary to
assure 11 type IT er:nr of 5%?

11-18. An engineer desires to test !.he hypothesis that


the melting point of an alloy is lOOOQC. 1f the true
melting point differs from this by more thM 20CC he

must change the alloy's composition, If we assume


that the melting point is a normally distributed ran~
com. variable, 0::;;;0.05, th:;O.lO. and (1= lO"C. how
many observ.ations should be t2.ken?
11-19. Two methods for producing gasoline from
c..41c.e oil are being i:westigated. The yields of both
processes are asscrc.ed to be normally distributed. The
follo\1.'ng yie:d data have been ob:ain::d from the pilat

53.33
55.17

55.17
55.17
57.14

47.40
49.80
51.90
52.20
54.50

58.20
59.00
60.10
63,40

69.57

55,70

69.57

56.70

71.30
75.40

Yields (0/.;:)
24.2 26.6 25.7
21.0 22.1 2\.8

20.9 22.4 22.0

Ca) Is there reason to believe that process 1 has a


greater mean yield? Use 0: O.OL Assume that
bolli variances are equal.
Co) ~l;ssuming that in order to adopt process 1 it must
produceame:m yield thatls at least5% grea,.-er than
ti:at of process 2, what are your ;re....--ommendations:?
(e) Fl.Ilrl the power of the test in part (a) lithe mean
yield of process 1 is 5% greater L'lan that of
process 2.
(d) VIhat samp1e size is required for the test in part
(a) to ensure that the null hypothesis will be
rej~ with probability of 0.90 if tho mean
yield of process J exceeds tbe mean yield of
process 2 by 5%?
11~20. An article that appeared in the Proceedings of
the 1998 Winter Simulation Conferen.ce (1998, p.
:079) discusses the concept of validation for traffic

simulation :;).OOels, The stated purpose of the study is


to design aad modify the facilities (roadways and control devices) to optimize efficiency and safety of traffle flow. Pan of the study compares speed observed at
various ir.tersections and speed simulated by a model
being tested, The goal is to determine whether the
simulation model is representative of the actual
observed speed. Field data 1s collected at a particular
location and then the simulation model is imple~
mented, Fourteen speeds (ft/sec) are measured at a
particular location, Fourteen observations are simu~
tated using the proposed model The data are;

65.80

Assu.ming the variances are equal, conduct a hypoth~

esis test to determine whether there is a significant


difference between the field data and the model simulated data. Use a:;;; 0.05.
11~21. The foUov.i:ng are the bu..lning times (in minutes) of fla..-es of!\Vo different types.

Type 2

63
81

82

64

68

72

63

57
66

59
75
73

83

74
82

82

57.14
57.14
61.54
6\.54
61.54

Type:

plant

Process

53.33
53.33

59
65

56

82

(a) Test the hypothesis that the two variances are


equaL Use 0::;;;0.05.
(b) Using the results of {a), test the hypothesis that
the mean buming times are equal.

11 ..22. A new filtering device is insta:led in a cber:ll~


cal unit. Before its i:tstaEation. a random sample
yielded the fo:Cov.-ing in1:orma:ion about the percentage of impurity: XI = 12.5. si :;: 101.17, and n! =8.
Aiter installation, a random sample yielded x,::::: 10.2,
1
"
sl=94.73,nz=9.
(a) Can you conclude that the two variances are
equal?
(b) Has the filtering device reduced the percentage of
impurity significantly?
11-23~ Suppose that two rando:;). samples were drawn
from norn:al populations with equal variances. The
s:unple data yields x\ =: 20,0, nl :::;: 10, L(x!! Xj)l
1480, X, = 15.8, n., = 10, and J:(x,; -x,j' -1425.
(a) Test the hypotl'.esis that the !'No means are equal
Use 0:=0.01,
(b) Fwd the probability that the null hypothesis in (a.;
.,-till be rejected if the t.-ue difference in means
is 10.
Ce) What sample size is required to detect a true difference in means of 5 with probability at least 0,80
if it is known at the starr of the experiment that a
rough estimate of the common variance is 150?

316

Chapter 11

11~24.

Consider the data in Exercise 10-56.

1
I

Tests of Hypottleses

(a) Test the hypothesls that the means of the two nor~
mal distributions are equal. Use a ~ 0.05 and
assume that ~ 0;.
(b) v.'hat sample size is required to detect a difference
in means of2.0 with a probability of at least 0.SS1

cr:

(c) Test the hypothesis that the: variances of the two


distributions ru;e equaL Use (J,;;!;; 0.05.

(d) Find the power of the tesl in (c) if the variance of


a population is four times the other.
11-25. Consider the data in Exercise 10-57. Assw:ofig that ~:=: 0;. test the hypothesis that the mean rod
diameters do not differ. Use a:::::: 0.05.
11~26. A chemical company produces a certain drug
whose weight has a standard deviation of 4 mg. A new
method of producing this drug has been proposed.
although some additioual cost is involved, Manage~
ment wpl authorize a change in production technique
only if the standard deviation of the weight in the new
process is less than 4 mg. If the standard deviation of
weight in the new process is as small as 3 mg, the
company would like to switch production methods
with a probability of at least 0.90. Assuming weight to
be normally distributed and a = 0.05, how many
observations should be taken? Suppose the
resea.."'Chers choose n ;: :; : 10 and obtain the data below,
Is this a good choice for n? What SI:lOuld be their
decision?
16,630 grams
16.628 grams
16.631
16.622
16.624
16.627
16.622
16.623
16.626
16.618
11~27. A manufacturer of precision measuring instruments clai:ns that the standzrd deviation in the use of
the i=:.sil\;!;).cnt is 0.00002 inch. A.'1 analyst, who is
u:1aware of the claim, uses the instrument eight times
and obtain." a sample standard deviation of 0,00005
inch.

(a) Using a= 0,01. is the claimjustiiied?


(b) Compute a 99% confidence interval for the true
variance.
(c) "Vhat is the power of the test if the true standard
de-...1ation equals 0.000047

(d) What is the smallest sample size that can be used


to detect a true standard deviation of 0,00004 with
a probability at least ofO,95? Use a= 0.01.

11-28. The standard deviation of measu."'ements made


by a special thermocouple is supposed to be 0,005
degree. If the standard deviation is as great as 0.010.
we wish to detect it -with a probability of at least 0,90,
Use a;;;::: 0.01. What sample size should be used? If

this san:ple size is used and the sample standard devi~


ation s = 0.007, what is your conclusion, using a~
0,01? Construct a 95% upper-confidence interval for
\he true variance,
11~29. The manufacturer of a power supply is inter~
ested in the variability of output voltage. He ha.s tested
12 units. chosen at random, with the: following results:

5.34
5.00
5.07
5.25

5.54
5.44
4.61

5.35

(a) Test the hypothesis t1at

cr

'1

4,76

5.65
5.55
5.35

cr =05. Use a=0.05.

(b) If the true value of = LO. what is the probabil~


ity that the hypothesis in (a) v"ill be rejected?

11-30:. For the data in Exercise 11-7, test the hypoh..


esis that the two variances are equal. using a = 0.01.
Does the result of this test influence the manner in
which a test on means would be cor.ducted1 What
sample size is necessary to detect ~j~;;:; 25, 'With a
probability of at least 0,901
11~31. Consider the fOliOVling two sampies,
from two normal populations.

draVt"D.

Sa.'11ple 2

Sample 1
----_...
,.

4.34
5,00
4.97

1.87
2.00

2.00
1.85
2.11

4.25
5.55
6.55
6.37
5.55

2.31

2.28
2.0'7
1.76

3.76

1.91
2.00

Is there evider.ce to conclude that the variance of population 1 is greater than the variance of population 21
Use a
0.01. Find the probability of detecting
if,1c?, ~ 4.0.

11-32. Two machines produce :metal parts. The yari~


ancc of the weight of these parts is of interest. The fol-

lowing data have been collected.


Machine 1

Machine 2

1""'2=30

x\

"1 =25
0.984

x" ~ 0.907

s~;;:; 13.46

$;:9.65

' I

11-8 Exercises
(a) Test the hypothesis that the var:ances of the twO
macr.rnes are equaL Use a=0.05.
(b) Test the hypothesis that the two machines proo:.J.ce
parts having the same mean weight. Usc a= 0.05.

In a hardness test. a steel ball is press.ed into


the material being tested at a standard load, The diam~
eter of the indentation is measured, which is related to
the hardness, Two types of steel balls are available,
and thei:' pe:iormance is compa.red on 10 specimens.
Each speci::nen is tested twice. once with each ball.
The results are given below:
11~33.

75 46 57 43 58 32 61 56 34 65
52 41 43 47 32 49 52 44 57 60

Ball x
Bally

Test the hypothesis that the two: steel balls give the
same expected hardness measurement. Use a= 0.05,
1l~34.

1\.'1o:.ypes of exercise equ;p:nent.A and B. for


handicapped individuals are often !..!Sed to determine
the effect of the particul:u exercise aD heart rate (in
beats per minute), Seven subjectS participated In a
study to determine whether the two types of equipment have the same effect on hem rate, The results
are given in the table below,
Subject

1
2
3
4

162
163
140
191
160
158
155

161
187
199
206
161
160
162

5
6
7

an appropriate tCSt of hypo:hesis to deter~


mine whether there is a significant difference in heart
rate due to the type of equipmcnt used.
CQnd~ct

1135. An aircraft designer has theoretical e..1dence


t.b.at painting lhe airplane reduces its speed at a speci~
fied power and flap setting. He tests six consecutive
airplanes from the assembly line before and after
painting. The results are shown below.
Top Speed (mph)

Airplane

Painted

1
2
3

286
285
279
283
281
286

5
6

Not Pairted

289
286
283
288
183
289

317

Do the d&.ta support the designer's theory? Use

a=O.OS.
1136. An article in the Internation.al Journal of
Fatigue (1998. p. 537) discusses the bending fat~gue
resistance of ge::\!" teeth when using a pa.."ticular prestressing or presetting process. Presetting of a gear
tooth is obtained by applying and then removing a
single overload to the machine element. To detcrmine
significant differences in fatigue resistance due to presetti:1g. fatigue data were
A "preset" tooth and
a "nonpreset" tooth were paired if t.'1.ey were present
on the same gear. Eleven pairs were formed a:ld :he
fatigue life measured for e:lCrL (The final response of
inrerest is In[(fatigue life) X 10-'n
Pair

3
4
5

6
7

8
9

10
11

Preset Tooth

Nonpreset Tooth

3.813
4.025
3.042
3.831
3.320
3.080
2.498
2.417
2.462
2236
3,932

2.706
2.364
2,773
2558
2.430
2.616
2.765
2.486
2.688
2.700
2.8:0

Conduct a test of hypothesis to de:ermine whether


presetfulg significantly i::creascs the fatigue life of
gear teeth. Use a=O.10.
11-37. Consider the deta in Exercise 1066. Test the
hypothesis that the uninsa.w. rate is 10%. Use a:;;; 0.05.
11~3S.. Consider the data in Exercise 10-68, Test the
hypothesis t.13t the fraction of defective clLculators
produced is 2.5%.

11~39.

Suppose that we . . . -ish to

~est

the hypothesis

Ho: PI = fJ.:. against the alternative HI: iJl ;{; 112, where
botb variances ~ and

a; are kno'n'Il. A total of fl.[ + nz

= N observations can be taken. How should these

observations be allocated to the two populations to


mv:imize the probability thatHowill be rejected if H j
is true. and JlI - P1. : : : b;{; O'?
11-40. Consider the ~:r:ion membership study
described in Exercise 10-70, Test the hypothesis that
the proportion of men who belong to a union does not
differ from the proportion of women who belo:tg to a
union, Use a=0.05.
1l~41. Using the data in :a~se 10-71. determine
Whether it is reasonable to conclude that production
line 2 proeuced a higher fraction of defective prcd:lct
t.ian ili:.e 1. Use a = 0.01,

318

Chapter 11

Tests of Hypothe:>es

11-42. Two different types of injectio::-molding


machines are used to fonn plastic parts. A part is considered defective if it has excessive shrinkage or is
discolored. Two random samples, each of size 500,
are selected, and 32 defective parr.3 are found lj. the
sample from machine 1. while 21 defectl.',le part<; are
found in the sample from machine 2. Is it reasonable
to co::c1ude that both machines produce tie same medon of defective parts?

1143. Suppose that we wish to test Ho: PI = fi2


agains~ H!: f.lj ;i: il2., where ~ and 0; are known. The
total sample size N is fLxed, bu',; the allocation of
observations to tl'lC two populations such that r.( + "''2
= N is to be made on ilie basis of cost. If the costs of
sampling for populations 1 and 2 are C t and
rcspectively, find the minimum cost sample sizes that
provide a specified va."iance for the difference in sam~
pie means.
11 ~44. A manufacturer of a new pain relief tab1et
would like to demonstrate tnat her product works
twice as fast as hetcompetltor's product Specifically,
she would like to test

Ho: 1'-: = 2/11.


H,: 1'-, > 2/11.

Derive an expression similar to equation 1120 for :he /3 error for the test of the equfuity of the
variances of two normal clistributions. Assume that the
l.VtO-sided alternative is specified

1147. The number of defective -:.nits found each day


by an in-circci: functional tester in a printed circuit
board assembly process is shov.'U below.

31-35

--

36-40
41-45

Number of
Defeetsi

Times Observed
6
11
16

Number of Wafers
with i Defects
4

13
2

34

56
70
70
58

5
6
7
8
9
10

12

11~46.

1}-10
11-15
16-20
21-25
26-30

11-48. Defects on wafer su.'1'aces in integrated cirelli:


fabrication are unavoidable. In a particular process the
follol1ting data were collected.

11

where !J.l is the mean absorption time of ::he competitive product and f11. is t.ie mean absorption time of the
nevi product. Assuming that ::he variances ~ and 0;are known. suggest a procedure for testing this
hypot.iesis.
11-45. Derive an expression si.'TIi1ar to cquation 1120 for the /3 error for the test on the variance of a nor~
mal distribution. Assume that tie two-sided
alternative is speCified.

Number of
Defectives

(a) It is reaso:::able to conclude that these data come


from a normal dist."ibution? Use a chi-square
goodncss-of-fit test
(b) Plot the data on nonnal probability paper. Docs a.1.
assumption of norrn.ality seem justified?

42

25
15

9
3
1

Doos the assumption of a Poisson distribution seem


appropriate as a probability model for:his process?

1149. A pseudora.."1dom number generator is


designed so that integers 0 thro~gh 9 have an equal
proba:rility of occurrence. The.first 10.000 numbers
are as follows:
0123456789
967 1008975 1)22 1003 989 1001 981 1043 1011

Does this generator seem to be working properly?


11-50. The cycle time of an automatic machine !las
been observed and recorded.

(a) Does the nor:tXU11 distribution seem to be a reason~


able probability' model for the cycle time? Use the
chi-square goodness-of~fit test.
(b) Plot the data on IlQrmal probability ;>aper. DOO1l

the assumption of normality seem reasonable?

28
22
19
lL
4

11",51. A soft drink bottler is s:udying the interru!l


pressure strength of l-liter glass nor.returnable bottIes. A random sample of 16 borGes is tested and the
pressure strengL~s obtained, The data are shown
below. Plot these data on normal probabiliry paper.

11-8 Exercises

Does it seem reasonable to conclude that presst:..'"e


strength is normally distributed?
226,16 psi
202,20
219.54
193,73
208.15
;95.45
;93,71
200,81
11~2.

211.14 psi
203,62
188,12
224.39
221.31
204.55
202.21
201.63

tions and ranges. Would you conclude that deflection


and range are independent?

RaDge (yards)

0- 1.999
2,000 - 5,999
6,000 - 1l.999

A company operates four machines for three

shifts each day. From production records. the following data on the number of breakdowns ate collected.

319

Latera: Deflection
Left
Normal
Rigt.t
14

6
9
8

II

8
4

17

11-56. A stc:.dy is being made of the failures of an


electronic componer.t There ate four t}"pcs of fail'JIes
possible arid two mounting positions for the deviee.
The folloVling data have been taken.
Fellure Type

Mounting Position
2
3

31
15

II
17

14
10

9
16

Test the hypothesis that breakdowns are independent


of the shift.
11-53. Patients in a hospital are classified as surgical
or r:.tedicaL A record is ~ept of the number of tmes
patients require nu;:sing service during t.~e night and
whether these patients are or. Medicare or not. The
data are as follows:

22

B
46

17

Surgical

Medical

Yes

46

No

36

52
43

Inactivc

Active

216
226

245

Low

Operations Research Grade

Grade

Other

25

B
C
Other

17
18
10

16

17
15
18

13
6
10
20

11

9
12

11-57. An article in Research in Nursing and Health


(1999, p. 263) summarizes data collected from a previous study (Research in Nursing and Healih, 1998, p.
285) on the rela.tionship between physical activity and
socio-economic status of 1507 Caucasian women. The
data artl given in the table below.

Socia-economic Stat'.!s

11~54. Grades L'1 a statistics course and aD operations


research course taken simultaneously were as follows
for a group of students.

4
8

18
6

Physical Activtty

Test the hypothesis that caJ.s by surgical-medical


patients are independen: of whether the patier.rs are
receiving Medicare.

Statistics

Would you conclude that the t:,rpe of failure is independent of the moor-ting position'?

Patient Category

Medicare

f:..:e the grades in statistics and operations research


related?
1155. All experiment with artillery shells yields the
following data on the characteristics of laterdl deflec-

Medium
H:gh

409
297

114

Test the hypothesis that physical activity is bdepend~


ent of socia-economic status.

11-58. Fabric is graded into three classifications: A.


E, and C. The results below were obtained from five
looms. Is fabric classification independent of th~
loom?
::J~ber of Pieces

of Fabric
in Fabric Classi5catio:::

Loom

2
3
4
5

185
190
170
,58
185

16

12

24
35

21
16
7

22
22

15

320

ChaptCI 11 Tests of Hypotheses

1l~59. An article in the Joumal of Marketing


Research (1970, p. 36) reports a study of the relation~
ship between facility conditions at gasoline stations
and the aggressiveness of their gasoline marketing pol~
icy, A sample of 441 gasoline stations was investigated
with the results shown below obtained. Is there e-.idence that gasoline pricing strategy and facility CODdl~
tions a..re independent?

Condition
Subst2.ndard

Standard

Modem

Aggressive

24

52

58

Neutral

15

73

86

Nonaggrcs..<;ive

17

80

36

l1~()O. Consider the injection molding process


described in li'Cercise 11-42.
(3) Set up this problem as a 2 X 2 contingency table
and perform the indicated statistical a:'.alysis.

(b) State deady the hypothesis being tested. Are you


testing homogeneity or independence?
(c) Is tb.i.s procedure equivalent to the test procedure
used if. Exercise 1:-42?

Chapter 12

Design and Analysis of


Single-Factor Experiments:
The Analysis of Variance
Experiments are a natural part of the engineering and management decision-making
process. For e.xample. suppose that a civil engineer is investigating the effect of curing
methods on the mean compressive strength of concrete, The experiment would consist of
making up several test specimens of concrete using each of the proposed curing methods
and then testing the compressive strength of each specimen. The data from this experiment
could be used to determine which curing method should be used to provide maximum
compressive strength,
If the:e are only two curing methods of interest, the experiment could be desigl'.ed and
analyzed using the methods discussed In Chapter 11. That is, the experimenter has a single
facwr of interest--curing methods-and there are only two levels of the factor. If the experimenter is intere..<;ted in deten::rining which curing method produces the maximum compressive strength, then the number of specimens to test can be determined using the
operating characteristic curves in Chart VI (Appendix), and the t-test can be used to deter~
mine whether the two means differ,
Many single-factor experimentS require more than two levels of the factor to be considered. For example, fPe civil engineer may have five different curing methods to investigate. In this chapter we introduce the analysis of ",mance for dealing with more than Wo
levels of a single factor. In Chapter 13, we show how to design and analyze experiments
with se...-eral factors.

12-1 THE COIVIPLETELY RA.."IDOMIZED SINGLE-FACTOR


EXl'ERIM:&",<T
12-1.1 An Example
A manufacturer of paper used for making grocery bags is interested:in improving the ten~
sHe strength of the product. Product engineering thinks that tensile strength is a function of
the hardwood concentration in the pulp, and that the range of hardwood concentrations of
practical interest is between 5% and 20%. One of the engineers responsible for the study
decides to investigate four levels of hardwood concentration: 5%,10%,15%, and 20%. She
also decides to make up six test specimens at each concentration level, using a pilot plant,
PJl 24 specimens are tested on a laboratory tensile tester in random order. The data from
this experi::nent are shown in Table 12-1.
Thi'i is an example of a completely randomized single-factor experi::nent with four levels of the factor. The 1e'vels of the factor are sometimes called treatments. Each treatmem

321

322

Chapter 12 The .Analysis of Va.'ance


Table 12-1 TellSLe Strength of Paper (psi)
Hardwood
Concentration (%)

5
10
15
20

~---

...

12
14
19

17
18
25

Observations
3
4

15
13
19

Totals

Averages

11

18

19
16
18

10
15
18
20

60
94
102
127

10,00
15,67

17
23

22

383

17,00

2Ll7
15.96

has six observ'ations. or replicates. The role of randomization in this experiment is


extremely important. By randomizing the order of the 24 runs, the effect of any nuisance
variable that may affect the observed tensile strength is approximately balanced Ollt. For
example, suppose that there is a warm-up effect on the tensile tester; that is, the longer the
machine is on, the greater the observed tensile strength. If the 24 runs are made in order of
increasing hardwood concentration (i.e., all six 5% concentration specimens are tested first.
followed by all six 10% concentration specimens, etc.), then any observed differences due
to hardwood concentration could also be due to the warm-up effect.
It is importar:.t to gT2pwcally analyze the data from a designed experiment. Figure
12-1 presents box plots of tensile strength at the four hardwood concentration levels. This
plot indicates that changing the hardwood concentration has an effect on tensile strength;
specifically. higher hardwood concentrations produce bigher observed (ensile streng"ill. Fur~
thennore, the distribution of tensile strength at a particular hardwood level is reasonably
symmetric, and the variability in tensile strength does not change dramatically as the hard. .
wood concentration changes.

30

25

0;

20

.c
rn
c

i" 15

0;

.'!!
'0;
c

{!!. 10

Hardwood conce!'1tration (%)


Figure 12-1 Box. plots of hardwood concentration data,

12-1

The Completely Randomized Single-Factor Experiment

323

Graphical interpretation of the data is always a good idea. Box plots show the variability of the observations within a treatment (factor level) and the variability between treatments. We now show how the data from a single-factor randomized experiment can be
analyzed statistically.

121.2 The Analysis of Variance


Suppose we have a different levels of a single factor (treatments) that we wish to compare.
The observed response for each of the a treatments is a random variable. The data would
appear as in Table 12-2. An entry in Table 12-2, say Yip represents the jth observation taken
under treatment i. We initially consider the case where there is an equal number of observations n on each treatment.
We may describe the observations in Table 12-2 by the linear statistical model,
i ~ 1,2, ... ,a,
Yij:::: J1.+'f j +Eij ._
{

]-1,2, ... ,n,

(121)

where Yij is the Ui)th observation, J1. is a parameter common to all treatments, called the
overall mean, 't',. is a parameter associated with the ith treatment, called the ith treatment
effect, and ij is a random error component. Note that Yij represents both the random vari-

able and its realization. We would like to test certain hypotheses about the treatment effects
and to estimate them. For hypothesis testing, the model errors are assumed to be normally
and independently distributed random variables with mean zero and variance 0"2 [abbreviated NID(O, 0"2)]. The variance 0"2 is assumed constant for all levels of the factor.
The model of equation 12-1 is called the one-way-classification analysis of variance,
because only one factor is investigated. Furthermore, we will require that the observations
be taken in random order so that the environment in which the treatments are used (often
called the experimental units) is as uniform as possible. This is called a completely randomized experimental design. There are two different ways that the a factor levels in the
experiment could have been chosen. First, the a treatments could have been specifically
chosen by the experimenter. In this situation we wish to test hypotheses about the 'fl , and
conclusions will apply only to the factor levels considered in the analysis. The conclusions
cannot be extended to s:imilar treatments that were not considered. Also, we may wish to
estimate the 7:'1' This is called the fixed effects model. Alternatively, the a treatments could
be a random sample from a larger population of treatments. In this situation we would like
to be able to extend the conclusions (whiCh are based on the sample of treatments) to all
treatments in the population, whether they were explicitly considered in the analysis or not.
Here the 'f; are random variables, and knowledge about the particular ones investigated is
relatively useless. Instead, we test hypotheses about the variability of the 7:'; and try to estimate this variability. This is called the random effects, or components of variance, model.

Table 12-2 Typical Data for One-Way-Classification Analysis of Variance


Treatment

Observation

Totals

Averages

y,.
y,.

Ya'

y"
y"

y"
y"

Y2l1

y,.
y.,

Yo,

Yo'

YO"

Yo'

y,"

324

Chaptcr 12 The Analysis of Variance


In this section we will develop the analysis of variance for the fixed-effects model. one-.
way classification. In the fixed-effects model, the treatment effects ~ are usually dnfined as
deviations from the overall mean, so that
(12-2)

Let y;, represent the total of the observations under the ith treatment and Y" represent the
average of the observations under the ith treatment Similarly. let y .. represent the gnnd
total of all observations and y.. represent the grand mean of all observations. Expressed
mathematically,

i=1,2, .. "a.
,

(12-3)

Y=y/N,

y .. = 2,2:>ii'
1=1 i"'i

whereN =: an. is the total number of observations. Thus the "dot" sUbscript notation implies
sum.m.ation over the subscript that it replaces,
We are interested l!J. testing the equality of the a treatment effects. t;sing equation 12-2,
the appropriate hypotheses are

Ho: 't': 1J :;; ... :;; '4<) 0,


H!: ~ 0 for at least one l.

(12-4)

That is, if the null hypothesis: is true. then each observation is made up of the overall mean
f.L plus a realization of the random error Ell'
The test procedure for the hypotheses in equation 12-4 is called the analysis of vari~
aIlee. The name "analysis of variance" results from partitioning total variability in the data
into its component pa.'1S, The total corrected sum of squares, which is a measure of total
-variability in the data, may be written as
(12-5)

or

(12-6)

+2 2,2,(Yi.- 'y.)(Yi; - Yi.)


~l

j=l

Note that the cross-product term in equation 12-6 is zero, since

i(Yij -Yi.) = Yi-nYi. =Yi.-n(Yi.!n) =0.


j::::.l

Therefore we have
j

(12-7)

12~ 1

The Completely Randomized Single-Factor Experiment

325

Equation 12-7 shows !hat !he total variability in the data, measured by !he total corrected
sum of squares, can be partitioned into a sum of squares of differences betv,,'een treatment
means and the grand mean and a sum of squares of differences of observations within treatments and the treatment mean. Differences bet\Veen obsen-ed treatment meanS and the
grand mean measure the differences between treatments. while differences of observations
within a treatment from the treatment mean can be due only to random error. Therefore. we
write equation 12-7 symbolically as

SST = SStr"'..aIme~1l< + SSe'


where SST is the total sum of squa...--es. SS:reolm~~~ is the sum of squares due to treatments (Le.,

between treatments), and SSE is the sum of squares due to error (i.e., within t:eatments).
There are an =Ntotal obseJ."'lations; thus SST has N -1 degrees of freedom. There are a levels of the factor, so SS~=ts has a - 1 degrees of freedom, Finally, widrin any treatment
there are n replicates providing n - 1 degrees of freedom Vlith which to esthnate the experimental error. Since there are a treatmen~ we have a(n 1) = an - a = N - a degrees of
freedom for error,
Now consider the distributional properties of these sums of squares. Since we have
asst:med that !he errors ;, are NID(Q, a"), !he observations Yij are NID(,u + "i, 0'), Thus
SSr/a" is distributed as cbi-square with N - 1 degrees of freedom, since SST is a sum of
squares in normal random variables, Vie may also show that SSl1o.:.:mi:,1)~/d is chi-square with
a 1 degrees of freedom, if Hois tru_e, and SSfid is chi-square witt N - a degrees of freedom. However, all three sums of squares ate not independent, since SStr~~m~nu and SSE add
up to SST' The followmg theorem, which is a special form of one due to Cochran, is useful
in developing the test procedure,

Theorem 121 (Cochran)


Let Z, be NID(O, I) for i = 1, 2, ,," v and let

wheres <v a!1d Q, is chi-square with Vi degrees offree-dom (i= 1, 2 . ,.,3), Then QH Q;tt
Q: are independent chi-sGuare random variables with Vl , v2 > " v, degrees of freedom,
respectively~ if and only if

Using this the-orem, we note that the degrees of freedom for SSt=.tm~n~ and SS:,' add up
to N - 1, so that SS""',m"Ja" and SSe/a" are independently distributed cbi-square random
variables, Therefore, under the null h)l'othesis, the statistic

(12-8)
follows the Fa _ I N _ a distribution. The quantities MS<=.:rw:nu and ~y'S are rr.ean squares.
The expected values of the mean squares are used to show that Fo in equation 12-8 is
an appropriate test statistic for Ho: 1) =: 0 and to determine the criterion for rejecting this null
hypothesis. Consider

:i
326

Chapter 12 The Analysis of Variance

Substituting the model, equation 12-1, into this equation we obtain

E(MSE) = N

:0

E[

t.~(!i-C" Hut - t.(~(!iHI ~elj

In

~ow on squaring and taking the expectation of the quantities within brackets, we see that
terms involving e7; and L;",,;f.7j are replaced by if and n<f!, respectively, because E( ( 1) :::: O.
Furthermore, all cross products involving '1] have zero expectation. Therefore, after squar~
ing! taking expectation~ and noting that I.~l 'rl """ 0, we have

Using a similar approach. we may show that


a

n2:-rf
f"S trear:me.'llti ) :;;::;; (J 2 + -1.. 1- .
E \.IYl,
a-I

cr.

From the expected mean squares we see that MS E is an unbiased estim.torof


Also,
uncer the null hypothesis. A[SueI.ltmeot!, is an ur:.biased estimator of cr. However, if the null
hypothesis if false, then the expected value of ~o/fS=tmeJl[; is greater than vl. Therefore, under
the alternative hypothesis the expected value of the numerator of the test statistic (equation
12-8) is greater than the expected value of the denominator, Consequently; we should reject
H'J if the test statistic is large, This implies an upper~tail, one-tail critical region. Thus, we
would reject H, if

where Fe is computed from equation l2~8.


Efficient computational formulas for the sums of squares may be obtained by expanding and simpJifying the definitions of SS~a and SST in equation 12~7. This yields

, "

SST = I.2>~
i=1

I::'l

(12-9)

and
(12-10)

'r;
"

,I

12~ 1

The Completely Randomized S:.ngle-Factor Exp"'...riment

327

The error sum of squares 1s obtained by subtraction:


SS

(12-Jl)

:;;; SST - SSrreo.:menr:r,'

The teSt procedure is summarized in Table 12-3. This is called an analysis-of-variance table,

Consider the hardwood concentration experiment described in Section 12-1.1 We can use the analysis of wrian<:e to test '.he hypothesis that different hardwood concentrations do not affect the mean
tensile strength of the paper. The sums of squares for analysis of variance are computed from equadOO$ 129, 12-10, and 12-11 as follows:

"

_ (383)'

=(7) +(8J-+ ,,+(20t---=512,96,


24

(60)' 7(94)' -(102)' +(127)'

(383)' = 382.79
24

'

SSE = SST - SStl:'elItnl~nt:.

=512,96-382,79 130,17,

Th.e analysis of variance is sumrr;.ar.zed in Table 124. Since Fo.c:.:l,:tO = 4.94, we reject Ho and conclude that hardwood concentral'io:::, b. the pulp significantly affects the stre:::lgth of the pilper.

12-1.3 Estimation of the Model Parameters


It is possible to derive estimators for the parameters in the one-way
model

analysis-of~variance

Table 12-3 Analysis ofVa..";ance for the OI'.e~Way~aassification Fixed-Effects Model


Variation

Sum of
Squ.a:res

Source of

Degrees of
Freedom

Betweca treatments

SStm.l:mOOIt<

&rOT (within treatments)

SSE

a-I
N-a

Total

SST

N-I

Mean

Square

Fo

MS~r.1$

MS,

Table 124 Analysis of Variance for the Tensile Strength Data


Source of

Sum of

Variatio=

Hardwood concentration
Error

382.79
130,17

Total

512,96

Degrees of
Freedom

Mean

127,60

20
23

6,51

19,61

328

Chapter 12 The Analysis of Variance


An appropriate estimation criterion is to estimate J1 and 1'; such that the sum of the squares
of the errors or deviations Ei I is a minimum. This n:ethod of parameter estimation is called
the method of least squares: In e.ti.,nating }.l and ~i by least squares, the nonnality assump_
tion on the errors Elf is not needed. To find the least-squares estimators of J1 and 1'!~ we form
the sum of squares of the errors

(12-12)

j!.

and find valnes of }.l and T" say and ii' that minimize L The values p. and ii are the solutions to the a + 1 simultaneous equations

~a}.lL -;

=0

jl,.{

i=1,2, ... ,a.

Differentiating equation 12-12 with respect to JJ.. and 1'1 and equating to zero, we obtain
a

-iLI,(Yirj!.-r,)= 0
i.=:l J=l

and

-2I,(Yi rj!.-i',)=O,

i::::: 1,2.".,a.

1';;1

After simplification these equations become

NjJ.+nf1 +rii1+ ...

+nf~=y."
=)'t' ,

+m,

=Y2"

(12-13)

Equations 12-13 are called the least-squares normal equations. Notice that if we add the
last a normal equations we obtain the first normal equation. Therefore, the normal equations
are not linearly independent, and there are no unique estimates for f.4 't'p1:.z,.,.,'t'". One way
to overcome this difficulty is to impose a constraint on the solution to the normal equations.
There are many ways to c1loose this constraint Since we have defined the treatment effects
as deviations from the OY"erall mean, it seems reasonable to apply the constraint
(12-14)
Using this constraint, we obtain as the solution to the normal equations
i:::; 1, 2, ..., a.

(12-15)

12-1

The Completely Randomized Single-Factor Experiment

329

This solution has considerable intuitive appeal since the overall mean is estimated by the
grand average of the observations and the estimate of any treatment effect is just the difference bct\lleen the treatment average and the grand average,
This solution is obviously not unique because it depends On the constraint (equation
12-14) that we have chosen. At :first this may seem unfortunate. because two different
experimenters could analyze the same data and obtain different results if they apply different constraints. However, certainfimctions of the model parameter are estimated uniquely.
regardless of the constraint. Some examples are 'i-"r which would be estimated by ~,-~ ~
)1. -:ip and f.1 + r i which would be estimated by it +7:; = <~};.' Since we are usually interested
in differences in the treatment effects rather than their actual values, it causes no concern
that the 7:1 cannot be estimated uniquely. In general, any function of the model parameters
that is a linear combination of the left-hand side of the normal equations can be estimated
lI1liquely, Functions that are lI1liquely esrimated, regardless of which constnlint is used, are
called estimable functions,
Frequently, we would like to construct a confidence interval for the ith treatment mean,
The mean of the ith treatment is
l= 1. 2, .... a.
A point estimator of f.1l would beil} =it + 1i
Now t if we assume that the errors are nor~
mally distributed, each 1" is !\'1D(;1,! <i'ln), Thus, if <i' were known, we could use the nQrmal distribution to construct a confidence interval for fli' V sing MSE as an estimator of >
we can base the confidence interval on the t distribution, Therefore, a 100(1 - a)% confidence interval on the ith treatment mean Pi is

cr

(12-16)
A 100(1- a)% confidence interval on the difference between any two treatment means, say
J11 - /lj. is

~~#l!1~)~-2 .'
We can use the results given previously to estimate the mean tensile s.trengths at different levels of
hardwood concentration for the experiment in Section 12~ 1. L The mean tensile strength estimateS are

Yl' : : :. AI(,:::::' 10.00 psi,


~.

=:

jJ.ll.Yfc ~ 15.67 psi,

J,. ~ ilm, = 2Ll7 psi

A 95% codidence interval on the mean tensile strength at 20% hardwood is found from equation
12-16 as follews:

[2L17 (2,086)-I65lj6 ].
[2Ll7 2.17],

330

Chapter 12 The Analysis of Variance


1he desired confidence interval is
19.00 psi ~ I'm. ~ 23.34 psi.
\'b-ual. examinarion of the data suggests that mean tensile strength at 10% and 15% hardwood is si1l1~
ilar. A confidence interVal on the difference in means J1.lS'h - JllM. is

[Yi'- Yj tc:.{l.N-Q~2MS5/n ].
[17.oo-1j.67(2.086)~2(6.51)/6l
[L333.07].

Thus. the confidence intm'3.l on J1.!5% -Il;(jl, is


- 1.74 51l1YJ. -

Ji.!M. ~

4.40.

Since the confidence interval includes zero, we would conclude that there is no difference in mean

tensile strength at these two particula:: !laniwooC :levels.

12-1.4 Residual Analysis and Model Checking


The one-way model analysis of variance assumes that the observations are normally and
independently distributed. with the same variance in each treatment or factor level. These
assumptions should be checked by examining the residuals, We define a residual as: e'i =::
Yl} - y~, that is, the difference between an observ"3tion and the corresponding treatment
mean. The residuals for the hardwood percentage experiment are shown in Table 12-5.
The normality assumption can be checked by plotting the residuals on normal probability paper. To check the assumption of equal varia..'1ces at each factor level, plot the residuals
against the factor levels and compare the spread in the residuals. It is also useful to plot the
residuals agamst )'1. (someti,nes called thefitted value); the variability in the residuals should
not depend in any way on the value ofy,: Vv'fien a pattern appears in these plots, it usually
suggests the need for transfonnation, that is, analyzing the data in a different metric, For
example, if the variability in the residuals increases with Yi" then a transformation such as
log y or .[Y should be considered. In some problems the dependency of residual scatter in:;1"
is very important information. It may be desirable to select the factor level that results in
maximum y; however, this level may also cause more variation in y from run to run.
The independence assumption can be checked by plotting the residuals against the time
or run order in which the experiment was performed. A pattern in this plot. such as
sequences of positive and negative residuals, may indicate that the observations are not
independent. This suggests that time or run order is important, or that variables that change
over ti..rne are important and have, not been included in the experimental design.
A normal probability plot of the residuals from the hardwood concentration experiment
is shown in Fig, 12-2, Figures 123 and 124 present the residuals plotted against the treatTabl.12-5 Residuals for the Tensile Strength Experiment
Hardwood
Concentration
5%
10%
15%
20%

R::siduals
-3,00 -2.00
-3.67
1.33
-3.00
1.00
-2.17
3.83

5.00
-2.67
2.00
0.83

1.00 -l.OO
2.33
3.33
0.00 -1.00
1.83 -3.17

0.00
..1J.67

LOa
-Ll7

12-1 The Completely Randomized

Single~Factor

Experiment

331

ment number and the fitted value YI' ' These plots do not reveal a:ly model inadequacy or
unusual problem with the assumptions,

12-1.5 An "Cnbalanced Design


In some single~factor experiments the number of observations taken under each treatment
may be different. We then say that the design is unbalanced. The analysis of variance
described earlier is still valid. but slight modifications must be made in :he sums of squares
formulas. Let n, observations be taken under treatment i(i = 1, 2,. ". a); and let the total

'

Residual value

Figure 12--2

l\~onnal probability plot

of residuals from the hardwood co:r.centration experiment.

t
4

1
2

-2

Figure 12~3 Plot of residuals YS. treatment.

332

Chapter 12 T.::le A..'1alysis ofVatiance

2~
!

Figure 12*4 Plot of residuals vs. )it'

number of observations N:::: k;"'ln l The computational fOIDlulas for SST and SSfftltilY:rJs
become
+

SST

"

2::2>3
r""l
j:<

and

r: .

In solving the normal equations, the constraint


In,tl 0 is used. No other changes are
required in the analysis of variance.
111ere are two important advantages in choosing a balanced design. First, the test sta~
tistic is relatively insensitive to small departures from the assumption of equality of vari~
ances if the sample sizes are equal This is not the case for unequal sample sizes. Second,
the power of the test is maximized if the samples are of equal size.

122 TESTS ON INDIVIDUAL TREATMENT MEANS


122.1

Orthogonal Contt-dSts
Rejecting the null bypothesis in the fixed-effects-modeI analysis of variance implies that
there are differences between the a treatment means1 but the exact nature of the differences
is not specified. In this siruation, further comparisons between groups of treatment means
may be usefuL The ith treatment mean is defined as Ili I' ~ T,. and 1'; is esti.."",ted by 'ii;.
Comparisons between treatment means are usually made in tenr.s of the treatment totals
{y, }.
Consider the hardwood concentration experiment presented in Section 12-1,1. Since
the hy'pothesis Eo: !j ~ 0 was rejected, we know that some hardwood concentrations produce tensile strengths different from others, but which ones actually cause this difference?

12~2

Tests on Individual Treattnent Means

333

We might suspect at the outset of the experiment that hardwood concentrations 3 and 4 pro~
duce the same tensile stren~ implying that we would like to test the hypothesis

H,: p, = ,,",,
H,: p, *11"
This hypothesis could be tested by using a linear combination of treatment toWs, say

Y:r-Y,=O.

If we had suspected that the average of hardwood concentrations 1 and 3 did not differ from
the average of hardwood concentrations. 2 and 4. then the hypothesis would have been

Ho: III + P, = P, + 11"


p,+ 1'.,

HI: 1', + p,

which implies that the linear combination of treatment totals


YI' +y,. -y,.- y" =0.
In general, the comparison of treatment means of Interest will imply a linear oornbination of treatment totals such as
a

C=l..Ci}'i,'
i<=1

L;

with the restriction that ""tC,


of squares for any contrast is

O. These linear combinations are called cOntrasts. The sum

(12-18)

and has a single degree of freedom. If the design is unbalanced, then the comparison of
treatment means requires that 1~ ",1niGo:::: 0, and equation 12~ 18 becomes

(12-19)

A contrast is tested by comparing its sum of squares to the mean square error. The resu1t~
ing statistic would be distributed as F, with 1 and N - a degrees of freedom.
A very important special case of the above procedure is that of orthogonal contrasts.
Two contras:s with coefficients (cJ and (dJ are otthogonal if

or, for an unbalanced design, if

334

Chapter 12

The Analysis of Variance

For a treatments a set of a-I orthogonal contrasts will partition the sum of squares due to
treatments into a-I independent single-degree-of~freedom components. Thus. tests per~
formed on orthogonal contrasts are independent.
There are many ways to choose the orthogonal contrast coefficients for a set of treatments. Usually, something in the nature of the experiment should suggest which comparisons will be of interest. For example, if there are a=::3 treatments, with treatment 1 a
"control" and treatments 2 and 3 actua11evels of the facto! of interest to the experimenter.
then appropriate orthogonal contrasts might be as follows:

Treatment

Otthogor.al Contrasts

1 (control)
2 (levell)
3 (level 2)

-2
1
I

0
-1
1

Note that contrast 1 with Ci =-2, 1, 1 compares the average effect of the factor \Vith the control while contrast 2 with d i = 0, - 1, 1 compares the two levels of the factor of interest.
Contrast coefficients must be chosen prior to running the experiment, for if these com

parisons are selected after examining the data. mOSt experimenters would construct tests
that compare large observed differences in means. These large differences could be due to
the presence of real effects or they could be due to random error, If experimenters always
pick the largest differences to compare, they will inflate the type I error of the test, since it
is likely that in an unusually high percentage of the comparisons selected the observed differences will be due to error.

Consider the hardwood concentration experiment. There an:: four levels of hardwood concentration.
and the poss:.Ole sets of comparisons between these means and the associated orthogonal comparisons
are
Ha: f,i; ..... f,i;, =P-t -1-'1,

C1 t;;;,YI' -Y2' -Yy + Y4"

H,,: 3f,i( + ,LL:.,:;;; P-t + 3fJ. 4t

C2 = - 3y!' -)'2' + YJ- - 3Y4'

H,: 1', + 3", ~ 3", + ",.

C3=-YI+3Y2'

3Y3'+Y4'

Notice that the contrast constants are orthogonaL Using the data from Table 12-1. we find the numerical values of the contrasts and the sums of squares as follows:
C1 ~60-94-102+127~-9.

(-9)'

SSe. ~ 6(4) ~ 3.,8,

~ (209)' ~ 364.00

C, ~-3(60)-94+ 102+3(127) =209.

SS

C, =-<50 +3(94)-3(102) + 127 = 43.

(43)'
SSe, ~ 6(20) =15.41.

C,

6(20)

These contrast sums of squares C-Ontpletely partition the treatment sum of squares; that is. SS_G::=
SSe l + SSC~ - SSe:; == 382.79. These tests on the contrasts are usually incorporated into the analysis of
variance, such as shown in Table 12-6. From this analysis, We conclude that there are sig:n.ificant dif~
ferences between bardwood concentrations 1,2 vs. 3, 4. but that the average of 1 and 4 does not dif~
fer from the average of 2 and 3. nor does the average of 1 and 3 differ from the average of 2 and 4.

12-2 Tests on Individual Treatment Means

335

Table 12-6 Analysis of Variance for the Tensile Strength Data


Sum of
Squares

Hardwood cO!lCentrarioD
C, (1.4 s. 2. 3)
C, (1,2 vs. 3,4)
C, (1.3 vs.2.4)

382.79
(3.3S)
(364.00)
(15.41)
130.17
512.%

Error
Total

Mean

Degrees of
Freedom

Source of
Variation

3
(1)
(1)

(1)
20
23

Square

F,

127.60
3.38
364.00
15.4.1
6.51

19.61
0.52
55.91
2.37

12-2.2 Tukey's Test


Frequently~

analysts do not know in advance how to construct appropriate orthogonal contrasts, or they may wish to test more than a-I comparisons using the same data. For example, analysts may want to test all possible pairs of means. The null bypotheses would then
be Ho: III = Ilj for all i j. If we test all possible pairs of means using I-tests, the probability
of committing a type I error for the entire set of comparisons can be greatly increased.

There are several procedures available that avoid this problem. Among the more popular of
these procedures are the Newman-Keuls test ",ewman (1939); Keuls (1952)), Duncan's
multiple range test [Duncan (1955). and Tukey's test [Tukey (1953)]. Here we describe
Tukey's test.
Tukey's procedure makes use of another distribution, called the Studentized range distribution. The Studentized range statistic is

_ Y1ThlX - Y~n
r=-;-
~MSE!n

wbere y~ is the largest sample mean and y""" is the smallest sample mean out of p sample
means. Let qa (a,f) repre.<:;ent the upper a percentage point of q, where a is the number of
treatments andfis the number of degrees of freedom for error. Two means) y;. and Yj. (i
j), are considered significantly different if

*'

lYi . -)\.1 > T.


where

/MSE
Ta = qa (a,f )'I-n-'

(12-20)

Thble XlI (Appendix) contains values of q. (a,fJ for a= 0.05 and 0.01 and a selection of
values for a and! Thkey's procedure has the property that the overall significance level is
exactly a for equal sample sizes and at most a for unequal sample sizes.

~~,!,!~'g4:
We will apply Tukey's test to the hardwood concentration experiment, Rec::ill that there are a = 4
rnea.1S, n:::: 6) and MSc :::: 6,51. The treatment means axe

Y" = 10.00 psi.

y,. = 15.67 psi.

y, = 17.00 psi,

Y..,

=:

21.17 psi.

Fro", Table XII (Appendix). with a= 0,05. a =4. and!= 20. we find qOM(4, 20) = 3.96.

336

Chapter 12

The Analysis of Variance

Using Equation 12-20,

Therefore. we would conclude:hat Mo means are significantly different if

> 4.12.

[y,.
The differences in treatment averages are

15',.

= 110.00 -15.671 = 5.67,

If,. -y,.1 =10.00-17.001 = 7.00.


11,. - 3',.1 = 110.00 - 21.171 = ILl7,

I)',. -3',1 = :15.67 -17.001 = 1,33,


[f,. -)',1 = 115.67 - 21.171 =5.50,

L', - )',1 = 117.00-21.17: =4.17.


From this analysis. we see significant differences between all pairs of means except 2 and 3. It may
be of use to draw a graph of the treatment means, such as Fig, 12-5, wi.lo'1 the means that are not different underlined.

Simultaneous confidence intervals can also be constructed on the differences in pairs


of means using the Tukey approach. It can be shown that

when sample sizes are equal. This expression represents a 100(1 a)% simultaneous confidence interval on aU pairs of means Jl; - j..l;.
If the sample sizes are unequal, the 100(1 - 0:)% simultaneous confidence interval On
all pairs of means JJ.; - ~ is given by

Interpretation of the confidence intervals is straightfonvard. If zero is contained in an inter~


val. then there is no significant difference beween the two means at the a significance level
It should be noted that the significance level, IX, in Tukey's multiple comparison procedure represents an experimental error rate. "With respect to confidence intervals, (1. represents the probability that one or more of the confidence intervals on the pairwise
differences will not contain the true difference for equal sample sizes (when sample sizes
are unequal, this probability becomes at most 0:).

91.

12,'

~.

10.0

12.0

14.0

Figure 12-5 Results ofTukey's :est.

16,0

fa.

Y4-

I
1M

20.0

12-3

The Random-Effects Model

337

123 THE RAC'mOMEFFECTS MODEL


In many situations, the facwr of interest has a large number of possible levels. The analyst
is interested i:1 drawing conclusions about the entire population of factor levels. If t.l-te
experimenter randomly selects a of these levels from the population of factor levels, then
we say that the factor is a random factor. Because the levels of the factor actually used in
the experiment were chosen randomly, the conclusions reached wil1 be valid about the
entire population of factor levels. We will assume that t.~e population of factor levels is

either of infinite size Or is large enough to be considered inficite.


The linear statistical model is
i 1,2"",,a,
Yij=/1+'r i 'ij { ' J -1,2,"",n,

(1221)

where 1; and ;, are independent random variables. Note that the model is identical in structure to the fixed-effects case, but the parameters have a different interpretation. If t.'1e variance of 'T, is ~, then the variance of any observa:ion is
V(y)

= rr; + <:f,

The variances ~ and t:T are called variance componems and the model, equation 12-21.
is called the components-ofvariance or the raru1om~effectsmodeL To test hypotheses using
this model, we require that the (E,) areNID(O, <:f), tha, the ('ti} are ,NID(O,
and that "i
and /j are independent.The assumption that the {1j} are independent random variables
implies that the usual assumption on:;,} ~, = 0 from the fixed-effects model does not apply
to the random-effects modeL
The sum of squares identity
j

rr;),

(12-22)

still holds. That is~ we partition the total 'Vmiability in the observations into a component
that measures variation between treatments (SSI.."tlI;mI:1lJ and a component that measures variation within treatments (S5E)' However, instead of testing hypotheses about individual treatment effects) we test the hypotheses

If
0, all treatments are identical. blo1 if U; > 0, t.'len there is variability between treatments, 'The quantity SSE/a' is distributed as chi-square with N - a degrees of freedom, and
under the null hypothesis, SS~enl$/cf is distributed as chi-square with a 1 degrees of
freedom, Further, the random variables are independent of each other. Thus, under the null
hypothesis, the ratio
SS"""""",/( a - I)
SSE/eN -a)

(12-23)

is distributed as F with a - I and N - a degrees of freedom, By examining the expected


mean squares we can deten.nme the critical region for this statistic,
Consider

338

Chapter 12 The Analysis of Variance

If we square and take the expectation of the quantities in brackets; we see that terms involving -r: are replaced by
as E(r;) O. Also, terms involving I7", 1~' L~.. lL;", I~' and
L;.IL;~ I ~ are replaced by nO". 000". and an 0;, respectively. Finally, all cross-product
tenns involving ~i and ijhave zero expectation. This leads to

d!.

or
E(MS=Il,:) =

(J1

+ nd;.

(12-24)

A similar approach will show that


E(MS,) = (Jr,

(12-25)

From the expected mean squares. we see that if Ho is true, both the numerator and the
denominator of the test statistic, equation 12-23, are unbiased estimators of 0-2; whereas if
HI is true, the expected value of the numerator is greater than the expected value of the
denominator. Therefore, we should rejectHo for values of Fo that are too large. This implies
an upper~tail, one-tail critical region, so we reject Ho if Fo > Fa. a_ I,N_,:'
The computational procedure and analysis-of-variance table for the random-effects
model are identical to the fixed~effects case. The conclusions., however, are quite different
because they apply to the entire population of treatments.
We usually need to estimate the \'ariance components (0-2 and ~) in the model. The
procedure used to estimate 0-2 and a! is called the "analysis..of-variance method," because
it uses the lines in the analysis-of-variance table. It does not require the normality assumption on the observations. The procedure consists of equating the expected mean squares to
their observed values in the analysis-of-variance table and solving for the variance components. When equating observed and expected mean squares in the one-way-classification
random-effects model, we obtain

MSu<;wnctJ.I1 = 0-2 + nO-;


and

Therefore, the estimators of the variance components are


&?:::!ldS

(12-26)

and
(12-27)

For unequal sample sizes, replace n in equation 12-27 with

,-

12-3

The Random-Effects Model

339

~n21 ,
Ln,-.c;-1
L
~ i

1
a
no=-

a-I 1'-1
;

11.

kcl

Sometimes the analysis-of-variance method produces a negative estimate of a variance


component Sinee variance components are by defmition nonnegative, a negative estimate
of a variance component is unsettling. One course of action is to accept the estimate and use
it as evidence that the true value of the variance component is zero, assuming that sampling
variation led to the negative estimate. \\'!rile this has intuitive appeal, it will disturb the sta~
tistical properties of other estimates, Another alternative is to reestimate the negative vari~
3..1.ce component with a method that always yields nonnegative estimates. Still anodier

possibility is to consider the negative eS"Jmate as evidence that the assumed linear model is
incorrect, requiring that a study of the model and its assumptions be made to find a more
appropriate model.

;J,l~~ii;)l#:~
In his book Design aruiAnalysis afExperiments (2001). D. C. Montgo:nery describes a single-factor
experiment involving the random-effects model. A textile manufacturing company weaves a fabric on
a la.,'"gc number of loows. The company is interested in loom-to-loom. variability in tensile stre::1gth,
To investigate this, a IruUlufacturing engineer selects four looms at FaJldom and makes fOtlf st."'\'::ngth
determinations on fabric samples chosen at random for each loom. The data are shown in Table 12-7,
and the analysis of variance is summarized in Table 12-8.
From the analysis of variance, we conclude :hat the looms :.n the p:ar.t differ significantly in their
ability to produce fabric of unifonn strength. The variance components are estimated by &::;:::: 1.90 and
., _ 29.73-1.90
4

vr

6%
. .

Therefore, the varia;}.ce of strength in the man.li!actlqing process is estimated by

= 6.96 + 1.90
8.86.

Most of this variability is attributable to dill'erences between looms.

Table 127 Streugth Data for Example 12-5

Observations

Loom

Totals

Averages

1
2
3

98
91

97
90

96

95

95

96

390
366
383
388
1527

97.5

96

99
93
97
99

92

95
98

91.5
95.8
97.0
9SA

34(l

Chapter 12

The Analysis of Variance

Table 12-8 Analysis of Variance for the Strength Data


Source of
Variation

Sum of
Squares.

Degrees. of
Freedom

Mean
Square

P,

Looms
Error
Total

89.19
22.75

3
12
15

29.73

15.68

111.94

1.90

This example illustrates an important application of analysis of variance-the isolation

of different sources of variability in a manufacturing process. Problems of excessive


variability in critical functional parameters or properties frequently arise in qualityim.provement program.~. For example, in the previous fabric-strength example, the process
mean is estimated by y.. = 95.45 psi and the process standard deviation is estimated by
&, =
= '18.86 2.98 psi. If strength is approximately normally distributed, this

'wl;;J

would imply a distribution of strength in we Outgoing product that looks like the normal
distribution shown in Fig. 12-6a. If the lower specification limit (LSL) on strength is at 90
psi, then a substantial proportion of the process defective is fallout; that is, scrap or defec~
dve material that must be sold as second quality, and so aD.. This fallout is directly related
to the excess variability resulting from differences between looms. Variability in 1001P- performance could be caused by faulty setup. poor maintenance) inadequate supervision,
poorly trained operators, and so forth. The engineer or manager responsible for qUality
improvement must identify and remove these sources of variability from the process, If he
can do this, then strength variability will be grea~y reduced, perhaps as low as = =
-.11.90 = 1.38 psi, as shown in
12-6b, In this improVed process, reducing the variability in strength has greatly reduced the fallout. 'This: will result in lower cost, higher quality,
a more satisfied customer, and enhanced competitive position for the company.

a, a

Process
fallout

(a)

110

'.

pSI

(b)

Figure 12.-6 The distribution of fabric strengdl, (a) Current process. (0) Improved process.

12-4 The Randomized Block Design

341

124 THE RANDOMIZED BLOCK DESIGN


12-4.1 Design and Statistical Analysis
In many experimental problems it is necessary to design the experiment so that variability
arising from nuisance variables can be controlled. As an example, recall the situation in
Example 11~17j where two different procedures were used to predict the shear strength. of
steel plate girders. Because each girder has potentially different strength, and because this
variability in strength was not of direct interest, we designed the experiment using the two
methods on each girder and compared the difference in average strength readings to zero
using the paired t-test. The paired t-test is a procedure for comparing two means when aU
experimental runs cannot be made under homogeneous conditions, Thus, the paired t-test
reduces the noise in the experiment by blocking out a nuisance variable effect. The randomized block design is an extension of the paired t-test that is used in situations where the
factor of interest has more than two levels.
As an example. suppose that we wish to compare the effect of four different chemicals
on the strength of a particular fabric. It is knOYtll that the effect of these chemicals varies
considerablY from one fabric specimen to another. In this example, we have only one factor: chemical type. Therefore. we could select several pieces offabric and compare all four
chemicals within the relatively homogeneous conditions provided by each piece of fabric.
This would remove any variation due to the fabric,
The general procedure for a randomized complete block design consists of selecting b
blocks and running a complete replicate of the experiment in each block. A randomized
complete block design for investigating a single factor with a levels would appear as in Fig.
12-7. There will be a observations (one per factor level) in each block, and the order in
whicb these observations arc run is randomly assigned within the block.
We will now describe the statistical analysis for a randomized block design. Suppose
that a single factor with a levels 1s of interest. and the experiment is run in b blocks. as
shown in Fig. 12-7. The observations may be represented by the linear statistical model.

fi= 1.2, ...,a.

Yii=:!!f.1+'ti+fJj+ij~.
~J

1.2..... b,

(12-28)

where f.1 is an overall mean, 'ti is the effect of the ith treatment, PI js the effect of the jth
block, and e" is the usual NID(O, a') random error term. Treatments' and blocks will be C(lJ1sidored initially as fixed factors. Furthermore, the "eattnent and block effects are defined
as deviations from the overall mean, so that:L~", 1'ti:::: 0 and :L;", 1/3;= O. We are interested in
testing the equality of the treatment effects. That is,
BIJ! '1:1 =:!!

H}:

Block 1

B:cck 2

't',;;t

12::::

.,.:= 1'a=O~

0 for at least one i,

Brock b

IY1D

y"
he

Fi,.,oure 12-7 The randomized complete block design,

342

Chapter 12 The Analy'sis of Variance

Let Yi. be the total of all observations taken under treatment i,let Y'} be the total of all
observations in blockj, let y . be the grand total of all observations, and let N = ab be the
total number of observations. Similarly, y;. is the average of the observations taken under
treatment i, yO) is the average of the observations in blockj, and y. is the grand average of
all observations. The total corrected sum of squares is

Expanding the rigbt-hand side of equation 12-29 and applying algebraic elbow grease
yields
(J

.,

(1

.,

2,2,(Yij - y.,f =b2, (y,.-; ..)' +a2, (;.j - y.)"


i=1 J"'-l

i=1

/,=1
b

+ 2,2,(Y,j-Y"-Y'j+)'.')
i=1

(12-30)

j~l

or. symbolically,

SSr= SSumrr,cn~ +

(12-31)

SSblocu'l- SS,

The degrees-of-freedolI! breakdown corresponding to equation 12-31 is

no -

1 = (0

1) + (b

1) T (0 - 1)(b - 1).

(12-32)

The null hypothesis of no treatment effects (H,: "" = 0) is tested by the F ratio,
The analysis of variance is summarized in Table 12-9. Computing formu,
las for the sums of squares are also shown in this table. The same test procedure is used in
cases where treatments and/or blocks are random.
MS~,,,IMS.

Ex..nJ~~eX2~6
Au experiment was performed ::0 determine the effcct of four different chemica:s on the stre';lgth of a
fabric. These chemicals a.--e used as part of the penna.nent-press finishing process. Five fabric samples
were selected, and a randomized block design was run by testing each chemica! type once in random
order on each fabric s:unple. The data are shown in Table 12-10.
The S~ of squares for the analysis of variance are corcputed as follows:

Table 12~9 Analysis of Variance for Randomized Complete Block Design


Source of
Variation

SUIDof

Squares

.Degreesof
Freedom

Treatments

a-I

Blocks

h-I

Error

Tota!

SSE (oy subtraction)

(a -1)(h -1)

ab-I

Mean
Sql.1.3J:e

a-I
S~l"'.'~

0-1

ss

"

(a-l)(o-l)

124 The Randomized Block Design

Table J2.10

Fabnc Strength Data -

Chemica!

Randomized Block Design

Fabric Sawe1e
2
3
4

Type

1.8
3.9

4.4

0.6
2.0

9.2 10.1

3.5

2.2

3
4

0.5

1.6
2,4
1.7

1.3

1
2

OA

343

Row

Row

Tota~s,

Averages,

y,.

Y.

1.2
2.0
1.5
4.1

1.1

5.7

1.14

1.8
1.3
3.4

8.8
6.9
17.8

1.76
1.38
3.56

8.S

7,6

39.2

1.96

1.90

(y .. )

(]i.. )

CollUli.n

totals, Y'J
Colu;r.n

averages. Y'j

2.30 2.53 0.88 2.20

SS

b,od:s

(5.7)' +(8.8)' ,.(6.9)' 7(17.8)'


5

no ",
~;18.04.
20

-.&
ZL_ iab
,..., a
/",1

; (9.2)' +(10.1)2 +(3.5)' +(8.8)' +(7.6)'

(39.2)' ; 6.69,

20

SSE;;; SST - SSblOCiks - SSlte:J.tmllntl';


= 25.69 -6.69 -18.04= 0.96.

The analysis of variance is summar'.z.ed in Table 12-11. We would conclude that there is a signiiicant
difference in 11e chemical types as far as their effect on fabric strength is oo:l.cernoo,

Table 12-11 Analysis ofVarimcc for the RaI:.domized Block Experiment

Source of

Sumo!

Variation

Squares

Degrees of
Freedom

Squa...-e

18.04

6.01
1.67
O.OS

Mean

Chemical type
(treatments)

Fab::ic saople
(blocks)

6.69

Error

0.96

12

Total

25.69

19

75.13

'J
344

Chapter:2 The A.'1alys:s of Variance

Suppose an experiment is conducted as a randomized block design, and blocking was


not really necessary. There are ab observations and (a - I)(b - 1) degrees of freedom for
error. If the experiment had been run as a completely randomized single~factor design with
b replicates, we would have hada(b -1) degrees offreedom for error. So, blocking has cost
alb I) - (a-I) (b - 1) = b -1 degrees of freedom for error. Thus, since the loss in error
degrees of freedom is usually small. if there is a reasonable chance that block effects may
be important1 the experimenter should use the randomized block design.
For example, consider the ex.periment described in Example 12-6 as a one-wayclassification analysis of va..riance. We would have 16 degrees of freedom for error. In the
randomized block design there are 12 degrees of freedom for error. Therefore, blocking has
cost only 4 degrees of freedom, a "tty small loss considering the possible gain in information that would be achieved if block effects are really important. As a gene.rn.l rule, when in
doubt as to the inlporta:Jce of block effects, the experimenter should block and gamble that
the block effect does exist. If the experimenter is "''Tong, the slight loss in the degrees of free~
dom for error will have a negligible effect, unless the number of degrees of freedom is very
small. The reader shollid compare this discussion to the one at the end of Section 11 ~3 .3.

12-4.2 Tests on Individual TreatnIent Means


When the analysis of variance indicates that a difference exists between treannent means,
we usually need to perform some follow~up tests to isolate the specific differences. Any
multiple comparison method, such as Tukey's test, could be used to do this.
Tukeis test presented in Section 12-2.2 can be used to determine differences between
treatment means when blocking is involved Simply by replacing n with the number of
blocks b in equation 12~20. Keep in mind that the dl?grees of freedom for error have now
Changed. Forth. rancomized block design,j= (a - I)(b - I).
To illustrate this procedure, recall that the four chemical type means from Example
12-6 are

1,

= 1.14,

y,. = 1.76,

y.

= 3.56,

Therefore, we would conclude that two means are significantly different if


1)';' -5\.1 > 0.53.
The absolute values of the differences in treatment averages are

IY.. - y,.1 = 11.14 - 1.761 = 0.62,

1Y. -y,.I=11.l4 1.381=0.24,

IY. - y,.1 = 11.14- 3.561 ~ 2.42,

IY,. - y,.1 ~ 11.76 -

1.381 = 0.38,

=11.76 - 3.561 =1.80,


ty,. - Y4' =11.38 - 3.561 =2.18.
ty,. -Y4.1

Too results indicate chemical types I and 3 do not differ, and types 2 and 3 do not differ. ;
fignre 12-8 represents the results graphically, where the underlined pairs do not diffeLI

12-4 The Rar.domized Block Design

13>

}'2-

y,.

I I

y,.

345

2.QO

1.50

1.00

2.50

$,00

3.50

Figure 12-8 Results of Tukey's !:esC

124.3 Residual Analysis and Model Checking


In any designed experiment it is always important to examine the residuals and check for
violations of basic assumptions that could invalidate the results. The residuals for the randomized block design are just the differences between the observed and fitted . . 'alues

where the fitted . . '3lues are


(12-33)

The fitted value represents the estimate of the mean response wben the itb treatment is run in
thejth block The residuals from the experiment from Example 12-6 are shown in Table 12-12.
Figures 12-9, 12-10, l2-ll, and 12-12 present the important residual plots for the
experiment. There is some indication iliat fabric sample (block) 3 has greater variability in
strength when treated with the four cbemicals than the other samples. Also, cheILical type
Table1212

Residuals from ihe Ra:1dontized Block Design


Fabric

Chemical

Type
-0.18
0.10
0.08
0.00

3
4

-0.11
0.07
-0.24

027

0.44
-0.27
030
-0.48

-0.18
0.00
-0.12
0.30

0.02

0.10
-0.02

-0.10

-T-___

",-

1il

E
0

-,

-2

0.50

0.25

0.25

0.50

Res;d:tal value

Figure 12-9 Normal probability plot of residuals from the randomi2ed block design.

Chapter 12

The Analysis of Variance

4, which provides the greatest strength, also has somewhat more mability in strength, Fol,
low-up experiments may be necessary to confirm these findings if they are potentially
important,

-0.5
Figure 12-10 Residuals by trea!:D:.ent.

el/+
, I

0.5 ;-

!
01-

t5
1

.~

-O.5~
I

Figure 12-11 Residuals by block.

eat

l
1

-O'5~

.
..
'

Figure 12-12 Residuals versus YI}"

l2-5 Determining Sample Size in Single-Factor Experiments

341

12S DETEIL'1JNING SAMPLE SIZE r.T SINGLEFACTOR

EXPERIMENTS
In any experimental design problem the choice of the sample size or number of replicates
to use is important. Operating characteristic curves can be used to provide guidance in makthis selection. Recall that the operating characteristic curve is a plot of !.he type II (fJ)
error for various sample sizes against a measure of the difference in means that it is important to detect. Thus, if the experimenter knows how large a difference in means is of poten~
rial importance, the operating characteristic curves can be used to determine how many
replicates are required to give adequate sensitivity.
We first consider sample size determination in a fixed-effects model for the case of
equal sample size in each treatment. The pnwer (1- (3) of the test is
1- {3= P(Reject HoIH, is false}

(1234)

=P{Fo > F", ,.,."iH, is false I


To e\'aluate this probabili~ statement, we need to know the distribution of the test statistic
Fo if the null hypothesis is false. It can be shown tllat if He is false, the statistic Fo =
MS=_.IMSE is distributed as a noncentral F random variable, with a I and N - a

degrees of freedom and a noncentra1ity parameter o. If Ii ~ O. then the noncentral F distri


bution becomes the usual centroi F distribution.
The operating characteristic curves in Chart vn of the Appendix are used to calculate
the pnwer of the test for the fixed-effects model. These curves plot the probability of type
II error ({3) against <1>. where

nL'!f
{=1
'--,-.

(12-35)

aIr

o.

The parameter tt;2- is .related to the noncentrali::y parameter Curves arc available for a::::
I::: 0.01 and for several values of degrees of freedom for the numerator and
denominator. In a completely randomized design, the symbol n in equation 12-35 is the
number of replicates. In a randomized block design, replace n by the number of blocks.
In using the operating characteristic curves, we must define the difference in means
that we wish to detect in tenns of 2::", j 1;. Also, the error variance (52 is usually unknown_

0.05 and a

In such cases. we must choose ratios of 'i~"'l~j<:f that we wish to detect. Alternatively, if

an estimate of d- is available. one may replace

cr with this estimate. For example, if we

were interested in the sensitivity of an experiment that has aJrea.dy been performed, we
might use MS E as the esti:nate of

cr,

~E~i!~j~~1
Suppose that five means are being compared in a completely random.ized experimen:: with a::;:; 0.01.
The e.'l.perimenter would like to know how IIlalJ,y replicates to run if it is l.."UpOrta."l.l to reject HIJ with:l
probability of at least 0.90 ifL~",
jJdl::;:; 5.0. The parameter 2 is, in this case,

<I>

:1

nt~?
n(5j'=n..
,
=--,-=QIJ
5
;"'1

and the openting chal;a.cteristic curve for a -1 = 5 -1 =4 and N - a = a(n-l) = 5(11 -1) euor degrees
offreedomis shown in Chart VII (Appendix). As a fut guess, try n =4 replicates. This yields <Ilz 4,
<1)=2, and5(3)= 15 crrordegrees offrcedom. Consequently, from ChartVlI. we find that i3~ 0.38.

348

Chapter 12 TbeAnalysis of Variance

P 1 - 0.38 ;;;: 0.62. whicn is less than the


required 0.90, and so we conclude that It:::::: 4 replicates are not sufficient. Proceeding in a similar manner, we can constru:ct the following display.

Therefore, the PO?'ef of the test is approxllnately 1 -

1)

<1>'

2.00

15

5
6

5
6

2.24
245

20

0.38
0.18

0.62
0.82

25

0.06

0.94

<I>

a(n

Power (1 - jJ)

Thus, at least 11 = 6 replicates must be run in order to obtain a test with the required power.

The power of the test for the random-effects model is

1- J3=P{RejectHclHo is false}
(12-36)

,ja; > O}.

=P{F,> F.o_, ..

Once again the distribution of the test statistic Fe under the a1ternative hypothesis is needed.
It can be shown that if HI is lIUe (a: > 0), the distribution of F, is centrnl F, with a-I and
N - a degrees of freedom.
Since the power of thc random-effects model is based on the central F distribution, we
could use the tables of the F distribution in the Appendix to evaluate equation 12-36. However; it is much easier to evaluate the power of the test by llSing the operating characteris~
tic curves in Chart
of the Appendix. These curves plot the probability of the type II
error against}., whcre

vm

(12-37)

In the randomized block design, replace n with b, the number ofblocks. Since u' is usually
unknown, we may either use a prior estimate or define the value of

in detecting in terms of the ratio d!/~.

cr! that we are interested

Consider a completely randomized design with five treatments selected at random, with six observa
tions per treatment and a=O.05. We \ish to determine the power of the testlf is equal to cT. Since
a 5,11 = 6, and = 02, we oay compute
w

a-;

a;

A~~1+(6)1 ~2.646.
From the operating characteristic curve with a-l :::::4, tV -a = 25 degrees of freedom and ct= 0,05,
we find that
{J = 0.20.
Therefore. the power is approrimately 0.80.
~~~

..

.....- -

126 SAMPLE CO:\1PUTER OUTPUT


Many computer packages can be implemented to carry out the analysis of variance for the
situations presented in this chapter. In this section, computer output from 1-linitab is
presented.

12-6 Sa'JOple Computer Output

349

computer Output for Hardwood Concentration Example


Reconsider Example 12-1, which investigates the effect of hardwood concentration on tensile strength. Using ANOVA in Minitab provides the following output.

Analysis of Variance
Source
DF
SS

Concen
; Error
: Total

382.79
130.17
S12.96

3
20
23

~or

TS

>is

127.60
6.51

19.61

0.000

Indi '\ridual 95% C:s :For Mean!


Based 0:1. Pooled 'StDev
Level

Mean

StDev

--+-----r"----+-~

5
10
15
20

6
6
6
6

10,QOO
15,667
17,000
21.167

2.828

(--*--)

...~-+-

(--*--)
(--*--)

2.805
1. 789
2.639

(--* .. _)
---+----+-----+----~-

Pooled StDev

2,551

10.0

15.0

20,0

25.0

The analysis of variance results are identical to those presented in Section 12-1.2. 11initab
also provides 95% confidence intervals for the means of each level of hardwood concentration using a pOOled estimate of the standard deviation, Interpretation of the confidence
intervals is straightforward. Factor levels with confidence lnterYals that do not overlap are
said to be significantly different.
A hetter indicator of significant differences is provided by confidence intervals based
on Tukey's test on pairwise differences, an option in MinitabV , The output provided is

Tukeyl s pairwise comparisons


error rate = 0.0500
Individual error rate
0.0111

Fa~ily

Critical

~~lue

IIntervals for
5

3.96

(cQla~

level mean) - (row level mean)

10

10

-9.791
-1. 542

15

-11.124
-2.876

-5,458

-15.291
-7.042

-9,624
-1.376

20

15

2,791
-8.291

-0.042

The (simultaneous) confidence intervals are easily interpreted, For example, the 95% confidence intenra} for the difference in mean tensile strength between 5% hardwood concentration and 10% hardwood concentration is (-9.791, -1.542), Since this confidence interval
does not contain the value 0, we conclude there is a significant difference between 5% and
10% hardwood concentrations. The remaining confidence intervals are interpreted simi~
larly, The results provided by M.in.itab' are identicil to those found in Section 12-2.2,

350

Chapter 12

The Analysis afYanancc

12-7 SGMMARY
This chapter has introduced design and analysis methods for experiments with a single factor. The importance of randomization in sing1e~factor experiments was emphasized. In a
completely randomized experiment, all runs are made in random order to balance out the
effects of unknown nuisance varia.bles. If a known nuisance variable can be contrOlled,
blocking can be used as a design alternative. The fixed-effects and random~effects models

of analysis of variance were presented. The primary difference bet\\'een the two models is
the inference space.1n the fixed-effects model inferences are valid only about thefactor levels specifically considered in the analysis) while in the random~effects model the conclu~
sions may be extended to the population of factor levels. Orthogonal contrasts and Tukey's
test were suggested for making comparisons between factor level means in the .:fixed~effects
experiment. A procedure was also given for estimating the variance components in a random-effects model. Residual analysis was introduced for checking the underlying assumptions of the analysis of variance.

12-8 EXERCISES
12--1. A study is conducted to determine the effect of

Observations

cuttir.g speed on the life (in hO":lrs) of a particular

?".acbine tooL Four leveis of cutting speed are selected


for the study with the following results:
Cuttir.g

41
42

43
36

Tool Life
33
39
45
34

34

38

34

34

36

37

36

38

36

40

40

39

36
35

33

35

(a) Does cutting speed affect tool life? Draw comparative box piots a.id perform an analysis of
variance.

(b) Plot average tool life agai.."lst cutting speed and


interpret the results.

125

2.7

160

4.9

200

4.6

(d) Find the residuals and examine them for model

4.6
3.4

2.6

3.0
4.2
3.5

3.2

3.8

3,6
4.1

4.2
5.1

5,0

2.9

(a) Does C~6 flow rate affect e~ch unifon:.rity1 Con-

struct box. plots to compare the fac~or levels and


perform the analysis of variance,
(b) Do the residuals indicate any problems v.i.th the
underlying assun:ptions?
tz..3. The compressive strength of concrete is being
studied, Fom: different mi.:x.ing techniques are being
investigated. The following data have been collected:

Mixing
TechrJque

(c) Use Tukey's tese to investigate Cifferences

between the individual levels of cutting speed.


Interp:et the results.

2
4,6

Compressive Stren..,c-th (psi)


3129

3000

2865

3200

3
4

2800

3300
2900
2700

2975
2985

2600

2600

2890
3150
3050
2765

inadequacy.

J2...2. I!l "Orthogonal Design for Process Optimization and Its Application to Plasma Etching'; (Solid
Sfate Technology, May 1987), G. Z. Ym:md D, W. lillie describe (1,'1 experiment to determine the effeet of
q,) flow rate on the ur..iformity of the etch on a silicon wafer used in integrated circuit manufacturing,
Three flow rates are used in the experiment., and the
resulting unifomrity (in percent) for six :eplicmes is as
follows:

Ca) Test the hypothesis that mixing techniques affect


the strength of the concrete. 1;se a=' 0.05.

Cb) Use Thkey's test to make comparisoos bem'eetl


pairs of means. Estimate tl::.e treatment effects.
12-4. A textile mill has a la..rge nurr:.ber of looms. Each
loom is supposed to provide the sam.e output of cloth
per minute. To investigat!! this assumption. fiv!! looms
are chosen at random and the!! output measured at dif~
ferent times. The following data.are obtained:

.....:...J

351

12-8 Exercises

Output (lb i miJ.<)

LoOIC
4.0
3,9
4,1
3,6
3,8

2
3
4
5

4.1

4.2

3.8
4,2
3,8
3.6

3.9
4,1

4,0

3.9

4,0
4,0
4,0
3,9
3,8

4,1
4,0
3,9
3,7
4,0

(a) Is this a:5xed- or randoIll~effects experiment? Ate


the looms similar in output?

{c) Compute a 95% interval estimate of the mean for


coating type 1. Compute a 99% interval estimate
of the mean difference bern'een coating types 1
and 4,
(d) Test all pairs of means using Tukey's test, with

a= 0,05,
(e Assu:ni.."1g t.l)ar. coating type 4 is currently in use,
what are your recommendations to the manufac~"'Cf? We wish to minimize conductivity.

(c) Estimate the experimental error variance.

12-7. The response time in milliseconds was determined fot three different types of clrccirs t:.sed in an
electronic calculator. The results a..-e recorded here:

(d) What is the probability of accepting Hr; if ~ is


four times the experimental error variance?

Circuit Type

('J) Estimate the .wability between looms.

(e) Analyze the residuals from this experiment and


check for model itmdquacy,

12-5. 1m experiment was run to determine whether


four specific :5ring tempetatu:res affect the density of a
cer.ain type of brick. The experiment led to the fol~
lowiDgda~:

21.8 21.9 21.7 21.6 21.7 21.5 21.8


21.7 21.4 21.5 21.5
21.9 21.8 21.8 21.6 21.5 21.7 21.8 21.7 21.6 21.8 -

(a) Does dle fixing temperature affect the density of


the bricks?
(b) Estirnate the components in the model,

(c) Analyze the residuals from the expe~ent.


12-6.An electro::tics eogineer is interested in the effect
on tube conductivity of .five different types of coating
for cathode ray tubes used in a telecommunications
system display device, The follov.i.ng conductivity
data are obtained:

3
4
5

2
3

22
21
15

20
33
18

18

25

27

4{)

26

17

(a) Test the hypothesis that the three circuit types


have the same response time.

means.

_-"C-=F)-'-_ _ _ _ _ _D=en=,=ityL...................... _ _

Response Time
19
20
16

(b) Lse Tuley's test to compare pairs of treatment

Temperatu...c
100
125
150

:43
152
134
129
147

141
149
133
127
148

150
137
132
132
1'4

146
143
127
129
142

(a) Is there any difference in conductivity due to coat~


ing type? Use <X= 0,05,
(b) Estimate the overall mean and the treatment
effects.

(c) Construct a set of Ort.1og0r..al contrasts. assuming


that at the outset of the experiment you suspected
the response time of circuit type 2 to be different

from the other two.


(d) What is the power of this test for detecting

-r;

L~., ler = 3.0?


{e) Analyze the residuals from this experiment.

12S.In "The Effect of Nozzle Design on the Stability


and Performance of Turbulent Water Jets" (Fire Safety
JOUT1'.aI. VoL 4, Angus! 1981), C. Theobald describes
an experiment in which a. shape fac:or was determined
for several different nozzle designs at different lev>!ls
of jet efflux velocity. Interest in this experimer.(
focuses primarily on nwil.e design, and velocit} is a
nuisance factor. The data are shown below:

1
2
3

0.78 0,80 0,81


0,85 0,85 0,92
0,93 0,92 0,95
1.14 0,97 0,98
0,97 0.86 0,78

0.75 0.77 0,78

0,86
0,89
0,88
0,76

0,81
0,89
0,86
0.76

0,83
0,33
0,83
0,75

(a) Does noz:ile type affect shape factor? Compare the


noZ<.1es using box plots and the analysis of variance.
{b) Use Tukey's test to determine specific differences
between the nozzles. Does a graph of average (or

352

Chapter 12 The Analysis of Variance

standard deviation) of shape factor versus nozzle


type assist -.vith the conclusions?
{c) Analyze:he residuals from this experiment.
12-9. In his bookDesigrt andAnalysis ajExperimen.ts
(2001), D. C. Montgomery describes an experiment to
determine the effect of four chemical agents on the
strength of a particular type of clo:h. Due to possible
variability from cloth to cloth. bolts of cloth are considered blocks. Five bolts are sclec:ed and all four
chemicals in random order are applied to each bolt.
The resulting tensile streng"..hs are
Bolt
4

68
67

74

71

67

75

72

70

73
73
75

68

78

73

73

71

15

75

68
69

Chemical
1

(a) Is there any dIfference 1."1 tensile sjre:tgth between


the chemicals?
(b) Use Tukcy's test to investigate sped..ic differ~

ences between the chemicals.


(c) Analyze the residuals itom L.":ris experin1ent.
12~10. Suppose that four normal populations have
COUll!J.On variance <f =25 a.."l.d means ,Il! = 50, ~ ::60.
iLl;;;:: 50, and Ji.4 """ 60. How ma.1Y observations should
be taken on each population so that the probability of
rejectiug the hypothesis of equality of means is at least
0,901 Use" = 0.05.

12~11. Suppose :hat five uonnal populations have


coUll!J.On variance IT =100 and means Ji.: = 175~ J.i: =:
190,)1., 160,)1., = 200, and)1., 215. How = y
observatiollS per population must be taken so :hat the
probability of rejecting the hypothesis of equ.allry of
means is at least 0.951 Use a= 0.01.

1,2..12. Consider testing the equality of ilie means of


two normal populations where t.'1e variances are
uokuo\\'n but assumed equal. The appropriate test
procedure is the two-sample Hest Show that ::he
two-sample t-test is equivalent to tl:e one-way~
classification analysis of variance.
12-13. Show that the variance of the linear combioa~
tion Z:~"[c;y,, is 02Z:;d!ni
12J4. In a fixed-effects moru;l. suppose that there ue
n observations for each of four treatments, Let~, ~,
and
be single-degree~of-freedom components for
r.h~ ortl;ogO;.1al cootrasts. Prove :hat SSl(tt!!l!Ienu = Q~ +

c7,

a;

Q;+ 12;.
12-15. Consider the data shown in Exercise 12-7.
(a) Write out the :east squares normal equations for
this problem., and solve them forfl and1;., making
the usual coostraint (Z:~'" Ji,-;; 0). Estimate 'C) - ; .
(b) Solve the equations in (a) using the constraint~:::;;
O. Are the estimators f, and fi the same as you
found in (a)? Why? Now estim.ate 1'[ ~ 't; and compare your answer with (a). '\\t'hat statement can
you make about estimating contrasts in the !)
(c) Estimate jl+ 1';. 21'] -11- 1";: and jl + 'C( + 12 using
the two solutions to the normal equations. Compa..-e L1e results obtained !n each case.

Chapter

13

Design of Experiments
with Several Factors
An experiment is just a test or a series of tests. Experiments are performed in all scientific
and engineering disciplines and are a major part of the discovery and learning process. The
conclusions that can be drawn from an experiment will depend. in part, on how the experiment was conducted and so the design of the experiment plays a major role in problem
solution, This chapter introduces experimental design techniques useful when several fa:~

tors are involved,

131 EXAMPLES OF EXPERIMENTAL DESIGN APPLICATIONS

.~.:t.ailii(i~l:fr
A Characterization Experiment

A development engineer is working on a new process for sol-

dering electronic components to printed circuit hoards. Specifically. he is working with a r.ew type of
flow solder machine that be hopes will reduce the number of defective solder joints. (A flow solder
machine preheats printed circuit boards and then rr..oves them into contact with a wave of liquid solder. This machi:1e makes all the electrical and most of me mechanical COllt.ections of the components
to the pr. ."1ted circuit board, Solder defects require touehup or rework. which adds cost and often dam~
.ages the boards.) The flow solder machine has several variables that the engineer can controI. They
are as follows:
1. Solder temperature
2.. Prebeat temperature
3. Conveyor speed
4. Flux type
5. Flu., specific gravity
6. Solder wave depth
7. Conveyor angle
In addition to t.iese controllable factors, there are several factors that cannot be easily controlled
once the machine enters routine manufacturing, ineluding the following:
1. Thickness of the printed circuit board
2. Types of components used on the board
3. Layout of the components of the board
4. Operator
5. En"irorunental factors
6. Production rate

Sometimes we call tl:Ie uncontrollable factors noise factors, A schematic representation of the
process is shOVf"U in Fig. lJ., L

353

354

Chapter 13 Design of Experiments with Several Factors


Controllable factors

11 r
Input

Process

Outpl..1:

(printed circuit boards)

(flow solder machine}

(defects, y)

ll-~
Z1

2'2

Zq

Uncontrollable {noise) factors

Figure 13~1 The flow solder experiment

In this sit'.lscion the engineer is interested in characterizing the flow solder m:lClrine; that is, he is interested in detemrining which factors (borb controllable and uncontrollable) affect the occurrence of
defects on the p.r:inted circuit boards, To accomplish this he can design an experiment that v;:ill enable
him to estimate the magnitude and direction of the factor effects, Sometimes we call an experiment
such as this a screening experiment The information from this characterization study Or screening
experiment can be used to identify the critical factors, to determine the direction of adjustr.::Jent for
these factors to reduce the n:nnber of defects, and to assist in determining which factors should be ca.re.
fully controlled dcring manufacturing to prevent high defect levels and erratic process perfo.rmance.

;Ekli~~:i3:~.J
An Optimiza.tion Experiment In a characterization experiment, we are interested in determining

which factors affect the response. A logical next step is to determine the region in the important factors that leads to an optimum response, For example. if the response i.,. yield, we would look for a
region of maxim1.ll:l yield, and if the response is cost, we would look for a region of minimum cost.
As an illustration. suppose that the yield of a chemical process is influenced by the operating
temperature and the reaction time. We are currently operating the process at 155F and 1.7 hours of
reaction time and experiencing yields around 75%. Figure 13-2 shows a view of the time-temperature space from above. In this graph we have connected points of constant yield with lines. These lines
are called contours. and we have shown the contours at 60%, 70%, 80%, 90%, and 95% yield. To
locate the optimU1l1, it is necessary to design an experiment that varies reaction time and temperature
together. This design is illu:strated in Fig. 13-2. The responses observed at the four points:in the experiment (145'F, 1.2 br), 045F, 2.2 br), (16S'F. 1.2 br), MId (165'1', 2.2 br) indicate that we should

move in the general direction of increased temperature and lower reaction time to increase yield. A
few additional IUIlS could be performed in this direction to locate the region of maximum yield.

These examples illustrate only two potential applications of experimental design methods. In the engineering environment~ experimental design applications are numerous. Some
potential areas of use are as follows:
1. Process troubleshootiI:g

2. Process development and optimization

3. Evaluation of :material alternatives


4. Reliability and life testing
5. Performance testing

13-2 Factorial Experiments

355

200

Path leading to region


of higher yield

CUrrent operating
conditions

160

150

140

0.5

1.0

1.5

2.0

2.5

TIme (hr)

Figure 13M2 Contour plot of yield as a function of reaction time and reaction temperature, illustrating an optimization experiment.

6. Product design configuration


7. Component tolerance detennination
Experimental design methods allow these problems to be solved efficiently during the
early stages of the pr6duct cycle. This has the potential to dramatically lower overall product cost and reduce development lead time.

13-2 FACTORIAL EXPERIMENTS


Vlben there are several factors of interest in an experiment, afactorial design should be
used. These are designs in which factors are varied together. Specifically, by a factorial
experiment we mean that in each complete trial or replicate of the experiment all possible
combinations of the levels of the factors are investigated. Thus, if there are two factors, A
and E, with a levels of factor A and b levels of factor E, then each replicate contains all ab
treatment combinations.
The effect of a factor is defined as the change in response produced by a change in the
level of the factor. This is called a main effect because it refers to the primary factors in the
study. For example, consider the data in Table 13-1. The main effect of factor A is the difference between the average response at the first level of A and the average response at the
second level of A, or

A_ 30 + 40 10+20_20
--2---2-- .

356

Chapter 13 Desig."l of Experirne;;.ts v,1.th Several Factors


Table 13-1

A Factorial Expcrimc:1.t with Two Factors


Factor B

B,

Factor .4

10

20

30

4()

That is. changing factor A from levell to level 2 causes an average response increase of 20
units. Similarly, the main effect of B is
B= 20+40 _10+30 =10.
2
2
In some experiments. the difference in response between the levels of one factor is not
the same at all levels of the other factors. V/hen this occurs. there is an interaction bet\Veen
the factors. For example, consider the data in Table 13-2. At the first level of factor B, the
A effect is
A =30-10 = 20,

and at the second level of factor B. the A effect is


A=O-20=-20.

Since the effect of A depends On the leVel chosen for factor B, there is interaction between
A andB.
Vt'hen an interaction is large, the corresponding main effects have little meaning. For
example, by using the data in Table B-2, we find the main effect of A to be
A

30+0_10+20 =0

22'
and we would be tempted to conclude that there is no A effect. However. when we exam~
ined the effects of A at different levels offactor B. we saw that this was not the case. The
effect of factor A depends on the levels of factor B. Thus, knowledge of the AB interaction
is more useful than knowledge of the main effects. A significant interaction can mask the
significance of main effects.
The concept of interaction can be illustrated grnphically. Figure 13-3 plats De data in
Table 13-1 against the levels of A for both levels of B. Note that the B, and B,lines are
roughly parallel, indicating that factors A and B do not interact significantly. Figure 13-4
plots the data in Table 13-2.ln this grnpb, the B, and B, lines are not parallel, indicating the
interaction between factors A and B. Sucb graphical displays are often useful in presenting
the results of experiments.
An alternative to the factorial design that is (unfortunately) used in practice is to change
the factors one at a time rntberthan to vary them simultaneously. To illustrate this one~factor~
at~a-time procedure, consider the optimization experiment described in Example 13-2, The
Table 13-2

A Factorial ExpCrir::.cllt with Inte::action

Factor B

B,

Faztor A
A,

10

A,

30

10

o
,
-~

:3-2 Factorial Experiments

357

:~

i 30~
\D

20 i-

~ 10~
O~

i ................

L ..

A,

~ ___ _

Factor A

Figure 13-3 Factorial experiment, no interaction.

50 -

4O~
~ 30~

~ 20~

8 10~
o~

8,
8,

<

81

8,
Figure 13..4 Factoria: experiment, with
interaction,

A,

A,

engineer is interested in finding the values of temperature and reaction time that maximize
yield, Suppose that we fix temperature at 155'F (the current operating level) and perform
five runs at different levels of time, say 0.5 hour, La hour, 1.5 hours, 2.0 hours, and 2.5
hours. The results of this series of runs are shown in Fig. 13-5. This figure indicates that
maximum yield is achieved at about 1.7 hours of reaction time. To optimize temperature, the
engineer fixes time ar 1.7 hours (rhe apparent optimum) and perfonns five runs at different
temperatures, say l40'F, 150'F, 160'F, 170'F, and 180'F. The results of this ser of runs are
plotted in Fig. 13-6. Maximum yield occurs at about 155F. Therefore, we would conclude
that running the process at 155'F and 1.7 hours is the bestset of operating conditions, resultingin yields around 75%,

BO

70

~
.",
:.;
>-

60

50

0.5

1.0

1.5

2.0

2.5

TIme (hr)

Jrrgw;e

13~5

Y"lCld versus reaction time with temperature constant at 155"P'

358

Design of Expe:imcnrs with Several Factors

Chaptc; 13

~
:Q

:;:

70

eo

Temperature (!JA

Figure 13~6 Yleld versus temperature with reaction time constant at 1.7 hr"

Figure 13-7 displays the contour plot of yield as a function of temperature and time
with the one-factor-at-a~time experiment shown on the contours. Clearly the one-factor-ata-time design has failed dramatically here, as the true optimum is at least 20 yield points
higher and occurs at much lower reaction times andbigher temperatures. The failure to discover the shorter reaction times is particularly important as this could have significant
impact on production volume or capacity, production planning, manufacturing cost. and
total productivity.
The one-factor-at-a-time method has failed here because it fails to detect the interaction between temperature and time, Factorial experiments are the only way to detect inter-

18D!

~m
0.

t-

150l

14G~
'-_;;,',;:" .--~~.:--.~.---.-.-L.----L......-.-

Oll

1.0

1.5

2.0

2.5

Time (hr)

Figure 1J..7 Optimization experiment using the

onc-:factor-at~a~time method.

13-3 Two-Factor Factorial Experiments

359

actions, Furthermore, the one-factor-at-a-time method is inefficientj it will require more


experimentation than a factorial, and as we have just seen, there is no assurance that it will
produce the correct results. The experiment shown in Fig. 13-2 that produced the informa~
tion pointing to the regiOD of the optimum is a simple example of a factorial experiment.

13-3 TWO-FACTOR FACTORIAL EXPERIMENTS


The simplest type of factorial experiment involves only two factorS. say A and B. There are
"levels offac1<lr A and b levels offactor B. The two-factor factorial is shown in Table 13-3.
Note that there are n replicates of the experiment and that each replicate contains all ab
Ireatment combinations. The observation in the ijth cell in the kth replicate is denoted 11Ft.'
In collecting the dar.a, the abn observations would be run in random order. Thus! like the
single-factor experiment srudiedln Chapter 12, the two-factor factorial is a completely randomized design.
The obse::vations =y be described by the linear statistical model

(13-1)

where J.1 is the overall mean effect.. -t: is the effect of the ith level of factor A,~, is the effect
of the jth level of factor S, (1:!llij is the effect of the interaction between A and S, and
is
a NID(O, a') (norma! and independently distributed) random error component. We are
interested in testing the hypotheses of no significant factor A effect, no significant factor B
effect, and no significant AB interaction. As with the single-factor experiments of Chapter
12, the analysis of variance will be llsed to test these hypotheses, Since there are two factors under study, the procedure used is called the two-way analysis of variance.

'il'

13-3.1 Statistical Analysis of the Fixed-Effects Model


Suppose that factors A and S are fixed. That is, the " levels of factor A and the b levels of
factor S are specifically chosen by the experimenter, and :nferences are co",';ned to these
levels only. In this model, it is customary to define the effects 'fi,!l" and (1:{3),. as deviations
from the mean, so that ::~, T, 0,
,!ll 0, ::~, (1:{3)ij = 0, anl:E:",('l'{3)i) '= 0.
Let y.:. denote the total of the observations under the itb level of factor A, let yo)' denote
the total of the observations under the jth level of factor B, let Iii' denote the total of the
obse::vations in the ijth cell of Table 13-3, and let y ... denote the grand total of all the

:;"

Table 13-3 Dat:! Arrangement for a Two-Factor Factorial Design


Factor B

Factor A

)'1 (1' )'\11' '''')'1:"

Yilt> )'12:1' , .. , )'l1ft

)'IW)';O:;- '''' )'11'"

)flU> Ym.,.

Yl'll>

Y2n' .. , Y:.t.!n

Y1.W)':ll>2' ,)l7m.

'11111

""J!:l:!

360

Chapter 13

Design of Ex-periments with Several Factors

observations. Define Yf ... YT'


grand averages. That is,

YIj" and y... as the correspondlng row. colllIIlD.. cell, and


.1
i::::: 1.2... ~a,

i:= 1.2..... a.

Yij. ::::;: LY(fk'

j=l.2, ... ,b,

k::::.l

(13-2)

Y...

Y... = LLLYij"

ahn

f=-\ i:zlk=l

The total corrected sum of squares may be written


a

"

~~L(Yil,-YJ

f::O.1"""1.:",,1
a

= LLLfCVI.-Y .. )+(Yj.
r"'l 1::::]k=-1

+ (YiJ.
a

-yJ

-Yio. -Y.I +Y... )+(Y,jk-Yi}.jf


b

=bnL(Yi.. -Y.) +anL(Y:! -Y:)


1",1

ab

(13-3)

1::::1

"aon,

+nLL(YJ}. -y, -Y.j+Y.,J + LLLlY,!k -Yij) .


1""1 j=-l

i=1 1",1 k><ol

Thus. the total sum of squares is partitioned into a sum of squares due to "rows," or factor
A (SSA)' a sum of squares due to "columns," or factor B (SS.), a sum of squares due to the
interaction between A and B (SSM). and a sum of squares due to error (SSE)' Notice that
there must be at least tv.'o replicates to obtain a nonzero error sum of squares.
The sum of squares identity in equation 13-3 may be written symbolically as
(13-4)

There are abn - 1 total degrees of freedom. The main effect<; A and B have a - I and
&-1 degrees of freedom, while the interaction effectAB has (a - 1)(b-1) degrees cffreedom. Within each of the ah cells in Table 13-3, there are n -I degrees of freedom between
the n replicates, and observations .in the same cell can differ only due to random error.
Therefore, there are ahen -1) degrees of freedom for error. The ratio of each sum of squares
on the right~hand side of equation 13-4 to its degrees of freedom is a mean square.
Assuming that factors A and B are fixed, the expected values of the mean squares are

13-3 Two-Factor F2.ctorial Experimcnt:5


a

E(MS
All

361

,
nL2JrJ3)~j
) ~ E(
SS,s
J
~ ,,2 +
(a-l)(b-I)
(a-l)(b-I) .
i=[ j=1

and

Therefore} to test He: 't;::;:; 0 (no row factor effects), Ho: /3,- -;::; 0 (no column factor effects), and

H,: (~fi),! ~ 0 (no interaction effects), we would divide'the corresponding mean square by
the mean square error, Each of these ratios will follow an F distribution with numerator
degrees of freedom equal to the number of degrees of freedom for the numerator mean
square and ab(n-I) denominator degrees of freedom, and the critical region will be located
in the upper tail, The test procedure is arranged in an analysis-of-variance table. such as is
shown in Table 13-4.
Computational fonnulas for the sums of squares in equation 13-4 are obtained easily,
The total sum of squares is computed from

SSy~

r.,.,

y2

LLLYi;, -~"'"
aim
1=1 j=l

(13-5;

.(:",1'

The sums of squares for main effects are


(13-6)

and
2

'\' Y,j.
y",
SSB -~.L,;
--.
}=1 an
abn

(13-7)

We usually calculate the SS" in two steps. First, we compute the sum of squares between
the ab cell totals, called the sum of squares due to Hsubtotals":
IJ

SSS\lbtO".,;:Us;;

b y~.

LL..JL:..b
nan
1",1 J""1.

Table 134 The .Analysis~of- Variance Table for the Two- Way~aassification Fixed~Effects Mode:

Mean

Sum of
Squares

Degrees of
Freedom

A treatments

SS,

a-I

MSA

B treatments

SS.

0-1

MS,

Interaction

SS"'S

Source of
Variation

Total

(a

1)(0 - I)

SS,

ab(n-l)

SST

abn-l

MSAfj

a-I
MS,

0-1

MS,

SSAS

(o-I)(b-I)

MS,:

MS,

i
362

Chapter 13

Design of Experi;;nentS with Several Factors

'This sum -of squares also contains 5Sft and SSs' Therefore, the second step is to compute
SSAS as
(13-8)

SS" = SS"""", - SSA - 5S8

The error sum of squares is found by subtraction as either


(13-9.)

or
(13-9b)

. E~pl.13,3.
Aircraft primer paints are applied to alUI1li.nilI!l. surfaces by two n:.ethods: dipping and spraying, The
purpose of the primer is to improve paint adhesion. Some par...s can be primed using either applicadon method and engineering is interested i'11eamiug whether th...--ee different primers differ in their
adhesio::::. properties. A factorial experiment is performed to investigate the effect of paint primer type
and application method on paint adhesion. 'Three speci.-.ner.s are painted with each primer using each
application method. a ilnish palnt applied, and the adhesion force measured. The data from the
experiment are shown in Table 13-5. The circIed nUDlbers in the cells are the cell totals Y,r . The sums
of squares required to perfow the analysis of variance are computed as follows:

= (4.01,2 +(4.5)' +.. + (5


0\' - (89.8)'
, . !
18 --1072
, ,

Ss.
,)'?'!5

"' ,

= ~YL_~
~

or.

abn.

= (28.7)' +(34,1)' +(27.0)'


6

(898)'=4.58

18

(89.8)'
3

18

4.58- 4.91 = 0.24.

and

10.72-4.58 -4.91-0.24 = 0.99.


The analysis of variance is summarized in Table 13~6, Sinee F'J,(r.;,i,l'2 =- 3,S9 and FC.O,..,11 = 4.75. we
..:onclude that the main effects of primer type and application method affect adhesion force. Fu..'tber
M

more, since 1.5 <: F O.05 ,2,12' there is no indication of interaction between these factors.

13-3 Two-Factor Fac:or'.J1l Experiments

TabJe 13--5

363

Adhesion Force Data for Example 13-3


Application Method

Primer 'JYpe

Dipping

4.0.4.5,4.3

5.6,4.9, 5.4

3,8,3.7,4,0

Y,.

40.2

Spraying

@
@
@

5.4,4.9,5,6
5.8,6,1,6.3
5.5,5.0,5.0

'I:..

@
@
@

49,6

28,7
34,]
27.0

89,8=y.,

T.b1.1;>.6 Analysis of Variance for Example 13--3

SoutCe of
Variation

Sum of
Squares

Degrees of

Mean

FreOOom

Square

Primer types
Application !Ilethods
Interaction
Error
Total

4581
4,909
0241
0,987
10,718

2
1
2
12

2,291
4,909
0,121
0,082

27.86
59.70
1.47

17

A graph of t.':1e cell adhesion force averages Yir versus the levels of primer type for each application
method is shown in Fig. 13--8. The absence ofintcraction is evident by the parallelism of the two lines.
Furthermore. since a large response indicates greater adhesion force, we conclude that spraying is a
superior application method and that primer type 2 is most effeetive.

Tests 011 Individual Mean. When borh factors are fixed, comparisons between rhe individual means of either factor may be made using Tukey's test. When there is no interaction;
these comparisons may be made using either the rOw averages Yi" or the column averages
Y.).. However. when interaction is significant, comparisons between the means of one factor
(say A) may be obscured by rhe AS interaction. In rhls case, we may apply Tukey's test to
the means of factor A, 'With factor B set at a particular level.

Yi}t
7,06.0

rf

5,Or
4.0~

~ng

~Ping

a,oc
2
Prlmertypa

Figure 13-& Graph of average adhesion force versus


primer types for Example 13-3,

364

Chapter 13

13-3.2

Design of Experiments with Several Facton:

Model Adequacy Cbecking


Just as in the single-factor experiments discussed in Chapter 12, the residuals from a facto"
rial experiment play an important role in assessing model adequacy, The residuals from a
two~factor factorial experiment are

That is~ the residuals are JUSt the difference between the observations and the corresponding
cell averages.
Table 13-7 presents the residuals for the aircraft primer paint data in Example 13-3.
The normal probability plot of these residuals is shown in Fig. 13-9. This plot has tails tbat
do not fall exacdy along a straight line passing through the center of the plot, indicating
some potential problems with the normality assumption, but the deviation from normality
does not appear severe. Figures 13-10 and 13-11 plot the residuals versus the levels of
primer types and application methods, respectively, There is some indication that primer
type 3 results in slightly lower variability in adhesion force than the other two primers. The
graph of residuals versus fitted values Yl/k;;;;; YU' in Fig. 13-12 reveals no unusual or diag~
nostic pattern.

13-3.3

One Observation per Cell


In some cases involving a two-factor factorial experiment, we may have only One replicate,
that is, only one observation per ceIL In this situation there are exactly as many parameters
Residuals for the Aircraft Primer Paint Experiment in Example 13-3

Table 13M7

Application Met.'-lod

Dippi.'g

Spraying

0.23, 0.03
0.30, -0.4{), 0.10
-0.03, -0.13, 0.17

0.10, -OAO. 0.30


-0.27, 0.03, 0.23
0.33, -iJ.17, -0.17

Primer Type
I
2
3

-iJ.27,

2.0.

1J
ZI

oJ

-1.0

-2.0

-0.5

-0.3

-0.1
SI}X'

+0.1

..J

+0.3

residual

Figure 13-9 Nonnal probability plot of the residuals from Example B-3.

13-3

Two~Factor

Factorial Experiments

365

Figure 13~10 Plot ofresiduals Yers"JS primer type.

+0.5

Or-------~~------~~~~-

Application method

-0.5

Figure 13~11 Plot of residuals versus application method.

.4

-0.5

Yijk

in the analysis-of-variance model as there are observations, and the error degrees offree~
dam is zero. Thus, it is not possible to test a hypothesis about the main effects and interac~
tians unless some additional assumptions are made. The usual assumption is to ignore the
interaction effect and use the interaction mean square as an error mean square. Thus the

analysis is equivalent to the analysis used in the randomized block design. This

366

Chapter 13 Design of Ex.periments with Several Factors


no~interaction assumption can be dangerous, and the experimenter should carefully exam~
ine the data and the residuals for indications that there really is interaction present. For more
details, see Montgomery (200 1),

133.4 The RandomEffects Model

So far we have considered the case where A and B are fixed factors. We now consider the
situation in which the levels of both factors are selected at random from larger populations
of factor levels, and we wish to extend our conclusions to the sampled population of factor
levels, The observations are represented by the model
~ i:::: 112, .. ,a.

~ Pj + ("Pl.j H ijk j

j:

1,2" ., b,
(1310)
,.k -1,2,. .. n,
where the parameters 1:i' pj (~f5)ii' and ti" are random variables. Specifically, we assume
that~i is NIO(O, a;), p) is NIO(O, ~), (o/!I} is NIO(O, 0".,), and e,jk is NIO(O, 0"), The vari
ance of any observation is
Yij' : !1 Hi

V(Yij,) ~

a; + ~ + a;p + 0",

and 0;.. ~ o;.~~ and (52 are called variance components. The hypothcses that we are inter~
ested in testing are H,: ~ ~ G, Ho: ~ ~ G, and Ho: 0"" ~ O. Notice the similarity to the one
way classification random-effects model,
The basic analysis of variance remains unchanged; that is, 8S". SSe. SSw SSI~ and SSE
are all calculated as in the fixed-effects case, To construct the test statistics. we mUIit examine the expected mean squares. They are
E(MS,) ~ 0" + no"., + bna;,
E(MS,) ~ 0" +

na:." + ana;,

E(MSA,) ~ 0" + na:p,

(1311)

and
E(MS,J = <1'.
~ote

from the expected mean squares that the appropriate statistic for testing H;;:

0"",: 0 is
(1312)

since under Ho both the numerator and denominator of Fo have expectation <f', and only if
He is false is E(MSAIl ) greater than E(MS,), The ratio Fo is distributed as F,,_ ""_ !),,~".IJ'
Similarly. for testing Ho: 0-; ::: 0, we would use

1'0 = MS,

(1313)

MSAB

which is distributed as F4j -l,(s-l;(b _ l) and for testing Ho:


E _ MS.
0- MS '

<if, =- 0, the Statistic is


(13.14)

An

which is distributed as F >""1,(0-:)(1;-1)' These are all upper-tail, one. .tail tests, Notice that these
test Statistics are not the sarne as those used if both factors A and B are fixed, The expected
mean squares are always used as a guide to test statistic construction.

13-3

Two-Factor Factorial E}."Periments

367

The variance components may be estimated by equating the observed mean squares to
the::r expected values and solving for the variance components, This yields
-2
(j

= MS"
(13-15)

Suppose that in Example 13-3, a large number of primers and several application methods could \:Ie
used. 'Three pri..'"Uers, say 1. 2. anc 3, were se:ected at rancom, as were :he two application methods,
Tne analysis of va.--iance assuming the rando:n effects :;node! is sho'W'D in Table 13-8.
Nerice that the first four colurr.ns.in the analysis ofvanance table are exactly as in Exa.'1lple 133, Xow. however. the F ratios are computed according to equations 13~12 through 13~14. Since
F&,~,1.l2::::: 3.89, we conclude that interaction is not sigrlficant. Also, since FI)IJ!,;';;::::: 19_0 and FM).I.? =
18,5, we conc!ude that both types and application. methods significantly affect adhesion force,
although primer type is just ba:ely significant at (j. = 0,05. The '1-ariance components may be o!Stimated
using equation 13-15 as follows:
,,' =0,08,

0,12-0,08 00133
= 3
"
_, 2,29-0,12
,~
6
",0,36,

v"
-2

"r

,,~

4,9: - 0, 12 = 0,53,

Clearly, :he tw'Q largest variance components are for prirr.er typeS (6';::::: 0.36) and application methods
0.53),

13-3.5 The Mixed Model


Now suppose that one of the factors, A, is fixed and the otherl B, is random, This is called
the mixed model analysis of variance. The linear model is
[i=I,2"",a,

f1 ~ ti + P; + (,Plil -;- Eil'lj = 1,2"",b,


k = l,2, ... ,n,

(13-16)

Table 1J..8 Analysis of Variance for EXar:lple 13-4


Source of
Variation

Sumo!
Squares

Primer typeS

4,58
4,91
0.24
0.99
10,72

Applicatioo methods
Interaction
Error

Total

Degrees of

Mean

Freedom

Sq:.:.are

F,

2.29
4.91
0,12
0,08

19,08
40,92
1.5

2
12

17

368

Chapter 13

Design of Ex.periments with Several Factors

In this model, 'li is a fixed effect defined such that 'L~ '" I 't, = 0, ~ Is a random effect, the inter~
action term ('f/3)ij is a random effect, and el;~ is a NID(O, a2) random error. Ie is also cusand that the interaction elements (rfJ!ij are nonnaJ
tomary to assurne that Ilj is NlD(O,
random variables "ith mean zero and variance l(a - 1)la]d'fJ' The interaction elements are
not all independent,
The expected mean squares in this case are

<i,,)

bnI ~?
E(.M:SA)=a2+/ra;jJ+

h:

a-I

2
E( MS. ) ~(f 2 +an(fp,

(13-17)

E(MSAB ) = 0'2 +na;',


and

Therefore, the appropriate test statistic for testing Ho: off = 0 is


(13-18)

whicb is distributed as F.. _1,(c-I)(I1- n' For testing He:

cr1 = 0, the test statistic is

MS
MSe

B
Fo="--,

(13-19)

which is distributed as Fb_l,ab(n_ 1)' Finally. for testing Ho:cr;p = 0, we would use
(13-20)

which is distributed as F(C_IXb_1J,;<'b(,,_I)'


The variance components ~ , dtjl. and rr may be estimated by eliminating the first
equation from eqcation 13-17t leaving three equations in three unknowns, the solutions of
which are

and
(13-21)

This general approach can be used to estimate the variance components in any mixed
model. After eliminating the mean squares containing :fixed factors, there will always be a
set of equations remaining that can be solved for the variance components, Table 13-9
summarizes the analysis of variance for the two~factor mixed model.

369

13-4 General Factorial Experb.en.ts


Table 139 Analysis ofVa.."iance for the }\No-Factor Mixed Model
Sou....-ce of
Variation

Squares

Rows {A.)

55,

Columns (B)

Interaction

Mean
Square

Expected
Mean Square

a-I

M5,

aZ+ncr2 +bnI>r"2 l{a~l)

55,

b-l

MSg

cr + ar.0'~

55"

(a -l)(b -1)

MSAB

r;;1 + nO'~

Error

SSE

ab(n -I)

MS,

a'

Tota1

SSr

abn-l

Sum of

Degrees of
Freedom

'"

'

F,

MSA
MSA,B

M5 p
M5,
MSA~

MS E

134 GENERAL FACTORL'l.L EXPERIMENTS


:Many experiments involve more than two factors. In this section we introduce the case
where there are a lev-els of faetor A, b levels of factor B, c levels of factor C, and so on,
arranged in a factorial experiment. In general. there will be abc ... n total observations, if
there are n replieates of the complete experiment,
For example, consider the thtee~:fuctor experiment with underlying model

YiJ'/

= !1 Hi + /3j +r, + ('1"/3)'1 + r)" +(J3r)J'

rj = 1,2, ... ,a,


j
+ ('I"/3r)'lk + E iikl 1 _l,!, ... ,b,

(1322)

k-l'~1"'jCj

.1 = 1.2

,n.

Assuming that A, B, and C are fixed, the analysis of varianee is shown in Table 13-10, Note
that there must be at least !\vo replicates (n ~ 2) to compute an error sum of squares. The
F -tests on main effects and interaetions follow directly from the expeeted mean squares.
Computing formulas for the sums of squares in Table 1310 are easily obtained, The

total sum of ,squares is, using the obvious "dot notation,


c

If

Ss,. = I,
I,I, :~::'5k/ i=1 i=l ko;.l1=:
aDen

(13-23)

The sum of squares for the main effects are eomputed from the totals for factors A (Yf"')'
BIy,),,), and ely.. ,.) as follows;

(13-24)

(13-25)
c

Y..k,
Y....
SSC _- ""
. . , - - - - -.
.1:=1 abn.
abcn

(13-26)

310

Chapter 13 Design of Experiments with Several Factors

To compute the two-factor interaction sums of squares, the totals for theA x B,A x C, and
B x C cells are needed. It may be helpful to collapse the original data table into three twoway tables in order to compute these totals. The sums of squares are

rr
11.

;""1 j=l

"2

,2
Yij.

c_L',-SSA -SSo
en
abcn

(13-27)

= SSw.hroroh(AlT; - SSA - SSB'

rr
11.

SSAC =

Yd. -~-SSA -SSe


i=l k""l bn
abcn

;; SSwblotalS{AC) -

(13-28)

SSA - SSe>

and

rr
b

SSnc =

J=l K""t

y ,j>' -~-SSB -SSe

an

abcn

(13-29)

= SS'.'''WCBC) - SSe -SSc


The three-factor interaction sum of squares is computed from the three"way cell totals Yijtas

Table 13-10 The Analysls~of-Variance Table for the Three-Factor Fixed-Effects Model
SOwceof
Variation

Sum of
Squares

--~-

Degrees of
Freedom

Mean
Square

Expected
Mean Squares

..

(1z

+ bcrir."t:

MS;.

a-t

MS,

acnI.j3;

MSn
MS,

SS,

a-I

MS,

SSe

b-I

MS.

CI

SSe

e-I

MS c

(12+atJ

AS

SSAB

(a - tj(b-i)

MS",

+~-

b-I

'nl; ,

AC

SSAC

(a-I)(e-I)

MS;.c

BC

SSnc

(b -I)(c - I)

MSBC

,4BC

SSABC

(a - I)(b -1)(c - 1)

MS;JJC

Error

SSE

cbc(n - i)

Total

Ss.,

aCch-i

MS r

It
c-I

c' +
01 +

CI

'

F,

0" +

rnlIP);j

MSc;_
MS,

MS",!>

(a-l)(b-I)

MS,

brIE{'ry)~~

MS AC

(a-l)(c-l)

MSc

arl:(Jl'f);.
'

MSec

(b-I)(,-I)
til.::IT.(-rj}y):jk

(a-l)(b-l)(c-1)

MSE
MSAlIC
MSe

a'

.J

13-4 Gen::ral Factorial Experiments

371

(13-30b)

The error sum of squares may be found by subtracting the sum of squares for each main
effect and interaction from the tOtal sum of squares, or by
(13-31)

A mechanical engineer is stUdying::he sw::face roughness of a part produced in a metal-cutting Qper~


arion. Three factors. feed rare (."1), depth of cut (B), and tool angle (C), are of interest. All three f<lctorS have been assigned two levels. and two replicates of a factorial design are ron. The coded data are
shown in Table 13-11. 'The t..'rree~way celJ totals Yijk' are circled in this ta.lJle,
The su.."IlS of sqW1.-'"Cs are ca:!culated as follows, using equatio1:.s 13~23 to 13-31:

y'1

a ,,2

SSA=I:!i.:.:..:...-~
[ .. I

ben

abcn

~ (75)' +(;02)'

(177)' ~ 45.5625
16
'

..

_ ~ )~ .. ~l
SSs- - , - - - ) .. 1 am
cben

(82)' -(95)'
g

SSAii :::::

b;(

(177)'
16

10.5625,

L: 2: Yuen" - Men
y:". -SSA - SSs
;"'! j .. j

(37)' +(38)' +(45)' .,(57)'


=

(177)'
16

45.5625 -10.5625

~7.5625,

=0.0625,
b

Y.... SSe- 5Sc


SSlJC -_""Y.jk.
..t......t....-----j ..

:.to:

an

abcn

+148\' (J77')'
='138"I +(44)' _ (4")'
"
I
.
10.5625-3.0625

16

372

Chapter 13 Design of Experiments \'lith Several Factors

SSE

:=

SST -SS:n.blotlh(ABC}

= 92.9375-73.4375 = 19.5000.
1M analysis of ..'afiance is summarized in Table 13-12. Feed rate has a significant effect on su.rface
finish (a: < 0,01). as does the depth of cut (0.05 < a < 0.10). There is some indication of a mild inter-

action betweet. these factors, as the Ftest for the AJ3 interaction is just less than the 10% critical value.

Obviously factorial experiments wit'" three or more factOTS are complicated and require
many runs, particularly if some of the factors have several (more than two) levels. This
leads us to consider a class of factorial designs with all factOrs at two levels. These designs
are extremely easy to set up and analyze, and as we will see, it is possible to greatly reduce
the number of expe~ental runs through the tecbnique of fractional replication.

Table lJ..ll

Coded Surface Roughness Data for Example 13-5

Depth of Cut (B)

Feed Rate (A)

0.Q25 Loch

0.040 inch

Toel Angle l c)

Tool Angle (C)

15'

25"

25"

11

10

10

II

10

10
13

12

12

16
14

38

44

47

48

20inJmln

30 in.lmln

B x Ctotals

15

A X B Totals )iF'

Yr"
75

102

177

A X CTotals }r'k'

0.025

0.040

MC

15

25

20

37

45

38
57

20
30

36

30

49

39
53

;,;./<.

82

95

y,,).

85

92

AlB

=, ....

'

13-5 The 2t: Factorial Design

373

Table 13-12 AnalYStS of Variance for Example 13~5


Sourc~of

Variation

Feed rate (A)


Depth of cut (B)
Tool angle (C)

AB
AC
BC
ABC
Error
Tow

Sum of
Squares

Degrees of

Mean

Freedom

Squa.-re

455625
105625
3.0625
75625
0.0625
1.5625

5.0625
19.5000
92.9315

1
S
15

45.5625
10.5625
3.0625
75625
0.0625
15625
5.0625
2.4375

F,

18.69"
4.33'
1.26
3,10

om
0.64

2.08

"Significanr at 1%.
!>S:;gni5cant at 10%.

135 THE 2k FACTORIAL DESIGN


There are certain special types of factorial designs that are very useful. One of these is a factorial design with k factors, each at two levels. Because each complete replicate of the
design has Z" runs or treatment combinations, the arrangerr.ent is called a Z~ factorial design.
These designs have a greatly Simplified statistical analysis, and they also form the basis of
many other useful designs.

135.1 The 22 Design


The simplest type of2k design is the 2z~ that is, two factors, A and B, each at two levels. We
usually think of these levels as the "low" and "high" levels of the factor. The 2' design is
sho"" in Fig. 1313. Note that the design can be represented geometrically as a square, with
the 21. 4 runs forming the cOmers of the square. A special notation is used to represent the
treatment combinations. In general a treatment combination is represented by a series of
lowercase letters. If a, letter is present, then the corresponding factor is run at the high level

High b
(+)

,....-----------tab

Low
(-)

(1)

Low
(-)

a
High
(+)

Figure 1313 The 2' lilctorial desigr..

374

Chapter 13 Design of Experiments with Severa:. Fac:t.ms


in that treatment combination; if it is absent, the faetor is run at its low level. For example.
treat:raent combination a indicates that factor A is at the high level and faetor B is at the low

leveL The treatment eombination with both factors at the low level is denoted (1), 'This notation is used throughout the 2K. design series. For example. the treatment combination in a 24
design with A and C at the high level and B and D at the low level is denoted ac.
The effects of interest in the 22 design are the main effects A and B and the two~factor
interaction AB. Let (I), a, b, and ab also represent the totals of all n observations taken at
these design points. It is easy to estimate the effects of these factors. To estimate the main
effect of A we would average the obsenrations on the right side of the square, where A is at
the high level. and subtract from this the average of the observations on the left side of the
square, where A is at the low level, or
A= a+ab _ b+(l)

2n

2n

J...[a +ab -b -(1)].

(13-32)

2n

Similarly, the main effect of B is found by averaging the observations on the top of the
square, where B is at the high level. and subtracting the average of the observations on the
bottom of the square, where B is at the low level:

B= b~aiJ._ a+(l)
2n

2n

= I- [b+ab-a-(I),..

2n

(13-33)

'

Finally, the AB interaction is estimated by taking the difference in the diagonal averages in
Fig. l3~13i or

(13-34)

The quantities in brackets in equations 13-32, 13-33, and 13-34 are called contrasts.
For example, the A contrast is
Contrast, ~ a - ab - b - (1).

mthese equations, the contrast coefficients are always either +1 Or -

L A table of plus and


minus signs, such as Table 13-13, can be used to determine the sign of each treatment com~
bination for a particular contrast. The column headings for Thble 13MB are the main effects
A and B, AB interaction. and I, which represents the total. The row headi:r.:.gs are the treat~
ment combinations, Note that the signs 1:1 the AS column are the products of signs from

Table13-13

Signs for Effects in che 22 Design

T-"eatment

Combinations
(1)

a
b
ab

Factorial Effect

+
+
+
+

AS

+
+

135

The 2' Factorial Design

375

colu."llOS A and B. To generate a contrast from this table, multiply the signs in the apprQpri~
ate column of Table 13-13 by the treatment combinations listed in the rows and add.
1b obtain the sums of squares for A, B, and AB, we can use equation 12-18; which
expresses the relationship between a sing1e-degree-of~freedom contrast and its sum of
squares:
(ContrdS t

SS =
o

l"

".

L (contrast coefficientst

Therefore, the sums of squares for A, B, andAB are

SSA
SS. =
SSAB =

40
4n

40

The analysis of variance is completed by computing the total sum of squares SST (~ith
40 - 1 degrees offreedom) as usual, and obtaining the error sum of squares SSE [with 4(0
- 1) degrees of freedom] by subtraction.

;;gI~',l~~
An article in the AT&T Technical Jo"""u (MarcblApril, 1986, Vol. 65, p. 39) describes the application of two-level experimental designs to integrated circuit manufacturing. A basic processing step in
t.'1.is industry is to gr:ow an epitaxial layer on polished silicon wafers. The wafers are mounted On a susceptor and positioned inside a bell jrtr, Chemical vapors are introduced rr..rough nOl.7..les near the top
of the jar, The susceptor is rotated and heat js applied. These conditions are maintained until the eplR
taxiallayer is thick enough.
Table 13-14 presen!'$ theresuJts of a 2' factorial design \\-1.thn=4replicates using the factors A ""
deposition time and B:::: arsenic flow rate. The two leve~s of deposition time are-= short and,.;. == long,
and the two levels of arsenic flow rate are -:=; 55% and - = 59%. The response variable is epitaxial
layer thickness (urn). We may find the estimates of the effects using eq'Jations 13 32, 13-33, and
R

13-34 as follows:
A = ;n [a+ah-b-(ll]

=2t4l [59.299+ 59, 156-55,686-56.081J =0.836,

1\1ble13-14 The 2,z Design for the Epitaxial Process Experiment


Treatment
Combinations

B AB

...

(1)

a
b
ab

+
+
+

+ +

Thickness (um)

Total

14.037.14.165,13.972,13.907
14.821,14.757, ,4.843, 14.878
13.880,13.860, 14.032, 13.914
14.888,14.921, 14.415, 14.932

56.081
59.299
55.686
59.156

14.021
14,825

13.922
14.789

376

Chapter 13 Design of Experiments with Several Factors

B=..!..[b+ab-a-(I)]
2n

1 [55.686+59.156-59.299-56.081

] =-{).O67,
=2(4)

AB=..!..[ab+(I)-a-b]
2n
1

= 2(4) [59.156+56.081 -59.299-55.686]=0.032.


The numerical estimates of the effects indicate that the effect of deposition time is large and has a positiYe direction (increasing deposition time increases thickness), since changing deposition time from
low to high changes the mean epitaxial layer thickness by 0,836 pm. The effects of arsenic flow rate
(B) and the AB interaction appear small.
The magnitude of these effects may be eonfi'llled with the analysis of variance, The sums of

squares for A, E, andA,B are computed using equation 13-35:


(Contrast)'
n4

SS=-- -,
SS =

ra+ab-b-(I)l'
16

<

= [6.688]2
16
~2.7956,

,[b_+_ab_-_"_-l-'(llCL.l'

SS.=-

<

16

[-{).5381'
16
= 0.0181,

[ab+(I)-a-bj'
SSAB

16

_ [O.256f
-

16

=0.0040.

The analysis of variance is summarized in Table 13-15. This confirms our conclusions obtained by
examining the magnitude and direction of the effects; deposition time affects epitaxial layer tb.ickness,
and from the direction of the effect estimates we know that longer deposition ti.:n.es lead to thicker epi~
taxia11aYer5,

Tablel3-15

Analysis ofVa.ciance for the EpitaXial Process Experiment

Source of
Variation

Sum of
Squares

A (deposition time)
B (arsenic flow)

Et::ror

2.7956
0.0181
0.004()
0.2495

Total

3.0672

AB

Degrees of
Freedom
I

12
15

Mean
2.7956
0.0181
0.0040
0.0208

134.50
0.87
0.19

:3-5

The 21( Fa.ctorial Design

377

ResidualAnalysis It 1s easy to obtain the residuals from a 2;; design by fitting a regression
mode! to the data. For the epitaxial process experiment, the regression model is
y;

130 + 13,x, + E,

since the only active variable is deposition time, which is represented by Xl' The low and
high levels of deposition time are assigned the vaiues Xl =-1 and Xl =+1, respectively. The
fitted model is

14.389+lro.836ixl>
2 /

where the intercept A, is the grand average of all 16 obServatiODS (Y) and the slope $, is one
half the effect estimate for deposition time. The reason the regression coefficient is one-half
the effect estimate is because regression coefficients measure the effect of a unit change in
on the mean of y, and the effect estimate is based on a two-unit change (from -1 to +1),

Xl

This model can be used to obtain the predicted values at the four points in the design.
For example, consider the point with low deposition time (x: = -1) and low arsenic flow
rate. The predicted value is

IH):

ji 14.389 + (0.836
2 J

13.971 pm,

and the residuals would be

.,

14.037 - 13.971: 0.066,

e,

14.165

eJ

=13.972 -

13.971 = 0.194,
J3.971

=0.001.

e4 = 13.907 -13.971 =-Q.064.

It is easy to verify that for low deposition time (x, : -1) and high arsenic flow rate, ji:
14.389 + (0.83612) (-1) = 13.971 J1.rI4 the remairting predicted values and residuals are

" = 13.880 - 13,971 =- 0.091,


e,: 13.860-13.971 =-0.111,
e,: 14.032 - [3.971 =0.061,

e,: 13.914 - 13.971 =-0.057,


that for high deposition time (x, = +1) and low arsertie flow rate,jI : 14.389 + 0.83612)(+1) :
14.807 J1.rI4 they are

e,: 14.821-14.807 = 0.014,


e,,: 14.757 - 14.807 =- a.oso,
ell = 14.843 - 14.807: 0.036,
ell

=14.878 -

14.807

=0.071,

and that for high deposition time (x, = +1) and high arsertic flow rate,:9 = 14.389 +
(0.83612)(~1)= 14.807 J1.rI4 they are
e,,: 14.888 -14.807 = 0.081,

e" = 14.921-14.807 =0.114,

378

Chapter 13 Design of Experiments with Several Factors

e" ~ 14.415 -

14,807 = - 0,392,

e" = 14.932 - 14.807 = 0.125.


A normal proba1:>ility plot of these residuals is shown in Fig. 13-14. This plot indicates
that one residUal, e;s =:: 0.392, is an outlier. Examining the four runs with high deposition
time and high arsenic flow rate reveals that observ-ation Y15 =:: 14.415 is considerably smaller
than the other three observations at that treatment combination_ This adds some additional
evidence ro the tentative conclusion that observari on 15 is an outlier. Another possibility is
that there are some process variables that affect the variability in epitaxial layer thickness,
and if we could discover which variables produce this effect, then it might be possible to
adjust these variables to levels that would minimize the variability in epitaXial layer thick~
ness. This would have important implications in subsequent manufacturing stages. Figures
13-15 and 13-16 are plots ofresiduals versus deposition time and arsenic flow rate, respec
tively. Apart from the unusually large residual associated with YU' there is no strong evidence that either deposition time or arsenic flow rate int1uences the variability in epitaxial.
layer thickness.
Figure 1317 shows the estimated St:llldard deviation of epitaxial layer thickness at all
four runs in the 21 design. These standard deviations were calculated using the data in
Table 13-14. Notice that the standard deviation of tilefour observations with A and B at the
high level is considen:bly larger than the standard deviations at any of the other three design

+T
+1

ZJ

o~

..

-t

..

-2~
-0.39200

-0.29433

~O.~9S-67

-0.09900
Residual

-0.00133

0.09633

0.19400

Figure 13-14 Normal probability plot of residuals for the epita:tial proces..'i experiment.

O~------~L~O-W------~H~i9~h----~

-0.5 ~

Oeposition time, A

I<'igure 13--15 Plot of residuals versus deposition time.

13-5 The 2k Factorial Desig!!.

379

ol~:------~IL~O-W----~H~i~9h----

~J

...

Arsenic flow rate. B

Figure 13~ 16 P;ot of residuals versus arsenic flow rate.

s~O.077

(+) b

H (1)

$=0.110

5= 0.051

(+)

Figure 13-17 ne estimated standard deviations


of epitaxial layer thickness at the four runs in the
2'design.

points. Most of this difference is attributable to the unusually low thickness measurement
associated with YIS' The standard deviation of the fout' observations -with A and B at t.'1e low
levels is also somewhat larger than the standard deviations at the remaining two runs. This
could be an indication that there are other process variables not included in this experiment
that affect the variability in epitaxial layer thickness. Another experiment to study this possibility, involving other process variables, could be designed and conducted (indeed, the
original paper shows that there are two additional faetors. uneonsidered in this example,
that affect process variability).

13-5.2 The 2' Design for k ;;: 3 Factors


The methods presented in the previous section for factorial designs with k:::: 2 factors cach
at two levels can be easily extended to more than two factorS. For example, consider k = 3
factors, each at two levels. This design is a 2) faetorial design. and it has eight treatment
combinations. Geometrically, the design is a cabe as sho\\<ll in Fig. 13~18, with the eight
runs forming the comers of the cube. This design allows three main effects to be estimated
(A, B, and C) along with three two-factor interactions (AB. AC, and BC) and a three-factor
interaction (ABC).
The main effects can be estimated easily, Remember that (1). a, b,ab, e, ac. be, and abc
represent the total of all n replicates at each of the eight treatment eombinations in the
design. Referring to the cube in Fig. 13-18, we would estimate the main effect of A by averaging the four treatment combinations on the right side of the cube, where A is at the high

380

Chapter 13 Design of Experiments with Several Factors


be

~----------~~abc

Be

,--:err-

e r''---T-,
I
I
I
I

bi __

~~_

------

..,- ... ..,-'


-1 :...

( 1 ) " " ' - - - - -. .

ab

+1
/

-/a

-1

Figure 1318 The 2' desigu.

level, and sUbtracting from that quantity the average of the four treatment combinations on
the left side of the cube, where A is at the low level. This gives

A = --[a ~ ab+ac + abe- b -e- bc- (I)].

4"

(13-36)

In a similar manner the effect of B is the average difference of the four treatment combinations in the back face of the cube and the four in the front, or

B = ..!...rb+ ab + be +abe- a - c - ac -(I)],


4n'

(13-37)

and the effect of C is the average difference between the four treatment combinations in the
top face of the cube and the four in the bottom, or

1
4n

, )1 ,
'

C=--lc~ac+be+abc-a-b-ab-\I

(13-38)

~ow consider the tv/o-factor interaction AB. \hen C is at the low level, AB is just the
average difference in the A effect at the t\Vo levels of E, or

AB(CIOW)=_I [ab-b]-_I [a-(I)].


2n
2"
Similarly, when C is at the high leve~ the AB interaction is
AB( C high) = ..!...[abe- be] - "!'.(ac-e].
2n
2"
The AB interaction is just the average of tbese two components. or

AB = "!"'[ab+ (1) + abc + e- b -a - be -ac]'

(13-39)
4n
Using a similar approach, we can show that the AC and Be interaction effect estimates are
as follows:

1
AC= -[ac+ (I) + abc + b -a-c - ab - be],
4n
I

BC =--[bc+ (1) + abc+a- b-e- ab- ae).


411

(13-40)

(13-41)

13~5

The 2k Factorial Design

381

The ABC interaction effect is the average difference between theAB interaction at the two
levels of C. Thus

ABC = :n ([abc~bcl-[ac-cl~[ab~bl+[a~{I)]}

bc~ac+ c~ab + b + a - (I)l..

= 2-[ abc 4n

(13-42)

The quantities in brackets in equations 13-36 through 13-42 are contrasts in the eight
treatment combinations. These contrasts can be obtained from a table of plus and minus
signs for the 2' design, shown in Table 13-16. Signs for thc main effects (colu.mns A, B, and
C) are obtained by associating a plus with the high level of the faetor and a minus with the
low level. Once the signs for the main effeets have been established. the signs for the
remaining eolumns are found by multiplying the appropriate preceding columns row by
row. For example. the signs in column AB are the produet of the signs in columns A and B.
Table 13-16 has severnl interesting properties.

1. Except for the identity column l, each eolumn has an equal number of plus and
minus signs.
2~

The SUJr. of products of signs in any two columns is zero; that is, the columns in the
table arc orth..ogonal,

3. Multiplicating any column by column [leaves the column unchanged; that is, I is an
identity element.

4. The product of any two columns yields a column in the table; for example, A x B =
C, since any colu.mn multiplied by itself is the iden-

AS and AB x ABC =A'B'C


tity column.

The estimate of any main effect or interaction is determined by multiplying the treatment combinations in the first column of the table by the signs in the corresponding main
cffect or interaction colUlllll. adding the result to produce a contrast, and then dividing the
contrast by one-half the tolal number of runs in the expctimcnt. Expressed mathematically,
Effect

Contrast

(13-43)

The sum of squares for any effect is


SS -

(Contrast)2

(1344)

n2k

Table 13-16 Signs for Effects.in the 2) Design

Treatmem
Combinations
(I)

a
b

ab
C

ac
Ix:

abc

+
+
+
+
+
+
+
+

Factorial Effect
AS
C ,K

+
+
+
+

+
+

+
+
+

+
+

BC ;I.BC

+
+
+
+

+
+
+

+
+

+
+

1
382

Chapter 13 Design of Experiments with Several Factors

F!~il(>!~11~J
Consider the surface--roughncss experiment described originally in Example 13-5. This is a: 23 factorial design in the factors feed rate (.4)j deptl::. of cut (B), and tool angle 'G), 'Yri.tb 10 =2 replicates. Table
13-17 presents the observed surface~rougbness data.

The main effects may be estimated using equations 13-36 through 13-42. The effect of A is, for
example.

A =.l.ra + ab+ac+abc-b - c-bc- (I)]


4" ~

= 4t2)[22+27+23+30-20-21-18-16]
1
8

=-[27]=3375,
and the sum of squares for A is found using equation 13-44:

(ContrastS

(21)'
=--=455625
2(8)
~
.
It is easy to verify that the"other effects are

B= 1.625,

C= 0.875.
AB

1.375,

AC=O.I25.
BC=-O.625.
ABC= 1.125.

From. examining the magrrimde of the effects. clearly fced rate (factor A) is dom.i."lant. followed by
depth of cut (8) and the AB interaction. although the interaction effect is relatively small. The analy-

sis of \':It....ance is summarized in Table 13-181 and it confirms our interpretation of the effect estimates.

Table 13,17
Treatment

Combinations

Surface Roughne.'iS Data for Example 13-7


Design Facto'"
C
A
B

(1)
a

-1

-1

-1

b
ah
c

-I

I
1
-1
-I
1
I

ae
be
abc

-I
1
-I

-1
-1
-1
-1

1
1

Surface
Thtals

9, 7
10.12
9.1!
12,5
11.10
10,13
10,8
16.14

16
22
20
27
21

23
18
30

:3-5

The 2k Factorial Design

383

T.bl.13-18 Analysis of Variance for the Sl.lt'!aee-Finish Bxperi..mcnt


Source of
Variation

Sum of
Squares

Degrees of
Freedom

A
B
C
AS
AC
BC
ABC
Error

45.5625
10.5625
3.0625
7.5625
0.0625
1.5625
5.0625
19.5000

1
8

Total

92.9375

15

Mean
Square

F,

45.5625

18.69

lO.5625
3.0625
7.5625
0.0625
1.5625
5.0625
2.4375

4.33
1.26
3.10
0.Q3

0.64
2.08

Other Methods for Judgfng Significance af Effects The analysis of variance is a formal
way to detenni:1e which effects are nonzerQ. There are t\\'O other methods that are useful.
In the first method, we can calculate thc standard errors of the effects a.~ compare: the magnitudes of the effects to their standard errors. The second method uses normal probability
plou> to assess the importance of the effects,
The standard error of an effect is easy to find, If we assume that there are ft replicates
at each of the 26 runs in the design, and if Yil' yc~' , .. , Yill are: the observations at the ith run
(design point), then

1,2 ..... 2.<,

r;=

is an estimate of the variance at tl:e ith run. where )'t =


lyJn is the sample mean of the
n obsenrations. The 2" variance esGnates can be pooled to give an overall variance estimate
...
S' -

2k

fi

2k {n_l)I.I.(Yij-Yi) ,

(13-45)

1=1 j""l

Where we have obviously assumed equal variances for each design point. This is also the
variance estimate gNen by the mean square error from the analysis of variance procedure.
Each effect estimate has variance given by

V{Effect) = v[contrast]
n2k 1

-:r

-'-'::""'T,
k

(n2

V(Contrast).

Each contrast is a linear combination of 27. treatment totals, and each total consists of n
ohservations. Therefore.

V(Contrast) = n2'<i'
and the variance of an effect is

V(Effect)

(n2H)'
1

n2 k- 2

n2' (J"
(1346)
2

{J

384

Chapter 13 Design of Experiment<: with Several Factors


The estimated standard error of an effect would be found by replacing

s' and taking the square root of equation 13-46,

rr with its estimate

To illustrate for the surface-roughness experiment, we find that S' = 2.4375 and the
standard error of each estimated effect is
s,e,(Effect) =

~;;

= ,} 2, ~3-2 (2.4375)
=0,78,
Therefore two standard deviation limits on the effect estimates are

A: 3.375 1.56,
B: 1.625 1.56,
C: 0,875 1.56,

AB: 1.375 156,


AC: 0,125 1.56,
BC: -0,625 156,
ABC: Ll25 1.56,
These intervals are approximate 95% confidence intervals. They indicate that the wo main
effects. A and 8, are important, but that the other effects are not, since the intervals for all
effects except A and B include zero.
Normal probability plots can also be used to judge the significance of effects, We will
illustrate that method in the next section.

Projection o/2k Designs .Any zi< design will collapse or project into another i:. design in
fewer variables if One or more of the original factors are dropped Sometimes this can provide additional insight into the remaining factors. For example, consider the surface-rough~
ness experiment. Since factor C and all its interactions are negligibl~ we could eliminate
factor C from the design. The result is to collapse the cube in Fig. 13~ 18 into a square in the
A - B plane-however, each of the four runs io the new design has four replicates, In genera!, if we delete h factors so that r = k - h factors remain, the origioalZ' design with n replicates will project into a 2r design with n2h replicates.
Residual Analysis We may obtain the residuals from a 2' design by usiog the method
demonstrated earlier for the 22 design. As an example, consider the surface~roughness
experiment. The three largest effects areA, B, and theA.Binteractiou. The regression model
used to obtain the predicted values is

y = Po + p,x, + Ax, + /3,,,,,,,,,,,


where Xj represents factor A~.l1 represents factor B, and x!A:z represents the AD interaction.
The regression coefficients,Bl,A. andft!~ are estimated by onebalfthe corresponding effect
esti:clates andfJo is the grand average, Thus

110625'~ ,(3.3751
(1.375\
- - IX 1 + (1.6251
- 0-)X, + --)x}xz.
\ 2 /
'"
2

y =.

13-5 The 2< Factorial Design

385

and the predicted values would be obtained by substituting the low and high levels of A and
B into this equation, To illustrate, at the treatment combination where A, B, and C are all at
the low level, the predicted value is

11.0625+ [.~75}_1)+ e~25}-I)+ (1.~75}_I)(_J)


=9.25.

The observed values at thls run are 9 and 7, so the residuals are 9 - 9.25 = -0.25 and
7 - 9.25 ~ -2.25. Residuals for the other seven runs are obtained similarly.
A normal probability plot of the residuals is shown in Fig. 13-19. Since the residuals
lie approximately along a straight line. we do not suspect any severe nonnormaJiry in the
data. There are no indications of severe outliers. It would also be helpful to plot the resid
uals versus the predicted values and against each of the factors A, 8, and C.
Yates' Algorithm for the 2" Instead of using the table of plus and minus signs to obtain the
contrasts for the effect estimates and the sums of squares, a simple tabular algorithm
de'\1.sed by Yates can be employed, To use Yates' algorithm. construct a table with the treatment combinations and the corresponding treatment totals recorded in standard order. By
standard order, we mean that each factor is introduced one at a time by combining it with
all factor levels above it. Thus for a 22, the standard order 1s (1), a, b. 00, while for a 23 it is
(1), a, b, ab, c, ac, be, abc, and for a 2' it is (1), a, b, ab, c, ac, be, abc, d, ad, bd, abd, cd,
acd, bcd, abed. Then follow thls four-step procedure:

L Label the adjacent column [IJ. Compute the entries in the top half of thls column by
adding the observations in adjacent pairs. Compute the entries in the bottom half of
this column by changing the sign of the first entry in each pair of the original obser~
vation. and adding the adjacent pairs.
2. Label the adjacent column [2J. Construct column (2) using the entries in column
(1). Follow the same procedure employed to generate colurr,n (1]. Continue this
process until k columns have been constructed, Colunm [k] contains the contrasts
designated in the rows.
3. Calculate the sums of squares for the effects by squaring the entries in column [k]
and dividing ey ni:'.
4. Calculate the effect estimates by dividing the entries in column [k) by n2''.

+2!

+1 ~

0-

:~

-2~~2~5~OO~--~~--~~~--~~~--~~~--1~.O~a~3~3----7i.~75~OO
Residual,

ej

Figure 13-19 Normal probability plot of residuals fro", the surfaceroughness experimen:

1
386

Chap,er 13 Design of E.."'Cperiments v.ith Several Factors


Table 1319 Yates' Algorithm for the. Surface~Rougbness Experiment
Sl,;.ITl.of

Treatment
Corebinations
(1)

a
b
ab
c
ae
be
abc

[lJ

{2]

[3J

Effect

!6
22

38

85
92

20

44

27
21
23
18
30

48

13
14

177
27
13
11

Total

47

4
1

-5

10

6
7
2
12

A
B
AB
C
AC
BC
ABC

Squares
[31"fn2J
45.5625
10.5625
7.5625
3.0625
0.0625
1.5625
5.0625

Effect Estimates
[3;1n2'
3.375
1.625
1.375
0.875
0.125
-0.625
Ll25

Exa ';'/lie13-8.
Consider the surfacc-rougr..ness experiment in Example 13-7, This is a 23 design with n;= 2 replicates.
The analysis of this data using Ya,es' algorithm is illustrated in Table 13~19. Note that the sums of
squares computed from Yates' algorithm agree with the resultS obtained in Example 13-7,
.~~

...

~~-

:3-5.3 A Single Replicate of the zf' Design


As the number of factors in a factorial experi:r:lent grows, the ~umber of effects that can be
estimated g:ows also. For exaClple, a 24 experiment has 4 main effects, 6 two-factor inter~
actions, 4 three-factor interactions, and 1 four-factor interdction, whlle a 2 5experiment has
six main effects, 15 two~factor interactions, 20 three-factor interactions, 15 four-factor
interactions, 6 fivewfactor interactions, and 1 six-factor interaction. In most situations the
sparsity of effecrs principle applies; that is, the system is usually dominated by the main
effects and low-order interactions. Three-factor and higher interactions are usually negligible, Therefore, when the number of factors is moderately large, say k'2: 4 or 5, a common
practice is to ron only a single replicate of the Zl. design and then pool Or combine the
higher-order interactions as an estimate of error.

.Exam;,l" 1:>:9
An article in Solid State Technology ("Orthogonal Design for Process Optim.i:z.ation and its Application
in Plasma Etchir.g," May 1987, p. 127) describes the application of factorial designs in developing a
nitride etc]:: process ou a single-wafer plasma etcher. The process uses y,., as the reactant gas. Iris possible to 'vary the gaz flow, the power applied to the cathode, the pressure in the reactor chamber, and the
spacing berwee!l the ::mode azd the cathode (gap). Several response variables would usually be of i:.tercst in this process, but in this exa.:mple we will concentrate on etch rate for silicon ni::ride.
We will use a single replicate of a r design to i.;:.vestigate this process. Since it is un!ikcly that
the three-factor and four-factor intera.ctions are significant, we will tentatively plan to combine diem
as an estim.ate of error. The factor levels nsed in the design are shown here:
Design Factor
Gap

B
Pressure

c,F,Flow

D
Power

Level

(em)

(roThrr)

(SeeM)

(w)

LowH

0.80
1.20

450

125

550

200

275
325

High (+)

13-5

The 21 Factorial Design.

387

Table 13-20 presents the data from the 16 runs of the 24 design. Table 13~21 is the table of plus
and minus signs for the 24 design. The signs in the columns of this table can be used to estimate the
factor effects. To illustrate, the estimate of factor A is

A ::::..!.[a+ab+ac+abC+ad +ahd +acd+abcd-(l}-b-c-d -be -btl -cd-bed]


S
= ~[669+650+642 +635 + 749 +868 +860+ 729- 550 -604-633

-601-1037 - ,052- ,075-1063J


=-101.625.
Thus the effect of increasing the gap between the anode and the catbode from 0.80 em to 1.20 em is
to decrease the etch ;rate by 101.625 Almin.
1t is easy to verify that :he complete set of effect estimates is

D=306.125,

A = -101.625.

AD=-lS3.625,

B=-1.625,
AB=-7.875,

BD=-O.615,

C=7375,

ABD=4.125,

AC = -24.875,

CD=-2.12S,

BC=-43.875,

ACD=5.625.

ABC=-IS.62S,

BCD =-25.375,
ABCD=-40.125,

A very helpful method in judging the significance of factors in a 2' experiment is to


construct a normal probability plot of the effect estimates, If none of the effects is

Table 13-20

The 24 Design for the Plasc:;;a Ecch Experim::nt

(Gap)

(Pressure)

(C,F, Flow)

(power)

-1

-1

-I

-1

-1
-I

550

-1

1
1

-I

-1

604

-1

-1
-1

633

-I
1
-1

-1

-J

-1

-1
1

-1
-1

-1

Etch Rate

-1
-1

-1
1

-1

-1

-1

-1

1
-1
-1
-1
-1

1
1

669
650
642
601
635

1037
749

1052
868
1075
860
1
1

1063
729

388

Chapter 13 Design ofExperi.."Pents with Several Factors


Tablel",21 Contra:>, Constants for the 2' Design
A

AS

C AC

(1)

BC

. .;. .

ab
c
ac
be

+ . .;. .
+

+
+ +

+
+
+

aha

+
+

+ +

+
+

+
+
+

+
+

+ +

cd
acd
bed
abed

.,.

+
+

+
+
abc++++++

.,.

+
+

-r-

d
ad
bd

AD BD ABD CD ACD BCD ABCD

ABC D

.,.

+
+

+
+
+

+
+

significant, then the estimates will behave like a random sample drawn from a normal dis~
tribution with zero mean, and the plotted effects will lie apProximately along a straight line,
Those effects that do not plot on the line are significant factors:.
The normal probability plot of effect estimates from the plasma etch experiment is
shown in Fig, 13-20, Clearly the main effects of A and D and the AD interaction are significant, as they fall far from the line passing through the other points, The analysis ofvmance
summarized in Table 13-22 confirms these findings. Notice that in the analysis of variance
we have pooled the three- and four~factor interactions to form the error mean square. If !.he
nonnal probability plot had indicated that any of these interactions were important, they
then should not be included in the error term.
Since A =-101.625, the effect of increasing the gap benveen the cathode and anode is
to decrease the etch rate. However. D 306.125, so applying higher power levels \Vill
increase the etch rate, Figure 13-21 is a plot of the AD interaction, This plot indicates IIlat
the effect of changing the gap width at low power settings is smaU. but that increasing the

+2,~,------,~----~----'------r-----r------Dnil

+1
Zj

O~.

229,80
79.25
152.37
306,12
Effects
Normal probabilit)' plot of effects from the plasma etch experiment

0.37

Figure

13~20

13-5
Table 1322

Source of
Variation

41,310.563
10.563
217.563
374,850.063
248.063
2,475.063
94,402.563
7,700.063
1.563
18.063
10,186.815
531,420.938

AB

AC
AD

BC
BD
CD
Error
Total

389

Analysis of Variance for the Plasma Etch Experiment


Sum of
Squares

A
B
C
D

The 2k Factorial Design

Degrees of

Mean

Freedom

Square

1
5
15

41,310.563
10.563
217.563
374,850.063
248.063
2,475.063
99,402.563
7,700.063
1.563
18.063
2,037.363

F,
20.28
<1
<1
183.99
<1
1.21
48.79
3.78
<1
<1

gap at high power settings dramatically reduces the etch rate. High etch rates are obtained
at high power settings and narrow gap widths.

The residuals from the experiment can be obtained from the regression model

-776 .0625 - (101.625)


y- - - Xl + (306.125)'X 4

- (153.625)
- - - x1x 4 '
222

For example, when A and D are both at the low level the predicted value is

y = 776.0625 - (!O\625} -I) + e06~125)r-I) _ e53~625}_I)( -I)


=597,
and the four residuals at this treatment combination are

e,=550-597 =-47,

e, = 604- 597 = 7,

1400
1200
c

1000
800

- - - - - - - - - - - Dh;gh = 325w

600

_------.. D10w = 275w

W
400
200
0

Low (0.80 em)

High (1.20 em)


A (Gap)

Figure

13~21

AD interaction from the plasma etch experiment

390

Chapter 13 Design of Experlments with Several Factors

..


~;:-_-;;;!'c;o; __~.L _ _--::-;:!J

-3.00

Residua!,

20.16

~.33

66.50

9;

Figure 13-22 Normal probability plot of residuals from the plasma etch experiment.

e, = 638 -597 =41,


e,=601-597=4_
The residuals at the oth~r three treatment combi:Iations (A high, D low), (A low, D high),
and (A high, D high) are obtained similarly. A normal probability plot of the residuals is
shown in Fig. 13-22. The plot is satisfactory.

13-6 CONFotJNDING IN THE 2' DESIGN


It is often impossible to run a complete replicate of a factorial design under homogeneous
experimental conditions. Confour.ding is a design technique for running a factorial experiment in blocks, where the block size is smaller than the number of treatment combinations
in one complete replicate. The technique causes certain interaction effects to be indistin~
guishable from, or confounded with, blocb. \Ve will illustrate confounding in the 2* facto~
';"1 design in 2P blocks, where p < k.
Consider a 21 design. Suppose that each of the 22 == 4 treatment combinations requires
focr hours of laboratory analysis. Thus, two days are required to perform the experiment.
If days are considered as blocks, then we must assign two of the four treatment combinations to each day.
Consider the design shown in Fig. 13-23, Notice that block 1 contains the treatment
combinations (1) and ab, and that block 2 contains a and b. The contrasts for estimating the
main effects A and B are
Contrast, = ab + a - b - (I),
Contraste = ab + b - a - (1).
Note that these contrasts are unaffected by blocking since in each contrast there is one plus
and one minus treatment combination from each block. That is, my difference between
block 1 and block 2 will cancel out. The contrast for the AB interaction is
ContrastAl! = ab + (1) - a-b.

ao

Since the two treatment combinations with the plus sign,


and (1), are in block I and the
two with the minus sign. a and b~ are in block 2, the block effect and theAR interaction are
identical. That is, AS is confounded with blocks.

136 Confounding in the 2' Design


Block 1
Block 1

Block 2

(1)

(1)
all

~
~

ab
8C

Block 2

[IJ
,

[abc

be

Figure 13<-23 The 2: design in two blocks,

391

Figure 13~24 The 2} design in two blocks,ABC

confounded.

The reason for this is apparent from the table of plus and Illlnus signs for the 2' design
(Table 13-13), From this tablel we see that all treatment combinations that bave a plus on
AB are assigned to block 1. while all treatment combinations t..1at have a minus sign on AB
are assigned to block 2.
This scheme can be used to confound any 2k design in two blocks. As a second exar.Iple. consider a 2' design. run in two blocks. Suppose we wish to confound the three-factor
interaction ABC witb blocks, From the table of plus and minus signs for the 2' design (Table
13-16), we assign the treatment combinations that are minus on ABC to block 1 and those
that are plus onABCto block 2. The resulting design is shown in Fig. 13-24.
There is a more general method of constructing the blocks. The method employs a
defining contrast, say
(1347)
where Xi is the le'vel of the ith facto:: appearing in a treatr:lent combination and (Xi is the
exponent appearing on the ith factor in the effect to be confounded. For the 2k system. we
have either lX, = 0 Or lj and either x, = 0 (low level) or Xi =: 1 (high level). Treatment combinatious that produce tbe sarne value af L (modulus 2) will be placed in tbe sarne block.
Since the only possible values of L (mod 2) are 0 and I, this ,,'ill assign tbe 2' treatment
combinations to exactly wo blocks.
ill an example consider a 2 3 design with ABC confounded with blocks. Here Xl corresponds to A. X Z to E, X] to C, and lX, :::::: ~ :::: OJ = L Thus. the defining contrast for ABC is

L=x1 +X1 +X1


To assign the treatment combinations to the wo blocks, we substitute the treatment combinations i:Jto the defi.n.ing contrast as follows:
(1): L = 1(0) + 1(0) + 1(0)

0 = 0 (mod 2),

a: L= 1(1) + 1(0)+ 1(0) = I = I (mOd 2),


b: L= 1(0) + 1(1) T 1(0) = I = 1 (IDlJd 2),

abo L = 1(1) + 1(1) + 1(0) = 2 = 0 (mod 2),


e: L = 1(0) + 1(0) + 1(1) = I

1 (mod 2),

ac: L= 1(1) + 1(0) + 1(1) 2 = 0 (mod 2),


be: L= 1(0).,. 1(1) + 1(1) = 2 = 0 (mod 2),

abc: L = 1(1) + 1(1) -r 1(1) = 3 = I (mod 2).


Therefore, (I), ab, ac, and be are!1ll1 in block I, and a, b, c, and abc arenm in block 2, This
is the same design shown in Fig. 13-24.

392

Chap'i"er 13 Design of Experiments with Several Factors

A ~hortcut method is useful in constructing these designs. The block containing the
treaunent combination (I) is called the principal block. Any element [except (I)l in the
principal block may be generated by multiplying two other elements in the principal block
modulus 2. For example, consider the principal block of the 2' design with ABC confounded, shown in Fig. 13-24. Note that

ab . ac :::; a?bc

bC

ab bc=ab"J.c=ac,
a.c be = abC' = abo
Treatment combinations in the other block (or blocks) may be generated by multiplying one
element in the Dew block by each element in the principal block modulus 2. For the 2' with
ABC confounded, since the principal block is (I), ab, ac, and be, we know that b is in the
other block. Thus, the elements of this second block are

b'(I)=b,

bab=ab'=a,
b 'ac =abc,
b . be = b'c = e.

Jt'xample13;:lO
An experiment is performed to investigate the effects of foW" factors on the terminal miss distar.ce of

a shoulder-fired ground~to-air missile.


The four faclors are target type CAl, seeker type (8). tatget altitude (C). and target range (D),
Each. factor may bc conveniently ron at two levels, and the optical tracking system will allow termi~
nal miss dist.'Ulce to be measured to the nearest foot. Two differet.t gunners are used in the flighl test,
and since there may be differences between individuals, it was decided to conduct the 24 design in two
blocks with ABeD confounded. T!;.us, the defining contrast is
L=x~

+.:tz+x,+x4-

The experimental design and the resulting data are


Block 1

Block 2

(1).3
ab ... 7
ae =6
be ~8
ad"" 10
bd.4
cd 8
: abed I t 9

,7

;lid

b .5
c 6

d.4
abe = 6
bed = 7
acd =9
abd ~ 12

The anal:ysis of the dcsign by "rates' algorithm is shown in Table 13-23. A noJ;ID.al. probability plot of
the effects would reveal A (ta.'"get type), D (target nL1.ge), and AD to have large effects. A confuming
an.alysis of variance. using three~factor interactions as error, is shown in Table 13-2J..

It is possible to confound the 21:. design in four blocks of 2}:;-1 observations each. To con~
struct the design. twO effects are chosen to confound with. blocks and their defullng contrasts obtained. A third effect, the generalized interaction of the tvlo initially chosen, is also

:.3-6 Confoun.Cing in the 2k Design


Table 1323

Yates' Algorithm for the 24 Design in Examp:e 13-10

Treatment
Combinations

i .

Sum of

Response

[I]

[2]

[3]

[4:

Effect

3
7
5
7

10
12
12
14
14
16
17
16
4

22

48

63

111
21
5
-I

Total

26
30
33

!l

I,"

(1)

b
ab

;.-

I.
~i

e
ae
be

abc
d
ad
bd

6
S
4

10
4

abd
cd
.cd
bed
abed

'

Table 13-24

12
8
9
7

-4

3
-8
-11

2
-I
-2

-2
2

4
17
4

14
3
2

3
4

0
-3

0
-1

A
B
AB
C
AC
BC
ABC

",

-19
-3
-1
15
13

D
AD

BD
ABD
CD
ACD
BCD
ABCD

-3
7
-1

-3
-3
-I

Efe-.."t
Estimate

27.5625
15625
0.0625
3.0625
22.5625
0.5625
0.0625
14.0625
10.5625
0.5625
3.0625
0.0625
0.5625
0.5625
0.0625

2.625
0.625
--0.125
0.875
-2.375
--0.375
--0.125
1375
1.625
-0.375
0.875

-8.125
--0.375
-iJ.375
--0,125

Analysis of Variance for Example 13-1Q


Sum of
SquatCS

Blocks (ABCD)

A
B
C

AS
AC
i

6
-2

-2
6
S

Soutce of
Variation

!i

393

AD

BC
BD
CD
ErrorSABC+ABD
Total

+ACD +BCD)

Degrees of
Freedom

0.0625
27.5625
L5625
3.0625
14.0625
0.0625
22.5625
10.5625
0.5625
0.5625
0.0625
4.2500
84.9375

1
1
1

1
1

4
15

Meml

Square
0.0625
27.5625
1.5625
3.0625
14.0625
0.0625
22.5625
10.5625
0.5625
0.5625
0.0625
1.0625

Fe
0.06
25.94
1.47
2.88
13.24

0.06
21.24
9.94
0.53
0.53
0.06

confounded with blocks. The generalized interaction of two effects is found by multiplying
their respective columns.
For example, consider the 24 design in four blocks. If AC and BD are confounded with
blocks, their generalized interaction is (AC)(BD) = ABCD. The design is constructed by
using the defining contrasts for AC and BD:

L} =X1 +X:J,

L,. ~ x, + ".

394

Chapter 13 Design of Experi:nents with Several Factors

It is easy to verity that the four blocks are


Block 1

Btock 2

Siock :3

L,~O, 4~O

L,~1,4=O

t,~O,4~1

rmr
: ire

; bd

Iad

bed

~
be
,

~l

c
abd

iabed I

Btock 4-

t, ~ 1, 4

: cd

--

- '

This general procedure can be extended to confounding the 2k design in 21' blocks.
where p < k. Select p effects to be confounded, such that no effect chosen is a generalized
interaction of the others. The blocks can be constructed from the p defIning contrasts L:, L"
... ~ Lp associated with these effects. 10 addition, exactly 2! - P - 1 other effects are CODfounded with blocks, these being the generalized in:eraction of the origina! p effects chosen. Care should be taken so as not to eonfound effects of potential interest.
For more information on confounding refer to Montgomery (2001, Chapter 7). That
book contains guidelines for selecting factors to confound with blocks so that main effects
and low-order interactions are Dot confounded. In particular, the book contains a table of
suggested confounding schemes for designs with up to seven factors and a range of block
sizes, some as small as two roos,

137 FRACTIONAL REPLICATION OF THE 2' DESIGN


As the number of factors in a 21i. increases, the number of rons required increases rapidly.
For example, a 2' requires 32 runs. In this design, only 5 degrees of freedom correspond to
main effects and 10 degrees of freedom correspond to two-factor interactions. If we can
assume that certain high-order interactions are negligible. then a f"'otiona! factorial design
involving fewer than the complete set of 21< runs can be used to obtain information on the
main effects and low-order interactions. In this section, we will introduce fractional replication of the 2' design. For a more complete treatment, see Montgomery (2001, Chapter 8).

137.1 The OneHalf Fraction of the 2' Design


A one-half fraction of the 2k design t.-ontalns 2k - 1 ru::tS and is often called a Zk-I fractional
factorial design. .A..s an example, consider the 2'-1 design; that is. a one-half fraction of the
2'. The table of plus and minus signs for the 2' design is shown in Thble 13-25. Suppose we
select the four treatment combinations a~ h. C1 and abc as our one~half fraction, These
Table 13-25 Plus and Minus Signs for the 21 Factorial Design
Treatment
Combinations

Factorial Effect
I

a
b

+
+

ac
be
(I)

+
+

AB

AC

BC ABC
+

.;.

+
+
+

+
.-'

ab

+
+

+
+

.,.

-'

13-7

Fractional Replication of t!:e 2k Design

395

treatment combinations are shown in !he top half of Table 13-25. We will use bo!h the CODventional notation (a, hI c, ... ) and the plus and minus notation for the treatment comblll.3.~
tions. The equivalence between the two notations is as follows:
Notation 1

Notation 2

a
b

+--+--+
+++

c
abc

Notice that the 23 -1 design is formed by selecting only those treatment combinations
!hat yield a plus an !he ABC effect, Thus ABC is called !he generator of this particular fraction. Furthermore,. the identity element I is also plus for the four runs, so we call

[=ABC
the defining relation for the design.
The treatment combinations in the 2:!< -: designs yield three degrees of freedom associated with !he main effects, From Table 13-25, we obtain !he estimates of the rna;:'! effects as

1
A =z:[a-b-c+abc],
I
B=-[-a+ b -c+ abc],
2
I
C=-[-a-b+c+abc].
2

It is also easy to verify that the estimates of the two-factor interactions are
I

BC =z:[a-b -c+abe],
I
AC= 2[-a+b-c+abc],

AB= 1
2

-b-c+abc].

Thus, !he linear combination of observations in column A, say A , estimates A + BC


Similarly, E estimate.B .,.AC, and c estimates C ~ AB. Twa or mare effects !hat have this
proper:y are called aliases. In our 2'-1 design, A and BC are aliases, B andAC are aliases,
and C and.4.B are aliases. iiliasing is !he direct result of fractional replication. In many practical situations, it will be possible to select the fraction so that !he main effectS ""d loworder interactions of interest will be aliased with high~order interactions (which are
probably negligible).
The alias structure for this design is found by using the defining relation I = ABC, Multiplying any effect by the defining relation yields !he aliases for !hat effect. In OUr example,
the alias of A

A=A . ABC =A'BC = BC,


since A . I = A and A' = 1. The aliases of B and C are

B=B.4.BC=AB'C=AC

396

Chapter 13

Design of Expe:irnents with Several Factors

and
C=CABC=ABC'=AB.

Now suppose that we had chosen the other one-half fraction. that is" the treatment com~
binations in Table 13-25 associated with minus on ABC. The defining relation for this
design is I =--ABC. The aliases are A =-BC. B =-AC, and C ~ --AB. Thus the estimates of
.4.B, and C with this fraction really estimateA- BC, B -AC, and C -AB. In practice, it usually does not matter which onehalf fraction we select. The fraction with the plus sign in the
defining relation is usually called the principal fraction, and the other fraction is usually
called the altemate fraction.
Sometimes we use sequences of fractional factorial desig::lS to estimate effects. For
example, suppose we had run the principal fraction of the 23 -! design. From this design we
have the following effect estimates:
e,.=A+BC.
C,=B+AC.
ec=c+AB.

Suppose that we are willing to assume at this point that the t:\J,Io-factor interactions are neg~
ligible. If tile}' are, then tbe 23 - J design has produced estimates of the three main effects, A,
B, and C, However, :if after running the principal fraction we are uncertain about the inter~
actions. it is possible to estimate them by running the alternate fraction. The alternate frac~
tien produces the following effect estimates:
e;=.4-BC.

e~ B-AC,
e;~C-AC.
If we combine the estimates from the two fractions t we obtain the following:

Effect i
I

;j(A+BC+A-BC)=A
1

I=C

~A

+BC-(A-BC)] =BC

1(B+AC+B-AC)cB

}[B+AC-(B-AC)] =AC

~(C+AB+C-AB)=C

1
~[C+AB-

(C -AB)) =AB

Thus by combining a sequence of two fractional factorial designs we can isolate both tile
main effects and the two~factor interactions. This property makes the fractional factorial
deSign highly tL>;eful in experimental problems a') we can run sequences of small. efficient
experiments, combine infonnation across several experiments, and take advantage of learning about the process we are experimenting with as we go along.
A 2;:-1 design may be constructed by \\-TIting down the treatment combinations for a
full factorial with k - I factOrs and then adding the kth factor by identifying irs plus and
minus levels with the plus and minus signs afthe highest-order interaction ABC,, (KI). Therefore, a 2'-1 fractional factOrial is obtained by writing down the full 2' factorial and
then equating factor C to the AB interaction, Thust to obtain the principal fraction. we
would use C =+ AB as follows:
j

13-7 Fractional Replication of the 21 Design

397

Full 2'

C=AB

+
+

To obtain the alternate fraction we would equate the last column to C = -AB.

To illustrate the use of ZI one-half fraction, consider the plasma etch experiment described in Exa"n-

pIe 13-9, Suppose that we decide to use a 24 - 1 design with 1::; ABeD to investigate the foU!' factors.
gap (A). p::essure (B), C<F6 flow rate (C), and power setting (D). This design would ~e consl'l.lc[ed by
writing down a 2.) in the factors A. B, and C and then setting D = ABC. T'r.e design ond the resulting
etch rates are shov,rn in Table 13-26.
In this design, the main effects a.."'C aliased with the three-factor interactions: note that the a:ias
of A is

A!=AABCD,
;;;;;.A 2BCD,

= BCD,
and sirnila::ly
B=ACD,
C=.4BD,
D=ABC.
The two~factor interactior..s are aliased with each otheL For ex.nnple, the alias of AB is CD:
AB!=ABABCD,
=A'B'CD,
CD.

Tb.e other aliases axe

Table 13~26

Tte 24 -! Design with Defining Relation 1 "" ABCD


C

+
T

+
+

+
+

Etch
Rilte

(1)

550

ad

bd

749
1052
650
1075
642

Treatr:1ent
Combinations

+
+

D=ABC

ab
cd
ac
be
abed

601

n9

398

Chapter 13 Design of Experiments with Several Factors


AC=BD,
AD=BC

The estimates of the mal.!;. effects and their aliases are found using the four colUl.Il:ls of signs in
Table 13~26. For example, from column A we obtain

t, =A + BCD =t(-550 + 749-1052+ 650 -1075 +642 -601 + 729)


;=-127.00.

The other columns produce

t 8 =B+ACD:4.00,
tc=C+ABD=lLSO,

and
tD=D+ABC

290.50.

Clearly .e" a....d b are large, and if we believe that me tlL."'ee-factor interactions are negligible. then. the
main effects A (gap) and D (power setting) significantly affect etch rate.
The interactions are estimated by formi::!g the AE, AC, and AD colu::n.1.s and adding them to the
table. The signs in theAB column are +,-, -. +, +, -. -, +, aDd this column prod..:ces the estimate

t" =AB + CD =-j:(550-749-1052 + 650 + 1075 -

642- 601 + 729)

=-10.00.
From theAC and AD columns we find
e;,c=AC+BD ""- 25.50,
Ao=AD+BC=- :97.50.
The tAD estimate is large; the most straightforward interptetacion of the results is that this is the AD
inreractio!].. Thus, the results obtained from the 2d ~ I design agree with the full factorial results in

Example 13.-9.

Normality Probability Plots and Residuals The normal probability plot is very useful in
assessing the significance of effects from a fractional factorial. 'This is particularly true
when there are many effects to be estimated. Residuals can be obtained from a fractional
factorial by the regression model method shown previously. These residuals should be plotted against the predicted values~ against the levels of the factors. and on normal probability
paper. as we have discussed before, both to assess the validity of the underlying model
asSl.!IDptions and to gain additional insight into the experimental situation.

Projection of the 2k - 1 Design If one or core factors from a one-half fraction of a 210 can
be dropped, the design will project into a full factorial design. For example, Fig. 13-25 pres-

ents a t- 1 design, Notice that this design will project into a full factorial in any t'No of the
three original factors. Thus. if we think that at most two of the three factors are important
the 2:} -1 design is an excellent design for identifying the significant factors. Sometimes we
call experiments to identify a relatively few significant factors from a larger number of fac~
tors screening experiments, This projection property is highly useful in factor screening. as
it allows negligible factors to be elim.i:mted, resulting in a stronger experiment in the active
factors that remain.

13-7

f<ractional Replication of the 2~ Design

399

I
/

I
I

"I
I
I
I

I
I
I
I

abc

II

tf--

- a

, II
I I
I I
I I

Figure 13-25 Projection of a 2 3 - 1 design iIlto three 2). designs.

In the 2'-1 design used in the plasma etch experiment in Example 13-11, we found that
two of the four factors (B and C) could be dropped. If we eliminate these two factors, t':te
remaining columns in Table 1326 form a 2' design in the factors A and D, with two reFli
cates. This design is shown in Fig. 13-26,
Design Resolution The concept of design resolution is a useful way to catalog fractional
factorial designs according to the alias patterns they produce. Designs of resolution III, IV-,
and V are particularly imporraut The cefioitions of these terms and an example of each
follow:

1. Resolution III Designs. These are designs in which no main effects are aliased with
any other main effect, but main effects are aliased with two~factor interactions, and
twofactor interactions may be aJiased with each other. The 2' -, design with I ~
ABC is of resolution m. We usually employ a subscript Roman numeral to indicate
design resolution; thus this one-half fraction is a t'm 1 design,
2~

Resolution IV Designs. These are designs in which no main effect:is aliased with
any other main effect or two-factor interaction. but two-factor interactions are
(1052.1075)

[749,729)

+1~------------------'

1(550,601)

-1~~~~--------~

-1

(650,642)

+1

A (Gap)

Figure 13-26 The 22 design obtained by dropping factors B and C from the plasma etch
experiment

400

Chapter 13

Design of Experiments with Severa!. Factors

aliased "ith each other, The 2'-1 design Mth I =ABCD used io Example 13-11 is
of resolution N (21; 1),
3. Resolution V Designs. These are designs in which no main effect Or two-factor
interaction is aliased with any other main effect or two-factor interaction, but two~
factor interactions are aliased with three~factor interactions. A 25 -! design with I ~
ABCDEis ofresolution V (2',,1).

Resolution ill and N designs are particularlY useful io factor screening experiments,
A resolution IV design provides very good information about main effects and will provide
some information about two-factor interactions.

13-7.2 Smaller Fractions: The 2'-P Fractional Factorial


Although the 2k - 1 design is valuable in reducing the number of runs required for an experiment, we frequently:find that smaller fractions will provide almost as much useful information at even greater economy. In general. a 2( design may be run in a 112P fraction called a
2'-' fractional factorial design. Thus. a 1/4 fraction is called a 2'-' fractional factorial design.
a 1/8 fraction is called a 2k - J design, and so on.
To illustrate a 1/4 fraction, consider an experiment Mth six f""tors and suppose that the
engineer is interested primarily in main effects but would also like to get some information
about the two-fa;:;tor interactions. A 2'- t design would require 32 runs and would have 31
degrees of freedom for esthnation of effects. Since there are only six main effxts and 15
two-factor interactions, the one~ha1f fraction is inefficient-it requires too many runs.
Suppose we consider a 114 fraction, or a 26 - 2 design. This design conta.ins 16 runs and, with
15 degrees of freedom. will allow esthnation of all six main effects "ith some capability for
examination of the two-factor interactions, To generate this design we would write dovma 24
design in the factors A. B, C, and D. and then add two columns for E and F. To find the new
columns we would select the two design geMrators I = ABCE andl = ACDF. Thus column E
would befouod from E=ABCand column Fwould be F=ACD, and also columns ABCE
and ACDF are eqnal to the identity column. However, we know that the product of any two
columns in the table of plus and minus signs for a 2' is just another column in the table; therefore, the product of ABCE and ACDF or ABCE (ACDF) =A'BC'DEF= BDEF is also an
identity column. Consequently, the complete defin.ing relation for the 25 - 2 design is
I=ABCE=ACDF~BDEF.

To find the alias of any effect, simply multiply the effect by each word io the foregoiog
deflning relation. The complete alias strUct1lre is
A = BCE= CDF=ABDEF,

B =ACE DEF ABCDF,


C =ABE = ADF =BCDEF,

D=ACF=BEF=ABCDE.
E=ABC=BDF=ACDEF.
F=ACD=BDE=ABCEF,
AB= CE= BCDF=ADEF,
AC=BE=DF=ABCDEF,
liD = CF = BCDE = ABEF,

13~ 7

Fractional Replic.ation of the 21,- Design

401

AE=BC= CDEF=ABDF,
AF=CD=BCEF=ABDE,
BD=EF=ACDE=ABCF,
BF=DE= ABCD = ACEF
ABF=CEF=BCD=ADE,
CDE=ABD=AEF= CBF.
);otice that this is a resolution IV design; main effects are aliased with three~fac~or and
higher interactions, and two~factor interactions are aliased with each other. TIlls design
would provide very good information on the main effects and give some idea about the
strength of the two-factor interactions. For example, if the AD interaction appears signi:fi~
can~ either AD andior CF are significant, If A andior D are significant main effectS, but C
and F are not, the experimenter may reasonably and tentatively attribute the significance to
the AD interaction. The construction of the design is shov.'1l in Table 13-27.
The same pr.nciples can be applied to obtain even smaller fractions. Suppose we wish
to investigate seven factors in 16 runs. This is a 27 - 3 design (a 118 fraction). This design is
constructed by writing daVID. a 24 design in the factors A, B, C, and D and then adding three
new columns. Reasonable choices for the three generators required ace I = AliCE. I =
BCDF, and I = ACDG, Therefore, the new columns are formed by setting E = ABC, F =
BCD, and G =ACD. The complete defining relation is found by multiplying the generators
together two at a time and then three at a time, resulting in

I =ABCE = BCDF = ACDG = ADEF = BDEG =ABFG = CEFG.


Notice that every main effect in this design will be aliased with three-factor and higher
interactions and that two--factor interactions will be aliased with each other. Thus this is a
resolution IV design,
For seven factors. we can reduce the number of runs even further. The 27 -.; design is
an eight-run experiment accommodating seven variables, This is a 1116 fraction and is
Tabl,13:.7 Construction of the 26 - 2 Design with Generators I "" ABCE and I =ACDF
A

E=ABC

F=ACD

(1)

.;.

+
~

+
+

+
+

+
+
+

+
+

+
+
+
+

eif
bcf

abce
.;.

+
+
+

be
obf
at:

+
+

dej

.;.

+
+
+

dj
ade
bdef
obd

ede
oedf

obcdef

bed
+

402

Chapter 13 Design of Experiments with Several }actors


obtained by first writing down a 2' design in the factors A, B, and C, and then forming the
four new columns, from 1= ABD, I = ACE, 1= BCF, and 1= ABCG. The design is shown
in Table 13-28.
The complete defining relation is found by IDuetiplying the generators together two,
three, and :finally four at a time, producing

I =ABD =ACE =BCF =ABCG = BCDE =ACDF = CDG =ABEF


=BEG=AFG=DEF=ADEG=CEFG=BDFG=ABCDEFG.
The alias of any main effect is found by multiplying that effect througb each term in the
defining relation. For example, the afuu; of A is

A=BD

CE=ABCF=BCG=ABCDE=CDF=ACDG

=BEF =ABEG = FG =ADEF = DEG = ACEFG =ABDFG


= BCDEFG.
This design is of resolution rn~ since the main effect is aliased with two-factor interactions.

If we assume that all three-factor and higher interactions are negligible, the aliases of the
seven main effects are

e,=A+BD+ CE+ FG,


e,=B+AD+CF+FG,
fc=C+AE+BF+DG,
CD=D+AB+CG+EF,
C,=E+AC+BG+DF
C,=F+BC+AG+DE,
eo

G+CD+BE+AF.

This 27:i!4 design is called a saturated fractional factorial, because all of the available
degrees of freedom are used to estimate main effects. It is possible to combine sequences
of these resolution ill fractional factorials to separate the main effects from the two-factor
interactions. The procedure is illustrated in Montgomery (2001, Chapter 8).
In constrUcting a fractional factOrial design it is important to select the best set. of
design generators. Montgomery (2001) presents a table of optimum design generators for
designs 'With up to 10 factors. The generators in this table will produce designs of maximum resolution for any specified combination of k and p. For more than 10 factors, a

DC_AS)

E( ACI

F( SCI

+
+
+

+
+
+
+
+

+
+

+
+

+
+

13~8

Sample Computer Output

403

resolution ill design is recommended. These designs may be constructed by using the same
method illustrated earlier for the 2114 design. For example, to investigate up to 15 factors
in 16 runs, v.'rite down a 24 design in the factorS A, B, C. andD, and then generate 11 new
columns by taking the productS of the original four columns two at a time, three at a time,
and four at a time. The resulting design is a 21~!- n fractional factoriaL These designs, along
witit other useful fractional factorials, are discussed by Montgomery (2001, Chapter 8).

13-8 SAMPLE COMPUTER OUl'l'UT


We provide Minitab ou:put for some of the examples presented in this chapter.
Sample Computer Output for Example 13-3
Reconsider Example 13-3, dealing with ai.rcraft primer paints. The Minitab0 results of :he

3 x 3 factorial design with t.lu'ee replicates are

Ar..a1ysis of Variance for Force


Source

DF

SS

"4S

Type

2
1
2
12
17

4.5811
4.9089

2.2906
4.9089
0.1206
0.0822

Applicat
Type~Applicat

Error
Total

0.2411
0.9867
10.7178

I
27.86
59.70
1.47

0. 000

O. 000
0.269

The Minitab results are in agreement with the results given in Table 13-6.
Sample Output for Example 13-7
Reconsider Example 13-7, dealing with surface roughness. The Minitab results for the 23
design with two replicates are
Term

Effect

Coef

SE Coef

3.37S0
1.6250
0.8750
1.3750
0.1250
-0.250
1.1250

11. 0625
1. 6875
0.8125
0.4375
0.6875
0.0625
-0.3125
0.5625

0.3903
0.3903
0.3903
0.3903
0.3903
0.3903
0.3903
0.3903

Constant
A
B
C

A*B
A*C
B*C
A*BC

T
28.34
4.32
2,08
1.12
1. 76
0,16
-0,80
1.44

P
0.000
0.003
0.071
0.295
0.116
0.877
0.446
0.1.88

Analysis of Variance
Source
DF
Mall: Efect s
3
2-Way In~eractions 3
3-Way Interactions 1
Residual Error
8
Fure Error
8
Total
1.5

Seq SS
59.187
9.~87

5.062
19.500
19.500
92.937

Adj SS
59.187
9.187
5.062
19.500
19.500

Adj !l.S
19.729
3.062
5.062
2.437
2.438

P
8.09
1.26
2.08

0 008
0.352
0.188

The ou:put from Minitab'" is slightly different from the results given in Example 13-7. Hests
on the main effects and interactions are provided in addition to tite analysis of variance on
the significance of main effects. two~factor interactions, and three-factor intera(.i.ions. The

Chapter 13 Design of Experiments with Several Factors

404

Al~OVA results indicate that at least one of the main effects is significant, whereas no twofactor or three-factor interaction is significant.

13-9 SUM1VlARY
This chap:er has introduced the design and analysis of experiments with several factors
concenrrating on factorial designs and fractional factorials. Fixed, randoIr., and mixed mod~
els were considered, The F~tests for main effects and interactions in these designs depend
on whether the factors are fixed or random.
The 2k. factorial designs were also introduced. These are very useful designs in which
all k factors appear at two levels. They have a greatly simplified method of statistical analysis. In situations where the design cannot be run under homogeneous cQnditiQns~ the 2~
design can be confounded easily in 2 P blocks. This requires that certain interactions be confounded with blocks. The 2' design also lends itself to fractional replication. in which only
a parJcular subset of the 2" treatment combinations are run. In fractional replication~ each
effect is aliased with one or n:ore other effects. The general idea is to alias main effects and
low-order interactions with higher-order interactions. This chapter discussed methods for
construction of the 2~-P fractional fac:orial designs, that is, a If2P fraction of the 2* design.
These designs are particularly useful in industrial experimentation,
j

13-10

EXERCISES

13-1. An article in the Journal of Materials Processing Technology (2000. p. 113) presents results from an
experiment involving tool wear estimation in m.ilEng.
The objective is to minimize tool wear. 1\vo facto;s of
~nterest in ~e study were cutting speed (mlmin) and
d:;pth of cut {mm). One response of interest is tool
flank wear (mm). Three levels of ea.cll factor were
selected and a factorial experiment with three replicates is run. Analyze the data and draw conclusions.

25, and 30 minutes-andrandornly chooses two types


of paint from several that axe available, He conducts
an experiment and obtains t.'1e data shown here. Ana~
lyze the data and draw conclusions, Estimate the variance components.
Drying Time (min)
Paint

Depth of Cut

12

15

18.75

13~2..A:1 engineer

0.170
0.IS5
0.110

0.198
0.210
0.232

0.217
0.241
0.223

0.178
0.210
0.250
0.212

0.215 .
0.243
0.292
0.250

0.260
0.289
0.320
0.285

0.238
0.267

0.282
0.321

0.325
0.354

suspectS that the surface finish of a

meta: part is lnI1uenced by the :ype of paint used and


the dr)'ing time, He selects three d..")ing times-20,

20

25

30

74

73
61
44

78

64
50
92

98

66

86

73
88

45

68

35
92

85

13~3. Suppose that in Exercise 13-2 paint types wc:e


fixed effects. Compute a 95% interval estimate of the
mean difference between the responses for paint type

1 and painttype 2.

13-4. The factors that influence the breaking strength


of cloth are being stu.died. Four machines and three
ope{ators axe chosen at random and an experiment is
run using cloth from the same one~y.ard segment The
results are as follows:

]310 Exe,cises

l09

llO

lOS

110

1I0

as

109

116

13M8. Consider the tool wear dara in Exercise 13-1.


Plot the residuals from this expe.riment against the lev~
cis of cutting speed and against the depth of cut. Comment on the graphs obtained. \Vhat are the possible
consequences of the information conveyed by the
:residual plots?

111
112

110
III

III

114
112

raw pulp ar.d the ::reeness aLd cooking time of pulp

109
111

112

114

liS

109

Machine
2

Operator
A

B
C

109

111
112

Test for interaction and main effe...-r.s at the 5% leveL


Estimate the components of variance.
]3..5. Suppose that in Exercise 13-4 the operators
were chosen at random. but only four machines were
available for the test Does this influence the analysis
or your conclusion:.?
13-6. A company employs two time-study engine"".
Tbei= supervisor wishes to determine whether the
standards set by them are bfluenced by any interaction between engineers and operators. Sbe selects
three operators at random and conducts an experiment
in which the engineers set standard times for the same
job. She ohtains the data shown here. Analyz.e the data
and draw conclusions.
Operator
Engineer

405

2.59

2.38

2.78

2,49

2,40
2.72

2.15

2.85

2.86

2.72

2.66
2.87

13-7. An article in Industrial Quality Con1lt)l (1956, p.


5) descnbes an
investig'.ite the effect of
t\.Vo factors (glass type and phosphor type) on the
brightness of a television tuhe, The response variable
measured is the current necessary (in IIiicroamps) to
obtaizt a spedfied brightness level The da:a are shO\\TI
heJ;e. A. nalyze the data and draw conc1us:ons,-a.ssumir.g that hoth factors are fixed,
Phospbor Type
2

280
290
285

300
310
295

290
285
290

230
235
241)

260
2"0
235

220
225
230

Glass

13~9.

The pc;cenrage of hardwood concentration in

are being investiga':ed for their effects on the strength


of paper, Analyze the data shown in the following
table., assUltli..;g that all three factors are fixed.
Cooking Time

Percen:age of
Hardwood
Concentration

Cooking TUM

1.5 hours
Freeness
~JO

500 650

2.0hou.. . s

Freeness
41)0 500

650

10

96.6 97.7 99.4


96.0 96.0 99.S

98.4 99.6 1OQ.6


9S.6 100." 100.9

15

985 96.098.4
972 96.9 97.6

97.5 98.7 99.6


9S.1 98.0 99.0

20

97.5 95.6 97,4


96.6 96.2 98.1

97.6 97.0 98.5


98.4 97.8 99.8

13-10. An article in Quality Engin.eering (1999, p.


357) presents the results of an experiment conducted
to determine the effects of three factors on warpage in
a.."1. injection-molding process. Warpage is defined as
the nonfiatness property in the product manufactured,
This parrictaar company to.ar.ufactures plastic molded
components for use in television sets. washing
machines, and automobiles. The three factors of interest (each at two levels) are A = ::celt temperature, B:::::
injection speed, ar:.d C;;;;; injection pr;;x:ess. A complete
23 factorial design was carried out 'Nith replication,
'fINo replicates are provided in the table below. A'll1lyze the data from this experiment

IT

-1

-1
-I

-1
-1
-1
-1

1.35
2.15
1.50

-I
-1

0.70

-1
-1
1
-1

1.10
1.40

1.20

LlO

1.40
2.20
1.50
1.20
0.70
135
1.35
1.00

13-11. For the Warpage experimentm Exercise 13~10,


obtain the residuals and plot them on non:nal probability pape:. ALso plot the residuals VerSus the predicted values. Commen~ on these ?lots.

406

Chapter 13

Design of Experiments with Several Factors

13..12. Four factors are thought to possibly influence


the taste of a soft drink be"erage: type of sweetener
~4), rario of syrup to water (B), carbonation level (C).
and temperature (D), Each factor can be run at two
levels. produci.Ilg a 24 design. At each run in the
design, samples of the beve:.-age are given to a test
panel consisting of 20 people. Each tester assigr.s a
point score from 1 to 10 to the beverage. Total score is
the response variable, and the objective is to futd a
formulation that m.aximizes tOW sCOre, Two replicates
of this design are run, and the results shown here.
Analyze the data and draw conclusions,
Tteatment
~lica:e
Combinations I
II
(1)
a

190
174
181
183
177
181
188
173

b
ab

c
ac
be
abc

193

tlame*testing fabrics after applying fIre-retardan,


treatments. There are four factors: type of fabric (A),
type of firc-reta."Iiant treatment (B), laundering condi~
ton (C-the low level is no laundering. the hlgh level
is a.Fter one laundering). and the method of conducting
rhe flame test (D). All factors are ron at two levels. and
the response ..-a.riable is the inches of fabric burned on
a standard size test sample. The data are
(1)

Treatment
Replicate:
Combinations T
II

179
187
180

195
176
183
186
190
175
184
180

experiment in Exercisc

13~12.

178

ad

185
180
178
180

bd
aba
cd
aed
bed
abed

182
170

Experimental Statistics (No. 91, 1963) involves

198
172
187
185

199

42

d ...,,40

= 30

a=;: 31

ad

b =45

hd

50

ab = 29
e = 39

abd

= 25

DC

=28

cd =40

aed "'" 25

be =46

bed = 50

= 32

abed = 23

abc

(a) Esti.matc the effects and prepare a nonnal probability plot of the effects.
(b) Construct a nonnal probability plot of thc residu-

als and comment on the results"


13~13. Consider the

Plot the residoals against the levels of factors A, B, C.


and D. Also construct a normal probability plot of the
residuals. Comment on these plots.
13-14. Fmd the standard error of the effects for he
cxperit:.lcnt in Exercise 13~ 12. Using the standard
errors as a guide. what factors appear significant?
13-15. The data sho~'D here represent a single repli~
cate of a 25 design that is used in an experiment to
study the compressive strength of concrete. The factors are nUx (A). time {B). laboratory (C), temperature
(D), and drying time (E). Analyze the data, assuming
that three-factor and higher interactions are negligible.
Use a nom'.a1 probability plot to assess the effects.
(1) = 700

d=,OOO

e= 800

de =1900

900

ad=ll00

ae: 1200

ade=1500

b= 3400

hd = 3000

be =3500

bde=4000

ab= 5500

abd=6100

abe =6200

abd< =6500

ce = 600

cde =1500

e= 600

cd:::::: SOO

ae= 1000

acd=ll00

ace""" 1200 aede =2000

be= 3000

bed =3300

oC$=3006

bede=3400

abc = 5300 abcd=6000 abee = 5500 abed< =6300


13~16. An experiment described by ~L G. Natrella in
the National Bureau of Standards Handbook of

(c) Construct an analysis of variance table assuming


that three~ and four-factor interactions are
negligible.
13-17. Consider the data from the first replicate of
Exercise 13-10, Suppose rhat these o!:;servations could
not all !:;e ron under the same conditions. Set up a
design to run these observations in two blocks of four
observations, each withABC confounded. Analyze Lie

data.
13~18. Consider the data from the firSt replicate of
E."l:ercise 13~ 12. Construct a desig=. with two blocks of
eight observations each. with ABCD confounded.
Analyze the data.

13 ..19. Repeat Exercise 13-18 assuming that four


blocks are required. Confound ABD and ABC (and
consequemJy CD) wi'" blocks.
13~20. Construct a 2$ design in four blocks. Select the
effects to be confounded so rhat we confound the
highest possible interactions with blacks.

1321. An article in Industrial and Engineering


Chemistry ("Factorial Experiments in Pilot Plant
Studies," 1951, p. 1300) repo::ts on an experiment to
investigate the effects of temperature (A), gas throughput (B). and concentration (C) on the strength mproduct solution in a recirculation unit TWo blocks were
used with ABC confounded, and the .experiment was
replicated twice. The data are as follows:

1310 Exercises
1

2
Block 1

Block 2

Block 1

0=51
~
10& I

(1)';; 46

'abc= 35 '

bc=67

c:::o

ab=-47

Block 2
a

D :=: hold down time, and E:=: quench oil temperature,


The data are shown below,

with I =ABCDE was used to investigate-the effects of


five factors on the color of a chemical product The
factors are A ::: solvent/reactant, B:.:;; catalyst/reactant,
C::: temperature, D = reactant pu."ity, lll1d E = reactant
pH. The results obtained are as follows:

b
abe
e

d =6,,9
ode

6,47

~-H8

bde

3A5

= L66

abd

5,68

~ ~06

cd<

5,22

aed

~.438

ace = 1.22
bee = -2.09
abc =L93

bed =430
abede

=4,05

(a) Prepare a nnrmal probability plot of the effects.


\Vbich factors are active?
:P) Calculate the residuals. Construct a nonna! ptobahility plot of the residuals and plot the residuals
versus the fitted values. Comment on the plots,
(c) If any factors are negligible, collapse the 25 - i
design into a full factorial in the active factors.
Co~e::1t on t'lC resulting design. and i::1te!"pret

.,.

7.78.
8,15.
750,
7.59.
7.54,
7.69,
7.56,
7.56,
7,50.
7.88,
7.50,
7.63,
732,
7.56,
7.18,

1.8:"

.,.

+
+

+
+

.;.

+
+

.;.
.;.

.;.

.;.

(d) Suggest. betterdesign; specifically,ono that would


provide some information on aU interactions,

~-O,63

D
.;.

{c) Comment Oli the efficiency of this design. Note


that we have replicated the experiment tv.'ice, yet
we have no information on the ABC interaction.

ac;;;22

(a) Analyze 6e data from this e.."\:perixucnt.


(b) Plot the residuals on Donna! probability paper and
against the predicted values. Comment on the
p:ots obtained.

a = 2.51

13--22. R. D. Snce ("Experimenting \Vim a Large


Number of Variables," in Experiments in Industry;
Design.. Analysis and Interpretation of Results. by R.
D. Snee. L. B. Hare, and J. B. Trout, Editors, ASQC,
1985) describes an experiment in which a 2~-1 design

407

+
+

+
+

.;.

.,.
+

+
+
+
+

7.78,
8.18
7.56,
7.56.
8.00,
8.09,
7.52.
7.81,
7.25,
7.88
7.56,
7.75,
7.44,
7.69,
7.18
75O,

7.81
7.88
7.50
7.75
7.88
8.06
7.44
7.69
7.12
7.44
7.50
7.56
7.44
7.62
7.25
759

(al \Vhat is the generator for this fractio!!? Write out

the alias struCture.


('0 luialyz.e t.'le data. Vihat factors influence mean
free height?

(c) Calct;late the range of free height for each run, Is


there any indication that any of these factors
affects variability in free height?
(d) .A.nalyze the residuals from this experiment and
comment on your findings.
13~24. An artic1e in Industrial and Engineering
Chemistry ("More on Planning Experiments to

increase Research Efficiency," 1970, p, 60) uses a 25 - 2


design to investigate the effects of A """ condensation
temperature, B "'" anlount of material 1, C = solvent
vol"..!ZlC, D :)ondenSation time, and E = amount of
material 2 on yield. The results obtained are as
follows:
e

ab

= 23.2,
= 15.S,

ad = 16,9,

cd

be = 16.2. ace

=23.8,
=23,4.

bde

16.3,

abed<

18.1.

(a) Verify that the design generators used were I ::


ACE and 1= BDE.
(b Write down the complete defudng relation and

the aliases from this design.

13-23. An article in the Jouma.l of Quality Tedmology, (VoL 17, 1985, p. 198) describes the use of a

(e; Es'l:iJ.:late the rna.in effects.


(d) Prepare a... analysis of variance table. Verify that
the Ali and AD interactions are available to use as

replicated fractional factorial to investigate the effects


of five factors on the free height of leaf springs used in
an automotive application. The factors are A =f..rrnace
temperature, B = heating time. C = transfer time.

(e) Plot the residuals versus the fitted values. Also


construct a normal probability plot of the residuals, Corcoe:.t ou the results.

the results.

error.

408

Chapter 13

Design of Expe6ments with Severn! Fa,to"

13-25~ An article in Cement and Concrete Research


(2001. p.1213) describes an experimenttQ investigate

the effects of four metal oxides on several cement

properties. The four factors are aU run at two levels


and one response of i:r.terest is the mean bc.lk density
(g!cm~). The four factors and their le'l.'els are
Low Level

High Level

(-1)

(+1)

0
0
0
0

30

Factor
ll.: %Fe,G,

8: % 7.J10
c: %PbO
D: %0:,0,

15

2.5
2.5

Typical results from this type of experiment are given


in the follO'Ning !able:

Run
2
3
4
5
6

7
8

ll.
-I
I
-1

B
-1
-1

C
-I
-I

-1
-1

Density

2.001

-1
-1

2.062

-1

-1

-1

2.019
2.059
1.990

1
-1

-1

1
1
-1

2.076
2.038
2,118

1
1

(a) 'What is the generator for dlls fraction?


(b) Analyze the data. What factors influence mean
bulk density?

(e) Analyze the residuals from this experiment and

comment on your findings.

13-.26. Consider the 2,,-2 design in Table 13-27. Suppose that aftet analyzing the original data, we find that
factors C and E can be dropped. Vlhat type of 2t
design is left in the remaining variables?
13-27. Consider the 2'-' design in Table 1327. Sup.
pose that after the original data analysis. we find that
factors D and F can be dropped, ""'hat type"'of 2k
design is left in the rero.aining variables? Compare the
results with Exercise 13-26. Can you explain why the
answers are different?
13~28. Suppose that in Exercise 13-12 it was possible
to nm only a one-half fracdon of the Z' design. ConstrUCt the design and perform the statistical analysis,
use the data from replicate 1

13-29. Suppose that in Exercise 13-15 only a onehalf


fraction of the 2~ design could be run, Construct the
design and perform the analysis.
13-30. Consider the data in Exercise 13~15, Suppose
that only a one-quarter fraction of the 2;5 design could
be run. Construct the design and analyze the data.
13,,31~ Construct a 2t\t:~ fractional factorial design.
Write do'Wll the :a.:iases, assuming that only main
effects aLd two-factor interactions are of inteCCSL

chap_te_r_l_4____________________

Simple Linear Regression


and Correlation
In many problems there are two or more variables that are inherently related~ and it is necessary to explore the nature of this relationship. Regression analysis is a statistical technique
for modeling and investigating the relationship between NO or more variables. For example, ill a chemical process, suppose that the yield of product is related to the process operating temperature. Regression analysis can be used to build a model that expresses yield as
a function of temperature. This model can then be used to predict yield at a given temperature leveL It could also be used for process optimization or process control purposes,
In general, suppose that there is a single dependent variable. or response y, that is
related to k independent, or regressor, variables, say Xl,:tz. ,." Xt" The response variable y
is a random variable, while the regressor variables x!~;t1, ... , x,I;- are measured wi!h neg1igi~
ble error, The Xi are called mathematical variables and are frequently controlled by the
eXperimenter. Regression analysis can also be used .in situations where y, Xi' X 2' ", x\: are
jointly distributed random variables, such as when !he data are collected as different measurements on a common experimental unk The relationship between these variables is characterized by a matheIr..atical model called a regression equation, More precisely. we speak
of the regression of yon X 1>;t1. " "Xk" This regression model is fitted to aset of data. In some
instances. the experimenter will know the exact form of the true functional relationship
between y and x l .;t1 ... , x", say y;;;; (x\, x2 " ' , xJ, However, in most cases, the true functional relationship is unknov.n, and the experimenter will choose an appropriate function to
approximate i/J, A polynomial model is usually employed as the approximating function,
In this chapter, we discuss the case where only a single regressor variable, x. IS of inter
est Chapter 15 will present the case involving more than one regressor variable.
w

14-1 SIl'vIPLE LINEAR REGRESSION


We wish to detennine the relationship bet\\'een a single regressor variable x and a response
variable y. The regressor variable x is assu.."Ued to be a continuous mathematical variable,
controllable by !he experimenter. Suppose that the true relationship benveen y and x is a
straight line, and that the observation y at each level of;; is a random variable. Now; the
expected value of y for each value of x is
(14-1)

where the intercept Pc and the slope /31 are unknown constants. We assume that each observation~ y~ can be described by the model

13.+ f3,x+,

(14-2)

409

410

Chapter 14 Simple Linear Reg:re!)sio~ and Correlation

cr.

where is a random error with mean zero and variance


The {} are also assumed to be
uncorrelated random variables, The regression model of equation 14-2 involving only a single regressor variable x is often called the simple linear regression model
Suppose that we have n pairs of observations, say (y" x), (y,. x,), .... (y., x"). These
data may be used to estimate the unknown parameters f30 and Pl in equation 14-2. Our estimation procedure will be the method ofleast squares. That is, we will estimate p, and Pl so
that the sum of squares of the deviations between the observations and the regression line
is a minimum. Now using equation 14--2, we may write

i==-1.2 .. ,n,

(14-3)

and the sum of squares of the deviations of the observations from the true regression line is
n

L =2>1 =l., (Yi - Po - Plx,t

(14-4)

The least-squares estimators of Po and p" say Po andfi" must satisfy

(14-5)

Simplifying these two equations yields

j=l
n

1=1

Pol., X, + P,l.,X;
(14-6)
Equations 14-6 are called the
equations is

least~squares

normal equations. The solution to the normal

(14-7)

( 14-8)

,y,

where y= (l/niE;. andX'= (lln)L;.lx" Therefore, equations 147 and 14-8 are the leastsquares estimators of the intercept and slope, respectively. The fitted simple linear regression model is
(14-9)

14-1

Simple Lbear Regression

411

Notationally, it is convenient to give special symbols to the numerator and denominator of equation 14-8. That is. let
(14-10)
and
n

S", =

:2>,(x,

(14-11)

i=l

We callSp; the corrected sum of squares of.x and Sr; the corrected sun: of cross products of
x and y. The extreme right-hand sides of equations 14-10 and 14-11 are the usual computational form.u1as, Using this new notation, the least~squares estimator of the slope is

S
S'"

f3.. =-'L.

(14-12)

A chemical engineer is investigating the effect of process operating terr.perature on product jield, The
study results irl the following data:
Tempernture. 'C (xl

YIeld, % (y)

100

110

120

130

140

150

160

170

180

190

45

51

54

61

66

70

74

78

85

89

These pairs of points are plotted irl Fig. 14--1. Such a display is called a scatter diagram. Examination
of this scatter diagramirldicaLes that there is a s'JOng relationship between yield anc temperature, and
the tentative assumption of the straight~li.:le model y = Pc + PIX - appears to be reasonable, The follollti."l.g quantities IDay be computed:

" =1450. 2:>i=673.


Lx,
IO

n=lO.

;:=145.

..I

!C

10

!C

Lx? =218,500.

Lyl = 47,225,

LX,y, =101,570,

i"'l

fool

i",l

From equations 14--10 and 14-11, we find

S =
n

~ x:

,L,.
1"',

-1.l'~" yI
1O,L,'

and
10

1 (!O

'

i[""y I
10

S = "" ",y.--I "" x1


.ty
,L..;"
lO:.L,;' 1"'/..1

=218 500- (:450)' =8250


10

,'",j)

\.;",1,; ;",,1

(1450)(6731

,=101570-,,""""",""-' -3985 .
I'
10

Therefore, the least~squares estimates of the slope and intercept are


,

S",

3985

/3,, =-=--=0.483
S," 8250

412

Chapter 14 Simple Linear Regression and Correlation

100

9C
80

70

,,-"
ID

;;:

60

5D
40

30

20
10

100 110 120 130 140 15D 160 170 160 160
Temperature, x

Figure 14.1 Scatter diagram of yield versus temperature.

and

Po ~ y - fi ii =67.3

(0.483)145 =- 2.739.

The fitted simple linear regression model is

y= -

2.739 + 0.481<.

Since we have only tentatively assumed the straight-line model to be appropriate, we


will want to investigate the adequacy of the model. The statistical properties of the leastsquares estimators and are useful in assessing model adequacy. The estimators and
PI are random variables, since they are just linear combinations of the Yi> and the Yl are random variables. We will investigate the bias and variance properties of these estimators. Con~
sider first
The expected value of /3; is

Po

A.

Pt

,...

14-1

Simple Li::!ear Regression

413

since '2.7= !(xl-X) = 0,

r..;. ;7:'(X1

;;;;;

Sx,x. and by assurnptionE(e,) = O.Thus'PI is an unbi-

ased estimator of the true slope PI' Now consider the variance of

assumed that VeE) =

0-', it follows that V(y); 0-', and

Pt , Siuce

we have

(14-13)

The random variables {y,) are uncorrelared because the {,} are uncorrelated, Therefore, the
variance of the stlmm equation 14-13 is just the sum of the variances, and the variance of
each term in the sum, say VIY;(XI
is VZ(x; - i)2. Thus.

v(p,)

-(1

S2

xx

0-

22:' I x,-x_)2
~

;""l

(14-14)

S'"
Using a similar approach~ we can show that
(14-15)

and

ftc

:Note ~t is an ::nbiased estimator of Po. The covariance of Po and 13: is not zero; iu fact.
Cov({3e' f3,) ;-rr"iIS""
It is usually necessary to obtain an estimate of <fl. The difference bet\Veen the obser-

vation 1, and the corresponding predicted value y" say e} = y, - y" is called a residual, The
sum of the squares of the residuals, or the error sum of squares, would be

(14-16)

, )2 '
y,
A more couyeni",nt computing formula for SSe may be found by substiruting the fitted
model
{30 - i3,x, into equation 14-16 and simplifying, The result is
n

SSE=2:Y7
1=1

and if we let :Lf""tyl-ny'2 =i(Yi _y)2 =: Syy. theu we may 'Write SSfi as
j",l

(14.17)

414

Chapter 14 Simple Linear Regression and Correlation


The expected value of the error sum of squares SSe is E(SSE) = (n - 2)(1'. Therefore,
(14-18)

is an unbiased estimator of oz.


Regression analysis is widely used and frequently misused. There are several common
abuses of regression that should be briefly mentioned. Care should be talren in selecting variables with which to construct regression models and in determining the form of the approxi~
mating function. It is quite possible to develop statistical relationships among variables that
are completely unrelated in a practical sense. For example, one might attempt to relate the
shear stren.,"1h of spot welds with the nn.'Uber of boxes of computer paper used by the data
processing department. A straight line may even appear to provide a good fit to the data, but
the relationship is an unreasonable one On which to rely. A strong observed association
benveen variables does not necessarily imply that a causal relationship exists beween those
variables. Designed experiments are the only way to detennine causal relationships.
Regression relationships are valid only for values of the independent variable \'vithin
the range of the original data. The linear relationship that we have tentatively assumed may
be valid over the original range of XI but it may be unlikely to remain so as we encounter x
values beyond that range, In other words, as we move beyond the range of values of x for
which data were collected, we become less certain about the validity of the assumed modeL
Regression models are not necessarily valid for extrapolation purposes,
Finally, one occasionally feels that the model y = .8x "". is appropriate. The onission of
the intercept from this model implies, of course, that y =0 when x = O. This is a very strong
assumption that often is unjustified. Even when t'.VO variables, such as the height and weight
of men, would seem to qualify for the use of this model, we would usually obtain a better fit
by including the intercept, because of the limited range of data on the independent variable,

14-2 HYPOTHESIS TESTING IN SlMPLE LINEAR REGRESSION


An important part of assessing the adeqaacy of the simple linear regression model is testing statistical hypotheses about the model parameters and constructing certain confidence
intervals. Hypothesis testing is discussed in this section) and Section 14-3 presents methods
for constructing confidence intervals. To test hypotheses about the slope and int=ept of the
regression model, we must make the additional assumption that the error component t is
normally distributed. Thus, the complete assumptions are that the errors are l'<'JD(0, d')
(normal and independently distributed). Later we will discuss how these assumptions can
be checked through residual analysis.
Suppose we wish to test the hypothesis that the slope equals a constant, say .8,. o' The
appropriate hypotheses are

.8, ~ .8,."
B,:.81 '" .8,.c,
B,:

(14-19)

where we have assumed a t'.Vo-sided alternative. Now since the / are NlD{O, oZ). it follows
directly that the observations y, are !XJD(j3, + .8.x" 0"). From equation 148 we observe that
/3, is a linear combination of the observations Thus, /3, is a linear combination of independent normal random variables and, consequently, is "'!(/3" o"!S~), using the bias and
variance properties of /3, from Section 14-1. Furthermore, /3, is independent of MS,. Then,
as a result of the normality assumption, the statistic

Yr

fit

(14-20)

14-2 Hypothesis Te}''ting in Simple Linear Regression

415

P,.,. We would reject

foDows the tdistribution with n - 2 degrees of freedom underHo' p,


Ho: p, ~ P;. 0 if

Itol > t"","~"

(14-21)

where to is computed from equation 14-20.


A similar procedure can be used to test hypotheses about the intercept To test
Ho:

p, P,.o'

(14-22)

H,' Pn " Pn.o,


we wood use the statistic

(14-23)

and reject the null hypothesis if Itol > ta!1."~"


A very important special case of the hypothesis of equation 14-19 is

He'

p, =0,

H"

/3,,, O.

(14-24)

This hypothesis relates to the significance of regression. Failing to reject Ho: PI :.:: 0 is
equivalent to concluding that there is no linear relationship between.x and y. This situation
is illusrrated in Fig. 14-2. Note that this may imply either tbatx is of little value in explaining the variation in y and that the best estimator of y for any x is y = Y (Fig. 14-2.a) or that
the true relationship between" and y is not linear (Fig. 14-2b). Alternatively, if Iio: {3, = 0
is rejected, this implies that ):is of value in explaining the variability in y. This is illustrated
in Fig. 14-3. However, rejecting HO=fJl :::::: 0 could mean either that the straight-line model is
adequate (Fig. 14-3a) or that even though there is a linear effect ob, better results cnuld be
obtained with the addition of higher-order polynomial terms in x (Fig. 14-3b).
The test procedure for Iio:/3, 0 may be developed from two approaches. The frst
approach starts with the following partitioning of the total corrected sum of squares for y:
(14-25)

.. ,

., .

(a)

Figure 14--2 The hypothesis Hy:

, "

...
(b)

/3; :.:: 0 is not rejected.

416

Chapter 14

1,

Simple Linear Regression and Correlation

..

...' .'. . ... .


..


'

'

"

x
(b)

(a)

Figure 14~3 The hypothesis Ho: PI "" 0 is rejected.

The two components of Sy! measure, respectively. the amount of variability in the Yi
accounted for by the regression line, and the residual variation left unexplained by the
ICY! - Yr)'l the error sum of squares and SSp;::::
regression line. We usually call SSE:=:
,(Yi - Y)'l the regression sum of squares. Thus, equation 14-25 may be v.rritten

I.;=

(14-26)

Comparing equation 14-f 6 with equation 14-17. we note that the regression sum of squares

SSE'S
(14-27)

S" has" - 1 degrees of freedom, and SS, and SSE have 1 and n - 2 degrees of freedom,
respectively.
We may show thatE[SS,,!(r. 2)] ~ a'- and E(SS;;J ~
are independent. Thus, if Ho: f3, = 0 is true, the statistic
1';,

a'- + f3;S_ and that SSE and SS,

SSR/l

(14-28)

SS/(" - 2)

follows the FL,,_l distribution. and we would rejectHo if Fc > Fex. l,II-1' The test procedure
is usually arranged in an analysis of variance table (or k'lOVA), such as Table 14-1.
The test for significance of regression may also be developed from equation 14-20 with
ilL' = 0, say
(14-29)

Squaring both sides of equation 14-29, we obtain


2 _

/Jfs", _AS", _MS.

to - - - - - - - - -

MSe

MSE

MS

Table 14..1 Analysis ofVa.'ance for Tes:ing Significance of Regression


Source of
Variation

Sum of

Degrees of

Squares

Freedom

p,So-,

Regression
Error or residual

35,=
S5,=S,,_- {J,S"

Total

Syy

n-2
n 1

Mean
Square

MS.
MS,

(14-30)

Inte:val Estlmatior. in Simple Linear Regression

14-3
Table

14~2

41-7

Testi.'lg for Significance of Regression. Example 14-2


Degrees of
Freedom

Mean

1924.87

1924.87

7.23

0.90

1932.10

Source of
Variation

Sum of
Squares

Regression
Ettor
Th:al

2,38.74

Note thatt; in equation 14-30 is identical to Fain equation 14-28. It is true, in general, that
the square of a t random variable with! degrees of freedom is an F random variable, with
One and!degrees of freedom in the numerator and denominator, respectively. Thus, the test
using", is equivalent to the test based on Fo.

We will test the model developed in Example 14-1 for significance of regression. The fitted model is

y:::- 2,739 ..;..0.483.x, and Syy is computed as

:47225- (673)'
,
:0
=1932.10.
The regression sum of squares is

SS, = PeS", = (0.483)(3985) = 1924.87,


and the error sum of squares is:

1932.10 -1924.87
7.23.
The analysis of variance for testing Ho: f3! 0 is summarized in Table 14-2. Noting that Fo 2138.74
> FIJ.O\. L~ = 11.26, We reject He and conclude that P! -:;L O.
t":";

14-3 INTERVAL ESTIMATION IN SIMPLE LINEAR REGRESSION


In addition to point estimates of the slope and intercept, it is possible to obtain confiden<oe
interval estimates of these parameters. The width of these confidence intervals is a me",,,,re
of the overall quality of the regression line. If the <, are normally and independently dis"
tribute<!, then
and

418

Chapter 14

Simple Linear Regression and Correlation

are both <listributed as t wilb n -2 degrees of freedom. Therefore, a 100(1- a)% confidence
interval on Ibe slope /3, is given by
(14-31)

Similarly, a 100(1 - a)% confidence interval on Ibe intercept /3, is

I'

1 ",' :

: [1

j"l

/30-la12
S./3o+ta'2'-2~MSE -+-J'
",-2
I I MS;; -+-J5./3o
n S
n S
!,

,xx

(14-32)

xx

We will find a 95% confidence interval on Ll:le slope of the regression line using the data in Exa:rnplc
14-1. Rec.all that ~, ~ 0.483, S~= 8250, and MS,= 0.90 (see Table 14-2). Then, from equation 14-31
we fine.

or
0.483- 2,306,1 0.9~ ,; III ,; 0.483+2.306,1 0.9_0 ,
,8250
,82,0
TIris simplifies to

0.45% Il,'; 0,507.

A confidence interval may be constructed for the mean response at a specified x. say
x". This is a confidence interval about E(YIxJ and is often called a confidence interval about
Ibe regression line. Since E(ylx,) ~ /3, + /31x", we may obtain a point estimate of E(YIxJ from

Ibe fitted model as

E~) ""y,=i3,+P,xo'
Now Yo is an unbiased point estimator of E(YIxJ, since Aand p, are unbiased estimators of
/30and /3" The variance ofYo is
V(i'o) =O"'~.!.+ ("" -

Ln

x)'],

S~

and Yo is normally distributed, as and PI are normally distributed. Therefore, a 100(1 a)% confidence interval abom the true regression line at x '\'0 may be computed from

/1

(1

Yo -ta{l,n-2 _MSEl-+

(xo-xl"J
(14-33)

Sa

,----,-----,-,.
'"1
\I
n

"E (YixO)';Yo+'aJ2~-2> 1MSEI' -+


I'

(x,-x)'1 .
Sa

btcrval Estimation in Simple Linear Regression

14-3

419

Tne width of the confidence interval for E(y1xJ is a function of xc' The interval width is a
minimum for Xu =::X and 'Widens as ~o - Xl increases, This widening is one reason why using
regression to extrapolate is ill-advised,

We will construct a 95% confidence interval about the regression line for the data in Example 14~ 1.
The fi!ted model is Yi; = - 2.739 ..;.. 0.483x,., and the 95% confidence interval on E(yix,,) is found from
equation 14-33 as

-21]I'

_
1 (xo -14,\ :
f
Yc ::2.30\!0.90 Iii--' &250 '
1

The fitted values Yo and the coa:espomii."lg 95% cor.fldence limits for the poi::ts
10, a:-e displayed in Table 14-3. To illustrate the use of this table, we may find the
interval on the true mean process yield at Xc = 140"C (say) as

i= 1, 2. ""

confidence

64.88 - 0.71 :;; E(Yix, = [40)'; 64.&8 + 0.71


or

64.[7"; E(Ylx" ~ 140) ,,65.49.


The fitted model and the 95% confidence in,terval about the regression line a.'''c shown in Fig. 14-4,

Table 14-3 Confidence Interval about the Regression Line, Example 14-4

95% confider.ce
limits

100

110

120

130

140

150

160

:70

1&0

190

45.56

50.39

55.22

60.05

64.&8

69.72

7455

79.38

114.2[

89.04

L30 LlO Q.93 Q.79 Q,71

Q.7t

Q,79 Q.93

LlO

1.30

100

90

/~

~-;

80
~~

...

...

70
"- 60

~
:;: 50
40
30
20
10
0'

!-~

100 110 120 130 140 150 160 170 180 190
Temperature. x

Figure 144 A 95% confidence interval about the :regression line for E;r;ampie 14-4.

420

14-4

Chapter 14 Si.rcple Linear Regres~10n and Correlation


PREDICTIO~

OF NEW OBSERVATIO~S

An important application of regression analysis is predicting new Or future observations y


corresponding to a specified level of the regressor variable x. If Xo is the value of the regressor variable of interest, then
(14-34)
is the point estimate of the new or future value of the response Yo'
Now consider obtaining an interval estimate of this future observation Yo' llis new
observation is independent of the observations used to develop the regression model. There~
fore) the cOlli"idence interval about the regression line, equation 14-33, is inappropriate,
since it is based only on the data used to fit the regression model. The confidence interval
about the regression line refers to the true mean response at x = Xc (that is, a population
parameter), not to future observations.
Let Yo be the future observation atx:;;:; xo' and letyc given by equation 14--34 be the esti~
matOr of Yo' Note that the random variable
'I'=y, -Yo
is normally ilistributed with mean zero and variance

because Yo is independent ofy,> Thus, the 100(1 - a)% prediction interval on a future obser,
vations ar Xc is

(14,35)

Notice that the prediction interval is of minimum width at Xo := X and widens as ~J xl


increases. By comparing equation 14-35 with equation 14-33, we observe that the predic,
tion interval atxois always wider than. the confidence interval at x()' This results because the
prediction interval depends on both the error from the estimated model and the error asso,
ciated with furore observations (cr).
We may also find a 100(1- a)% prediction interval on the mean of k future observa'
tions on the response at.x =X O' Let Yo be the mean of k future observations at x = Xo- The
100(1- a)% prediction interval on Yo is

(14-36)

14-5 Measuring the Adequacy of the Regression Model

421

To illustra::e the cons!ruction of a prediction interval, suppose we use the data in Example 14-1 and find a 95% prediction .inter,fal on the next observation .on the process yield at
Xo == 160C, Gsing equation 14~351 we find that the prediction interval is
74.55-2.306

iO.90[1+~+ (160-145)']

10

8250

r-c---------:-c;

< < 745' ? 306 1090[1 1 (160-145)'"


- Yo - . L. ~.
+10+
8250
'
which simplifies to

14-5 MEASL"RING THE ADEQUACY OF THE REGRESSION MODEL


FirJng a regression model requires several assumptions. Estimation of the model parameters requires the assumption that the errors are uncorrelated random variables with mean
zero and constant variance, Tests of h:'t'Potheses and inteITal estimation require that the
errOrS be normally disnibuted. In addition, we assume that the order of the model is correct;
that is, if we fit a first-order polynomial. then we are assuming that the phenomenon actually behaves in a firstorder manner.
The analyst should always consider the validity of these assumptions to be doubtful
and conduct analyses to examine the adequacy of the model that has been tentatively entertained. In this section we discuss methods useful in this respect.

14-5.1 Residual Analysis


We define the residuals as e1 == Y,-y;, i == 1, 2, "', n1 where Yi Is an observation andy, is the
corresponding estimated value from the regression modeL Analysis of the residuals is fre~
quently helpful in checking the assumption that the errors are 1\'ID(O, iT) and in detennining whether additional terms in the model would be useful.
As anapproximate~ check of normality, the experimenter can construct a frequency his~
togram of the residuals or plot them on normal probability paper. It requires judgment to
assess the nonnormality of such plots. One may also standardize the residuals by comput~
ing d, = e/YMSE , i = 1, 2, ... , n. If the errors are NIO(O, a'), then approximately 95% of
the standardized residuals should fall in the interval
+2). Residuals far outside this
interval may indicatc the presence of an outlier, that is, an observation that is atypical of the
rest of the data. Various rules havc been proposed for discarding outliers, However, sometimes outliers pro-vide important information about unusual circumstances of interest to the
experimenter and should not be discarded, Therefore a detected outlier should be investigated first, then discarded if warranted. For further discussion of outliers, sec Y!ontgomery.
Peck, and Yrning (2001).
It is frequently helpful to plot the residuals (I) in time sequence (ifrnown), (2) against
the }>" and (3) against the independent variable x. These graphs will usually look like one of
the four general patterns sho"Wn in Fig. 14-5. The pattern in Fig. 14-5a represents normality. while those in Figs. 14-5b, C t and d represent anomalies, If the residuals appear as in
Fig. 14-Sb, then the variance of the observations may be increasing with time or w.ith the
magnitude of the }'i or xi' If a plot of the residuals against time has the appearance of
Fig. 14-5b~ then the variance of the observations is increasing with time, Plots agamsty, and

Figure14.5 patterns for residual plots, [al Satisfactory, (b) funnel, (e) double bow, (d) nonlinear,
[Adapted from Montgomery, Peck, and Vming (2001),J

Xi that look like Fig. 14-5c also indicate inequality of variance, Residual plots that look like
Fig. 14.5d indicate model inadequacy; that is, higher-order terms should be added to the

model.

l{xam"i,,14~5'
The residuals for the regression model in Example 14-1 are computed as follows:

e, =45.00 - 45.56=- 0,56,


e, ~ 51.00 e~

50.39 = 0,61,

54.00 - 55.22 =-1.22,

e, = 61.00-60,05 = 0,95,
e, = 66,00 - 64.88 = Ll2,

e,

=70,00 - 69.72 = 0.28,


.,=74,00 -74.55 =-0.55,
<, = 78,00 -79.38
1.38,
e, = 85,00 - 84.21 = 0.79,
e" = 89,00 -89.04=- 0,04.

These residuals Me plotted on normal probability paper in Fig. 14--6. Since the residuals fall

approx~

imately along a straight line in Fig. 14-6, we conclude that there is no severe departure from nonnality. The residuals are also plotted agai.'1st y! in Fig. 14-7a and against Xi in Fig, lA-7b. These plots
indicate no serious model inadequacies.

14-5.2 The Lack-ofFit Test


Regression models are often fit 10 data when the true funetioual relationship is unknown.
Naturally, we would like to know whether the order of the model tentatively assumed is carrect, This scction will describe a test for the validity of this assnmption,

14-5 Measu..ring the Adequacy of the Regression Moce]

98

95
90

10

20

80

50

70
60
50

60

40

70

30

30
.~ 40
j3
m

.0

e
c..

423

BO

20

90

-: 10

95

9B~-L

-is
I

-1.5

1.0

-0.5

0.0

1.0

0.5

2.0

1.5

Residuals
Figure 14-6 Nonnal probability plot of residuals.

e,t
2.00~

1,00~

0.00 c

-1.00 ~
-2.00

------~-----------------~-

40

60

50

70

80

$,

2.00
:.00

----------~---------------

-1.00

0.00 .

---~----------------------

-2.00
.L

! ! !

100 110 120 130

~40

!!

:50 160 170 180 190

x,

Figure 14-7 Residual plots for Example 14-5. (0) Plot agalnsty" (b) plot against.tr

'The danger of using a regression model that is a poor approximation of the true func~
tional relationship is illustrated in Fig. 148. Obviously, a polynomial of degree two or
greater should have been used in this situation.

424

l,

C1:.apter 14 Simple Linear Regression and Correlation

x
Figure 14-8 A regression model displaying llck of fit.
We present a test for the "goodness of fit" of the regression model. Specifically, the
hypotheses we wish to test are

Ho: The model adequately fits the data

H,: The model does not fit the data


The test involves partitioning the error or residual sum of squares into the NO components
SSE = SS" + SSWF

where SSi'E is the sum of squares attributable to "pureH error and SSWF is the sum of squares
attributable to the lack of fit of the model. To compute SS" we must have repeated observations OIl y for at least one level of x, Suppose thar We have n total observations such that
repeated observations at XI'
repeated observations at~,

repeated observations at x""


Note that there are m distinct levels of x. The contribution to the pure-error sum of squares
at x (say) would be
Yml'

Ynr2' ... , YI1VI",

(14-37)
The total sum of squares for pure error would be obtained by summing equation 14-37 over
all levels of x as
~

SSpE =

I,L(Y,,, _y,)2

(14-38)

There are n~ = :L:! (n; - 1) ;:::: n m degrees of freedom associated with the pure-error
sum of squares. The sum of squares for lack of fit is simply
(14-39)
wid, n - 2 - n, = m - 2 degrees of freedom. The test statistic for lack of fit would then be
~ _ SSwd(m-2)
TQ -

SSPE!(n-m)

(14-40)

and we would reject it if Fo> F(t",-2.II-m'

14-5

Measuring the Adequacy of the Regression ~lodel

425

This test procedure may be easily introduced into the analysis of va....-fance conducted
for the significance of regression. If the null hypothesis of model adequacy is rejected, then
the model must be abandoned and attempt!) made to find a more appropriate model. If Ho is
Dot rejected, then there is DO apparent reason to doubt the adequacy of the model, and MSPE,
and MSWF are often combined to estimate rfl.

i;:Xi#~re 1ft;.
SuppQse we have the follov,,1ng data:
x

1.0

LO

2.0

3.3

3.3

4.0

4.0

4.0

4.7

5.0

2.3

L8

2.8

1.8

3.7

2.6

2.6

2.2

3.2

2.0

5.6

5.6

5.6

6.0

6.0

6.5

6.9

3.5

2.8

2.1

3.4

3.2

3,4

5.0

We may compute St)';;; 10.97, S,ry:= 13.20, S;u: =

52.53~

y:;; 2.847, ru;d x;;; 4.382. The regression model

isy,;:::: 1.703 + Q,26Qx, a.nd the,regressionsum of squares is SSr:= f3IS1:)' = (0.260)(13.62) = 3.541, The
pure-error SUr:l of squares is computed as follows:
Level of x

:E(y, -

Yl'

4.0
5.6
6.0

0.1250
1.8050
0.1066
0.9800
0.0200

Total:

3.0366

1.0
3.3

Degrees of Freedom

2
2

1
7

The analysis of variance is summarized in Table 14-4. Since Foz, a, 1 L 70. we cannot reject the
hypothesis that the tenta;tve model adequatelY describes the data. \Ve ",ill poor lak~of-fit and pure~
error mean squares to fonn the denominator mean square in the test for significance of regression.
Also, since
=4.54, we conclude that{1!"# O.

In fitting a regression model to experimental. a good practice is to use the lowest


degree model that adequately describes the data. The lack-of-fit test may be useful in this

T.bl.14-4 Anilysis of Variance for Example


Source of

Variation
Regression
Residual
(Lack of fil)
(Puree:rror)

Total

Sum of
Squares

3.541
7.429
4.392

3.037
10.970

14~6

Degrees of
Freedom

1
15
8
7
16

Mean
Square
3.541
0.495
0.549
0.434

7.15

1.27

426

Chapter 14 Simple Linear Regression and Correlation

respect. However, it is always possible to fit a polynomial of degree n - I to n data points,


and the experimenter should not consider using a model that is "saturated.," that is, that has
very nearly as many independent variables as observations on y,

145.3 The Coefficient of Determination


The quantity
(I4-4J)

is called the coefficient of determination and is often used to judge the adequacy of a
regression model. (VIe will see subsequently that in the case where x and y are jointly distributed random variables, R2 is the square of the correlation coefficient between x and y.)
Clearly 0 ,; R' ~ I, We often refer loosely to R' as the amount of variability in the data
explained or accounted for by the regression model. For the data in Example 141, we have
R' = SSFIS" = 1924,87/1932,10 = 0,9963; that is, 99,63% of the variability in the data is
accounted for by the model
The statistic Rl should be used with caution. since it is al'W"&ys possible to makeR2 unity
simply by adding enougb tenns to the model For example, we can obtain a "perfect" fit to
n data points with a POlplOmiaI of degree n - 1. Also. Rl will always increase if we add a
variable to the model, but this does not necessarily mean the new model is superior to the
old one. Unless the error Sum of squares in the new model is reduced by an amount equal
to the original error mean square, the new model will have a larger error mean square than
the old one, because of the loss of one degree of freedom. Thus the new model will actu~
ally be worse than the old one,
There are several misconceptions about R2, In general, R? does not measure the magnitude of the slope of me regression line, A large value of R' does not imply asleep slope,
Furthermore. Rl does not measure the appropriateness of the model, since it can be artificially inflated by adding higber-order polynomial terms, Even if y and x are related in a non
linear fashion, J?l will often be large. For example, R2 for the regression equation in
Fig. 14-3b will be relatively large, even though the linear approximation is poor. Finally,
even though R2 is large, this does not necessarily imply that the regression model will pro~
vide accurate predictions of future observations.

146 TRANSFORMATIONS TO A STRAIGHT LTh'E


We occasionally find that the straigbt-line regression model y = Po + P,X + is inappropriate because the true regression function is nonlinear. Sometimes this is visually determined
from the scatter diagram. and sometimes we know in advance that the mode1 is nonlinear
because of prior experience or underlying theory. In some situations a nonlinear function
can be expressed as a straigbt line by using a suitable transformation. Such nonlinear mod
els are called intrinsically linear.
As an example of a nonlinear model that is inttinsically linear, consider the exponential function
Y

/3oe P'"e.

This function is intrinsically linear, since it can be transformed to a straigbt line by a logarithmic transformation

lny =lnP,+ Jh+ In .

14.7 Correlation

427

This transformation requires that the transformed error ten:ns 1n E be nor:maEy and inOO
pendently distributed with mean 0 and variance a?.
Another intrinsically linear funetion is

(11
y;/30+/3'l-I+'
x/
By using the reciprocal t:ransformation z ;;;: lix, the model is linearized to

y;/3o+/3,z+.
Sometimes several transformations can be employed jointly to linearize a function. For
example, consider the function

Letting y* = lly. we have the linearized form


lny'; /30 + /3,:<+"
Several other examples of nonlinear models that are intrinsically linear are given by Daniel
and Wood (1980).

14-7 CORRELATION
Our development of regression analysis thus far has assumed that x is a mathematical vari~
able. measured with negligible error, and that y is a random variable. Many applications of
regression analysis invulve situations where both;; and y are random variables. In these sit~
uations, it is usually assumed that the obser"llations (Yi' XI)' i = 1. 2, "', n, are jointly distributed random variables obtained from the distribution/(y, x). For example, suppose we
wish to develop a regression model relating the shear strength of spot welds to the weld
diameter. In this example, weld diameter cannot be controlled. We would randomly select
n spot welds and observe a diameter (Xi) and a shear strength (y.) for each. Therefore, (y, x)
are jointly distributed random variables.
We usually assume that the joint distribution ofy! and Xi is the bivariate normal distribution. That is,

(14-42)

0:

a;

where J.i.l and are the mean and va..--iance of Y. J.4 and are the mean and variance of x,
and p is the correlation coefficient between y and x, Recall from Chapter 4 that the correlation coefficient is defined as

where ()12 is the covariance between y and x.


The conditional distribution of y for a given value of x is (see Chapter 7)

428

Chapter 14 Simple Linear Regression and Correlation

(14-43)
where
(14-44a)

(14-44b)

and
--'
'
,
"]2
= 0;(1 - {r),

(l4-44c)

That is~ the conditional distribution of y given x is normal with mean

E(ylx) = Po + P,X

(14-45)

and variance O"~i' Note that the mean of the conditional distribution of y given x is a
straight~line regression modeL Furthermore, there is a relationship between the correlation
coefficient p and the slope PI' From equation 14-44b we see that if p=(), then P, 0, which
implies that there is no :regression of y On x. That is, knowledge of x does not assist us in predicting y.
The method of maximum likelihood may be used to estimate the parameters 130 and {JIIt may be show'n that the maximum likelihood estimators of these parameters are

{Jo=y-/J"i

(14400)

and
i-I

(l4-46b)

We note that the estimators of the intercept and slope in equaden 1446 are identical to
those given by the metl::od of least squares in the case where x was assumed to be a mathe~
matical va..~able, That is, the regression model with y and x jointly normally distributed is
equivalent to the model with x considered as a mathematical variable. This follows because
the random variables y given x are independently and normally distributed with mean Po +
PIX and constant variance
These results will also hold for any joint distribution of y and
x such that tl::e conditional distribution of y given x is noxmaL
It is possible to draw inferences about the correlation coefficient p in this model. The
es.timator of p is the sample correlation coefficient

0':,.

(14-47)

14-7 Correlation

429

Note that

'S

,,1/,2

III, ~.,YY
s j'

(14-48)

r,

\ .a

so the slope PI is JUSt the sample correlation coefficient r multiplied by a scale factor that
is the square root of the "spread" of the y values divided by the "spread" of the x values,
Thus
and r are closely related, although they provide somewhat different information,
The sample correlation coefficient r mea!."1U'es the linear association between y and x. while
measures the predicted change in the mean of y for a mtit change in x. In the case of a
mathematical variable x, r has no meaning because the magnitude of r depends on the
choice of spacing for x. We may also write) from equation 14-48,

il,

il,

which we recognize from equation 1441 as the coefficient of determination. That is, the
coefficient of determination RZ is just the square of the sample correlation coefficient
between y and x.
It is often useful to test the hypothesis

H,:p=O,
(14-49)

H,:p,eO,
The appropriate test statistic for this hypothesis is

(14-50)

which follows the t distribution with n - 2 degrees of freedom if Ho: P = 0 is true, Therefore, we would reject the null hypothesis if It,l > 1"',"_1' This test is equivalentto the test of
the hypothesis He: Il, = 0 given in Section 14-2, This equivalence follows directly from
equation 1448,
The test procedure for the hypothesis
H,: P=Pa'

(14-51)

H,:p",po,

where P,,e 0, is somewhat more complicated, For moderately large samples (say n;' 25) the
statistic

l~r

Z = arctanh r = -l:n-'2 l-r


is approximately normally distributed with mean
Ilz

1 I~p
=arctanh P =-l:n-2 I-p

(14-52)

430

Chapter 14 Simple Linear Regression and Correlation

and variance

<r. =(n-3t:.
Therefore, to test the hypothesis Ho: P = Po' we may compute the statistic

Zo =(arctanh T -

arctanh pOl(n - 3)ln

(14-53)

and reject H,: P = Po if IZo! > Z.~.


It is also possible to construct a 100(1- a)% confidence interval for P using the transformation in equation 14-52. The 100(1 - a)% eonfidence intelval is
Zai2
r
' 'J ~ p
"In-3

tanh( arctanh r -

~ tanh

(arctanh r +

Z.2
~}
..In-3

(14-54)

where tanh u = (e" - .-")I(e" + e-").

~~~@1Efi
Montgomery, Peck, and VUJing (2001) descn.be an application of regression analysis in which an
engineer at a soft-drlnk bottler is investigating the product distribution and route service operations
for vending machines. She suspects that the time required to load and service a machine is related to
the number of cases of produ..-t delivered. A random sample of 25 (etail outlets having vending
machines is selected. and tOe in-outlet delivery time (in minutes) and volume of product delivered (in
cases) is observed for each outlet. The data-are shown in Table ~4-5. We assume that delivery time and
volume of product delivered ate jointly normally distributed.
Using the data in Table 14-5. we may calculate
s~ =

S" = 6105.9447,

698.5600,

S.,= 2027.7132.

The regression model is

y= 5.1145+ 2.9027<.
The sample correlation coefficient between x and y is computed from equation 14-47 as
T

S;ry

S:n;SYJ

2027.7132

0.9818.

[(698.5600)(6105.9447)1'"

Table 14-5 Data for Example 14-7


Delivery
Observation

2
3
4
5
6
7
8
9
10
11
12
13

Tune (y)

}iumber of

c.ses (x)

9.95
24.45
31.75
35.00

2
8
11
10

25.02

16.86
14.38

4
2
2

9.60
24.35
27.50
17.08
37.00
41.95

8
4

11
12

Delivery
Observation
14
15
16
17
18
19
20
21
22
23
24
25

TUlle (y)

11.66
21.65
17,89
69.00

10.30
34.93
46.59

44.88
54.12
56.63
22.13

21.15

Number of
C=(x)

2
4
4
20
1

!O
15
15
16
17
6
5

r
I

14-9 Summary

431

Note that R'l ~ (0.9818)2 0,9640, or that approximately 96.400/0 of the variability in delivery time is
explained by the linear relationship with delivery volume, To test the hypothesis

Iio: p~ 0,
Ii,: p" 0,
14~50

we can compute the test statistic of equation

as follows:

_ '~n-2
,......-.,jl_,2

0,9818,,23

to -

.,jl-0,9640

2480
,

Since to.Ol.'i,1:! = 2.069, we reject Ho and conclude fuat the correlation coefficient p;t O. Fi:::ally, we may
construct an approximate 95% confidence .interval on p from equation 14-54, Since arctanh r ;:; ;
arctmh 0,9818 = 2.3452, equation 14-54 becot:1<S

tmh' 2.3452-~ ,;p,; tmh IZ,3452+ 1.96.,


\
~22
\
-fii)
(

which reduces to
0.9585 s

P s 0,9921.

14-8 SAMPLE COMPUTER OUTPUT


Many of the procedures presented in this chapter can be implemented usiug statistical soft~
ware. In this seetion, we present the Minitab" output for the data in Example 14-1.
Recall that Example 14-1 provides data On the effect of process operating temperature
on product yield. The Minitab output is

The regression equation is


Yield

=-

2.74 + 0.483 Temp

Predictor
Constant

Coef

T~p

= 0.9503

A.~alysis

~2.739

SE Coe
1.546

-~,77

0.48303

O.0~046

46.~7

99.6%

R-Sq

P
0.114
0.000

R-Sq(adj) = 99.6%

of Variance

Source
Regression
Residual Error
Total

55

MS

~924.9

~924.9

2131. 57

0,000

7.2

0.9

~932.~

DF
1

The regression equation is pro'tided along with the results from the Hests on the individual
coefficients. The P-values indicate that the intercept does not appear to be significant
(P-value = 0.114) while the regressor variable, temperarure, is statistically significant
(P-va!ue = 0), The analysis of variance is also testing the hypothesis that Ii,: /31 = 0 and can
be rejected (P-value = 0)_ Note also that r = 46.17 for temperature, and f = (46,17)'
2131.67 = F. Aside from rounding, the computer results are in agreement with those found
earlier in the chapter,

14-9 SUMMARY
This chapter has introduced the simple linear regression model and shown how leastsquares estimates of the model parameters may be obtained. Hypothesis ..testing procedures

432

Chapter 14 Simple Linear Regression and Correlation

and confidence interval estimates of the model parameters have also been developed. Tests
of hypotheses and confidence intervals requrre the assumption that the observations y are
nonnally and independently distnouted random ...'3Iiables. Procedures for testing model
adequacy, including a lack-of-fit test and residual analysis, were presented. The correlation
model was also introduced to deal with the case where x and yare jointly normally distributed. The equivalence of the regression model parameter estimation problem for the case
where x and y are jointly nonnal to the case where x is a mathematical variable was also discussed, Procedures for obtaining point and interval estimates of the correlation coefficient
and for testing hypotheses about the correlation coefficient were developed.

14-10 EXERCISES
14-1. Montgomery, Peck. and Vwing (2001) present
data concerning the performance of the 28 National
Football League teams in 1976. It is suspected that
the number of games won (y) is related to the number
of yards gained:.'U.Sh.ing by an opponer:.t (x). The data
are shown below,
Yards Rushing by

Teams
Washington
Minnesota
New Englaod
Oakland
Pittsbcr:gh

Baltimore
Los Angeles
Dallas
Atlanta
Buffalo
Chicago

Cinc!"1.Ilati
Cleveland
Denver
Detroit
Green Bay
Houston
Kansas Cit)'
Miami
~ew Orleans
New York Gia:ats

Games Won (y)

Opponent (x)

10

2205
2096
1847
1903

11
11

13
10

1848

10
1!

1564

2577

2
7
10
9
9
6
5
5
5

2476
1984
1917
1761
1709
1901

6
4
3

'S'ew York Jets

Philadelphia
St Louis
San Diego

San. Francisco
Seatt::c

Tampa Bay

4151

11

10
6
8
2
0

1321

2288
2072
2861
2411

2289
2203
2592
2053
1979
2048
1786
2876
2560

(a) Fit a linear regression model relating games won


to yards gained by an opponent
(b) Test for significance of regression.
(e) Find a 95% confidence interval for me slope.
(d) '\Vh.a: percentage of total variability is explained
by the lIlodel?
(e) Find::he residuals and prepare appropriate residual plots.
14-2. Suppose We would like to use the model devel~
oped in Exercise 14-1 to prediet ::he number of games
a team will ..........n if it can limit the opponents to 1800
yards rushing. Find a poin~ estimate of the number of
games won if the opponents gain only 1800 yards
rushing. Find a 95% prediction interval on the number
of games WOn.

14-3.. Motor Trend magazine frequently presents performance data for automobiles. The table below presents data from the 1975 volume of Motor Trend
concern.iog the gasoline milage perfonnance and the
engine displacement for 15 automobiles.

Automobile
Apollo

Omega
t\ova
Monarch
Duster
Jensen ConY.
Skybawk

Monza
Corolla SR-5
Carnaro
Eldorado
Trans Am
Charger SE
Cougar

Co,rvette

:'files I
Gallon (y)
18.90
17.00
20.00
18.25
20.01

11.20
22.12
21.47
30.40
16.50
14.39
16.59
19.73
13.90
16.50

Displacement
(Cubic

(x)

350
350
250
351
225
"40
231
262
96.9
350
500
400

318
351
350

Exexoises

,4-10

433

(a) Fit a regression model relating mileage performance to engine displacemenL

(a) Fit aregression model relating sales price to taxes

(b) Test far significance of regression.

(b) Test for significance of regression,

(c) 'What percentage of total ,,"arizbility in mileage is


explained by the model?

(c) W"hat percentage of the variability in selling price

(d) Fi:r:.d a 90% confidence inten'a! On the mean


mileage if the engine displacement is 275 cubic
inches,
144. Suppose that we wish to predict the gasoli:r:.c
mileage from l! car v.ith a 275 cubic inch displacement

engine, Find a point estimate. using the model developed in Exercise 14-3. ar.d an appropriate 90% interval
estimate. Compare this interval to the Onc obra:ned in
Exercise l4-3d, 'Which one is wider. and why?
Fmd the rcsidl;als from the moder in Exercise
14.-3. Prepare appropriate residua: plors anC commer.t
on model adequacy.

paid.

is explained by the taxes paid?


Cd) Find the residuals for this model. Construe: 11 normal probability plot for the residuals. Plot the
residuals versus y and versus x. Does the model
seem satisfactory?

14-7. The stre:r:gth of paper used ill the manufacture of


cardboard boxes (y) is related to the percentage of
hardwood concentration in the original. pulp (x). UnCer
controlled cor.ditions, a pilot plant manufaetu.'>'CS 16
samples. each from a different batch of pu~p, and
measures the tensi::e strength. The data are shown here.

14~5.

14~6. An article in TechnomeIrics hy S. C. Narula and


J. E Wellington ("Prediction, Linear Reg:ession, and
a :Mi.nin::n.ntJ. SUDl of Relative Errors:' VoL 19, 1977)
presents data On the selling price and annual taxes for
27 houses. The data are shown below.

:~~4 (:7;411:~/ 11~~211~~911:60911~8 1~:9

y ! 11.3 123.0 12':),1 145.2 1343 ,,<.:.4.5 J43.'1 146.9

zlZSl28 I

2$ 3.0

3.0 3.2 I B

(a) Fit a simple linear regression model to the data,


(b) Test for lack of fit and significa.'1ce of regression.

(c) Construct a 90% confidence interval on the slope


Taxes (Local, School.
Sale Price f 1000

11000

25.9
29.5
27.9
25.9
29.9
29.9
30.9
28.9
35.9

4.9176
5.0208

(e) Construct a 95% confidence interval on the true

4.5429

14~8.

31.5
3LO
30.0

5.3003
6.2712
5.9592
5.0500

36.9

8.2464

Jan.
Feb.

41.9

6.6969

Mar.

4().5

7.7841

Apr.

30.9

43.9
37.5
37.9

44.5
37.9
38.9

36.9

County)

fl,
Cd) Construct a 90% confidence interval 00 the inter~
cept flo.

45.8

4.5573
5.0597
3.8910
5.898Q5.6039
5.8282

9.0384

5.9894
7.5422
8.7951
6.0831
8.3607
8.1400
9.1416

regression line at :t =: 2.5.


Compute the residuals for the regression model
in Exercise 14-7. Prepare appropriate residual plots
and comment on model adequacy,
14..9. The number of pounds of steam used per month
hy a chemical plant is thought to he related to the aver~
age ambient temperature for that month, The past year's
usage a...'1d temperatures are shown in the following
table.
Month

May
June

July
Aug,

Sept.
Oct.

No\"
Dec.

Temp.

Usage 1 1000

21
24
32
47
50
59
68
74

185.79
21".47

62
50
41
30

288.03
42".8'
45".58
539.03
621.55

675.06
562.03
452.93
369.95
273.98

434

Chapter 14 Simple Linear Regression and Correlation

(a) Pit a simple linear rcgression model to the data.


(b) Test for significance of regression,

Average Size

Level

8530
8544
7964
7440
6432
6032
5125
4418
4327
4133
3765

18.20
18.05
16.81
15.56
13.98
14.51
10.99
12.83
11.85
11.33
10.25

(e) Test the hypothesi! that the slope 131 = 10.


(d) Construct a 99% confidence interval about ilie
true regression line atX'''''' 58,
(e) Construct a 99% prediction interval on the steam
usage in tlie ne:;.:.t month having a mean ambient
temperat.'Jl'c of 58<>,
14-10, Compute the residuals for the reg::ession model
i::. Exercise 14~9. Prepare appropriate residual ploto;;
and comment on model adequacy.
14-11. The percentage of impurity in oxygen gas pro-.
eueen by a distilling process is iliought to be related to

the percentage of Ilydroca:bon in the main condenser


of the processor. One mon1.'s operating data are avail~
able. as shown in the table at the bottom of this page.
(a) Pit a simple linear regress~o;). model to the data.
(b) Test fot lack of fit and significance of regresston.
(c) Calculate R2 for this model
(d) Calculate a 95% confidence interval for the slope

13,
14~12.

Compute the residuals for the data in Exercise

14-11.
(a) Plot the residua:s on norma: probability paper and
draw appropriate coucbsions.
(b) Plot the residuals againsty and x. Interpret these
displays.
14~13. NJ article in Transportation. Research (1999.
p. 183) presents a study On world maritime employ~
ment, The pwpose of the St.:.dy was to determine a
relationship between average manning level and the
average size of the fleet. Mar..rung level refers to the
ratio of number of posts that must be manned by a
seaman per ship (posts/ship). Data collected for ships
of the lIn.:.:ed Kingdom over a 16~yeax period are

Average Size

Level

9154
9277
9221
9)98
8705

20.27
19.98
20.28
19.65
18.81

continues
Purity (%)

HYGr'ocarbon (%)
Purity (%)

Hydrocarbon (%)

(a) Fit a l.i:near regression model relating average


:manning level to avemge ship size.
(b) Test for significance of regression.
(c) Find a 95% confidence interval 0:: the slope.
{d) '\\-'hat percentage of total variability is explained
by ilie model?
(e) Find the residuals and construct appropriate residual plots.
14-14. The final averages for 20 randomly selected
stv.dents taking a course in engineering statistics and a
course in operations researcb at Georgia Tech a:e
shown here, Assume that the final averages are jointly
normally distributed.
Statistics

OR
Statistics

OR
(a) Find the regression line relating the statistics final
average to the OR final average.
(b) Estimate the correlation coefficient.
(c) Test the hypothesis that p: O.
(d) Test the bypothesi.~ that p:::: 0.5.
(e) COIlSQUct a 95% confidence interval esti:nate of
the correlation coefficient.
14-15. The weight and systolic blood pressure of 26
randomly selected males in the age gl'O'Jp 25-30 are

86.91

87.33

86 .29

89.86

1.02

0.95

1.11

1.02

%.73

95.00

1.46

1.01

I 96.85

85.20

90.56

0.99

0.95

0.98

14-10 Exercises
shown in the following table. Assume that weight and
blood pressure are jointly normally distributed.
Subject

Weight

Systolic BP

165
167
180

130

2
3

155
212

128

5
6
7

133
150
lSI

13

175
190
210
200
149
158
169
170

14
15

172
159

153
128

16
17

168
174
183

149

8
9

10
11
12

18
19
20
21
22
23
24

25
26

215
195
180
143

:240
235
192
187

146

150
140
148
125

435

Is the es-:ir.1ator of t.'1e slope in the simple linear


regression model unbiased?
14~18. Suppose that we are fitting a straight line and
we wish to make the variance of the s10pe Pl as small
as possible. Vt'bere should the observations x" i:::. 1,
2"", n, be taken so as to minimize V(fi l )? Dis~ss the
practical implications of this allocation of the X"

14M19. Weighted Least Squares. Suppose thar we are


fitting the straight line y;;;::: flo "l' iJjx + but that the
vzriance of the y values now depends on the level of x;
that is,
i:::.1,2 .... ,n,

133

135
150

where the Wi are unknown constants, often called


weights, Show that the resulting !easHquares normal
equations are

132

50

158

150

l",!

WI -

PI:t w;x; = .t
1=1

W;Y,',

;""

/loLwixl-P:Lwixr= Lw/x;'Y;.

163
156

t=l

;"'!

i",:

124
170
165
160
159

(a) Find a regression tine relating.systolic blood pres~


sure to weight,
(b) Estimate the correlation coefficient.
(e) Test the hypothesis that p = O.
(d) Test the hypothesis that p = 0.6.
(e) Construct a 95% confidence interval estito.a.."e of
the correlation coefficient
14-16. Consider the siJ:r.ple linear regression model
Y= 130 ..... f3;x + c, Snow that E(MS~) ::,,;12 ..... fl~S.tt'
14-17. Suppose that we have assumed the straight~line

regression model

but that the response is affected by a second variable,

.tz, such t..~t the true regression function is

14 20. Consider the data shOWll below, Suppose that


the relationship between y and x is hypothesized to be
y = (iJ':! + PIX + Erl. Fit an appropriate model to the
data, Does the assumed model form seem
appropriate?
M

0.24

14-21. Consider the weighta.n.d blood pressure data b


Exercise 14-15. Fit a nowbtercept model to the data,
and compare it to the model obtained in Exercise
14~15. Which model is superior?

14-22. The following data, adapted from


Montgomery, Peck, and Vming (2001), present the
number of certified mental defectives per 10,000 of
estimated population in the United Kingdom (y) and
the number of radio receiver licenses issued (x) by the
BBC (in millions) for the years 1924-1937. Fit a
regression model relating y to x. COIll.m.ent on the
model. Specifically, does the existence of a strong cor~
relation imply a cause~and-effect relationship?

436

Year

Chapter 14 Simple Linear Regression and Correlation

Number of Certified
Mental Defectives per

Number of Radio
Receiver Licenses

10,000 of Estimated
U.K Population (y)

Issued (Millions)
in the U,K (xl

8
8

1924
1925
1926
1927
1928
1929
1930
1931
1932
1933

12
16
18
19

1934

20

1935
1936
1937

21

1.350
1.960
2,270
2.483
2.730
3.091
3.674
4,620
5,497
6,260
7,012
7,618

22

3.m

23

8,593

10
;1

11

.J

Chapter

15

Multiple Regression
Many regression problems involve more than one regressor variable. Such models are
called multiple regression models. Multiple regression is one of the most widely used statistical techniques. This chapter presents the basic techniques of pardII1eter estimation, confidence interval estimation, and model adequacy checking for multiple regression. We also
introduce some of the special problems often encountered in the practical use of multiple
regression including model building and variable selection. autocorrelation in the errors t
and multicollinearity or near-linear dependence among the regressors.
j

15-1 MULTIPLE REGRESSION MODELS


A regression model that involves more than one regressor variable is called a multiple
regression. model As an example, suppose that the effective life of a cutting tool depends
on the cutting speed and the tool angle. A mnltiple regression model that might describe this
relationship is
(15-1)
where y represents the tool life, XI represents the cutting speed, and x 2 represents the tool
angle, This is a multiple linear regression model with two regressors. The term "linear' is
used because equation 15-1 is a linear function of the unknown parameters f3o~ f31~ and~,
Note that the model describes a plane in the two-dimensionalx1 .:.s space. The parameter f30
defines the intercept of the plane. We sometimes call /3, and A partial regression coefficients, because {3J measures the expected change in y per unit change in x: when A2 is held
constant, and A measures the expected change in y per unit change in X, when Xl is held
constant.
In general; the dependent variable or response y may be related to k independent vari~
abies. The model
(15-2)

is called a mnltiple linear regression model with k independent variables. The parameters
~,j = 0, I, ... , k, are called the regression coefficients. This model describes a hyperplane
in the k-dimensional space of the regressor variables {x). The parameter f3; represents the
expected change in response y per unit cbange in Xj when all the remaining independent
variables Xi (i :;t)} are held constant. The parameters ~.,j;;;:;:; 1, 2, .,., k, are often called par~
rial regression coefficients, because they describe the partial effect of oue independent 'tari~
able when the other independent variables in the model are held constant.
Multiple linear regression models are often used as approximating functions. That is;
the true functional relationship between y and .x" X" . ,.x, is unl::no"'D, but over certain
ranges of the independent variables the linear regression model is an adequate
approximation.

437

438

Chapter 15 Multiple Regression

Models that are more complex in appearance than equation 15-2 may often still be ana~
lyzed by multiple linear regression techniques. For example, consider the cubic polynomial
model in one independent variable,

Po + P,x+ pzX' + f3i' + e.

.I
i

(15-3)

If we let x, =x, A1 ::;;r, andxJ =;:3, then equation 15-3 can be written
y = Po + PIXI + {3.,x,+ {3.,x) - e,

(15-4)

which is a multiple linear regression model wil.h three regressor variables. Models that
include interaction effects may also be analyzed by multiple linear regressiOTIlllethods, For

example, suppose that the model is


y=

If we let x) = XIx" and

p,

Po + PIX, + f3,x, + Pi',xlx" + s.

(15-5)

Pll' then equation 15-5 can be written


y = Po + P,x l + f3,x, + {3.,x, + e,

( 15-6)

which is a linear regression model In general, any regression model that is linear in the
paramJJrers (111e {3's) is. linear regression model, regardless of the shape of the surface 111.t
it generates.

152 ESTIMATION OF THE P.4.RAMETERS


The method of least squares may be used to estimate the regression coefficients in equation
15-2. Suppose that n;> kobservations are available, and let Xlj denote the ith observation Or
level of variable x,. The data will appear as in Table 15-1. We .ssume that the error term 8
in the model bas E(E) 0, V(s) 0", and 111at the if,} are uncorrel.ted random variables.
We may write the model, equation 15-2, in terms of the observations,
Yi = [10 + f31xil ~ fllxf2 +.,.+ flkxik + ej
k

'=

J30 + LJ3j xJj + ei

i::::l.2 ... ,lL.

(15-7)

r=:l

The least-squares function is

(15-8)

Table 15..1 Data for Multiple Linear Regression


y
Xu

_-,V-"___X::;,,, .~_ _x..:",-::....._ _ _ _ _x::,:"''-_

15-2 Estimation of the Parameters


The function L is to be minimized with respect to

A"

439

/3" ... , /31- The least-squares estiroa-

tors of /30' /3" _", /31 must satisfy


(\5-9.)

and
j

~ 1,2, .... k.

(15-9b)

Simplifying equation 15-9, we obtain the least-squares nonnal equations


...

"r:

n/3o + /3.:I,Xil
i",l

ffi L,x.
v

,.

r""l

+h:2>i2

+ ... + fl.L,Xik

i~:

+ /31 L, Xi!

+ ffi2 L, X il X
', +.,

.,

/."'i

i:1

+;3k 2. xilxi/;; ;::: LXilYi'

1::.1

i""l

j",l

l"'l

=L,y,.
(15-10)

Note that there are p;::: k + 1 normal equations, one for each of the unknown regression coef~
ficients, The solution to the no.rrnaI equations will be the least-squares estimators of the
regression coefficients /10,
r
1ti5 simpler to solve the normal equations if they are expresse<.iin matrix notation, We
now give a matrix development of the normal equations that parallels the development of
equation 15-10. The model in terms of the observations, equation 15-7. may be written in
matrix notation,

/Jll ... , /J

y=XIHe,
where

:1

x 11

[";:J x~l: X.,


X 21

y=

'0[1]

and

x"

.,

>": 1'

-"22

... Xu

-".2

.. '

x""

,-[::1
1';_

In general, y is an (n X 1) vector of the observations, X is an (n xp) matrix of the levels of


the independent variahles, II is a (p x I) vector of the regression coefficients, and s is ail
(n

I) vector of random errors.


We wish to fud the vector of least~squares estimators. ~ that :c:litlimizes

L= L,SY
i=l

e'e=(y-X[3)'(y-XII).

440

Chaprer 15 Mcltiple Regression


Note that L may be expressed as

L = y'y -Il'X'y - y'X~+ Il'X'X~


(15-11)

y'y - 21l'X'y + Il'X'Xf3,

since IfX'y is a (1 x 1) matrix, hence a scalar, and its transpose (IfX'y), = y'X~ is the same
scalar, The least-squares estimators must satisfy

aLI

'

=-2X'y+2X'X~=O,
a~~
which simpliftes to

X'X~ X'y.

(15-12)

Equations 15-12 are the least-squares normal equations. They are identical to equations
15-10. To solve the normal equations, multiply both sides of equation 15-12 by the inverse
of X'X. Thus, the least-squares estimator of ~ is

p= (X'Xt'X'y.

(15-13)

It is easy to see that the matrix form of the normal equations is identical to the scalar
form. Writing out equation 15-12 in detail we obtain

,
n

l>i!
ie::

LXi.<
cpo
,:;;;;1

LX'2

L,xa LxA

LXi:Xn

LXnXi,k

j""l

1"'1

l=1

1=1

t=l
n

LV"

tti.l
n

A1=

LXllY'
:=1

-I

LXikYi

LX"
L,=!

LXu:xil
i"'~

LX~

X ik X l"2

1::::1

i=l

PkJ

,1""1

If the indicated matrix multiplication is performed, the scalar form of the normal equations
(that is, equation 15-10) will result. In this fonn it is easy to see that X'X is a (p X p) symmetric matrix and X'y is a (p x 1) column vector. Note the special structure of the X'X
matrix. The diagonal elements of X'X are the sums of squares of the elements in the
columns QfX, and the off-diagonal elements are the sums of cross products of the elements
in the columns ofX. Furthermore, note that the elements ofXfy are the sums of cross products afthe columns of X and the observations {yJ.
The fitted regression model is

Y=Xp.

(15-14)

In scalar notation, the fitted model is


k

y,

Po + LP

j );.,

j:::1

The difference between the observationy, and the fitted valuey; is a residual, say e;:::: Yt
The (n xl) vector of residuals is denoted

o=y-y.

-Yi'

(IS-IS)

15~2

Estimation of the Parameters

441

iu.l article in the Journal of Agricultural Engineering and Research (2001, p. 275) describes the use
of a regression model to relate the damage susceptibility of peaches to the height at which they are
dropped (drop height. measured in rom) <C!d the density of the peach (measured in glcmJ ). One gocl

of the ana]ysis is to provide a predictive model for peach damage to serve as a guideline for harvesting and postharvesting operations. Data typical of this type of experiment is given in 1'able 15-2.
We will fit the multiple linear reg:::-e.....sion model
y=

i>o+ pjx 1 + f3J.~ +"

to these data. The X matr:..x and y vector for this model are

303.7
366.7
336.8
304.5
346.8
600.0
369.0
418.0
269.0
323.0
562.2
284.2
558.6
415.0
349,5

I:!1
;1

il

1.01
0.95

1.53
4.91

0.98

10.36

1.04
0.96
1.00
1.01
0.94
1.01 '
0.97
1.03
1.01
1.04
462.8 1.02
333.1 1.05
502.1 1.10
311.4 0.91
351.4 0.96J

:1
1
1
1

x=

3.62
7.27
2.66

O90i
U14

.1
I

1
1

5.26
6.09
6.57
4.24

y~

s.o4
3,46
8.50
9.34
5.55

8.11
7.32
12.58.
0.15
5,23

The X'X matrix is

X'X=[30~.7

366.7

,,:, l'

...

1.04

0.90

303.7
366.7 1.04

oro]

0.96 c ;

351.4 0.96

7767.8
3201646

7791.878 ,

19.93

7791.878

19.9077

[ 20
= 7767.8

19.93

and the X' y vector is

X'y= 303.7 366.7


0.90

1.04

3 62
1 'JC . ]
351.4
0.96

7.~7 ~

r51129.17J.
120.79

5'

122.70

.25

1
442

Chaplel15

Multiple Regression

Table 152 Peach Darr.age Data for Example 15-1

Observation
Number
2
3
4
5
6
7
8
9

10
11
12
13
14
15
16
17
18
19

20

Damage (rom),

Drop Height (mm),

Fruit Density (gIcm,),

3.62
727
2.66
1.53
4.91
10.36
5.26
6.09
6.57
4.24
8.04
3.46
8.50
9.34
5.55
8.11
7.32
12.58
0.15
S.23

303.7
366.7
336.8
304.5
346.8
600.0
369.0
418.0
269.0
323.0
5622
284.2
558.6
415.0
349.5
462.8
333.1
502.1
311.4
351.4

0.90
1.04

1.01
0.95
0.98

UJ4
0.96
1.00
1.01

0.94
1.01

0.97
1.03
l.01
1.04
1.02
1.05
1.10
0.91
0.96

The lea.')t-squares estimators are found from equation 15~13 to be

~ = (X'Xt'X'y,

or

[ ~lj=
~Ol

7767.8

[ 20

3201646

7767.8
19.93 7791.878

~,

r 24.63666

19.93 T'C 120.79]

7791.878J' l'S1l29.17
19.9077

122.70

0.005321 -26.74679'C 120.79 '


0.0000077 -().008353 rl'51129.17J'

=l 0.005321
-26.74679 -{).008353

30.096389 j

122.70

-33,831]

= 0.01314 .
[

34.890

Therefore, the fitted regression model is

y=-33.831 + O.01314x, + 34.890x,.


Table 15~3 shows the fitted values of)' and the residuals. The fitted values andresidua1s are calculated
to the same accuracy as the original data.

15-2 Estimation of the Parameters


Table 15-3

Observations, Fitted Values, and Resid'J:a1s for Example 15-1

Observation Number

y,

1
2

3.62
7:17
2.66

4
5
6

1.53
4.91
10.36

7
8
9

2.06

1.56
7.27
5.83
3.31

-l.7S

4.92

-0.01

0.02

5.26

10.34
451

6.09

6.55

10
11

651
4.24
8.04

12

3.46

13
14
15
16
17

8.50
9.34
555
8.11
1.32
12.58
0.15
5.23

4.94
3.21
8.79
3.75
9.44
6.86
7.05
7.84
7.18
11.14

18
19
20

443

0.00
-3.11

2.01

0.75
-0.46
1.63
1.03
-0.75
-0.29
-0.94
2.48
-1.50
0.27
0.14
1.44
-1.86

4.28

0.95

'The statistical properties of the leastRsquares estimator ~ may be easily demonstrated.


Consider first bias:
E(ii) = E [(x'X.t'X'y]
E [(x'X)'X'(xP + Ell

E [(X'xr'x'Xl3 + (X'XY'X' EJ

= I),
since E(e) = 0 and (X'XY'X'X = 1. Thus ~ is an unbiased estimator of fl. The variance
property of ~ is expressed by the covariance matrix

Cov(~)

E ([~ - E{~)][~ -E(~)]').

The covariance matrix of ~ is a (p xp) symmetric matrix whosejjth element is the variance
of ~ and ",hose (i,j)th element is the covariance between ~i and
The covariance matrix
of I) is

Cov(~)
It is usually necessary to estimate

cr'(X'X)'.

f.il. To develop this estimator, consider the sum of

squares: of the residuals, say

:2A
/.1

: :;: e'e.

444

Chapter 15

Multiple Regression

Substituting e=y-y =y -X~, we bave


SS" = (y - X~)'(y

X~)

= y'y - P-X'y - y'X~ + ~'X'X~

= y'y-2~'X'y + ~'X'Xp.
Since X'XP = X'y, this last equation becomes
SS= y'y

(1516)

P'X"y.

Equation 15-16 is called the error or residual sum of squares. and it bas n - p degrees of
freedom associated with it. The mean square for errOr is
MS,=

(15-17)

n-p

It can be shown that the expected value of MS, is 0-2; thus an unbiased estimator of (f'l is
given by
(15-18)

&'=MS,

t"".,:pi?f5~
We \Vill estim..a~ t.'le error va:ciance <:f for the multiple regression problem in Exan:.p1e 15-1, Using the
dara in Table

15~2,

we find
20

y'y=

I,0 =904.60
1",1

and

~'X"Y=[-33.831

120.79 1
0.01314 34.890J 51129.17J
[
122,70

= 866.39.
Iberefore, the e:ctor sum of squares

is
SSe = y'y - ~'X'Y

904.60 - 866.39

=38.2[.
The estimate of IT is
'2 _ SSe _ 38.21 _ 2 04" .. - - - - - - ._ i_

n- p

20-3

153 CONFIDENCE INTERVALS IN M1;"LTIPLE LTh"EAR


REGRESSION
It is often necessary to construct confidence intervcl estimates for the regression coefficients
{fl,}. The development of a procedure for obtaining these confidence intervals requires that
w~ assume the errors (o,) to be normally and independently distributed with mean zero and
variance cr'. Therefore, the observations {Y,) are normally and independently distributed

Confidence Intervals in Multiple Linear Regresskn

15-3

r;.

445

witb mean fio +


1 f1:Xi1 and variance r.r. Since the least-squares estimator /lis a linear com~
bination of the observations, it follows that is nonnally distributed with mean vector II and
covariance matrix u'(X'Xt'. Then each of the qUlllltities

Ii

P P
j -

1~2

ejj '

'lCf

O,l... k,

(15-19)

is distributed as twith n - p degrees oifreedom, where CJJ is tbei/th element oitbe (X'Xt!
matrix and if' is the estimate of the error variance, obtained from equation 15-13, Therefore,
a 100(1 a)% confidence interval for the regression coefficient Py j = 0, 1, ... , k, is
(15-20)

We will construct a 95% confidence interval on the parameter f31 in Example 15-1. Note that:he point
est:i:n.ate of PL is fil "'" 0.01314, and me diagonal element of (X'X)-! corresponding to fJl is en =
0,0000077, The estimate of <f was obtained in Example 15-2 as :.247. and til.o:.~.11 = 2.110. Therefo~
the 95% confu:1ence interval on [3, is computed from equation 15-20 as

O.01314-(2.1l0)~(2.247)(o.oOOO077) ." /3, i 0.01314+(2, ;)0).)(2.247)(0.0000077).


which reduces to

0,00436"; /3, ,,; 0,0219,

We may also obtain a confidence interval on the mean response at a particular point, say
(X 01 ' Xw "., xoJ To estimate the mean response at this point define the vector
1

The estimated mean response at this pom.t is

Yo=x;,~.

(15-21)

This estimatoris unbiased, since EGo) = E(x;,~) = x;,11 = EVe)' and the variance afY, is
(15-22)

Therefore, a 100(1- a)% confidence interval on the mean response at the point (x,., x"' ""
..tJt)is

Equation 15~23 is a confidence interval about the regreSSion hyperplane. It is tbe multiple
regression generalization of equation 14~33.

446

Chapt:r 15 M",tiple Regression

The scientists conducting the experiment on damaged peact.es in Example 151 would like to

OOn~

struct a 95% confidence interval on the mean damage for a peach dropped from:a height of Xl'" 325
~ if its density is Xi = 0.98 glcm? Therefore,

The estimated mean response at this point is found from equation

15~21

to be

-33,831]
325 0.98] 0.01314 = 4,63.
[ 34.890
The variance of Yo is estimated by

: 24,63666

O"x,(X'Xr'xc =2,247[1 325 0.98]1

000;>321

0,005321

-26.74679]~

0.0000077

-(l.008353

1 ]
325

c-26.14679 -(l,008353 30,096389lo,98

=2.247(0,0718) =0,1613,
Therefore, a 95% confidence interval On the me3l1 damage at this point is found from equation 15-23
to be

which reGuces to
3.78 S E(y,) :;; 5,48,

15-4 PREDICTION OF ],;'EW OBSERVATIONS


The regression model can be used to predict future observations on y corresponding to particular ..'alues of the independent variables, say Acl' xC<,; , XQI:' If ~ :::: [1, X:)I' XC;!, , X ct];
then a point estimate of the future observation Yo at the point (xm Xoo. ~ Xl)k) is
(15-24)
A 100(1 - a)% prediction interval for this future observation is

Cl(I'
+:<0 (X",,)-l Xc )

YC-tctj2,I1-p1./cJ

"'

.A.

(15-25)

/',c-,(;---,-,-X--X-)-:-t---C'

sYo -Yo +t"12,._p~d'- I+xc\

"0i'

TIlls prediction interval is a generalization of the prediction inteI'\la! for a future observation
in simple linearregressio~ equation 14--35.
In predicting new observations and in estimating the mean response at a given point
(xc;. Xw , X Ok )' one must be careful about extrapolating beyond the region containing the
original observations. It is very possible that a model that fits well in the region of the orig
inal data will no longer fit well outside that region. In multiple regres>ion it is oiren easy to
inadvertently extrapolate, since the levels of the variables (XI1 xa.> ... Xi*>' i;;;; 1, 2, , .. , n)
jointly define the region comaining the data. As an example, consider Fig. 15-1, which illus-

15~5

Hyporb.esis Testing in Multiple Linear Regmision

447

: XOj

L
,
Figure

15~1

Onginal

rangs for x1

An example 0: extrapolation in multiple regression.

trates the region containing the observations for a two-variable regression model. Note that
the point (.:tol' xO") lies within the ranges of both independent variables Xl and ~~ but it is outside the region of the original observations, Thus, either predicting the value of a new
observation or estimating the mean response at this point is an extrapolation of the original
regression modeL

Suppose mat the scientists in Example lSffl wish to construct a 95% prediction interval on the damage on a peach that is dropped from a heignt of-XI = 325 IIlIl1 and has a density of.;s =0.98 glcmJ Note
that x:: = [1325 0.98], and the point estimate of the damageisYn =x~~ =4.63 rom. Also. inE.~ample
15-4 we calculated x~(X'Xrlx.:: = 0.0718, Therefore, from equation 15-25 we have

4.63 -2.1l0~2.247{1 + 0.0718) 5; Y, ~ 4.63+ 2.110,/2.247(1 +0.0718),


and the 95% prediction interval is

15-5 HYPOTHESIS TESTING IN MULTIPLE LINEAR REGRESSION


In multiple linear regression problems, certain tests of hypotheses abeut the model parameters are useful in measuring model adequacy. In this section, we describe several importan! hypothesis-testing procedures. We continue to require the nonnality assumption on the
errors, which was introduced in the prevlous section.

15-5,1 Test for Significance of Regression


The test for significance of regression is a test to determine whether there is a linear relationship between the dependent variable y and a subset of the independent variables XI' hzl
x,t:. The appropriate hypotheses are
f

448

Chapter 15 Multiple Regression

H,:/l ~ /3,~ ... ~ /3,=0,

(1526)

H,: f3;" for at least onej.

Rejection of He: /3) == 0 implies that at least one of the independent variables XI' x?, .... xk
contributes significantly to the model. The test procedure is a generalization of the proce~
dure used in simple linear regression. The total sum of squares SfY is partitioned into a sum
of squares due to regression and a sum of squares due to error, say

Syy=SS,+SS,.
and if H,:

/3, = 0 is !rue, then S5.10" -

xi, where the number of degrees of freedom for the

x' is equal to the number of regressor variables in the model. Also, we can show that S5,10"
- X;,'," and SSe and SS, are independent. The test procedure for H,: J3, = a is to compute
_. SS,lk
_ MS R
EeSSE!(n-k-l) MSE

(15-27)

and to reject Ho if FI) > F a.k,lI-i.'-I' The procedure is usually summarized in an analysis of vari~
ance table such as Table 15-4.
A computational formula for SS, may be found easily. We have derived a computa
tion~ formula for SSE in equation 15-16, that is,

SSE=y'y- ~'X'y.

Now since SyJ :;; l,~",y~ - O:';"'ly'-y1/n

= y'y - CZ;"'ly,lln.

we may rev.rite the foregoing

equation

or

Therefore, the regression sum of squares is

(1528)
the error sum of squares is
(1529)
and the total surn of sq!WeS is

(1530)

15~5

Hypothesis Testing in Multiple Lir.ear Reg;r<>...ssion

449

Table 154 Analysis of Variance for Significance of RegreSSIon in MultipJe Regression

Sou..rce of
Variation
Regression
Er.ror or residual
Total

Sillll of
Sc;u"m

Degrees of
Freedom

Square

SS,
SSE

k
n-k-l

MS,
MS,

S;),

Mean

n-[

'~~j~~~~
We v.:ill test for significance of regression using the damaged peaches data from Example 15~ L Some
of the numerical quantities required are calculated in Example J5~2. Note that
I

',2

fi

Syy=y'y-- .L,y,.
n \i=!

= 904.60 (120.79)'

20
=175.089,

SSR~~'X'Y-.!.( :~.>,r
n\l:!

=866.39
~

SS,

(,20.79)'
20

[36.88.

=Syy -

5S,

~y'Y-~'X'Y
=38.21.
The analysis of variance is shown in Table 15-5, To testR(J: f31 = A =0, we calcula[e the statistic

Po = MS,

MS,

= 68.44 =30.46.

2.247

Since Fo > Fo,IJ~;)',.1 """ 3.59, peach damage is related to drop height, froitdensity, or both. However, we
note that this does not necessarily imply that the relationship found is an apprOpriate oae for predicting damage as a function of drop height or fruit density. Further tests of model adequacy are required.

Table 15-5 Test for Significance of Regression for Example

Source of
Variation

Squares

Sumo:

Regression
EI:ror

136.88
38,21

17

Total

175.09

19

15~6

Degrees of
Freedom

Square

F,

68."4

30.46

Mean

2.247

450

Chapter 15 Multiple Regression

15-5.2 Tests on lndividual Regression Coefficients


We are frequently interested in testing hypotheses on the individual regression coefficients.
Such tests would be useful 10 determining the value of each of the independent variables in
the regression model For example, the model might be more effective with the inclusion of
additional variables, or perhaps with the deletion of one or lUore of the variables already in
the modeL

Adding a variable to a regression model always causes the sum of squares for regression to increase and the error sum of squares to decrease. We must decide whether the
increase in the regression sum of squares is sufficient to warrant using the additional variable in the model. Furthermore, adding an unimPOrtant variable to the model can actually
increase the mean square error) thereby decreasing the usefulness of the model.
The hypotheses for testing the significance of any individual regression coefficient, say

f3i , are
Ho: f3j~ 0,
(15-31)

Hl:~''''O.
If H,: f3j

= 0 i, not rejected, then this indicates that xJ can possibly be deleted from the

model. The test statistic for this hypothesis is

p,

to ;

~"a:!c;'

(15-32)

A.

ejj is the diagonal element of (X'xyt corresponding to The null hypothesis


= 0 is rejected if Itol > 1"",.,_ ,. Note that this is really a partial or marginal test,
because the regression coefficient depends on all the other regressor variables xli ;) that
are in the modeL To illustrate the ;se of this test. consider the data in Example 15-1, and
suppose that we want to test
where

He:

p;

He: f3, = 0,
H,: {3," O.
The main diagonal element of (X'Xt' corresponding to jj, is

en ~ 30.096, so the r statistic

in equation 15-32 is
.. 34.89
_ 4.24 .
.)(2.247)(30.096)
Since to"" 22 = 2.110, we reject Ho: /3, =' 0 and conclude that the variable x, (density) contributes sigiuficantly to the modeL Note that this test measures the marginal or partial con~
tribution of X1 given that Xl is in the model.
We may also examine the contribution to the regression sum of squares of a variable,
say x" given that other variables x, (i J) are included in the model. The procedure used to
do this is called the general regression significance test, or the "extra sum of squares"
method. This procedure can also be used to investigate the contribution of a subset of the
regressor variables to the model. Consider the regression model with k regressor variables
y~Xll+,

whereyis (nX I), Xis (nxp), II is (px I),. is (nx I), andp = k+ 1. We would like to
determine whether the subset of regressor variables Xt~ X:l~ .. "' xr (r < k) contributes

15-5

Hypo~esis

Testing in Multip:e Lineru; Regression

451

significantly to the regression model. Let the vector of regression coefficients be pa.'1itioc.ed
as follows:

where

P: is (rx I) and ~2 is [(P -

r) X IJ. We wisb to test the hypotheses

Ho: Pl =0,

(l5-33)

H,: p, ;cO.
The model may be \.VIitten

(15-34)
where Xl represents the columns of X ass.ociated with PI and ~ represents the columns of
X associated ",itb ~"
Fortbeft,II1110del (including both 13, and ~,), we know that ~ = (X'Xr'X'y. Also, the
regression sum of squares for all var..ables including the intercept is
(p degrees of freedom)

and
y'y -~'X'y
MS" = --'-""""""'.
~
n- p

SS,(I3) is called the regression sum of squares due to 13, To find the contribution of the terms
in 13, to the regression, fit the model assuming the null bypothesis Ho: 13, =0 to be true, Tbe
reduced model is found from equation 15-34 to be
y = X'~2 +e,

(15-35)

The least-squares estimator of ~, is ~, = (X,x,r'~, and

S5,(J~,) =~Xy
The :egression sum of squares due to

(p

rdegrees of freedom).

(15-36)

13, given that a,z is already in the model is

SS,(I3,:~,)

SSR(~) -SS,l~'),

(l5-37)

This sum of s.q1.lZlIes has r degrees of freedom. It is sometimes called the "extra sum of
squares" due to ~ ,. Note tblIt S5,(~,1~,) is the increase in the regression sum of squares due
to including the variables x" x" '"., x, in the model. 1'ow S5'(13 ,113,) is independent of MS E'
and the null hypothesis ~, = 0 may be tested by the statistic

sSR(i3d~2)lr
(15-33)
If Fo > Fa. f, "_p we reject HCr< concluding that at least one of the parameters in 1">1 is not zero
and, consequently, at least one of the variables XI' Xz... " x!' in Xj contributes significantly
to the regression model Some authors call the test in equation 15-33 a partial F-test
The partial F-test is very useful. We can use it to measure the contribution of xJ as if it
were the last var.able added to the model by computing

SSiflip", p" ... , Pi-I' /5;."

... , pJ,

452

Chapter 15 Multiple Regression

This is the inerease in the regression sum of squares due to adding X; to a model that already
includes XI' , .. Xj_I' ;:,"H' .. xk' Note that the partial F-test on a single variable Xj is equivalent to the t-test in equation 15-32. However, the partial F-tes! is a more general procedure
in that we can measure the effect of sets of variables. In Section 15-11 we will show how
the partial F-test plays a major role in model building, that is, in searebing for the best set
of regressor variables to use in the model.

Consider the damaged peaches data in Example 15-1. We will investigate the contribution of the variable x,. (density) to tte model. That is, we wish to test

He: fl, = 0.
H,: /3," 0,
To test this hypothesis. we need the extra sum of squares due to /3,.. or

SS,(/3,I/3,. flc) = SS,cf3" fl" f3;J - SS,(/3,. I30l


= SS/J31' f3,I/3;J - sS,(/3,lf3,).
In Example 15-6 we calc'Jlated

(2 degrees of freedom).

and if the model y =: flo + f3 ,XI + e is fit, we have

SS,(fl,lfJol =fj,s", = 96,2,

(1 deg;ee of freedom).

The:[efore. we have

sS'/'/3,I/3"

flu) = 136,88 -96.21


=40.67

(1 degree of freedom),

This is the increase in the regression sum of squares attributable to adtfng x,. to a model aJJ:eady containing XI' To testHo: fl.1 ;: 0, [onn the test statistic
SSR(fl,lfl,.flc)/1
MS E

~ 40,67 =18.10,
2,247

Note that theMS; from thefull model. usi::g both x: ar.d x,.. is used in the denominator of the test sta~
t:.suc. Since F). Cd,I,;7::::: 4.45, we reject Hi; 132 =: 0 and conclude :.bat density (x2) contributes significantly
to the model.
Since this partial F~tes[ involves a single variable, it is equivalent to the Mest To see this, recall
that the t-test on Ho: .B: ;;; resulted in the tes: statistic tc :;:;; 4.24. Furt:b.crmore, recall that :he square
of a trmdom variable with v degrees of freedom is an Frandom variable with one and v degrees of
freedom. and we note that
(4.24)' =: 17,98::.: FfJ ,

15-6 MEASURES OF MODEL ADEQUACY


A number of techniques can be used to measure the adequacy of a multiple regression
modeL TItis section will present several of these techniques . Model validation is an impor-

15~6

Measures of Model Adequacy

453

tant part of the multiple regression model building process. A good paper on this sUbject is
Snee (1977) (see also Montgomery, Peck, and VIning, 2001).

15-6.1 The Coefficient of Multiple Detennination


The coefficient of multiple determination R2 is defined as
(15-39)

R2 is a measure of the amount of reduction in the variability of y obtained by using the


regressor variables xi'.:s, ... , x k As in the simple linear regression case, we must have a :s;:
R2:s;: 1. However, as before, a large value of Rl does not necessarily imply that the regression model is a good one. Adding a variable to the model will always increase R2, regardless of whether the additional variable is statistically significant or not. Thus it is possible
for models that have large values of If to yield poor predictions of new observations or estimates of the mean response.
The positive square root of R2 is the multiple correlation coefficient between y and the
set of regressor variables xl' .:s, ... , xk That is, R is a measure of the linear association
between y andx I'.:s, ... , x k When k;:::: 1, this becomes the simple correlation between y and x.

The coefficient of multiple determination for the regression model estimated in Example

15~ 1 is

R' ~ SSR ~ 136.88 ~ 0.782.


S,., 175.09

That is, about 78.2% of the variability in damage y is explained when the two regressor variables, drop
height (XL) and fruit density (.:s) are used. The model relating density to Xl ouly was developed. The
value of R2 for this model turns out to be Rl == 0.549. Therefore, adding the variable X? to the model
has increased R2 from 0.549 to 0.782.
-

AdjustedR'
Some practitioners prefer to use the adjusted coefficient a/multiple determination, adjusted

R2, defined as
(15-40)
The value Sy-/(n - 1) will be constant regardless of the number of variables in the model.
SSE1(n -p) is the mean square for error, which will change with the addition or removal of
tenns (new regressor variables, interaction terms, higher~order terms) from the model.
Therefore, R ~dj will increase only if the addition of a new term significantly reduces the
mean square for error. In other words, the R!dj will penalize adding terms to the model that
are not significant in modeling the response. Interpretation of the adjusted coefficient of
multiple detemrination is identical to that of R2.

454

Chapter 15 Multiple Regression

We can. calculate K~j for the model fit in Example 15-1. Prom Example 15-6, we found that SSE;;:;:
38.21 and Srr::= 175.09. The estimate R~: is then
38.21/(20-3) _1_ 2.247
175.09/(20 I)
9.215
=0.756.

~)=I

The adjusted R' will playa significant role in variable selection and model building later in

this chapter.

15-6.2 Residual Analysis


The residuals from the estimated multiple regression model. defined by", = y,-Y,. play an
important role in judging mode! adequacy. JUSt as they do in simple linear regression .'\.5
Doted in Section 14-5.1, there are several residual plots that are often useful. These are illustrated in Example 15 -9. It is also helpful to plot the residuals against variables nIlt presently
in the model that are possible candidates for mc1usion. Patterns in these plots. similar to
those in Fig. 14-5, indicate that the model may be improved by adding the candidate
variable.

The residt:.aL.... for the n:odel estimated in Example 15-1 are shown.in Table 15-3. These residuals are
plotted 00:1 a normal probability plot in Fig. 15-2. No severe deviatiollS from r.ocrnaliry are obvious,
although the smallest residual (e) = -3.11) does not fall near the remaining residuals. The standardized residual. -3.l7/~j2,247 :::: -2,11, appears to be large and could indicate an unUSt:a1 observation.
The residuals are plotted against y in Fig. 15~3, and againstx j and.A1 in Figs. 15-4 and ;5-5, respective;y_ In Fig. 15-4, there is some i.'1dication that the assumption of constant variance may not be satisfied. Removal ofth.e unusual observation may improve the model fit, but there is no indication of
error in data colleetion. Therefore, the point 'kill be retained, We will see subsequently (Example 1516) iliat!:\Vo other regressor variables are required to adequately model these data,

. . .

~
0

0
m

ro 0
E

~ ~
-1

I ,

-2 ....

1
-;

-1
-2
0
2
-3
3
Figure 15-2 Normal probability plot of residuals for E""ample 15-10.

r
15-6

3~

"

2- " .

~ ~ -----~-"-~"-~------

o:~:~"

~!easures

"

... -"-------+---- .

'.

~~l~----------~----------~
2

Figure

15~3

7
Fitted value

12

Plot of residuals againsty for Exan:.ple

15~10.

3~----------------------------------,

r
i

-2

'.

~'

--~~--~~----~;~----~
300
400
500
600

x,

,Figure 154 Plot of residuals against x ( :or Example 15-10.

Figure J5.5 Plot of residuals against", for Example 15 I 0,

of Model Adequacy

455

456

Chapter 15

Multiple Regression

15-7 POLYNOMIAL REGRESSION


The linear model y = X~ + e is a general model that ean be used to fit any relationship that
is linear in the unknown parameters ~. This includes the important elass of polynomial
regression models. For example. the second-degree polynomial in one variable.
y=

/3, + /3.x ,. /3,,:? + S,

(1541)

and the second-degree polynomial in two variables,


y = /3, ,. /3,x.

+ /3,x, + /3. ,x; + f3...:l X ; + /3.::>"x, + s,

(1542)

are linear regression models,


Polynomial regression models are widely used in cases where the response is curvilinear, because the general principles of multiple regression can be applied. The following
example illustrates some of the types of analyses that can be performed.

E~pl~;~s~ii
Sidewall panels for the interior of an airplane are formed i::.l a 1500~ton press. The unit manufactur~
ing cost varies \\1:th the production lot size. The data shown below give :he average cost per unit (in
hundreds of dollars) for this product (y) and :he production lot size (x), The scatter diagram. shown
in F1g. 15-6, indicates that a second-order polynomiul may be appropriate,

LSI

1,70

[.65

[,55

lAS

lAO

1.30

1.26

1.24

1.21

1.20

1.18

20

25

30

35

40

50

60

65

70

75

80

90

We will fit fue model


y = f30 + /3.x + /3"r + "

"90~
1.80

1.70

" i.50

"

"- 1.50

'"
~

1.40

1,30

1.20

1.10

1,00
20

30

, ,
40

.. !~.~

50 60 70
Lot size, x

Figure 15-6 D.ta for Example 15-11.

8()

90

f
1

15-7

The y vector, X matrix, and

Polyr.omial Regression

457

vector are as follows:

j:.811

1 20

400

25

625

30
I 35

1225

; 1.70

1.65

1.55
L48
1.40
y= 1.30 '

x=

900

1 40 1600
1 50 2500
60 3600'

1.26

65

L24
1,21

70 4900

4225

1 75 5625

1.20 ;

80

cLl8j

6400

1 90 8100

Solving the normal equations X"'Xtl = X'y gives the fitted model

, ; 2.1983 - 0.0225, + 0.0001251:<".


The test fo~sign.ificance of:egrcssioIl is shoVt"O in Table :5-6. Since Fo =2171.07 is significant at 1%,
we conclude iliat at least O!:.e of the parameters fJ! and /311 is not zero. Furth.ermore, the sta.1.da!d tests
for model adequacy reveal no unusucl behavior,

In fitting polynomials, we generally like to use the lowest-degree model consistent with
the data. In this example, it would seem logical to investigate dropping the quadratic term
from the modeL That is, we would like to test

13,,: 0,
H,: 13" "' O.
H,:

The general regression significance test can be used to test this hypothesis. We need to
determine the "extra sum of squares" due to {111' or

SS,(f3ulf3;,f3J = SSR(fi"

f3 li lf3,) - SS,(f3,If3J

The sum of squares 5S"(f3,, f3::1f3;J = 0.5254, from Table 15-6. To find SSR(fi,lf30), we fit a
simple linear regression model to the original data, yielding

1.9004 0.0091".

It can be easUy verified that the regression sum of squares for this model is

58,(13,113,)

0.4942.

Table 15-6 Test for Significance of Regression for the Second~~er Model in Example 15-11
Source of
Variation

Sum of
Squares

Regression
Error
Total

0.5254

0_0011
0.5265

Degrees of
Freedom

2
9
11

Mean
Square
02627

0.000121

F,

217L07

458

Chapter 15

Multip!e Regression

Table 15-7 Analysis ofVariaz;ce of Example

Reg:-ession

SS,([3" {J"i[3,) ~ 0.5:254

Lin_
Quadratic
Error

SS,([3,I[3,) ~ 0.4942
SS,(,8"1[3,, .8,) = 0.0312

2
1
I

0.0011
0.5265

Thtal

Showing the Test for He: [31,

Degree of
Freedom

$\l."U. of
Squaces

Source of
Variation

15~ 11

9
11

Mean
Square
0.2627
0.4942
0.0312
0.000121

2171.07
4084.30
257.85

Therefore, the extra sum of squares due to Pll' given that /31 and Pc are in the model, is

SS/fi"lA" /3)

SSk(/3"
~

/3,,113,) - SSifiJ/3~

0.5254 - 0.4942

~0.0312,

The analysis of variance, with the test of H,: /311 ~ 0 incorporated into the procedure, is displayed in Table 15-7, Note that the quedratic term contributes significantly to the model.

15-8 INDICATOR VARIABLES


The regression models presented in previous sections have been based on quantitative vari~
abIes) that is, variables that are measured on a numerical scale. For example. variables such
as temperature, pressure, distance, and age are quantitative variables. Occasionally. we
need to incorporate qualitative variables in a regression model. For example, suppose that
one of the variables in a regression model is the operator who is associated with each obser'''arion y;. Assume that only two operators are involved. We may wish to assign different levels to the two operators to account for the possibility that each operator may have a different
effect on the response.
The usual method of accounting for the different levels of a qualitative variable is by
using indicator variables. For instance, to introduce the effect of two different operators into
a regression model, we could define an indicator variable as follows:

0 if the observation is from operator 1,

x = 1 if the observation is from operator 2.

In general, a qualitative variable with t levels is represented by t - 1 indicator variables.


which are ,",signed values of either 0 or 1. Thus, if there were three operators, the different
levels would be accounted for by two indicator variables defined as follows:

x,

X,

0
1

0
0

if the observation is fmm operator 1,


if the observation is from operator 2,
if the obsertation is from operator 3.

Indicator variables are also referred to as dummy variables. The following example illus~
trates some of the uses of indicator variables. For other applications, see Montgomery.
PeCk, and Vining (200 1).

15-8 Indicator Variables

459

E~pl~(?::i2',
(Adapted from Montgomery. Peck. and Vmiog, 2001), A mechanlcal engineer is investigating the sur~
face finish of metal parts ptoduced on a lathe and its relationship to the speed (in RPM) of the lathe,
The data are shown in Table 15-8. Note that the data have been collected usiGg two different typCs of
cutti.:lg tools. Since it is likely that the type of cutting tool affects tb.e surl'ace fiIrish, we will fit the
model

y ~ /3, + fil~' + Ph + E,
where y is the surface finL'>h, XI is the lathe speed in RPM, and X: is a.'1 indicator va.--!ahle denoting the
type of ct;.tting tool used; 'ilia: is,

ofor tool type 302,


x -{
, - 1 for tool type 416,
The parameters if. ttis model may be easily interpreted. If x,! ;;;; 0, then the model becomes

Y""

which is :a straight-line model with slope


becomes

Pc + ftjX t + e,

Pi and intercept f3y However, if X. ;;;; 1. then the model

y ~ 13, + fi,x, + 13,(l) + E; p, -

/3, + f3,x, + E,
which is a st:raight~Jine model v.ith slope PI and intercept f3ij + /32' Taus, the model y;; Po + f31X+ f3-zXz
+ e implies that surface finish is linearly related to lathe speed and that the slope f31 does Dot depend
on the type of cutting tool used, However, the type of cutting rool does affect rbe intereept,. and /32 indi~
cates the ehange in the intercept associated with a change in too) type from 302 to 416,

Thble158 Swiace Furish Da,afor Example 15-12

Observation
N'lr!I:.ber. i
2

3
4
5
6
7
8
9
:0
II
12
13
14
15
16
11
18
19

20

Su.,..face F1nish.
y,
45.44
42,03
50,10
48.75
41-92
47,79
52_26

50.52
4558
44,78
33.50
3L23
37,52

37,[3
34,70
33,92
32,13
35,47
33-49

32.29

Type of Cutting

RPM

Tool

225
200
250
245
235
231
265
259
221
218
224
212
248
260

302
302

243
238
224

251
232
216

302
302
302
30?30?30?30?-

302
416
416
416
416
416
416
416
416
416
416

-c"J

460

Chapter 15 Multiple Regression


The X matrix and y vector for this problem are as follows:

1
1
1
1
1

1
X=
1
1
I

1
1
1
1

225
200
250
245
235
237
265
259
221
218
224
212
248
260
243
238
224
251
232
216

0
0
0
0
0
0
0
0
0
0
1

45.44
42.03
50.10
48.75
47.92

47.79
52.26
50.52
45.53

44.78
y=
33.50 .

31.23
37.52
37.13
3<'.70
33.92

1
1:

,32.13
: 35.47

1i

! 33.49:
L32.29 J

The fitted model is

5= 14.2762 + O.141u, -13.2802x,.


The a."ta:ysis of variance for this model is sho'h'Il in Table 15~9. Note that the hypothesis Eo: PI ::: /32
of regression) is reje...'1:ed. This table a.:w contains ::he sum of squares

=0

SSg = SS,(/3,. }3,iP,J

=ssg{/3,I}3') + SS,(}3,I}3,. }3,).


a test of the hypothesis Ho: /32 = 0 can be made. This hypothesis is also rejected, so we conclude
that tool type has an effect on surface finish..

S(}

It is atsopossihle to use indicarorvariables tomvesligate whether tool type affects both slopemui
intercept Let the model be

Table 15-9 Analysis of Variance of Example 15-12


Sou.."'Cc of

Sumo!

Degrees af

Variation

Squares

Freedom

Regression

SS'(f:l,IfJ.>
55,<{3,IfJ }3o)
Error
Total
"Signific.ant at 1%.

1012.0595
(130.6091)
(881.4504)
7.7943
1019.8538

(1)
(1)
!7
19

Mean
Square
506.0297
130.6091
881.4504
0,4508

1103.69'
284.87'
1922.52'

15-9

The Correlation Matrix

461

where.:s is the inmcam! variable. Now if too) type 302 is used,.:s "'" O. and the model is

{J,x,+e.

y=

lftool rype 416 is used .:s;;;;: 1, and the model becomes


y ~ {Jo - {J,", + A+ f3:,x, + e

= (fJ" + fiJ + (/3, + /3;JX, + SNote that A is the change in the intercept. and A is the change in slope produced by a change in
tool type,
Another method of analyzing these data set is to fit separate regression models to the data for
each tool type. However, the indicator variable approach has several advantages, rmt, only one
regression model must be estimated. Second, by pooling the data on both :001 types, more degrees of
freedom for error are obtained. Th.lrd. tests of both hypotheses on the parameters 13: end /33 a:e just
special cases of the general regression signific3IWe test

15-9 THE CORRELATION l'.lATRIX


Suppose we wish to estimate the parameters in the model

i 1,2, ""n.

(15-43)

We may rewrite this model with a transformed intercept f3~ as

or, since

The

Y, = /3; + /3,(x" - x,) + A(x,~

+ z,

(15-44)

y,-y= /3,(X" -x,) + A.(x,~ -

+ cr

(15-45)

A= y,

xx matrix for this model is


(1546)

where

Skj= I,(xik-x.)(xg-Xj),

k,j

1,2.

(15-47)

i=l

It is possible to express this XX matrix in correlation fo=. Let


Tk;' ::::; I
\

S,S.
JJ

)1/2

k,j=1,2,

(1548)

/<.Ii.

and note that r" = r" = 1. Then the correlation form of the XX matrix, equation 15-46, is

R-['I12

r~21

(1549)

The quantity r t2 is the sample correlation bet\.veenx[ and~. We may also define the sample correlation between Aj and y as
j=1,2,

(1550)

l
462

Chapter 15 Multiple Regression

where

Sj,

i(x'J XI)(Y'-y),

j=I,2,

(15-51)

1<"":

is the corrected sum of cross products between Xj and y, and S'I"J is the usual total corrected
sum of squares of y.

These transfur:nations result in a new regression model,


(15-52)

where

. . Y,-y

Yi =Slf2-,
yy

z11;::::

x -XI
1~~'2"

j= 1,2.

The relationship between the parameters b , and b2 in the new model, equation 15-52, and
the paramerers {J" {JI' and A in the original model, equation 1.5-43, is as follows:
(15-53)

{J,

(15-54)

(15-55)
The least-squares normal equations for the transformed model, equation 15-52, are
(15-56)
The solution to equation 15-56 is

or
(15-57a)

rZy - liz'll'
1-1.~

(J5-5ib)

The regression coefficients, equations 15-57, are usually called staruiardized regression
coefficients. Many multiple regression computer programs use this transfor.:nation to reduce
round~off errors in the (X'X)-l mati"" These round-off errors may be very serious if the

15~9

The Correlation Matrix

463

original variables differ considerably in magnitude, Some of these computer programs also
display both the original regression coefficients and the standardized coefficients, The
standardized regression coefficients are dimensionless, and this may make it easier to com~
pare regression coefficients in situations where the original variables Xj differ considerably
in their units of measuremenL In interpreting these standardized regression coefficients,
hovvever\ we must remember that they a...-.oe still partial regression coefficients (i.e., b, s..'1ows
the effect of z) glven iliat other Zi> i
are in the model). Fur",hermore, the hJ are attected
by the spacing of the levels of the x)' Consequently, we should not use the magnitude of the
bj as a measure of the importance of the regressor variables.
While we have explicitly treated only the case of two regressor variables. the results
generaIiz.e. If there are kreg:ressor variables X,I ~> xk one may vrrite the X'X matrix in

correlation form.

r'12
R=,r13

TZ3

r2k

r,.

rn

L'ik

w:tere r.j
(X'I -

'13

'12

'"

"Jk

,.,

r",

(1558)

''''
, "

S/(S;,Sl))Lf2 is dIe sample correlation between x, and x-' and Sij;';;; I.;:",(X"i - Xi)

x,). The correlations between Xj and y are

!'ly
g

: r2'j

(1559)

lrky j
"

where "Y =1:;"', (x., b" ''', b,] :s

x,)(y" - Y), The vector of standardized regression coefficients

0= R1g,

b' = h"
(1560)

The relationship between the standardized regression coefficients and the original regres~
sion coefficients is

0;;:

1,2, ... ,k,

(1561)

~~lil~'131
For the data in Example 151, we find
S", = 175,089,

S" = 184710.16,

SI,V;;4215.372,

S" = 0.047755,

S" = 2,33,

Sll ;;;51.2873.

Therefore.
51.2873

~(18471O.16)(0,047755)

0.5460,

464

Chapter 15

Multiple Regression

42\5,372
1(18471O.16)(6i089) - 0,7412,

S"
(S"S,)

r2v :::: - - ,.... ljl

'. -

2.33
",;;l

0.8060.

"JrO,047755)(175,089)

f)

and the correlation matrix for this problem is

0.54601

I,

J
From equation

15~56.

the normal equations in terms of the standardized regression coefficients are

Ic

1
0.5460

Consequently,

L~e

0.5460l'~t]=[0'7412],
1

L'"

0,8060

standardized regression coefficients are

r~,'J 1
0.5460]",'0,74121
~ b,J L0,5460
1
L0,8060,
1.424737 -<},77791J[0.7412]

= [ .{j,77791 1.424737. 0,8060


0.429022]
= [ 0.571754 .

These standardized regression coefficients could also have been computed directly from
elfuer equation 15,57 or equation 1561. Note that alfuough b, > b" we should be cautious
about ooncluding that the fruit density X, is more important fuan drop height (x,)' smce h,
and b2 are still partial regression coefficients.

15-10 PROBLEMS IN MULTIPLE REGRESSION


There are a number of problems often eneountered in the use of multiple regression. In this
section, we briefly discuss three of these problem areas: the effect of multicollinearity on
the regression model, the effect of outlying points in the x-space on the regression coefficients,. and autocorrelation in the errors.

15-10.1 Multicollinearity
In most multiple regression problems. the independent or regressor variables

Xi are intercorrelated, In situations whieh this intercorrelation is very large. we say that multi:
collinearity exists. lv1ulticollinearity ean have serious effects on the estimates of the
regression coefficients and on fue genernl applicability of fue estimated model.
The effects of multicollinearity may be easily demonstrated. Consider a regression
model with two regressor variables Xl and x,., and suppose that'xi and ~ have been ;~stan~
dardized.'" as in Section 15~9, so that the X'X matrix is in correlation fo:rm, as in equation

15-49, The model is

15-10 Problems in Multiple Regression

465

The (X'Xr' matrix for this model is

-r,,/(I- rall
. [)I(l-";)'
/"'-2
-,:,/(I-r,,) lj(l-fl~) J

C=(X'Xf' =

and the estimators of the parameters are

", -_~i.Y-'12X2Y
2'
P
.
1- r12
,
{3,

= X'Y"
2 -

'12 x'y
,

l-'l~

where T;2 is the sample correlation between x; and.:tj, and x;y and x~y are the elements of
the X'y vector.
Now, if multicolli.'learity is presen~ x, and.<, are highly correlated, and If,,1 -> 1. In
such a situation) the variances and covariances of the regression coefficients become very
large, since v(li) = ejF' -> ~ as If,,1 -> 1, and COy
= C"d' -> = depending on
whether r 12 ~ 1. The large vari,ances for A, imply that the regre.';sion coefficient. . are very
poorly estimated. Note that the effect of multicollinearity is to introduce a ''neat' linear
dependency in the columns ufthe X matrix, As r" -> 1. this linear dependency becomes
exact. Furthermore, if we assume that x;y -? x~ as Irnl ~ 1. then the estimates of the
regression coefficients become equal in magnitude btll opposite in sign; that is =
regardless of the true values of p, and p,.
Similar problems occur when multicollinearity Lt; present and there are more than t..vo
regressor variables, In general, the diagonal elements of the matrix C (X'X)-J can be

(A, Ii)

Ii, -A,

;:.!

written

j"'" 1,2, ... k.

where

(1562)

RJ is the coefficient of multiple determination resulting from regressing Xi on the

other k - 1 regressor variables. Clearly) the stronger the linear dependency of Xj on the

remaining regressor vatiables (and hence the stronger the multicollinearity), the larger the
value of If, will be. We say that the variance of , is "inflated" by the quantity (1- ,
Consequently, we usually call

R:r',

Ii,

j:::::: 1,2..... k,

(15-63)

the variance inflation factor for ~j' ~ote that these factors are the main diagonal elements
of the inverse of the correlation matrix. They are an important measure of the extent to
which multicollinearity is present.
Although the estimates of the regression coefficients are very imprecise when multicollinearity is present, the estimated equation may still be useful For example, suppose we
v.1sh to predict new observations. If these predictioQ.'i are required in the region of the
x~space where the multicollinearity is in effect, then often satisfactory results will be
obtained because while individual f3jmay be poorly estimated, the function 'L;',f3;:;, may be
estimated quite well. On the other band. if the prediction of new observations requires
extrapolation, then generally we would expect to obtain poor results. Exlrdpolation usually
requires good estimates of the individual model parameters.

466

Chapter 15 Multiple Regressio"


Multicollinearity arises for several reasons, It will occur when the analyst collects
the data such that. constraint of the form L~, ojX] ~ 0 holds among the colnnms of the X
matrix (the a, are constants, not all zero), For example. if four regressor varia.bles are the
components ~f a mixture, then such a constraint will always exist because the sum of
the cOClponents is always constant. Usually. these constraints do not hold exactly, and the
analyst does not know that they exist.
There are several ways to detect the presence of multicollinearity. Some of the more
important of these are briefly discussed.
1. The variance ioilation factors, defined in equation 15-63, are very useful measures
of multicollinearity. The larger the variance inflation factor. the more severe the

multicollinearity. Some authors have suggested that if any variance inflation factors
exceed 10 then multicollinearity is a problem. Other authors CQusiderthis value too
j

liberal and suggest that the variance inflation factors should not exceed 4 or S.
2. The determinant of the correlation matrix may also be used as a measure of multicollinearity. The value of this determinant can range between 0 and 1. When the
value of the determinant is I, the columns of the X matrix are orthogonal (i.e" there
is no intercorre1ation between the regression variables). and when the value is O.
there is an exact linear dependency among the columns of X. The smaller the value
of the determinant. the greater the degree of multicollinearity.
3. The eigenvalues or characteristic roots of the correla. .:ion matrix provide a measure
of multicollinearity, If X'X is in correlation form. then the eigenvalues of XX are
the roots of the equation

IX'X - All = o.
One or more eigenvalues near zero implies that multiCOllinearity is present. If Am:;
and Aml'lC denote the largest and smallest eigenvalues of X'X, then the ratio Amru!Amin
can also be used as a measure of multicollinearity. The larger the value of this ratio,
the greater the degree of multicollinearity. Generally, if the ratio VAm" is less than
10, there is little problem with multicollinearity.
4. SOr:letimes inspection of the individual elements of the correlation matrix can be
helpful in detecting multicollinearity. If an element
is close to 1. then Xi and
X, may be strongly multicollinear. However, when more than tvlO regressor variables are involved in a multicollinear fashion, the individual Tlj are not necessarily
large. Thus, this method will not always enable us to detect the presence of
multicollinearity.

h1

5. [f the F-test for significance of regression is significant but testt; on the individual
regression coefficients are not significant, then multicollicearity may be present.
Several remedial measures have been pr<!p<Jsed for resolving the problem of multicollinearity. Augmenting the data with new observations specmcally designed to break up
the approximate linear dependencies that currently exist is often suggested. However. sometimes this is impossible for economic reasons, or because of the physical constraints that
relate the X,. Another possibility is to delete certain variables from the model 'This suffers
from the disadvantage that one must discard the information contained in the deleted
variables,
Since multicollinearity primarily affects the stability of the regression coefficients. it
would seem. that estimating these parameters by some method that is less sensitive to multieollinearity than ordinary least squares would be helpful. Several methods bave beeD suggested for this. Hoed and Kennard (1970., b) bave proposed ridge regression as an

15] 0 Problems in Multiple Regression

467

alternative to ordinary least squares. In ridge regression, the parameter estimates are
obtained by solving
11'(1)= (X'X + Ifr'X'y,

(1564)

where I> 0 is a constant. Generally, mues of 1in the interval 0 :s; I'; I a.-e appropriate. The
ridge estimator 11'(1) is not an unbiased estimator of ]:I, as is the ordinary leastsquares estimator ~, but the mean square error of 11'(1) will be smaller than the mean square error of~.
Thus ridge regression seeks to find a set of regression coefficients that is more "stable,lt in
the sense of having a small mean square error. Since mUlticollinearity usually results in
ordinary least-squares estimators that may have extremely large variances, ridge regression
is suitable for situations where the multicollinearity problem exists.
To obtain the ridge regression estimator from equation 15~64, one must specify a value
for the constant I. Of course, there is an "optimum" I for any problem, but the simplest
approach is to solve equation 15-64 for several values of I in the inren'lll 0 :s; I ;; L Then a
plot of the values of Il'(l) against 1is constructed. This display is called the ridge trace. The
appropriate value of 1is chosen subjectively by inspection of the ridge trace. TypicallYI a
value for I is chosen such that relatively stable parameter estimates are obtained. In general l
the variance of 11'(1) is a decreasing function of I, while the squared bias [1l-Il'(I)J' is an
increasing function of l, Choosing the value of [involves trading off these two properties of
13'(1)
A good discussion of the practical use of ridge regression is in Marquardt and Snee
(1975). Also, there are several other biased estimation techniques that have been proposed
for dealing with multicollinearity. Several of these are discussed in Montgomery, Peck, and
Vrning (2001).

it,,mp~15-14
(Based on an example in Rald, 1952.) The heat geoerated in calories per gxa.,\ for a particular type of
cement as a function of the qua."1tities of foll::' additives (ZI' z;>.' z,. andz,;) is shown in Table 15-10. We
wish to fit a :nultiple linear reg:res;,lon model to these da:a.

Tabl.15-10 Data [or Example 1514


Observation Number
1

2
3
4
5
6

7
8
9
10
II

12
13
14
15

y'

z,

28.25
24.80
11.86
36.60
15.80
16.23
29.50
28.75
43.20
38.47
10.14
38.92
36.70
15,31

10

8.40

12
5
17

8
6
12
10

...,

Z,

z,

31
35
15
42

5
5
3.

5
3
6
5
10
10
5
II
8
2
3

45
52
24
65
19

17

36

">
16

34
40
50
37

20

40

15
7

45

12

18

22

25
55
50
70

80
61

70
68

30
24

468

Chapler 15

Multiple Regression

The data 'Will be coded by defining a new set of regressor variables as

i = 1.2,3,4,

i=I,2, ... ,15.

r.:

where Sjj = t (Zij - ;i! is the corrected sum of squares of the levels of <;" The coded data are shown
in Table 15-11. This tran..<;fonnation makes the intercept orthogonal to the o~r regression coefficients, since the:first column of the X matrix consists of ones. Therefore, the intercept in this model
will always be estimated by y. The (4 x 4) X"X matrix for the foUl' coded variables is the correlation
ma:rix

1.00000 0.84894 0.91412 0.933671


0.84894 1.00000 0.76899 Q.g7567
X'X= 0.91412 0.76899 1.00000 0.86784'
[
0.93367 0.97567 0.86784 1.00000
This matrix contains several large correlation coefficients, and this may indicate significant mu1~

dcoll.i..1earity. The inverse of X'X is

(X'Xr l

20.769

25.813

= 25.813

74.486

-0.608
12.597

-44.0421
-107.710

12597

8.274

-18.903'

-0.608

-107.710 -18.903

l-44042

163.620

The ...'ariance inflation factors are the main diagonal clements of this matrix, Note 'J:lat three of t!:!e
variance inflation factors exceed 10, a good indication that multicollinearity is present. The eigenvalues of X'X are }'1 := 3.657, i-;::= 0.2679, A, = 0.07127. and }"4 ':;::: 0.004014. Two of the eigenvalues,.:t.
and A4> are re1atively close to zero. Also, the ratio of the: largest to the smallest eige:r:.value is

?.~

Amill

3.652....=911.06,
0,004014

which is considerably larger than 10. Therefore, since examination of the variance inflation factors
will use ridge regression
:0 estiro.ate the model parameters.

and the eigenvalues indicates ?otential problems with multicollinearity, we

T.blels..n Coded Data for Example 1514


Observ'dtion Number

2
3

4
5
6
7

8
9

y
28.25
24.80
11.86
36.60
i5.80
16.23
29.50
28.75
43.20

10
11
12

38.47

13

36.70
15.31
8.40

14

IS

10.14

38.92

x.
-<J.12515
-<J.02635
-0.37217
0.22066
-<J.22396
-0.32276
-<J.02635
-<J.12515
0.27007
0.51709
0.17126
0.36887
0.12186
-0.27336
-<J.17456

X,
0.00405
0.08495
-<J.31957
0.22653
-<J.50161
-<J.27912
0.10518
0.06472
0.18608
0.38834
0.12540
0.18608
0.28721
-0.17799
-<J.38025

X,

-<JD9206
-<J.09206
-027617
0.27617
-<J.09206
-<J.276l7
0.00000
-0.09206
0.36823
0.36823
-<J.09206
0.46029
O.I84Jl
-<J.36823
-<J.27617

x,
-<J.05538
0.03692
-<J.33226
0.20832
-<J.39819
-<J.3l907
0.07641
0.01055
0.27425

0040609
0.15558
0.27425
0.24788
-<J.25315
-<J.33226

15-10 Problems it) Multiple Regression

469

We solved Equation 15-64 for 'various values of l, and the results are su:m..mar:ized in Tabie 15-12. The
ridge trace is shown in Fig. 15-7, The instabLity of the least-squares estimates tJ;U = 0) is evident
from inspectio:1 of the ridge t"ace. It is often difficult to choose a value of 1from the ridge trace e::.3r
simultaneously stabilizes the estimates of all regression coefficients. We will choose l "'" 0.064. which
implies that the regression model is

y = 25,53 -18.0,566x1
using

"'r'

17.2202xz + 3{}.0743;:;) + 4.7242J:~,

A:: y:: 25.53. Coavening the model to the original variables Zt we ha,,'C
y = 2.9913 -

0.8920z, + 0.3483" + 3.3209" - 0.0623".

Table 15-12 Ridge Regression Estimates for Example 15-14

0.000
0.001
0,002

0,004
0,008
0,016
0,032
0.064
0,128

0,256
0,512

-28.3318
-31.0360
-32.6441
-34,1071
-34.3195
-31.9710
-263451
-18.0566
-9.1786
-1.9896
2.4922

65.9996
57,0244
50,9649
43,2358
35,1426
27.9534
22,0347
17.2202
13.4944
10,9160
92014

64.1)479
61.9645
60.3899
58,0266
54.7018
50.0949
43.8309
36,0743
27,9363
20,8028
15.3:'97

-572491
-44.0901
-35.3088
-24.3241
-13.3348
-4.5489
1.2950
4,7242
6.5914
75076
7.7224

Pi(l)f
70

60

Ie
o
-10
-2C

P4 (I}

P;(I)

-30
-40

-50

--<lo

O.OS 0.:0 0.15 0.20 0.25 0.30 0.35 0.40 D.45 0.50 0,55

Figure 15~7 Ridge trace for Ex.ample 15-14,

1
470

Chap"" 15

Multiple Regression

1510.2 Influential Observations in Regression


When using multiple regression we occasionally find that some small subset of the obser~

varions is unusually influential, Sometimes these influential observations are relatively far
away from the vicinity where the rest of the data were collected. A hypometical situation
for two variables is depicted in Fig. 15-8, where one observation in x-Sface is remote from
the rest of the data. The disposition of points in the x~space is important in dete..1nining the
propetties of the mode!. For example, the point (x", xQ) in Fig. 15-8 may be very influential in detenr.Jning the estimates oime regression coefficients, the value of R2, and the value
of MS,
\Ve would like to examine the data points used to build aregr--vSsion model to determine
if they control many model properties. If these influential points are "bad>l points. or are
erroneous in any way, then they should be eliminated, On the other hand, there may be noming 'Wl"ong with these points, but at least we would like to determine whether or not they
produce results consistent with the rest of the data, In any event, even if an influential pomt
is a valid one, if it controls important model properties. we would like to know this, since
it could have an imp""t on the use of the modeL
Montgomery, Peck, and Vrning (2001) describe several methods for detecting influential observations. An excellent diagnostic is the Cook (1977, 1979) distance measure, Thls
is a measure of the squared distance between the least squares estimate of p, based on all n
observations and the estimate ~v, based on removal of the ith point. The Cook distance
measure is

-~)
(~(;' -~')' X'X(~'il
, \.

=.J

pMSE

'

i=l 2, ... ,n,


j

Clearly if the it.!? point is influential, its removal will result in ~"1 changing considerably
from me value p. Thus a large value of D, implies that the ith point is influential. The statiseic Di is acu:ally computed using
(1565)

where

ej JMSE(I-h,,) and hii is me ith diagonal element of the matrix


H = X(x'Xt'X'.

Xl!
XI':'. ,-~---------------!

Region containing 1
all obServattons I

except the ith

I.

:
1

I
I
I

Figure

15~8

A point that is remote in x~space.

15-10 Problems in Multiple Regression

471

The H matrix. is sometimes called the "haC matrix. since

y~x~

= X(X'Xr'X'y
=Hy.
Thus H is a projection matrix that transforms the obsen'ed values of y into a set of fitted

values 'Y.
From equation 15-65 we note that D, is made up of a component that reflects how well
the model fits the ith observatioll Yi [the quantity eJ~MSE(I- hii ) is called a Studentized
residual, and it is a method of scaling residuals so that they have unit variance] and a com~
ponent that measures how far that point is from the rest of the data ["lil(1 - h,) is the distance of the ith point from the centroid of the remaining n - 1 points]. A value of DI > 1
woold indicate that the point is influential. Either component of D, (or both) may contribute
to a large value,

[~[':!II~te!5Ii5;
Table !5-13lists the values of D; for the c!a::naged peaches data in Example
culations. CODSider the first observation:

15~ 1. 'Ib illusnte

Di

Table 15..13 I:rfb.1cnce Diagnostics for the Damaged Peaches Data in Example
Cook's Distance
Measure

Observation
(,)

2
3

4
5
6
7
8
9
10

11
12
13
14
15
16
17
18
19
20

0.249
0.126
0.088
0.104
0.060
0.299
0.081
0.055
0.193
0.117
0.250
0.109
0.213
0.055
0.147
0.080
0.209
0.276
0.210
0.078

0.277
0.000
0.156
0.061
0.000
0.000
0.008
0.002
0.116
0.024
0.037
0.002
0.045
0.056
0.067
0.001
0.001
0.160
0.171
0.012

15~15

the ca1~

472

Chapter 15 Multiple Regression

[2.06N2.247(1-O.249)]'.

3
~

0.249
(1-0.249)

0.277.

The values in Table 15~13 were calculated using :Minitab", The Cook distance measu..--e Di does not
identify any potentially influential observations in the data, as no value of Di exceeds unity.

1510.3 Autocorrelation
The regression models developed thus far have assumed that the model error components
S; are uncorrelated random variables. Many applications of regression analysis involve data
for which this assumption may be inappropriate. In regression problems where the dependem and independent variables are time oriented or are time-series data. the assumption of
uncorrelated errors is often untenable. For example, suppose we regressed the quarterly

sales of a product against the quarterly point~of~sa1e advertising expenditures. Both vari~
abIes are time series, and if they are positively correlated with other factors such as disposable income and population
which are not included in the model, then it is likely that
the error terms in the regression model are positively correlated over time, Variables that
exhibit correlation over time are referred to as autocorrel.ated variables. 11any regression
problems in economics, business, and agriculture involve autocorrelated errorS.
The occurrence of positively autocorrelated errors has several potentially serious con~

sequences, The ordinary least-squares estimators of the parameters are affected in that they
are no longer minimum variance estimator51 although they are still unbiased. Furthermore,

the mean square error !l;fSJ;; may underestimate the error variance

cr, Also, confidence inter-

vals and tests of hypotheses. which are developed assuming uncorrelated errors, are not
valid if autocorrelation is present.
There are several statistical procedures that Can be used to determine whether the error
terms in the model are uncorrelated. We ,Yin describe one of these, the Durbin~ WatSOn test
This test assumes that the data are generated by the first-order au.toregressive model
t=l~2,

.",n,

(15-66)

where t is the index of time and the error terms are generated according to the process

{15-67}
where Ipi < 1 is an unknown parameter and 0, is a NlD(O, <f'} random vanable. Equation
15-66 is a sjmple linear regression model, except for the errors, which are generated from
equation 15~67, The parameter p in equation 15~67 is the autocorrelation coefficient. The
Durbin-Watson test can be applied to the hypotheses

He: p=O,
H1:p>O.

(l5-6S)

Note that if Ho: P = 0 is not rejected, we are implying that there is no autocorrelation in the
errors. and the ordinary linear regression model is appropriate.

15-10 Problems in Multiple Regression

473

To test II,: p = 0. first fit the regression model by ordinary least squares. Then, ealculate the Duibin-Watson test statistic

(15-69)

where e, is the rth residual For a suitable value of a, obtain the critical values D a. u and Dq,J.
from Table 15-14.lf D > D~", do not reject II,: p =0; but if D < D .. t reject II,: p= 0 and
conclude that the errorS are pOSitively autocorrelated. If D13... L ~ D ! D 1Z, lj' the test is

Table 15~14 Critical Values of the Durbin-Watsor. Statistic


P:obability '"

k:::::: K umber of Regressors (Exchding :he Intercept)

LowetTail
SampIe (Significance
Size
Level:;; a)

15

0.01
0.025
0.05

0.81
0.95
1.08

1.07
1.23
1.36

0.70
0.83
0.95

lAO
1.54

0.59
0.71
0.82

1.46
1.61
1.75

0.49
0.59
0.69

1.70
L8L
1.97

0.39
0048
0.56

1.96
2.09
2.21

20

0.01
0.025
0.05

0.95
1.08
1.20

1.15
1.28
1.41

0.86
0.99
1.10

1.27
1.41
1.54

0.77
0.89
1.00

1.41
1.55
1.68

0.63
0.79
0.90

1.57
1.70
1.83

0.60
0.70
0.79

1.74
1.87
1.99

25

0.01
0.025
0.05

1.05
1.13
1.20

1.21
1.34
1.45

0.98
1.10
1.21

1.30
1.43
1.55

0.90
1.02
1.12

1.41
1.54
1.66

0.83
0.94
1.04

!.52
1.65
1.77

0.75
0.86
0.95

1.65
1.77
1.89

0.01
0.025
0.05

1.13
1.25
1.35

1.26
1.38
1.49

1.07
1.18

1.28

1.34
1.46
1.57

1.01
Ll2
1.21

1.42
1.54
1.65

0.94
1.05
1.14

1.51
1.63
1.74

0.88

30

0.98
1.07

1.61
1.73
1.83

0.01
0.05

1.25
1.35
1.44

1.34
1.45
154

1.20
1.30
1.39

lAO
1.51
1.60

1.15
1.25
1.34

1.46
1.57
1.66

1.10
1.20
1.29

1.52
1.63
1.72

1.05
1.15
1.23

1.58
1.69
1.79

0.0:
0.025
0.05

1.32

1.40
1.50
1.59

1.28

1045

1.42
l.50

1.38

1.54

1.63

1.49
1.59
1.67

1.20
1.30
1.38

1.54
1.64

1.46

:.24
1.34
1.42

1.16
1.26
1.34

1.59
1.69
1.77

0.01
0.025
0.05

1.38
1.47
:.55

1.45
1.54
1.62

1.35

1048

1.44
l.51

1.57
1.65

1.32
1.40
1.48

1.52
1.61
1.69

1.28
1.37
1.:44

:.56
1.65
1.73

1.25
1.33
1.41

1.60
1.69
1.77

80

0.01
0.025
0.05

1.47
1.54
1.61

1.52
1.59
1.66

1.44
1.52
1.59

1.54
1.62
1.69

1.42
1.49
1.56

1.57
1.65
1.72

1.39
1.47
1.53

:.60
1.67
1.74

1.36
1.44
1.51

1.62
1.70
1.77

100

0.01
0.025
0.05

1.54
1.59
1.65

1.56
1.63
1.69

1.50
1.57
1.63

1.58
1.65
1.72

1.48
1.55
1.61

1.60
1.67
1.74

1.45
1.53
1.59

1.63
1.70
;.76

1.44
1.51
1.57

1.65
1.72
1.78

0.025

50

60

1.25

:.72

Source: Adapted from Econometrics, by R. 1. Wonnacott and T, H. Won:aaeon., Joron W'ucy &: Sons. New York, 2970, -,villi
pennlss:ion of r.he publisher.

474

Chapter 15 Multiple Regression

inconclusive, \vnen the test is inconclusive, the implication is that more data must be collected. In many problems this is difficult to do.
To test for n.egative autocorrelation, that is) if the alternative hypothesis in equation
15-68 is HI: P < 0, then use lY =4 - D as the test statistic, where D is defined in equation
15-69, If a two-sided alternative is specified, then use both of the one~sided procedures. noting that the type I error for the two-sided test is 2a, where a is the type I error for the onesided tests.
The only effective remedial measure when autocorrelation is present is to build a model
that accounts explicitly for the autocorrelative structure of the errors. For an introductory
treatment of these methods, refer 10 Montgomery, Peck, and Vining (2001).

15-11 SELECTIOl\' OF VARIABLES IN MULTIPLE REGRESSION


15-11.1

The Model-Building Problem


An important problem ill many applications of regression analysis is the selection of the set

of independent or regressor variables to be used in the model. Sometimes previous experience or underlying theoretical considerations can help the analyst specify the set of independent variables, Usually, however, the problem consists of selecting an appropriate set of
regressors from a set that quite likely illc1udes all the important variables, but we are sure
that not all these candidate variables are necessary to adequately model the response y.
In such a situation, we are interested in screening the candidate variables to obtain a
regression model that contains the "best" subset of regressor variables. We would like the
final model to contain enough regressor variables so that in the intended use of the model
(prediction, for example) it will perform satisfactorily. On the other band, to keep model
maintenance costs to a minimum, we would like the model to use as few regressor variables
as possible, The compromise betvleen these conflicting objectives is often called finding the
"best" regression equation. However. in most problems, there is nO single regression model
that is ""best" in teons of the varions evaluation criteria that have been proposed. A great
deal of judgment and experience with the system being modeled is usually necessary to
select an appropriate set of independent variab~es for a regression equation.
No algorithm will always produce a good solution to the variable selection problem.
Most currently available procedures are sea..rch techniques. To perform satisfactorily, they
require interaction with and judgment by the analyst. We now briefly discuss some of the
more popular variable selection techniques.

15-11.2

Computational Procedures for Variable Selection


\Ve assume that there are k candidate variables, XI> Xl' . " XI.:' and a single dependent vari,..
able y. All models will include an intercept term (3" so that the model with all variables
included would have k + 1 terms. Furthermore, the functional fonn of each candidate vari~
able (for example, x, = lIx t ,x, In x" etc.) is correct.

An Possible Regressions

This approach requires that the analyst fit all the regression
equations involving one candidate va.,;able, all regression equations involving two candi~
date variables, and so on. Then these equations are evaluated according to some suitable criteria to select the "'best" regression modeL If there are k candidate variables, there are 2k
total equations to be examined, For example, if k 4, there are 24:::;: 16 possible regression
equations, while if k 10, there are 2" 1024 possible regression equations. Hence. the
number of equations to be examined increases rapidly as the number of candidate variables
increases.

Selection of Variables in Multiple Regression

15-11

475

There are a number of criteria that may be used for evaluating and comparing the dif
ferent regression models obtained, Perhaps the most commonly used cri:erion is based on
!.he coefficient of multiple detennination. Let denote the coefficient of determination for
a regression model with p terms, that is, p - 1 candidate variables and an intercept ~eml
(note that p !5 k + 1). Computationally, we have

Ii!

R' = ~R(P) =
P
S

l---::-~,

"

(15-70)

where SS.cp) and SSE(P) denote the regression sum of squares ,,,,d the error sum of squares,
respectively, for the p-variable equation. Now JS increases as p increases and is a maximum
when p:=; k+ 1. Therefore, the analyst uses this criterion by adding variables to the model
up to the point where an additional variable is not useful in tha: it gives only a small
increase in
The general approach is illustrated in Fig. l5~9, which gives a hypothetical
plot of against p, 1i'J:>ically, one examines a display such as this :md chooses the number
of variables in the model as the point at which the "knee" in the curve becomes apparent.
Clearly, this requires judgment on the part of the :malyst.
A second criterion is to consider the mean square error for the p-variable equation, say
MSE(P) SS,(p)/(n -p), Generally, MSE(P) decreases as p increases, but this is not necessarily so. If the addition ofa variable to the model withp 1 terms does not reduce the error
sum of squares in the new p term model by an amount equal to the error mean square in the
old p 1 term model, ,,"dSs(P) will increase, because of the loss of one degree of freedom
for error. Therefore, a logical criterion is to select p as the value that lI'inimizes 1,1SE (P); or
since ,\:fS(p) is usually relatively flat in the -vicinity of the minimum, we could choose p
such that adding more variables to the model produces only very small reductions in
MSE(P), The general procedure is illustrated in Fig, 15-10,
A third criterion is the Cp statistic, which is a measure of the total mean square error for
the regression modeL We define the total standardized mean square error as

R!

R!.

rp = ,,' rEly,
- EHl
,,,,1
=

:2 [t,{E(y,J-E(Y,j}' t, V(y,)+

.
"J= 1 [(b.)2
las + vanance
(j2

o
Figure 15-9 Plot of ~ against p,

476

Chapter 15 Multiple Regression

Minimum MSE(o)

L -_ _ _ _ _ _ _ _ _~~----

Figure 15~lO Plot of MSE(P) against p.

\Ve use the mean square error from the full k + 1 term model as an estimate of a'l; that is,
IT = MSs(k + I). An estimator of r,is

(15-71)
If the p-tel1I!. model has negligible bias, men it can be shown that
E( C,lzero bias) = p.

Therefore, the values of Cp for each regression model under consideration should be plotted against p. The regression equations that have negligible bias will have values of Cp that
fall near the line Cp = P. while those with significant bias will have values of Cp that pwt
above this line. One then chooses as the "best" regression equation either a model with min~
imum Cp or a model with a sJJghdy larger CD that contains less bias than the m.i.nimum.
Another criterion is based on a modification of that accounts for the number of variables in the model. We presented this statistic in Section 15-6.1. the adjusted R' for the
model fit in Example 15-L This statistic is called the adjusted R~ defined as

R;

'()
n-l("
RoO
p =1--1-11:).
7
n- p
P

(15-72)

Note that R:"j(P) may decrease as p increases if the decrease in (n - l){l - R; is not CDWpensated for by the loss of one degree of freedom in n - p. The experimenter would usually
select the regression model that has the caximum value of R~(P). However, note that this
is equivalent to the model that minimizes MS.(P). since

15-11

Seleetion of Variables in Multiple Rcgra""ssion

477

The data in Tab;e 15~ 15 are an expanded set of data fo:- the damaged peach data in Example 15~1.
Tnere are now five candidate variables, drop height (Xl)' fruit density (.x::;, fruit height at impact point
(x:J. fruit pulp thickness (x4 ), and potential energy of me fruit before me impact (x5 ),
Tab~e 15-16 presents the rescl~ of rur.uing all possible regressions (except the tivial modei wim
ody an intercept) on these data. The values of R~, R~lp), MS/.p), and ep are given in fr.e table. A plot
of the maximum for each subset of size p is shown in Fig. 15~ 1 L Based on this plot there does not
appear to be much gain in z.dding the fift:h variable. The value of does Dot seem to bcrease sigrificantly 'J.'!th the addition of x, over the four-variable model with the highest value. A plot of the
minimum lrfS:/.,p) for each subset of size p is sholA.'J1 in FIg. 15~12, The best two-variable model is
either (x;. X:/) or (..tz. X:/); the best th...'"U:-variable model is (xp.x::, x:J; the besl four-w.u::iablc model is
either (x:,.."t:J,.:s. xJ or (xl' 4' x 3 X5)' There are several models with relatively small values of MSi,p).
but either the three-variable model (xl' 4'xJ or me four~variable model (x,.~, Xz, xJ would be supe~
rior to the other models based on the MSdJ;) criterioll. Further investigation ?oill be necessary.
A Cp p:ot is shown in Fig. 15-13, Only 'fue fiv~variable model has a Cp '5. P (specifically Cp =6.0), but the Cp value for the four-variable model (Xl' x.~.~, x,J is Cft:;;: 6.1732. There appears to be
insufficer.t gain in the Cp value [Q justify including x5' To illustrate the calculations, for this equation
(for the rrwdel including (X" x" x,. xJ] we wou:d find

R!

R!

R!

C = SS(p) -n-2p

02

18.29715 20+2'5)=6.1 732.


1.13132
\

Table!;1' Darr:.aged Peach Data fur Example 15-16

TIme,

Drop
Height.

Fnrit
Density,

Fnrit
Heigh4

Fruit Pulp
Thickness,

Potential
Energy,

x,

X,

X,

x,

x,

3.62
727

303.7
366.7
336.8
304.5
346.8
600.0
369.0
418.0
269.0
323.0
5622

0.90

2<i.l
18.0
39.0
48.5
43.1
21.0
12.7
46.0
2.6
6.9

22.3
21.5

27.3

19.5
20.2

184.5
185.2
128.4
173.0
139.6
146.5
155.5
129.2
154.6
152.8
199.6
:77.5
210.0
165.!
195.3
171.0
163.9
1.40.8
:54.1
194.6

Delivery
Observation

2
3
4
5
6
7
8
9

1.53
4.91
10,36
5.26
6.09
6.57

10

4,24

11
12
13
14
15
16

8.04
3.46
8.50
9.34
5.55
8.11

17

7,32

18
19

12.58
0.15
5.23

20

2.6

284.2
558.6
415.0
349.5
462.8
333.1
502.1
311.4
351.4

Ul4
1.01
0.95
0.98
1.04
0.96
1.00
1.01
0.94
1.01
0.97
1.03
1.01
1.04
1.02

1.05
1.10
0.91
0.96

30.6

22,9

2004
18,7
17.0
20.L
18.1

21.5
24.4

37,7

22.6

2<i.l
48.0
32.8
29.2

17.1

'::',0

39.2
36.3

21.5
23.7
21.9
16.9

26.0
23.5

478

C'lapter 15

Multiple Regression

0.7

P
Figure 1511 11:0 maximwn

R; plot for Example 1516.

oJ"
:;

0.8

J,

4
P

..

t:
6

Figure 1512 The MS,(P) plot for &ample 1516.

35

u~

F'

25,
15

5::::L_ _.
2
3

P
Figure 15-13 The C, plot for Exa::nple 1516.

noting tha:(f1= 1.13132 is obtained from the full equation {xt>;S, x 3 x4 xj ). Slnee all other models
(with the exclusion of the five-variable model) contain substantial bias, we would conclude on the
basis of the C", criterion that the best subset of the regressor variables is (x:. X:!,.:s, x,J. Since this model
also results in rela::ively small MSs(p) and a relatively high
we would selecr it as the "best"
regression equation, The final model is

R!.

jk-19.9 + 0.0 123x, + 27.3x, - 0.0655", -0.196<,.,


Keep in mind. though, that fi.t.r.her analysis should be conducted on this model as well as other possible candidar:e models. VV1th additional investigation,. it is possible to discover an even bcttcr.-fitting
model. We v.ill discuss this in mOre detail later in this chapter.

15~ 11

Selection of Variables in Multiple Regression

479

The all-possIble-regressions approach requires considerable computational effort, even


when k is moderately small. However, if the ~yst is willing to look at something less than
the estimated model and all its associated statistics, ir possible to devise algorithms for all
possible regressions that prOOuce less infonnarion about each model but which are more
efficient computationally. For example, suppose that we could efficiently calculate only the
MSEfor each model. Since models with large MSE are not likely to be selected as the best
regression equations, we would then have only to examine in detail the models with small
values of MS, There are se....-ern1 approaches to developing a computationaUy efficient algorithm for all possible regressions (for example. see Furnival and Wi~son, 1974). Both

Minitab and SAS computer packages provide the Fumival and Wilson (1974) algorithm as
an option. The SAS output is provided in Table 15-16.
Stepwise Regression This is probably the most widely used variable selection technique.
The procedure iteratively constructs a sequence of regression models by adding or
Table 15-16 Al1 Pos.sible Regressions for the Data in Example

?':.u:nber in
Modelp-I

1
2
2
2
2
2
2

2
2
2
2
3

3
3
3
3
3
3

3
3
4

4
4
4
4
5

15~ 16

Variables in

Ii!

if""cp)

Gp

MS,fp)

Model

0.6530
0.5495
0.3553
0.1980
0.0021

0.6337
0.5245
0.3194
0.1535
-0.0534

37.7047
53.7211
83.7824
108.1144
138.4424

3.37540
4.38205
6.27144
7.80074
9.70689

":
x,
x,

0.7805
0.7393
0.7086
0.7030
0.6601
0.6412
0.5528
0.4940
0.4020'
0.2125

0.7547
0.7086
0.6681
0.6201
0.5990
0.5002
0.4345
0.3316
0.1199

19.9697
26.3536
31.0914
31.9641
38.6031
41.5311
55.2140
64.3102
78.5531
107.8731

2.26063
2.685>M
3.00076
3.05883
3.50064
3.69550
4.60607
5.21141
6.15925
8.11045

x"X-z
,,:,x,
X-z,x4
Xl' x,
Xz. x1
X.' x..

0.8756
0.8049
0.7898
0.7807
0.7721
0.7568
0.7337
0.7032
0.6448
0.5666

0.8523
0.7683
0.7503
0.7396
0.7294
0.7112
0.6837
0.6475
0.5782
0.4853

7.2532
18J949
20.5368
21.9371
23.2681
25.6336
29.2199
33.9410
42.9705
55.0797

1.36135
2.13501
2.30060
2.39961
2.49372

xp Xz,:S

2.6609B

Xz, Xl' x$
X-z, x4 Xs

0.8955
0.8795
0.8316
0.8103
0.7854
0.9095

0.8676
0.8474
0.7866
0.7597
0.7282
0.8772

8.6459
16.0687
19.3611
23.2090
6.0000

0.6744

6.1732

2.91456
3.24838
3.88683
4.74304
1.21981
1.40630
1.96614
2.21446
2.50467
1.13132

X,

x,

x:;;:1'

xJ '

X'4

x 4.;;:!
x;p;;:'"

x1,,Xz,x.;
,Xz.;S,x,.
xl' :X:::,xj
x 1.X:!,X4

xl,:S, Xj
xl'

XM

Xs

4
x:. Xz.xJ.x.,.
x".:t:z, A"XS
X-z, x,. X4' Xl
x;. X;:, x4 Xl
X" X.:.,

X(,~,X4'XS

Xl' X;,~,

x.; .xs

480

Chapti.'rr 15 Multiple Regression

removing variables at each step. The criterion for adding or removing a variable at any step
is usually expressed in terms of a pattial Ftes<. Let F i be the value of the F statistic for
adding a variable to the model, and letFoot be the value of the Fstatistic for removing a vari~
able from the model. We must have Fin ~ Fout' and usually F.~ ;;: F 00\'
Stepwise regression begir:s by forming a oneMvariable model using the regressor vari~
able that has the highest correlation with the response va..'iable y. This will also be the vari~
able producing the largest F statistic. !fno F statistic exceeds Fin' the procedure terrnina.tes,
For example, suppose that at this step XI is selected. At the second step the rema.:ining k - 1
candidate variables are examined, and the variable for which the statistic

SSR(PJlP" Po)

(l5-73)

ldSE(xJ1X1 )

is a maximum is added 10 the equation, pro,ided that F, > F". In equation 15-73, MS/"xr x,)
denotes the mean square for error for the model containing both XI andx? Suppose that this
procedure now indicates that:s should be added to the model. Now the stepwise regression
algorithm de1lermines whether the variable x; added at the first step should be removed. This
is done by calculating the F statistic
(1574)

If F I < F~u.I.' the variable x~ is removed,


In general, a: each step the set of remaining candidate variables is examined and the
variable with the largest partial F statistic is entered. provided that the observed value of F
exceeds F:~. Then the partial F statistic for each variable.in the model is calculated. and the
variable with the smallest observed value of F is deleted if the observed F < F 00" The pro
cedure continues until no other variables can be added to or removed from the model.
Stepwise regression is usually performed using a computer program.. The analyst exer. .
cises control over the procedure by the choice of FlO and Ft)u\' Some stepwise regression
computer programs require that numerical values be specified for Fin and F Qu:' Since the
number of degrees of freedom on MSe depends on the number of variables in the model,
which ch3!lges from step to step> a fixed value of FIE! and FOUl causes the type I and type II
error rates to vary. Some computer programs allow the analyst to specify the type I error
levels for Fm and F owt' However. the "advertised" significance level is not the true level.
because the variable selected is the one that maximizes the par.ial F statistic at that stage.
Sometimes it is useful to experiment 'With different values of Fj(j and F(j~t (or different advertised type I eITOr rates) in several runs to see if this substantially affects the choice of the
final model.

~;m;~~~'i~;4t
We w'Jlapply stepwise regression to the damaged peaches data in Thble 15-15, Miuita.b2i O!!tput is pnr
vided in Fig. 15-14. From this figure, we see that variables XI';S, and.x;, are sigcifica..'lt; this is because
the last column contains entries for only ,xl'~? and.t:J. Figure 15-15 provides the SAS computer output
that '.vil.l support the computatioas ro be calculated ne."tt. Instead of speciff.!lg u~ca1 values of F ~
andFQ<,# we use an advertised type I ITOr of a'"",O.10. The first Step consists ofbcild.ing a sirnp:e linear
regression model using the variable that gives the largest F statistic. This is X-z, and since

F, = SSR(P,IPo} _ 114.32885 33.87> F" = F,."".,, = 3.01,


MS(x,)
3.37540

15~1:

Alpr..a-to-Enter;

Selection of Variables in Multiple Regressior.

Alpha-to-Remove

Q.l

Response is y on $ predictors. with N


Step

Consta."lt
x2

T-'.lalue
P-Value

2
-33,33

3
-27,89

49,1
5.82
0.000

34.9
4.23
0.001

30.7
4:.71
0.000

0.0131
3.14

0.0136
4.19

C .C06

0.001

xl

-0.067
-3.50
0.003

x3

?-Value
P-Value
1.84

1. SO

1.17

:R-Sg

65.3C

R-Sq (adj)

63.37
37.7

7S.0S
75.47
20.0

87.56
S5.23
7.3

C-p

0.1

= 20

1
-4:2 .87

T-Value
P-Value

481

Figure 15-14 Minitab<!l ou~t for stepwise regression in Example 15-17.

.L:! is entered into the modeL

The second step bet..:ns by finding L'ie va."iable xJ that has the largest partial F statistic, giver;, that
.L:! is in the mode:. This is xI' and since

sS.(lldIl2.llc)
MSE(x,.x;)

46.45691
2.26063 - 20.55 > F;,

=F,.".,.i1 =3.03,

x. is added to ':he model. :;ow the procedure evaluates whet.'ier or not Xz should be retained, given that
x ~ is in the modeL rus mvol yes calculating
40.44627 = 17.89 > F
2.26063
m

FO,:O,J.17 =3.03.

TherefOie.t: should be retained. Step 2 terminates 'IN:ith bot'i x; and X:: in t'ie model.

The :bird step finds the next variable fer entry as;S. Since

ss,(Il, '/I, ,Il:, Ilc)


MSE(x"x"x,)

lq.64910
L36135

12.23> Fin

FO.IQ,I.IS = 3.05,

F~te.sts on Xz (given Xl and.x~ and Xl (given ~ and xJ) indicate that


these variables sholtld be retained. Therefore. the third step coucludes with the variables XI' x 2' and Xj
in the model.
At 'the fourth step. neither of the remainir.g terms, x.. or x~. is significant en.ough to be included
in the model. Therefore. the stepwise proced\4'"e ~tes.
The stepwise regression proce~""e would conehlde that the bes~ model includes XI' )'.,.. and-'S.
The usual checks of medel adequacy, such as residual analysis and Cp plots. should be applied to the
equation. These results are si.mi.l.ar to those found by all possible regressions, with. the exception that
Xi was also considered a possible significant variable with all possible regressions,

:l:; is added to the model. Partial

482

Chapter 15 Multiple Regression

Forward Selection This variable selection procedure is based on the principle that variables should be added to the model one at a time until no re:mai::ring candidate variables
produce a significant increase in the regression sum of squares. That is. variables a...~ added
one at a time as long as F> Fin' Forward selection is a simplification of stepwise regression
that omits the partial F-test for deleting variables from the model that have been added at
previous steps. TIlls is a potential weakness of forward selection; the procedure does not
explore the effect that adding a variable at the current step has on variables added at ear~
lier steps.

The REG Procedure


Depende~~

Variable: y

Porward Selection: Step 1 Va=iable x2 Entered: R-Square

0.6530 a."ld C(p)

37.70~7

Mean Square
P Value
Pr > f
114.32885
11.4.32885
33.87
<.0001
60.75725
3.37540
175,08609
Type II 5S
F Value
Pr :> P
variable
Par~~eter Estimate
St andard E=ror
-42,87237
8.41429
87.62858
25.96
<.OC01
Intercept
8,43377
114.32885
33,87
<.0001
49.08366
x2
Bounds on conc.i~ion nu.1'lber~ 1. 1

Source

DF

Model
Er:::or
Corrected Total

1
18
19

Sum of Squares

Forward Selection: Step 2 Variable xl Enterec.: R-Square = 0.7805 and C(p)


19.9697
Mean Squ;:o..re
F Value
Pr :> F
Source
DF
Sum of Squares
63,3277::.
<.0001
Model
2
136.65541
30.23
38.43068
2.26063
Error
17
Corrected Total
175.086C
19
Type II S5
F Val'.le
Pr :> F
Variab:e Parameter Estimate Standard Error
7.46286
46.45691
0,0003
Intercept
-33.83110
2C.55
0.01314
0.00418
22.32656
C.0059
xl
9.88
34.88963
8.24844
40.44627
17.89
0.0006
:<2
Bounds on condition number: 1.4282, 5.7129

------Fo~....ard

Selection: Step 3 Varl_able x3 Entered: R-Sqaare

7.2532
Source

Dr

:::0

0.8756 and C(p)

Mean Square
F Value
Pr > F
153,30451
51,10150
37.54
<.0001
21. 78159
1.36135
16
CO:::'rected Tota:
19
175.08609
Variable Parameter Esti,.,.tate Standard Error
Type I I S5
F Value
P" > F
Intercept
-27.89190
6.03518
29.07675
0.0003
21.36
0,01360
0.00325
23,87130
xl
17.54
0.0007
30.68486
6.50286
30.21859
22.40
:<2
0.0002
-0,06701
x3
0.01916
16.649:"0
12.23
0.0030
Bounds on condition number: 1.';786, 11.85

Model
Error

Sum of Squa:.::es

No other variable met the 0.1000 significance level for entry into the.. model,
Summary of Forward Se:e..ction
Step

Variab":e
Entered

Number
va.:rs In

x2

xl

x3

1
2
3

Partial
R-Square
0.6530
0.1275
0.0951

Model
R-Square
0,6530
0.7805
0.8756

Figure 15-15 SAS output for stepwise regression in Example

l5~ 17.

C(p)

37.7047
19.9697
7,2532

Value
33.87
9.88
12.23-

Pr > P
<.0001
0.0059
0.0030

15-: 1 Selection of Variables in Multiple Regression

483

Ex:unplelS-18 .
AppE.carion of the forward selection algoriilim to the damaged peach data ir. Tabre 15~ 15 would begin
by adding X-z to the :node1. The::l th~ variable that induces the Wb'eSt partial F~test. given tha! X 2 is in
the model, is added-this is variable XI' The third step enters X:" which produces tIre Largest pa.,"tial F
statistic given 1f<.at x t and x: are in the modeL Since the partial F Statlsr:cs for XJ al'l.d;; are not signif~
iea.I!t, the procedure terminates. The SAS outputfo:- forward selection is given in Fig. 15-16. Note <tat
forward selection leads to the same:final model as stepwise regression, This is not always the case,

Backward Elimino.tion This algorithm begins with all k candldate variables in the modeL
Then the variable \Vith tbe smallest partial F statistic is deleted if this F statistic is insignif~
kant, that.is, ifF < Foul' Next, the model with k - 1 \'3riables is estim.ated~ and the next variable for potential elimination is found. The algorithm terminates when no further variables
can be deleted.

To apply backward elimination to ilie dara in Table 15~15. we begi."'l by estimating t.'1e fur: model in
a!: five variables. This model is y ~-20.89732 + 0.01 lO2x, + 27.37046x, - 0.06929", - 0.25695x, +

O.Ol66&X,.
The SAS COffiputer output is given in Fig.

15~17.

The partial

F~tests

for eaeh variable are as

follows:
1'j
F,

F,

F,

F,

SS,(Pl;3,. /3" p, ,p,. Po)


MSe

SS, I,;3, !f3"

13.65360
1.13132

12m.

{3,. p,. P,. /30 'J 21.73153 ~ 19.2J.


MS z
1.13132

SSR( /3, !(3, .8,. f34' f3,. p,)


MS E

SSR(/3,!P,. f3,. /3,. {3" f30)


MS,

SS,(f3,!{3, .8, .{3,.f34 ..80)


MSE

17.37834
1.13132

15.36.

5.25602
1.13132

4.65.

2.45862
1.13132

2.17.

T.:e variablex$ has the smallest F statistic. Fs::::' 2.17 < FWI::::' Fo.!O,t.!4::;:: 3.10; therefore. AS is removed
from the model at step L The model is now fit with only the four xmaini.'"lg variables, 11. step 2. the
F statistic for .ott {F:,:::;; 2.86) is less than Foot =FO,10,1.1;;= 3,07, therefore, x4 is :-emoved from the model
No re:rnai::ring variables have F statistics less than the appropriate F values, and the procedure is ter~
miD.ated. T.:e th.."'Ce~variable model (x!, Xz. Xv bas all variables significant according to the partial
F~test criterion. Note that backward cl.i..::::.Unation has resulted in the same model that was found by
forward selection and stepwise regression, This r::w:y not always happen,
f)J,l!

Some Comments on Final )'Jodel Selection livre have illustrated several different
approaches to the seleetion of variables in multiple linear reg..""ession. The final model
obtained from any model-building procedure should be subjected to the usual adequacy
checks. such as residual analysis and examination of the effects of outermost points. The
analyst may also consider augmenting the original set of candidate variables with cross
products, polynomial terms, or other transformations of the original variables that might
improve the model

484

Chap'", 15 Multiple Regression


The REG Procedure

Dependent Va=iable: y
Backward El:i.\ninat:ior,.: S'tep 0 All Vax:iables En":ered; R~Squa't'e
C(p) = 6.0000
Sum of
Mean
DP
Squares
Square
F Value
Source
28.15
5
159.24760
31. 84952
Model

Error
Co~ected

Total.

14
19

15.83850
17$,08609

Pa::a.me'ter
Intercept
xl
x2
x3
x4
x5

Pr :> F
<.0001

1.13132

S1:al1dard

Error

Type II S5

7.16035

9.63604

8.52

0.01102

0.00317

13.65360

27.37046
-0.25695

0.11921

21. 73153
17.37834
5.25602.
2,45862

12.07
19.21

~O.O6929

6.24496
0.01768

Estimate

Variable

0.9095 and

-20.89732

F Value

Pr:> P
0,0112
0.0037
0.0006
0.0015
0.0490

15.36
4,65

0.016B8
0,01132
2.17
Bounds on condition number: 1. 6438, 35,628

0.1626

Backward E1:lmination: Step 1 Variable xS Removed: R-Square "'" Q.8955 and


C(p) "" 6,1732

Source

DF
4

Mode:!..

Er=or
Corrected Total
Va1:'iable
Intercept
xl

IS

19
Parameter
Esti.mate
-19.93175

5\.l.l'll of
Squares
156.78898
18:29712
175.08509
Standard

0.01233

Mean
Square
39.19724

P Val.le
32.13

Pr> F

<.0001

1.21981

Exror

Type II S3

7.40393

8.84010

0.00316
6.48433

18,51245
21. 6024;6
15.86299

F Value
7.25
IS.18

,,2

27,28797

x3

13.00
-0.06549
0.01816
-Q.19641
0.11621
3.48447
2.86
.Bo'.lnds on condition nUJ('.bcr: 1.6358, 22.31

Pr ;> F
0.0167
0.00140.0008
0.0026

17.71

0.1117

'Backward Eli.m:ination: Step 2. Variable x4 Removec.: R-Sq'.lare "" 0.8756 and


C(p) ",. 7.2532
Sum of
Mean
DF
Squares
Square
F Value
Pr '> F
Source
Model
3
153.30451
51.10150
37.54
<.0001
1,36135
Error
16
21. 78159
Correct~d Total
19
175.08609
Para,.'T!ctcr
Standard
Erra;::'
Estirr-catc
Type II 55
F Value.
Variable
l!:r > F
Intercept
-27.89190
6.03518
29.07675
21.36
O.oooS
23.87130
0.01360
0.00325
17.54
0.0007
x:.
6.5::'286
30.21859
22.20
0.0002
x2
30.68486
-0.06701
0.01916
16.64910
:'..2.23
0,0030
x3
Bounds on condition number: 1.4786, 11.85

All variables left k" the model are significant: at the 0.1000 level,
Step
1

Figure

Variable
Entered
x5
x4
15~ 16

Summary of .Bac1cNard Eli.'l'Iination


Partial
Model
C(p)
Vaxs In
R-Square.
R-Square
4
0.0140
0.8955
6.1732
0, :)199
0.8756
3
7.2532

Nu'1'1ber

SAS output for forward selection in Example 15-18.

Value
2.17
2.86

PO? > F
0.1626
0.:1:7

Selection of Variables in Multiple Regression

15-11

~he REG P~Qced~re


Dependent Variable: y
S,::epwise Selectior.: S"':ep 1 Variable x2. Ente~ed: R-Square

485

0.6530 a..'1d Cep)

37.7047

Source

DF

Made:

1
18

Error
Corrected

Tcta~

19
?ara.meter
Est:imat:e

Variable
In"tercept

~42.87237

x2

49.C8366

S,::ep~~se

Sum of
Squares

Mea.'l
Square

114.32885

114.32885
3.37540

60.75725

:? Val.ue

PI.'

33.87

> F

<.0001

175.08609

St:andard
Error

Type II SS

F Value

Pr > F
<.0001
<.0001

8.41429
87.62858
25.96
8.43377
114.32885
33.87
Bounds On condition ncnber: 1, 1

Selection: Step 2

Variab~e

xl Ent:ered: R-Squaxe

0.7805

a.~d

C(p) =

19.9697

Sou!."ce
DE
Model
2
Error
17
Corrected Total
19
Parart',et:er
Variab~e
Esti.ma'::e
Intercept
-33.83110
0.0":"3l4
34.88963

xl

x2

S\lln of
Squares

Mean
Square

136.65541
38.43068
":"75.08609

63.32771
2.26063

Sta.."'1dard
Error

F Value
30.2.3

Pr > F
<.OCOl

.:;:.::: 55
F Value
46.45691
20.55
22.32656
9.Sa
40.44627
17.89
condi tioD number: 1.4282, 5.7129
~ype

7.46286
0.00418
8.24844

Bounds on

-----

Pr > F
C.0003
0.0059
0.00C6

---------

St:epwise SElEction! St:ep 3 Variable x3 En"tered: R-Square


7.2532
SUl1I

of

<=

C. 8756 and C(p) ,,-,

Mean

SCJ.!.'ce
Squares
Sqt:are
F Value
D:?
153.30451
37,54
Model
3
51.10150
21.78159
1. 36135
16
Error
175.08609
Ccrrected 70tal
19
Pararnet:er
S-::~'1dard
F Va1'.le
Variab:e
Es-:::imate
Error
TJ."? II 55
6.035:'8
29.07675
21.36
':::ntercept
-27.89190
0.00326
17,54
xl
0.01360
23.87130
6.51286
x2
30.68486
30.2.1859
22.20
-0.C6701
0.01916
16.64910
12.23x3
Bounds on coruli ~ion nurnbe:!:' : 1. 4786, 11.85

AI: variables lef"t in "the model are

signific~~t

a~

~he

Pr> F
<.0001

P:r > F
0.0003
0.00C7
0.0002
0.C030

O.lCOO leveL.

Ke other variable met the 0.1000 significance level for entry into the model.

Summary of St:epwise Select~on


Va:r:"able Variable Number
Partial
Mode:
Step Entered Removed Vars In .a-Squar R-Square
Cep)
x2

2
3

xl

,,3

1
2
3

0.6530
0.1275
0.0951

0.6530
0.7805
0.8756

Figure 15~17 SAS output for bacbvard elimination ill Example

F Value
33.87
9.88
7.253-2 12.23

37.7047
19.9697

Pr > F
<.0001
C,0059
0,C030

15~19.

A major criticism of variable selection methods. such as stepwise regression, is that the
analyst may conclude that there is one "bestH regression equation. This generally is not the
case, because there are often several equally good regression models that can be used. One

486

Chapter 15 Multiple Regression


way to avoid this problem is to use several different model~building techniques and see if
different models result. For example, we have found the same model for the damaged peach
data by using stepwise regression. forward selection, and backward elim.ination. This. is a
good indication that the threewvariable model is the best regression equation, Furthennore,
there are variable selection techniques that are designe.d to find the best one-variable model.
the best two~variable model, and so forth. For a discussion of these methods, and the van..
able selection problem in general, see Montgomery~ Peck., and Vining (200l),
If the number of candidate regressors is not too large, the all~possib:e-regressions
method is recommended. It is not distorted by multicollinearity among the regressors, as
stepwise-type methods are.

1512

SU~1{

This chapter has introduced multiple linear regression, including least-squares estimation of
the parameters, interval estimation, prediction of new observations, and methods for

hypothesis testing, Various tests of model adequacy. including residual plots. have been discussed. It was shown that polynomial regression models can be handled by the usual multiple linear regression methods, Indicator variables were introduced for dealing with
qualitative variables. It also was observed that the problem of multicollinearity, or intercorrelation between the regressor variables. can greatly complicate the regression problem
and often leads to a regression model that may not predict new observations well. Several
causes: and remedial measures of this problem, inc1udi.'lg biased estimation techniques,
were discussed, Finally, the variable selection problem in multiple regression was introduced. A number ofmodel~bui1ding procedures, including all possible regressions, stepwise
regression. forward selection. and backward e1imination, were illustrated.

1513 EXERCISES
Consider th.e damaged peach data in Table
15-15,
(a) Fit a regression model using XI (drop height) and
x.;. (fruit pulp thickness) to these data.
(h) Test for significance of regression.
(c) Compute the residuals from this modeL Analyze
these residuals using the methods discussed in
this chapter.
Cd) How does this twO-\Mable r:1ode~ compare Veith
the two-variable model using x: and .:tz from
lS~l,

Example 15- j ?

'

15-2. Consider the damaged pea.en data in Table


15-15,
(a) Fit a regression ::nodel using XI {.:top height),.:tz
(fruit density), and ~ (fruit height at impact point)

to cese data.
(b) Test for sigrlficance of regression.
(c) Compute the residcals from this model. Analyze
these residuals using the methods discussed in
"'Us chapter.
15-3. Using the results of Exercise 15-1, find a 95%
confidenee interval on A.

lS~4.

Using the results of Exercise

15~2,

find a 95%

confidence interval on j3~.


15~5.

The data in the table at the top of page 487 are


the 1976 team performance statistics for the teams in
the National Football League (Source: The Sparring
ly'ews),

(a) Fit a multiple regression :nodel relating the number of games won to the teams' passing yardage
{x..), the percentage of rushing plays (x,), and the
opponentS' yards rosblng (x,),
(b) Construct the appropriate residual plots a:.d com-

ment on model adequacy,


(e) Test the significance of each variable to the
model, u:sing either the Nest Or the p;trtial F~test
1S-6. The table at the top of page 488 presents gasoline rr.ileage perforrr,.a.::ce for 25 at.tomobiles (Source:
Motor Trend, 1975),
(a) Fit a multiple regression :::::lodel relating gasoline
mileage to engine displacement (Xl) and number
of carburetor ba:c-els (xJ..
(b) Analyze the residuals and comment on

adequacy.

mood

1513 Exercises

487

National Football League 1976 Team Performance

Team

Washington
Mi:onesota
New England

10
II

Oa."'dand

13
10
II
10

11

P:otsburgh

Baltimore

Los Angeles
Pallas

11

Atlanta

Buffalo

2
7
;0

Chieago
Cincinnati

9
9

Cleveland

Denver
Detroit

Green Bay

Houston
Kansas City
:Miami
New Orleans

New York Gi.mts

New York Jets


Philadelphia
St. Louis
San Diego
San hancisco
Seat-Je
TIllIlpa Bay

6
5
5
5
6
4
3
3
4

10
6
8
2
0

x"

2113
2003
2957
2285
2971
2309
2528
2147
1689
2566
2363
2109
2295
1932
2213
1722
1498
1873
2118
1775
1904
1929
2080
2301
2040
2.447
1416
1503

1985
2855
1737
2905
1666
2927

2341

2737
1414
1838
1480
2191
2229

2204
2140
1730

2072

2929
2268
1983
1792

1606
1492

2835
2416
1638
2649
1503

38.9
38.8

4DJ
':'1.6
39.2
39.7
38.1
37.0
42.1
42.3
37.3
39.5
37.4
35.1
38.8
36.6
35.3
4Ll
38.2
39.3
39.7
39.7
35.5
35.3
38.7
39.9
37.4
39.3

64.7

+4

868

61.3
60.0
45.3
53.8
74.1
65.4
78.3
47.6
54.2
48.0
51.9
53.6
71.4
58.3
52.6
59.3
55.3
69.6

+3

615
914
957
836
786
754

78.3
38.1
68.8
68.8
74.1

50.0
57.1
56.3
47.0

+l4

-4
+15
+8
+12
-l

-3
-1
+19

+6
-5
+3
+6
-19
-5
+10
+6
+7
-9
-21
-8
+2
0
-1l
-22

-9

761

714
797
984
700
1037
986
819
791
776
789
582
901
734

627
722
683
576

848
684
S7S

59.7
55.0
65.6
61.4
66.1
61.0
66.1
58.0
57.0
58.9
67.5
57.2
58.8
58.6
59.2
54.4
49.6
54.3
58,7
51.7
61.9
52.7
57.8
59.7
54.9
65.3
43.8
53.5

2205
2096
1847
1903
1457
1848
1564
1821
2577
2476
1984
1917
1761
1709
1901
2288
2072
2861
2411
2289
2203
2592
2053
1979
2048
1786
2876
256)

1917
1575
2175
2476
1866
2339
2092
1909
2001
2254
2217
1758
2032
2025
1686
:835
1914
2.496
2670
2202

1988
2324
2550
2:10
2628
1776
252.4
2241

y; G-im'.e:s won (pc:' 14 ga.'Uc season),


:t,: Rusbing yards (season).

x,; Passing yards (season).


Punting aver.age (yds I punt)
.:t.: Field goal percentage (Igs made I fgs attempted).
xl: T"J..'TIo'ler Ciff"eren:ial {rornovers acquirCd il:D:nOV<'..rs lost}.
'xl:

~:

Penalty yards (season).

x;: Percent rushing (rushing prays I total plays).


~: OppoIk"'1lt5'
~

rushing yards (season).

Oppocents' passing yards (season).

(c) What is the value of addirtg ,46 to a model

aut

already contains,X;?

15M'7. The electric power COJ1SllD1ed each month by a


chemieal plant is thought to be related to the average

amo:cn! temperature (xa. the number of days in jre


lIIDnth (:r;), the average product purity (x,), and :he
tOllS of proouct produced (.xJ. The past year's hlstoncal data is available and is pres:ented in the table at the
bottom of page 488.

488

~lu!tipl<

Chapter 15

Automobile

Rogression

18.90
20.00
18.25
20.07
11.20
22.12
34.70
30040
16.50
36.50
21.50

350
250
351
225
440
231
89.7
96.9
350
85.3
171

165
105
143
95
215
110
70
75
155
80
109

260
185
255
170
330
175
81
83
250
83
146

8.0:1
8.25:1
8.0:1
8.4:1
8.2:1
8.0:1
8.2:1
9.0:1
8.5:1
8.5:1
8.2:1

2.56:1
2.73:1
3.00:1
2.76:1
2.88:1
2.56:1
3.90:1
4.30:1
3.08:1
3.89:1
3.22:1

19.70
17.80
14.39
14.89
17.80
23.54
21.47
16.59
31.90
13.27
23.90
19.73

258
302
500
440
350
231
360
400
96.9
460
133.6
318

110
129
190
2[5
155

195
220
360
330
250
175
290
NA
83
366
120
255

8.0:1
8.0:1
85:1
8.2:1
8.5:1
8.0:1
8.4:1
7.6:1
9.0:1
8.0:1
8.4:1
8.5:1

3.08:1
3.00:1
2.73:1
2.71:1
3.08:1
256:1
2,45:1
3.08:1
4.30:1
3.00:1
3.91:1
2.71:1

13.90
351
148 243
8.0: 1
16.50
350
165
255
8.5:1
~----------------y. Miles I gallon.

3.25:1
2.73:1

Apollo
Nova

Mo::>arCh
Duster
Jenson Cony.
Skyhawk

Scirocco
Corolla SR~5
Camara
DatsunB210

Capri II
Pacer
Granada
Eldorado
Imperial
NovaLN
Starlirc

Cordoba
Trans Am
Corolla E~5
M"klV
Celica GT
Charger SE

no
180
185
75
223
96
140

Cougar
Corvette

XL:

2
1

4
2
2
2
4

2
2
I

200.3

196.7
199.9
194.1
1845
179.3
155.7
165.2
195.4
160.6

3
3
3
3

"5
3
4

"

17004

3
3
3
3

44

2
4

3
3

4
2

3
5

171.5
199.9
224.1
231.0
196.7
179.3
214.2
196
165.2
228
171.5
215.3

2
4

3
3

215.5
185.2

2
4

69.9
72.2
74.0
71.8
69.0
65.4
64.0
65.0
74.4
62.2
66.9

3910
3510
3890
3365
4215
3020
1905
2320
3885
2009
2655

A
A
A
M
A
A
M
M
A
M
M

no

A
A

73.0
61.8
79.8
63.4
76.3

3375
3890
5290
5185
3910
3050
4250
3850
2275
5430
2535
4370

78.5
69.0

4540
3660

74.0

79.8
79.7

72.2
6504
76.3

Displacement (cub~c in.),

x,: Horsepower (ft~lb) .


.'f):

Torque (ft~lb).

XA:

Compression noo,

x,: Rear a:tle ratio.


;r~:

C'WJrctor (barrels).

Xl:

No. of tra.-.s:nlssion speeds.

x~;

Overall length (in.),

x.;: \Yidth {in.).


X10:

Weigl:t Oos).

XI i:

Type of tnl.n$mlssion (A-au:omatic. M-manU3l).

240
236
290

25
31
45

24

274

60

25

301
316

65
72

25
26

21
24

x,

300

80

296

84
75
60
50
38

91
90

100
95

88

110
88

267

94
99

288

87
91
94

276
261

x,
25
25
24
25
25
23

87
86
88
91

90
89

97
96
110
105
100
98

A
A

A
A

A
A
}'1
A

M
A
A
A

(a; Fit a multiple regression model to these data.


(b) Test for sigrllficance of regression.
( c) Use partial F statistics to test H,: {J, = 0 and H,:
(J, =0.
(d) Compute the residuals from this model. .Analyze
the res~du.als using the merhods discussed in this
chapter.

15~10. Consider the follow.Jlg data. which result from


.an experiment to derer.:nine the effect of x = test time
in hours at a particular temperature on y change in
oil viscosity.

15-8. Hald (1952) reports data on the heat evolved in

calories per gram of cement (y) for various amounts of


four ingredients: (x!, x... x,> xJ.
Observation
:."u:::.ber

2
3
4
5
6

x,

735
74.3
104.3

81.6
95.9
109.2
102.7
72.5

X,

26

60

29

52

11
11
1
11
3

56

15
8

20

31

47

52

6
9
17

33
22
6

22
18

22

9
10

93.1

55
71
31
54

115.9

21

47

11

8}.8

4()

12
13

1133
109.4

I
11

66

10

68

x,

'"

4
23
9
8

44

26
34
12

12

(a) Fit a multiple regression model to dlese data.


(b) Test for sig;'.i.fic2nce of reg:ession,

(c) Test the hypothesis fl, = 0 using the partial Ftest


Cd) Compute the r statiscics for each independent vari
able. What conclusions C2n you draw?
(e) Test the hypothesis {J, =fl, ={3, =0 using the partial F-test.
(f) ConstrJct a 95% confidence interval estimate for

{3,.

(c) Test the hypothesis thar Pn == O.


(d) Compute the residuals and test for model
adequacy.

-4.42 -1.39 -1.55 -1.89 _2.43 -3.15 -4.05 -5.15 -6,43 -7.89

x O.2S 0,50 0.75 :.00 1.25 L50 1.75 2.00 2.25

2,50

(a) Fit a second~Ol:derpolyno!tial to the data.


(b) Test for significance of regres.<;ion,
(0) Test the hypothesis that Pu =: O.
(d) Compute the residuals and check for model
adequacy,
lS~ 11. For many polyr.omial regression models We
subtract from each x value to produce a "centered"
regressor Xl x-x. Using the data from Exercise
l5~9, fit the model y e;; f3~ + P~x'+ P;I(X'? + e Use the
results to estimate the coefficients: in the uncentered
model y = {Jc + {J,x - f3 ..i' + E.
15~:u. Suppose that we use a standa.-'1.nzed variabie
x';::o (x - x)lsft where Sy is the standard deviation of x,
in constructing a polyr.omial regression model Using
!he data in Exercise 15-9 and the standardized var:il;ble
approach. fit the model y ~ fJ~ + P;x' + j3~I(X)Z..;.. e
(a) What value of y do you predict when x =. 285;)F?
(b) Estimate the regression coefficients in the unstan~
datdized model y = Po + PiX";" PllX~ + e
(c) What can you say about the relationship beween
SSe and RZ for the standardized and unstandardized models?
(d) Suppose that y"':=: (y - 'Y>fsv is used in the model
along with x~ Fit tbe [Lodel and comment on the
relationship between SSE and If in the standardized model and the unstandardized model.

15-9. An. article entitled "P.. Method for Improving the


Accuracy of Pol.Y::1omial Regression Analysis" in the
loumal of Q1'aliry Technology (1971. p. 1"9) reported
the following data on y:=> ultitl:!.ate shear strength of a
rnb'::ler compound (psi) and x = cure temperature ("'F).

15--13. The data shown at the bot"wm of this page were


collected during an experiment to detern:tine :he
change in thrust ef5.ciency (%) (y) as the divergence
angle of a rocket nozzle (x) cl:!anges.
(a) Fit a second~order model to the data.
(b) Test for significance of regression and lack of fit.

y 770

800

840

810

735

640

590 560

(c) Test the hypothesis that {3" ~

280

284

292

295

298

305

308 315

15-14. Discuss the hazards inherent in fitting polynoll1i2l models.

(a) Fir a second~order polynomial to this data,


(b) Test for significance of regression.

4.0

5.0

5.0

o.

15-15. Consider the data in Ex.ample 15-12. Test the


hypothesis that two diffexentregression models (with

6.5

6.5

6.75

7.0

7.3

490

Chapter 15 Multiple Regre>sion

different slopes and intercepts) are required to ade-

quately model the data.


15.. 16. Piecewise Linear Regression (I). Suppose
that y is piecewise linearly related to x. Thar is. differ

ent linear rela.tionships are appropriate over the intervals - o:x> < X ~:l 3lld x' < x < l, Show how indicator
variables em.;. be used to fit such a piecewise linear
regression model, assuming that point x' is kno'WU.
15~17.

Piecewise Linear Regression (II). Consider


the piecewise linear regression mode3 described in
Exercise 15-16. Suppose that at point x' a discontinuity occurs in the .regression function. Show how indi~
cator variables can be us~ to iIlcorporate the
discontinuity into the modeI.
15~18.

Piecewise Linear Regression (ill). Consider


the piecewise linear regression model described in
Exercise 15-16. Suppose that point x' .is not known
with certainty aud must be estimated. Develop an
approach that could be used to fit the piecewise linear
regression model

1s..19. C'..alcu1ate the standardized regression coefficients for the regression model developed in Exercise

lS-L
15~20. Calculate the standardized regressioo coefficients for the regression model developed in Exercise

15-2.
15~21. Find the variance inflation factors for the
regression model developed L1 Example 15-1. Do they
ir.dicate t.~at multicollinearity is a prOblem in this

model?

15-22. Use the National Football League Team Per~


fonnance data in Exercise 15-5 to build regression
models using rhe following techniques:
(a) All possible regressions.
(b) Stepwise regression.
(c) Fonvard selection.
(d) Backward elimination.
(e) COIl:.lm.ell.t on the various models obtained.
15~23. Use the gasoline mileage data in Exercise 15-6
to build regression models using the following tech~

niques:
(a) All possible regressors,

(b) Stepwise regression.


(e) Forward selection.
(d) Backward _atio"
(e) Comment on the various models obtained,
15~24.

Consider the Rald cement data in Exercise

15-8. Build regression models for the data using me

following techniques:
(a) i~l1 possible regressions,
(b) Stepwise regression.
(c) FoIW'dl'd selection.
Cd) Backward elimination.
Consider the Hald cement data in Exercise
15-8. Fit a regression model involving all four :reg...~s
sors and find t.~ variance inflation factors. Is multicollinearity a problem i:1 this model? Use ridge
regression to estimate the coefficients in this model.
Compare the ridge model to the models obtal.!1ed in
E..'Cercise 15~25 using variable selection methods.
15~25.

Chapter

16

Nonparametric Statistics
161 INTRODUCTION
Most of the hypothesis testing and eonfidenee interval proeedures in previous chapters are
based on the assumption that we are working with random samples from normal populations. Fortunately, most of these proeedures are relatively insensitive to slight departures
from nonnality, In general. the t- and F-tests and t eonfidenee intervals will have aetuallev ...
ets of signifieance or eonfidenee levels that differ from the nominal or advertised levels
chosen hy the experimenter, although the difference between the actual and advertised lev-

els is usually fairly small when the underlying population is not too different from the normal distribution. Traditionally. we have called these procedures parametric methods
beeause they are based on a partieular parametric family of distributions-in this case, the
nonna!. Alternatively, sometimes we say that these procedures are not distnbution free
because they depend on the assumption of normality.
In this chapter we describe procedures ealled nonparametric or distributionwfree methods and usually make no assumptions about the distribution of the underlying population;
other than that itis eontinuous. These procedures have actual levels of s:ignifieanee a or confidence levels 100(1- a)% for many different types of distributions. These procedures also
have considerable appeal. One of their advantages is that the data need not be quantitative; it
could be categorical (such a..,.;; yes orno~ defective or nondefective. etc.) or rank data. Another

advantage is that l10nparametrie procedures are usually very quick and easy to perform.
The procedures deseribed in this chapter are competitors of the parametric t- and
F-procedUIes describ~d earlier. Consequently~ it is important to compare the performance
ofbotb parametric and nonparametric methods under the assumptions of both normal and
nonnormal populations. In general, nonparametrie procedUIes do not utilize all the information provided by the sample) and as a result a nonparametric procedure will be less efficient than the corresponding parametric procedure when the underlying population is
norma1. This loss of efficiency usually is reflected by a requirement for a larger sample size
for the nonparametric procedure than would be required by the parametric procedure in
order to achieve the same probability of type II error. On the other band. this loss of effi
ciency is usually not large, and often the difference in sample size is very smalL When the

underlying distributions are not nonnal, then nonp,arametric methods have much to offer.
They often provide considerable improvement over the nOrmal-theory parametric methods.

162 THE SIGN TEST


162.1 A Description of the Sign Test
The sign test is used to rest hypotheses about the median j1. of a continuous distribution.
Recall that the median of a distribution is a value of the random variable such that the prob
ability is 0.5 that an observed value of X is less than or equal to the median, and the

491

492

Cbapter 16

Nonpa..'"3J:uetric Statistics

probability is 0.5 that an observed value of X is greater than or equal to the median. That is.
P(X s,{l) = P(X?;fl) = 0.5.
Since the normal distribution is symmetric, the mean of a normal distribution equals
the median. Therefore the sign test can be used to test hypotheses about the mean of a nor~
mal distribution. This is the same problem for which we used the I-test in Chapter 11. We
will discuss the relative merits of the two procedures iu Section 16-2.4. Note that while the
t~test was designed for samples from a normal distribution. the sign test is appropriate for
samples from any continuous distribution. Thus, the sign test is a nonparametric procedure.
Suppose that the hypotheses are
Ho: {l =/10.

(16-1)

H,: {l "/10.

The test procedure is as follows. Suppose that Xl' X2

Xtl is a random sample of n. observations from the population of mterest. Form the differences (Xi p,;J, i = 1, 2, .. ,' n. Now
if Ho: {l=/1o is true, any differenceX,-/1o is equally likely to be positive or negative. Therefore let R' denote the number of these differences (X, - p,;; that are positive and let K denote
the number of these differences that are negative. where R;;: min CRt, R-).

When the null bypothesis is true, R has a binomial distribution with parameters n and
p = 0.5. Therefore. we would find a critical value. say
from the binomial distribution that
ensures that P(type I erroD = P(reject Ho whenHo is true) = a. A table of these critical val-

R:

uesR;is given in the Appendix. Table X. If the test statistic R:f


sis Ho: P. = /10 should be rejeeted.

R:.

then the null hypothe-

~1ontgo:mery, Peck. and Vining (2001) report on a study in which a rocker: motor is formed by bindL"lg an igniter propellant and -a sustainer propellant together inside -a metal housing. The shear strength
of the bond between j}e two propellant types is an important characteristic. Res'..llts of testing 20 ran~
comly selected motors are shown in Table 16-1. We would like to test the hypothesis that me median
shear strength i'l 2000 psi.

Tne formal statement of the hypotheses of interest is

H,: fl = 2000,

H,: fl" 2000.


The last two columns of Table 16~ 1 show the differences (Xi - 2000) fnr i::::::: 1.2, .,., 20 and the cor~
responding sig:us, Note that Jr = 14 and R- = 6, The:efore R = min (Jr, R-) min (1<. 6) 6. Fro:n
the Appendix, Table X, with n =: 20. we find that the critical value for a = 0.05 is ~.05 = 5. Therefore,
smceR =6 is not less than or equal to the critica: va1ue~1l$ =5. we cannot reject:he ;l;ull hypothesis
that the median shear strength is 2000 psi.
"
We note that since R is a b1r.omial random variable, we could test the hypothesis of interest by
directly calculating a P~value from the binomial distribution. 'When Ho: fl:::::; 2000 is true, R has a binomial distributior. with paratl1eters n = 20 and p = 0.5, Thus the probability of observing six or fewer
negative signs in a sample of20 observations is

, (20\

P(R S 6) = '1:"

1(0.5)'(0.5)20-'
..:..1
r ).
r..,O\

= 0.058.
Since the P~ya1ue is not less than the desired level of Significance, we cannot reject the:lull h'ypoth
esis of ::=. 2000 psi.

,a

16-2

The Sigr. Test

493

:r.ble16-1 Propellant Shear Streng&. Da:a


Observation (i)

2
3
4
5

6
7
8
9
10
11
12
13

14
15
16
17

18
19
20

Shear Strength (X,)

Differences (X, - 2000)

2158.70
1678.15
2316,00
2061.30
2207,50
1708.30
1784,70
2575,10
2357.90
2256.70
2165.20
2399.55
1779.80
2336,75
1765.30
2053.50
2414.40
2200,50
2654,20
1753,70

Sign

+158,70
-32L85
+316,00
+{;L30
+207,50
-291.70
-215.30
+575.00
+357,90
+256,70
+165.20
+399.55
-220.20

+
+
+
+

+336,75

234,70
+

+53.50

+414.40
+20050
+654.20
-246.30

.,.
+

Exact Significance Levels When a test statistic has a discrete distribution 1 such as R does
.in the sign test, it may be impossible to choose a critical value that has a level of signifto yield an a as close to the
icance exactly equal to a. The usual approach is to choose
advertised level a as possible.

R:

R;

Ties in the Sign Test Since the underlying population is assumed to be continuous, it is
theoretically impossible to find a
" that is, a value of XI exactly equal to Po- However)
this may sometimes .happen in practice because of the way the dzta are collected. \Vhen
ties occur, they should be set aside aJ:,d the sign test applied to the remaining data.
One~Sided Alternative Hypotheses We can also use the sign test wben a one-sided alternative bypothesis is appropriate. If the alternative is H,: p > p", then reject Ho: p= p" if
R- <R~ if the alternative is H,: p <p", then rejectHo: P = p" if R" <R~ Tne level of significance of a one-sided test is one-half the value sbo,," in the Appendix, Table X, It is also
possible to calculate. P-value from the binomial distribution for the one-sided case.

The Normal Approximation When p 0.5, the binomial distribution is well approximated by a norma! distribution when n is at least 10. Thus, since the mean of the binomial
is np a.'1d the vari""ce is np{l- p), the distribution of R is approximately normal, with mean

O.5n and variance 0.25n wbenever n is moderately large. Therefore in these cases the null
l

hypothesis can be tested with the statistic

R-O.5/l

(16-2)

494

Chapter 16 Nonpara::::etric Slatistics

The two-sided alternative would be rejected if 1201:> Zoo' and the critical regions of the onesided alternative would be chosen to reflect the sense of the alternative (if the alternative is
H,: f1 :>f1c, rejectH, if 20 > Z", for example).

162.2 The Sign Test for Paired Samples


The sign test can also be applied to paired observations drawn from continuous popu1a~
aons. Let (X "X~,)~j:::: 1, 2. ",~ n, be a collection of paired observatiocs from two contin~
uous population;: and let

1,2. ,",.n,
be the paired differences. We wish to test the hypothesis that the two populations have a
common median, that is, that,il! ~~. This is equivalent to testing that the median ofllie du. .
ferences fld:::: O. This can be done by applying the sign tcst to the r. differences D), as illustrated.in the following example.

Exampk162
An autor.o.otive engincer is investigating two different types of metering devices for an electronic :bel
injection system to determine if they differ in their fuel.mi.!eage performance. The system is installed
on 12 different cars, and a tesis run with each meterslg system on each car, The observed fuel mileage
performance data, corresponding differences. ar.d their signs are show!:. in Table 16~2. Note that K:::: 8
andR-;4. Thm::foreR = min (K,R1 = min (8, 4) =4. From the Appendix, Table X, with n; 12, we
:find the criticlli va.:'.1e for a= 0.05 is~.O$ = 2. SinceR is not less than the critical value ~.(jj' we cannot
reject the null hypothesis that the two metering devices produce the same fuel mileage performa.nee.

1623 Type n Error (fJ) for the Sign Test


The sign test will control the probability of type I error at an advertised level a for testing
the null hypothesis H,: f1 = A, fo, any continuous distribution. As with any hypothesis.
testing procedu!:"e, it is l.-nportant to investigate the type IT error. p, The test should be able
to effectively detect departures from the null hypothesis, and a good measure of this effec,.
Tablel6-Z

Pcrfurr:.ance of Flow Metering Devices


Metering Device

Car

2
3
4

5
6
7
8

9
10
1l
12

17.6
19.4
19.5
17.1
15.3

15.9
16.3
18.4
17.3
19.1
17.8
18.2

16.8
20.0

Difference,

0.8

18.2

1.3

16.4
16.0

0.7
-0.7

15,~

0.5

16,5
18.0

-0.2

16.4
20,1
16.7

17.9

-0.6

0.4
0.9
-l.0

J.l
0.3

+
+
+
~

+
+
~

16-2 The Sign Test

495

tiveness is the value of Pfor departures that are important. A small value of Pimplies an
effective test procedure.
In determining p, it is important to realize that not only must a particular value of Jl, say
A:, + Ll., be used, also the form of the underlying distribution will affect the calculations, To
illustrate, suppose that the underlying distribution is nonnal with 0"= I and we are testing
the hypothesis that {1 = 2 (since P. = f.1 in the normal distribution this is equivalent to testing
that the mean equals 2), It is important to detect a departure from{1= 2 to{1= 3, The situation is illustrated graphically in Fig, 16-10, ,,'hen the alternative hypothesis is !rue (H,: {1 =
3), the probability that the random variable X exceeds the value 2 is
p

=P(X> 2) = P(Z>-I) = 1- 4>(-1) = 0.8413.

Suppose we have taken samples of size 12, At the a = 0,05 level, Appendix Table X indicates that we would reject Ho: {1 = 2 if R ,; R;,,, = 2. Therefore, the Perror is the probability that we do not reject No: {1 = 2 when in fact {1 = 3, or

'
J

'2 1 12
x
1- ~l x (0.1 587)"(O.8413)lt- = 0,2944,

If the distribution of X has been exponential rather than nonnal, then the situation
would be as shown in Fig. 16-1b, and the probability that the random variable X exceeds the
value.x : :;:; 2 when fl 3 (note when the median of an exponential distribution is 3 the mean
is 433) is
1

p=P(X>2)

-1

r~_1_;4,33X dx=;<,33 =0.6301.

h 4.33

2
Under He:

il =2

Under ~;;:;'=3
(a)

J(,=2 J.L-2.89
Under

H;;: 'ji "" 2


(b)

Figure 1&.1 Calcu!ation of {3for the sign test (a) Normal distributions, (b) exponential
distributions.

496

Chzpter 16

Nonparametric Statistics

The p error in this case is

f3 ~ 1-

I' 12x 1J(0.3699)" (0.6301)12-x = 0.8794.


2 (

x={)\

..

l1ms, the /3 error for the sign test depends not only on the alternative value of {l but on
the area to the right of the value specified in the null hypo thesis under the population probability distribution. This area is highly dependent on the sbape of that particular probabil
ity distribution.

16-2.4 Comparison of the Sign Test and the t-Test


If the underlying population is normal, then either the sign test or the t-test could be used
to test Ho: p.;;;;;:; flo. 1be Hest is known to have the smallest value of /3 possible among all tests
that have significance level ex,. so it is sUperior to the sign test in the normal distribution case.
When the population distribution is symmetric and nonnormal (but with finite mean i' ~ P.),
then the t-test will have a f3 error that is smaller than f3 for the sign test, unless the distribution has very heavy tails compared with the nonnal Thus, the SlgTl test is usually consid~
ered a test procedure for the median rather than a serious competitor for the t-tesL The
Wucoxon signed rank rest in the next section is preferable to the sign test and compares
well with the (-test for symmetric distributions.

163 THE WILCOXON SIGNED RANK TEST


Suppose that we are willing to assume that the population of interest is continuous and sym~
metric. As in the previous section. our interest focuses on the medianJl (or equivalently, the
mean fJ.. since f1 = f.L for symmetric distributions). A disadvantage of the sign test in this situation is that it considers O11ly the signs of the deviations Xi - flo and not their magnitudes.
The Wilcoxon signed rank test is designed to overcome that disadvantage.

163.1 A Description of the Test


We are interested in testing He: IJ:;:: Jl.o against the usual alternatives. LA.ssume that Xl' X2, , ,
X,,:is a random sample from a continuous and symmetric distribution with mean (and medi~
an) fl. Compute the differences X, - 11<;, i = I, 2, ... , n. Rank the absolilte differences
~XI - .L4>!, i;;;;;:; L 2, ... ; n, in ascending order, and then give the:ranks the signs of their corresponding differences. Let R' be the sum of the positive ranks and R- be the absolute
value of the sum of the negative ranks, and let R = min (R', R-). Appendix Table XI contains critical values of R, say R~ If the alternative hypothesis is Hi: i' ~ 11<;, then if R':; R;
the null hypothesis He: i' ~ 11<; is rejected.
For one~sided tests, if the altemative is Hj : ,u > Ji.o reject Ho: Ii = J10 1 R- < R:; and if
l
the alternative is HI: i' < 11<;, reject Ho: i' ~ 11<; if R' < R". The significance level for
one-sided tests is one-half the advertised level in Appendix Table Xl

2F;iiUli~1~~~;1:
To illustrate the Wucoxon signed rank test. consider the propellant shear strength data presented in
Table 16-1. The signed ranks are

16-3 The WIlcoxon Signed Rank Test


Observation

Difference, XI .- 2000

16
4
1
11
18
5

7
13
15
20
!O
6
3
2
14
9
12
17

8
19

+5350
..,,1.30
+158.70
,-165.20
+200.50
+207.50
-215.30
-220.20
-234.70
-246.30
+256.70
-291.70
+316.00
-321.85

497

Signed Ran!:
-1-1

+2

-3
+4
+5
-.<i

-7
-8
-9

-10
+11
-12
+13
-14
+15
+16
+17
+18
+19
+20

-1-336.75
+357.90
+399.55
+414.40
+575.00
-.<i54.20

The sum of the positive:ranks is R" (-1 +2 + 3 +4 + 5 + 6...:.11 + 13.,.. 15 + 16 +:7 + 18 + 1920) =: 150 and the sum of the nega:ive ranks is lC::::. (7 + 8 + 9 + 10 + 12 + 14) = 60. Therefore R:::.
nilil (P:', R') = nilil (150, 60) = 60. Proll: Appendix :able Xl, with n = 20 and a= 0.05, we find the
critical value R~.J1 = 52. Since R exceeds R;, we cannot reject the null uypothesls that the mean (or
median. since the popUlations are assumed to be symmetric) shear strength is 2000 psi,

Ties in the Wilcoxon Signed Rank Test


Because the underlying population is continuous, ties are theoretically impossible, although

:.bey will sometimes occur in practice. If several observations have the same absolute magnitude. they are assigned the average of the ranks that they would receive if they differed
slightly from one another.

16-3.2 A Large-Sample Approximation


If the sample size is moderately large, say n > 20, then it can be shown that R has approximately a normal distribution with mean

n{n + 1)
4

and variance

n(n + 1){2n + 1)
24

498

Chapter 16 Nonpara:netric Statistics

Therefore, a test of lIo: J1::::::: J1r; can be based on the statistic


R

n(n+ 1)/4

Zo == ---;===:':;=.==::.00-.
..,jn(n + 1)(2n + 1)/24

(16-3)

An appropriate critical region can be chosen from a table of the standard normal
distribution.

1633 Paired Observations


The Wilcoxon signed rank ,est can be applied to paired data. Let (X,)' X,),j = 1,2, ... , n, be
a coLlection of paired observations from continuous distributions that differ only with
respect to their means (it is not necessal)' that the distributions of X, and X, be symmetric).
This assures that the distribution of the differen.ces DJ ::: X lj - Xl} is continuous and
symmetric.
To use the WIlcoxon signed rank test, the differences are fL""St ranked in ascending order
of their absolute values, and then the ranks are given the signs of the differences. Ties are
assigned average ranks. Let K be the sum of the positive ranks and R- be the absolute value
of the sum of the negative ranks, and let R =min (R'", R-), We reject the hypothesis of equality of means if
where
is chosen from Appendix Table Xl.
For one-sided tests, lithe alternative is H:: P, > 110: (or H,: Pn > 0), rejectH, if R- < R;';
and if H,: P, < 110: (or H,: PD < OJ, reject Ho if R'" <R~ Note that the significance level for
one~sided tests is one~half the value given in Table XI.

R,; R:,

R:

Consider th~ fuel metering device data examined in Example 16~2. The signed :an.ks are shown

below.

Cat

Difference

Signed RanI<:

7
12

-0.2
0.3
0.4
0.5
-0.6
0.7
-0.7
0.8
0.9
-1.0
!.l
1.3

-1
2
3

6
2

4
5
1
9
10

11
3

-5
6.5

-6.5
8
9
-10

!1
12

Note that K= 55.5 and It =: 22.5; therefore.R=min (Ir"t10 = min (55.5. 22.5) 22.5. FromAppcndix Table Xl. with It =: 12 and (1,= 0,05, we find::he critical value ~,iJ) = 13. Since R exceeds (,1)$' we
cannot reject tbe null hypothesis that the two metering devices produce the same mileage
performance.

499

164 The Wilcoxon Rank-Sum Test

163.4 Comparison with the t Test


Vlhen the underlying population is normal. either the t-test or the Wilcoxon signed rank
test can be used to test hypotheses about /.1., The t-test is the best test in such situations in
the sense that ir produces a minimum value of fJ for all tests with significance level a.
Howeverj since it is not always clear that the nonnal distribution is appropriate, and since
there are many situations in which we know it to be inappropriate, it is of interest to compare the rNO procedures for both normal and nonnormal populations.
Unfommate)y, such a comparison is not easy. The problem is that [3 for the Vlilcoxon
signed rank test L.1:i very difficult to obtain, and 13 for the t-test is difficult to obtain for nonnormal distributions. Because type II error comparisons are difficult, other measures of
comparison have been developed, One widely used measure is asymptotic relative efficiency (ARE). The ARE of one test relative to another is the Um.:ting ratio of the sample
sizes necessary to obtain identica: error probabilities for the two procedures. For example,
if the ARE of one test relative to a competitor is 05. then when sample sizes are large, the
first test will require a sample t'.lce as large as the second one to obtain similar error performance. While this does not tell us anything for small sample sizes, we can say the
following:
1. For normal populations the ARE of the Wilcoxon signed rank test relative to the
,test is approximately 0.95.
2. For nonnormal populations. the ARE is at leab~ 0.86, and in many cases it will
exceed unity. VV'ben it exceeds unity, the Wilcoxon signed rank test requires a
smaller sample size than does the t~test.
Although these are large-sample results. we generally conclude that the Wilcoxon
signed rank test will never be much worse than the t~test, and in many cases where the population is nonnormal it may be superior. Thus the Vlikoxon signed rank test is a useful
alternative to the t-test.

164 THE WILCOXON RANKSUM TEST


Suppose that we have nvo independent continuous populations X J andX:z with means ,u and
p.,. The distributions of X, and X, have the same shape and spread and differ only (possibly)
in their means. The Wilcoxon rank-sur:! test can be used to test the hypothesis flo: 11, = f.L:.
j

Sometimes this procedure is called the Mann-Whitney test, although the Mallll-Vihitney
test statistic is usually expressed in a different form.

164.1 A Description of the Test


LetX 11 , X12> ~XI"I andX21 .Xn , ... Xo,J be two independent random samples from the con~
tinuous populations X; andXj. described earlier. We assume that n l :s:; 1t:!. A..'Tange all n J + n2
observations in ascending order of magnitude and assign ranks to them, If two or more
observations are tied (identical). then use the mean of the ranks that would have been
assigned nthe observations differed. Let R, be the sum of the r:mks in the smaller X. sample, and define

(16.4)

Now if the two means do not differ, we would expect the surrt of the r:mks to be nearly equal
for both samples. Consequently, if the sums of the ran.l.cs differ g:eatly. we would conclude
that the means are not equal.

500

Chapter 16 :Konparar.Jetric Statistics

R:

Appendix Table IX contains the critical values of the rank sums for 0: = 0.05 and a
0.0 I. Refer to Appendix Table IX, with the appropriate sample sizes n, and Ilz. The null
hypothesis Ho: Ill;;;;.~ is rejected in favor of HI: J1 1-t: J.L;.if either R: or R'{. is less than or equal
to the tabulated critical value R~
The procedure can also be used for one-sided alcematives. If the alternative is H j : J.lt <
/1), then reject He if R] :;; R~ while for He: #1 > fJ..z1 reject 8 0 if Rz s: R~ For these one~sided
tests the tabulated critical values
correspond to levels of significance of a::::: 0.025 and
0:'" 0.005.

R;

Exartlple.16-S
The mean axial stress in tensile members used in an aircraft st:ucture is bei:1g studied. Two. alloys are
being investigated. Alloy I is a traditio.nal material and a:loy 2 is a new aluminum-lithium tilloy that
is much lighter than the standard materiaL Ten specimens o.f each alloy type are tested, and the axial
stress meas'..lX""...d. The sample data are assembled in the following table:

Alloy 1

3238 psi
319-5
3246
3190
3204

3254 psi
3229
3225
3217
3241

Alloy 2

3261 psi
3187
3209

3212
3258

3248 psi

3215
3226
3240
3234

The data fI.""e arranged in ascending o.rdet and ranked as follows:

Alloy Nutnber

Axial Stress

3187 psi
3190
3195
3204
3209
3212
3215
3217

2
2

2
!
2

2
1
2

2
2
2

Rank

3
4
5

6
7

3225

3226
3229

10

3234

:2
13

3238
3240
3241
3246
3248
3254
3258
3261

11

14

15
:6
17

18
19
20

16-5

Nor-parametric Methods in the Analysis ofVa.'iauce

501

The sum of the ranks for alloy 1 are

R,

=2 + 3 + 4 + S + 9 + 11 +!3 + 15 + 16 + IS =99,

and for alloy 2 they are

R2 "'" n.1(n l "L~2 + 1) -R j "" 10(10+ 10+ 1) - 99

111.

FromAppendix Table IX. wit.1. ft J :::; nl = 10 and a= 0.05, we fiad that R;,c~ =78. Since neither RI nor
~ is less than R~.ils. we cannot reject the hypothesis that both alloys exhibit the Same mean axial stress.

164.2 A Large-Sample Approximation


When both nj and 1tz are mOderately large, say greater than 8, the distribution of R j can be
well approximated by the normal distribution with mean

n,(n, +":1 +1)


IlR,

and \'ariance

n,":1(n,+,,:!+I)
12

OR!;;::

Therefore, for n J and

nz > 8 we could use

z,
(165)

2:,/ ,.

as a test statistic, and the appropriate critical region as


Zm:' :z;, > Z'" or Z, < -Z",
dependiag on whether the test is a two-tailed. upper-tail, or lower-tail test.

1643 Comparison with the tTest


In Section 16-3.4 we discussed the comparison of the t-test with the Wilcoxon signed rank
test. The results for tl:!e two-sample problem are identical to the ooe-sample case; that is,
when me normality assumption is correct me Wilcoxon ranksum testis approximately 95%
as efficient as the t-test in large samples, On the other band, regardless of the form of the
distributions, the Wilcoxon rank-sum test will always be at least 86% as efficient.if the
underlying distributions are very nonnormal. The efficiency of the ViilClJxon test relative to
me I-test is usually high if me underlying distribution has heavier tails thao the normal,
hecause the behavior of the t-test is very dependent on the sample me~ which is quite
unstable in heavy.tailed distributions,

165 NO"'PARA..'-1ETRIC METHODS IN THE ANALYSIS OF VARIAi'iCE


16-5.1 The Kruskal-Wallis Test
The si::tglefactor analysis of variance model developed in Chapter 12 for comparing a
population means is
Yij

= J1.+ 1:,

'i=t2,,,.,a,

+ij

. ] = 1, 2, ..., nl'

(16-6)

502

Chapter 16 Nonpanunctric Statistics

In this model the error tenns

E., are assumed to be normally and independently distributed


with mean zero and variance '0". The assumption of normality led directly to the F-test
described in Chapter 12. The Kruskal-Wallis test is a nonparametric alternative to the
F-test; it requires only that the Eli have the same continuous distribution for all treatments
j= I, 2, .. " D,
Suppose that N .:;;: L~ln! is the total number of observations. Rank all N observations
from smallest to largest and assign the smallest observation rank 1, the next smallest rank.
2, , .. , and the largest observation ranklY. If the null hypothesis

Ho: 1': = }l, = ... 1',


is true. theN obsen'ations come from the same distribution, and all possible assignments of
the 11' ranl:'s to the a s"'!Iple, are equally likely, then we would expect the ranl:'s I, 2, ... , 11'
to be mi'ICed throughout the a samples, If, however, the null hypothesis Ho is false, then
some samples will consist of observations having predominantly small ranks while other
samples will consist of observations having predominantly large ranks, Let R'j be the rank
of observation Y;" and let R;+ and
denote the total and average of the ni ranks in the ith
treatment. When'the null hypothesis is true, then

and

The Kruskal-Wallis test statistic lI1.eaSuteS the degree to which the actual observed average
ranks RI , differ from their expected value (N + 1)/2.lfthis difference is large, then the oull
hypothesis Ho is rejected. The test statistic is
K

12

"

(-

11'+1)'

N(N+1)~ n,\"R - -2 -

An alternate computing formula is

K=

12

'

n,

N(N + I) i~,

11,2 -3(11'+1).

(16-7)

(16-8)

We would usually prefer equation 16-8 to equation 16-7, as it involves the rank totals rather
than the averages.
The null hypothesis He should be rejected if the sample data generate a large value fOr
K. The null distribution for K has been obtained by using the fact that under He each possible assignment of mlks to the a treatments is equaliy likely, Thus we could enumerate all
possible assignments and count the number of times each value of K occurs. This has led to
tables of the critical values of K, although most tables are restricted to small sample sizes
nf' In practice. We usually employ the following large-sample approximation: Vlhenever He
is true and ei ther

a= 3 and n,2:6

for i = 1, 2, 3

or
for i:::: 1, 2, ... , G,

16-5

Nonparametric Met.100s IT. the Analysis of Variance

503

then K has approximately a chi-square distribution with a-I degrees of freedom. Since
large values of K imply that Ho is false, we would reject He if

K?!i~'_l'
The test has approximate significance level a.
Tu;s in the Kruskal- ..Vallis Test When observations are tied, assign an average rank to
each of the tied observations. \\'1len there are ties, we shQuld replace the test statistic in
equation l6~8 with
(16-9)
where ni is the number of observations in the itb treatment, N is the total number of obser~
vatioM.and
C

52 = _.I_
N -1

iIR,} _!l(N +1)2].

(16-10)

I'd 1'=>1

Note that :f is just the variance of the ranks. When the number of ties is moderate, there
will be little difference between equations 16-8 and 16-9, and the simpler form (equation
16-8) may be used.

In Design and Analysis oj Experiments, 5t.'l Edition (John Weey & SODS, 2(01), 0, C, Montgomery
presents data from an experiment in which five different levels of cotton content in a syn~etic Eber
were tested to daermine if eottOn content has any effect on fiber tensile strength. The sa."llple data and
ranks. from this experimer.t are shown in Table 1&-3. Since :here is a fairly large number of ties, we
use equation 16~9 as the test statistic. From equation 16~ 10 we fud

S2

=~f-iiili~ _N(N+l)']
f;

-l~i"'l J""l

= 1..[5497.79- 25(26)'
24
4 j
~53.03.

Table 16-3 Data and Ran.<.s for :he Tensile Testing Experiment
Percentage of Cotton

IS

20

2.0
2.0

12

IS

12.5

11
9

7.0
4.0

12
18
18

lit

27.5

17

30

25

9.5
14.0
9.5
16.5
16.5

66.0

14
18

IS
19
19

11.0
16.5
16.5
20.5

20.5
85.0

19
25
22
19
23

35

2.0

10

5.0
7,0
12.5
7.0

20.5
25.0
23.0
20.5

11
15

24.0

11

113.0

33.5

504

Chapter 16 Nonparametric Statistics


and the test Statistic is

1:

R
L-"
- NtN+l)21
\' 4 J'

K=S2liQ1 nJ

=_l_r5245.0- 25(26)']
53.03,

=19.25.
Si...."lce K> x~,c,'.; ::::;; 13.28, we would reject the nullhypoiliesis andcondude that treatnents differ. This
is the same conclusion given by the usual analysis of variance Fwtest.

165.2 The Rank Transfonnation


The procedure used in the previous section of replacing the observations by their ranks is
called the rank tran.sformation. It is a very powerful and widely useful technique. If we
were to apply the ordinary P-test to the rarJ:s rather than to the or::ginal data, we would
obtain

Ki(a-l)
(N-I-K)i(N-a)
as the test statistic. Note that as the Kruskal-Wallis statistic K increases or decreases~ F<;;
also increases or decreases, so the Kruskal-Wallis test is nearly equivalent to applying the
usual analysis of variance to the ranks.
The rank transformation has wide applicability in experimental design problems for
which no nonparametric alternative to the analysis of variance exists. If the data are ranked
and the ordinary F wtest applied, an approximate procedure results, but oue that has good statistical properties, Villeo we are concerned about the normality assumption or the effect of
outliers or '''wild'' values, we recommend that the usual analysis of variance be perfotmed
on both the original data and the ranks. \','11en both procedures give similar results, the
analysis of variance assumptions are probably satisfied reasonably well. and the standard
analysis is satisfactory. Vi/hen the two procedures differ, the rank tralll;formation should be
preferred since it is less likely to be distorted by nonnormality and unusual observations. In
such cases, the experimenter may want to investigate the use of transformations for non...
normality and examine the data and the experimental procedure to detennine whether outliers are present and if so, why they have occurred.

16-6

S~1ARY

This chapter has introduced nonparametric or distribution-free statistical methods. These


procedures are alternatives to the usual parametric t~ and F-tests when the normality
assumption for the underlying population is not satisfied. The sign test can be used to test
h)'Potheses about the median of a continuous distribution. It can also be applied to paired
observations. The Wilcoxon signed rank test can be used to test hypotheses about the mean
of a symmetric continuous distribution. It can also be applied to paired observations. The
Wilcoxon signed rank test is a good alternative to the t~test. The two~sample hypothesistesting problem on means of continuous symmetric distributions is approached using the
Wilcoxon rank-sum test. This procedure compares very favorably with the two-sample
Hest. The Kruskal-Wallis test is a useful alternative to the F~testin the analysis of variance,

16-7

E,'{ercises

505

16-7 EXERCISES
16~1. Ten samples were taken Emma plating bath used
in an electronics manufacturing process and the bath
pH determined. The sample pH values are given
below:

7.91,7.85,6.82, 8.Ql, 7.46, 6.95, 7.05, 7.35,


7.25,7.42.

Manufacturing engineering believes that pH has a


median value of7 ,0. Do the sample cla::a indicate that
this statement is correct? Use the sign tes~ to investi-

gate this hypothesis.


16..2. The titanium content in an aitcraft~gTade :ilioy is
an important determinant of strength. A sample of 20
test coupons reveals the following titanium contents
(in percent):
832,8.05,8.93,8.65,8.25,8,46,852,8.35,
8.36,8.41, 8A2, 8.30,8.71,8.75,8.60,8.83,
850. 8.38, 8.29. 8.46.
The median titanium content should be 8.5%. Use jre
sign test to investigate this hypottesis.

16 ..3. The distribution time between a..-'rivals in a


telecommt."Dicauon system is ex-ponential, and the system man.ager wishes to test the hypothesis that HG: p.;::..
3.5lIlin versus HI: p. > 3.5 min,
(a)
is the value of the mean of the exponential
distribution under Hij: fl.:::. 3.5?

' ' hat

(0) Suppose that we have taken a sample of It:::::. 10


observations and we observe R-;:;; 3, Would the
sig:1 test reject Ho at a= 0,05?
(c) What is the type n error of this test ifjl =4,57
164. Suppose that we take a sample of n;:;; 10 measurementS from a normal distributio!l with (1 == 1, We
wish to test Ho: J1. "'" against HI: J1. > O. The no:rrn.al

a/-rn ,

test statistic is ~ =(x - #0)/


and we decide
to use a critical region of 1.96 (:.hat is, rejectRc if4 2:
1.96).
(a) 'What is

ex for this test?

(b) What is j1fortlllstestif,u= I?


(c) If a sign tes~ is used, specify the critical region
that gives an a value consiste::lt with a for the nor:nal test
(d) What is the t> value for the sign test if fJ. == 1?
Compare this with the :esult ob~ed in part (b).
16--5. TWo different types of tips can be used in a
Rockwell hardness tester. Eight COUP0!lS from test
ingots of a nickel-based alloy are selected, and each
coupon is tested twice, once with each tip. The Rockwell C-sca1e hardness readings are shown next Use

the sig:c test to determine whether or not the


produce equivalent hardness readings.
Coupon
1
2
3
4
5
6
7

Tip 1

Tip 2

63
52
58

60
51
56
59
58
54
52
61

60
55
57
53
59

CWO

tips

16-6. Testing for Trends. A turbocharger whee: is


ma."n;factured m.ing an invest.'TIent casting process.
The shaft fits into me wheel opening, and this wheel
opening is a critical dimension. As wheel wax patterns
are fanned, the hard tool producing the wax patterns
wears. This may cause growth in the wheel-opening
dimension, Ten wheel-opening measurements, in time
order of production. are shown below:

4.00 (mm), 4,02, 4.03. 4.01, 4.00, 4.03,


4.()4, 4.02, 4.03, 4.03.
(a) Suppose that p is the probability that observation
Xi+5 exceeds observation X" If t.1.ere is no upward
or downward trend. then X;"'5 is no more or Iess
likely to exceed XI or lie below Xt VVhat is the
value ofp?
(0) Let V be :he number of values of i for which
Xi+, > Xr If there is no upward or dov,llward trend
in the measurements, what is the probability dis-

tribution of V?
(c) Use t!:.e data above and the results of parts (a) and
(b) to test He: there is :10 trend versus HI: there is

i!pwatd trend. Use a"",O.05.

Note that this test is a modification of the sign test. It


Was

dc...eloped by Cox and S:uart.

16..7. Consider the Wilcoxon signed rank test, and


suppose :hat n == 5. Assume ~t Ho: j.J. = ,l1t.J is trJe.
(a) How many different sequences of signed r;mks
are pOSsible? Errt!rnerate these sequences.
(b) How ma.1.Y different 'values of R" are there? Find

the probability tlSsociated with each value of R".


{c) Suppose that we define the critical region of the
test to be
such that we would reject if R"'>

R:

R:,

and R~;::::: 13, What is the approximate a level of


this test?
(d) Can you see from this exerc:se how the critical
'values for the Vlilcoxon signed xank test were

deve,oped? Explain.

506

Chapter 16

Nonparametric Statistics

16..&. Consider:he data in Exercise 16-1. and asstllXte


that the distribution of pH is symmetric and continu~
ous, Use the Wilcoxon signed rank test to test the
hyp<:1thesis Ho: ,u == 7 against H!; /1 '# 7.
16~9. Consider the data

in Exercise 16-2, Suppose

that the distribution of titanium content is symmetric


and cont:ir.uous. Use the Wilcoxon signed rank test to
test the hypotheses Ho: J.l;:;::! 8.5 versus H;: J.l #; &.5.
16~10" Consider

t.1e data in Ex.ercise 16-2. Use the

large-sample approximation for the Wilcoxon signed


rank: test to test :he hypotheses Ho: J1::::: 8.5 Versus H,;
J1 ~ 8,5. Assume that :he distribution of titanium content is continuo'.)s and symmetric,

Unit I 25, 27, 29, 31, 30, 26, 24, 32, 33, 38.
Unit 2 31, 33, 32, 35, 34, 29, 38, 35, 37, 30,
16-16. In Design and h..alysis 0/ Experiments, 5th
Edition (John Wiley & Sons, 2001), D. C. Montgomery presents the results of an experiment to com-

pare four different :n;.,iting tech.rUques: on the tensile


strength of portland cement, The results are sho\Vn
below. Is t.~ere any indication that rnixi:ng !echnique
affects t..lJ.e st:ength?

Hi-ll. For the large-sample approximation to the


Wilcoxon signed rank test, derive the mean and standard deviation of the test statistic used in t!1e
procedtlte.
16~12. Consider the Rockwell hardness test data in
Exercise 16-5. Assume Li.a: bot."1 distributions are
continuous and use ll-)e \Vilcoxon signed rank test to
rest that the mear:. difference iL hardr,ess readings
between the two tips is zero,

16-13. An electrical engineer must design a circuit to


deliver the r.:l:aximum amount of current to a display
rube to achieve sufficient image brightness. Within
rJs allowable design constraints. he has developed
t'NO candidate circuits and tests prototypes of each.
The resulting data (in microamperes) is shown below:
Circuit 1: 25:, 255, 258, 257. 250, 251, 254, 2S0, 248
Circuit 2: 250, 253, 249, 256, 259, 252, 250, 251
, Use ;he Wilcoxon r.m.\<;-sum test 1.0 test Ho: J1; =
against the alternative HI; J11 > ,u1>

,u1

16-14. A consultant frequently travels from Phoenix,


Arizona, to Los Angeles. California, He will use one
of two airlir:es, Uni:ed or Southwest. The number
of minutes that his flight arrived late for the last
six trips on each airlbe is sbo\Vll belOW. Is there evidence that either airli,'le has superior on-time arrival
perfon:l3!lce'1

United

Southwest 20

19,
4 8

-2

0 (minutes late)

8 -3

5 (minu:es late)

The manufacrurer of a bot tub is interested in


twO different heating elements fOf his product.
The element that produces the maximum heat gain after
15 rcinutes would be preferable. He obtains 10 samples
of each heating unit and tests each one. The heat gain
after 15 minutes (in "F) is shown below, Is there any reason to suspect that one unit is superior to the other?
l<i~15.

testing

Tensile St:'ength (lb i In.')

:Mixing Technique

2
3
4
16~17.

3129
3200
2800
2600

3000
3000
2900
2700

2865
2975
2985
2600

2890
3150
3050
2765

An a.rticle in the Quality Control Handbook,

Srd Edition (McGraw~Hill. 1962) presents the results


of an experiment perfonned to investigate t!le effect of
th..'"ee differer.t conditioning methods on the breal:ir.g
strength of cement briquettes. Tne data are shown
below, Is there any indication that conditioning
method affects breaking strength?
Conditioning
Method

1
2

(Ib i in,')

Brealdng

553
553
492

550
599
530

568
579
528

541
545
510

537
540
571

16~1S. In

Statistics/or Research (John Wiley & Sons,


1983), S. Dowdy and $, We3rden present the results of
an experiment to measure stress resulting from operating hand-held chain saws. The experimenters measured the kickback angle through which the saw is
deflected when it begins to cut a 3-inch stock synthetic board. Shown below are deflection angles for
five saws chosen at random from each of four differ~
eut manufacturers. Is there any evidence that the manufacturers' products Ciffcr with respect to kickback

ar:gle?
Kickback Angle

l\.la.nufucturer

B
C
D

42
28
57
29

17

24

50
45
40

44
48

22

39
32
41
34

43
61
54

30

Chapter

17

Statistical Quality Control and


Reliability Engineering
The quality of the products and services used by our society has become a major consumer
decision factor in many, if not most, businesses today. Regardless of whether the consumer
is an individual, a corporation. a military defense program, or a retail store, the consumer
is likely to consider quality of equal importance to cost and schedule. Consequently, quality improvement has become a major concern of many U.S. corporations. This chapter is
about statistieal quality control and reliability engineering methods, two sets of tools that
are essential in quality~improvement activities.

17-1 QlJALITY IMPROVEMENT Ar."D STATISTICS


Quality means fitness for use. For example, we may purcbase automobiles that we expect
to be free of manofactering defects and that should provide reliable and economieal transportation, a retailer buys finished goods with the expectation that they are properly packaged and arranged for easy storage and display. or a manufacturer buys raw material and
expeets to process it with minimal rework Or scrap. In other words, all consumers expect
that the products and serviees they buy will meet their requirements, and those requirements
define fitness for use.
Quality, or fitness for use, is detemtined through the il"J.ternetion of quality of design and
quality of conformance. By quality of design we mean the different grades or levels of performance, reliability, serviceability, and function that are the result of deliberate engineering and management decisions, By quality of conformance, we mean the systematic
reduction of variability and elimination of defects until every unit produced is identical and
defect free.
There is some confusion :in our SOCiety about qlJ.t1lity improvement; some people still
think that it means gold plating a product or spending more money to develop a product or
process. This thinking is \\Tong. Quality improvement means the systematic elimination of
waste. Examples of waste include serap and rework in manufacturing, inspection and test,
errors on documents (such as engineering drawings. checks, purchase orders, and plans),
customer complaiDt hotlines1 warranty costs, and the time required to do things over again
that could have been done right the first time. A successful quality-improvement effort can
eliminate much of this waste and lead to lower costs, higher productivity, increased customer satisfaction. increased business reputation, higher market share, and ultimately
bigher profits for the company.

507

508

Chapter 17

Statistical Quality Control and Reliability Engineering

Statistical methods playa vital role in quality improvement. Some applications include
the following:
1. In product design and development, statistical methods, including designed experiments, can be used to compare d.i.f.te:rent materials and different components or
ingredients1 and to help in both system and component tolerance determination.
This can significantly lower development costs and reduce development time.
2. Statistical methods can be used to determine the capability of a manufacturing
process. Statistical process control can be used to systematically improve a process
by reduction of variability,
3. Experiment design methods can be used to investigate improvements in the process,
These improvements can lead to higher yields and lower manufacturing costs.
4. Life testing provides reliability and other performance data about the product. This
can lead to new and improved designs and products that have longer useful lives and
lower operating and maintenance costs.
Some of these applications have been illustrated in earlier chapters of this book. It is essential that engineers and managers have an in-depth understanding of these statistical tools in
any industry or business that wants to be the bigh-quality, low-cost producer, In this chap.
ter we give an introduction to the basic methods of statistical quality control and reliability
engineering that, along with experimental design, form the basis of a successful quality~
improvement effolt..

172 STATISTICAL QUALITY CONTROL


The field of statistical quality control can be broadly defined as consisting of those statistical
and engineering methods useful in the measurement, monilor..ng. contro~ and improvement
of quality, In this chapter, a somewhat more narrow definition is employed, We will define
statistical quality control as the statistical and engineering methods for process controL
Statistical quality control is a relatively new field, dating back to the 19205, Dr, Walter
A. Shewbart of the Bell Telephone Laboratories was One of the early pioneers of the field.
In 1924, he '''TOte a memorandum shoViing a modem oontrol chart, one of the basic tools
of statistical process control. Harold F. Dodge and Harry G. Romig, two other Bell System
employees, provided mucb of the leadership in the development of statistically based sam
pling and inspection methods, The work of these three men forms the basis of the modem
field of statistical quality controL World War II saw the widespread introduction of these
methods to U,S, icdustry, Dr, W Edwards Deming and Dr, Joseph M. Juran bave been
instrumental in spreading statistical quality-control methods since World War II.
The Japanese have been particularly successful in deploying statistieal quality-control
methods and have used statistical methods to gain significant advantage relative to their
competitors, In the 1970, American industry suffered extensively from Japanese (and other
foreign) competition, and that has led, in tum, to renewed interest in statistical qualitycontrol methods in the United States. Mucb of this interest focuses on statistical process
control and expen'mental design. Many U.S. companies have begun extensive programs to
implement these methods into their manufacturing~ engineering, and other business
organizations.

173 STATISTICAL PROCESS CONTROL


It is impossible to inspect quality into a product; the product must be built right the first
time, This implies that the manufacturing process must be stable or repeatable and capable

17-3

Statistical Process CenITol

509

of operating with little variability around the target .or nominal dimension. Online statisti ~
cal process controls are powerful tools useful in achieving process stability and improving
capability through the reduction of variability.
It is customary to think of statistical process control (SPC) as a set of problem-solving
tools that may be applied to any process. The major IDols of SPC are the following:
1. Histogram
2. Pareto chart
3. Cause-and-effect diagram

4.

Defect~concentration

diagram

5. Control chart
6. Scatter diagram
7. Check sheet

While these tools are an important part of SPC, they really constitute only the technical
aspect of the subject. SPC is an attitude-a desire of all individuals in the organization for
continuous improvement in quality and productivity by the systematic reduction of vari~
ability. The control chart is the most powerful of the SPC tools. We now give an introduction to several basic types of control charts.

17-3.1 Introduction to Control Charts


The basic theory of the control chart was developed by Walter Shewhart in the 1920s, To
understand how a control chart works~ we must first understand Shewharfs theory of variation. Shewhart theorized t."1at all processes, however good; are characterized by a certain
amount of variation if we measure with an instrument of sufficient resolution. When this
variabilicy is confined to ra:ndom or chance variation ouly, the process is said to be in a state
of statistical control. However. another situation may exist in which the process variability
is also affected by s.ome assignable cause, such as a faulty machine setting, operator error.
unsatisfactory raw material, worn machine components, and sO,on. l These assignable
causes of variation usually have an adverse effect on product quality, so it is important to
have some systematic technique for detecting serious departures from a state of statistical
control as soon after th~y occur as possible, Control charts are principally used for this purpose.
The power of the control chart lies in its ability to distinguish assignable causes from
random variation, It is the job of the individual using the control chart to identify the underlying root Cause responsible for the out-of-control conditio~ develop and implement an
appropriate corrective action, and then follow up to ensure that the assignable cause has
been eliminated from the process. There are three points to remember.
1~

A state of statistical control is not a natural state for wost processes.

2. The attentive use of control charts will result in the elimination of assignable causes,
yielding an in-control process and reduced process variability.
3. The control chart is ineffective without the system to develop and implement corrective actions that attack the root causes of problems, Management and enginee:~
:ing involvement is usually necessary to accomplish this.

ISoo::.etimes contnum c:J.use is used.insre:ld of "random" or '"'chAnce cause." and special cause is used ifIstead of
"assignab:e cause,"

510

Chapter 17 Statistical QUality Control and Reliability Engineering

We distinguisb between control charts for measurements and control charts for attributes, depending on whether the observations on the quality characteristic are measurements
or enumeration data. For example, we may choose to measure the diameter of a shaft, say
with a micrometer, and utilize these data in conjunction with a control chart for measurements. On the oilier hand, we may judge each unit of product as either defective or nondefe;::tive and use the fraction of defective units found or the total number of defects in
conjunction with a control chart for attributes. Obviously. certain products and quality characteristics lend themselves to analysis by either method. and a clear-cut choice between the
two methods may be difficult.
A control chart, whetberfor measurements or attributes, consists of a cemerline, corresponding to the average quallty at which the process should perform when statistical "ontrol
is exhibited, and two control U",jts. called the upper and lower control limits (UCL and
LCL). A typical control chartls shown in Fig. 17-1. The control limits are chosen so that values falling between them can he attributed to chance variation, while values falling beyond
them can be taken to indicate a lack of statistical conrroL The general approach consists of
periodically taking a random sample from the process, computing some appropriate quantity, and plotting that quantity on the control chart. When a sample value falls outside the
control limits, we search for some assignable cause of variation. However, even if a sample
value falls between the conrrollimits~ a trend Or some other systematic pattern may indicate
that some action is necessary, usually to avoid more serious trouble. The samples should be
selected in such a way that each sample is as homogeneous as possible and at the same time
maximizes the oppo:rturUty for variation due to an assignable cause to be present. This is usually called the ratiolllll subgroup concept. Order of production and source (if more than one
source exists) are commonly used bases for-obtaining rational subgroups.
The ability to interpret control charts accurately is usually acquired with experience. It
is necessary that the user he thoroughly familiar with both the statistical foundation of control charts and the nature of the production process itself.

17-3.2

Control Charts for Measurements


When dealing with a quality characteristic that can be expressed as a measurement, it is customary to exercise control over both the average value of the qUality characteristic and its
variability. Control over the average quality is exercised. by the control chart for means, usually called the X cha.:t. Process variability can be controlled by either a range (R) cha.:t or a

Upper control limit (UCL)

Sample number

Figure 17~1 A typical control chart

17-3 Statistico.1 Process Control

511

standard deviation chart, depending on how the population standard deviation is estImated.
We will discuss only the R chart.
Suppose that the process mean and standard deviation, say p. and a. are known, and,
furthermore, that we can assume that the quality characteristic follows the normal distribution. LetXbe the sample mean based on a random sample of size n from this process. Then
the probability is 1 - IX that the mean of such random samples w,n fall between
f.l+Za/2{a/.Ji;) and I1-Z<l/2(a/..;;;). Therefore. we could use these two values as the
upper and lower control limits, respectively. However, we usually do not know f.1 and a and
they must be estimated. In addition, we may not be able to make the normality assumption.
For these reasons, the probability limit 1 - a is seldom used in practice. Usually Zan is
replaCed by 3, and "3.sigma" control limits are used.
'Wben J1. and a are unknown. we often estimate them On the basis of preliminary sam~
pIes. taken when the process is thought to be in control. We recommend the use of at least
20--25 preliminary samples. Suppose k preliminary samples are a;"Ililable, each of size n.
Typically, n will be 4,5, 'Or 6; these relatively small samp:ie sizes are widely used and often
arise from the C'Onstruction of rational subgroups. Let the sample mean for the ith sample
be Xi. Then we estimate the mean of the population, j)., by the grand mean

l'
IL.X,.

(17-1)

1'=;

Thus, we may take X as the centerline On the X control chart.


We may estimate afrom either the standard deviations or the ranges of the k samples.
Since it is more frequently used in practice, we confine 'Our discussion to the range metb,od.
The sample size is relaliv'ely small, so there is little loss in efficiency in estimating afrorn
the sample ranges. The relationship between the range, R, of a sample from a normal population with known parameters and the standard deviation of that population is needed.
Since R is a random variable. the quantity W:;:;: RIO', called the relative range. is also a random ,{ariable. The parameters of the distribution of W have been determined for any sample size n. The mean of the distribution of W is called d,., and a table of d,. for various n is
given in Table XllI of the Appendix. Let R, be the range of the ith sample, and let

_
R

=-2:11,
k

(17 -2)

I:::::L

be the average range, Then an estimate of (j would be

. R

0"=-.

(]73)

d,.

Therefore, we may use as our upper and lower control limits for the X chart

= ~-F
3R.

UCL = X

d.2~n

3
LCL=X--:-;=J:(

We note that the quantity

(17-4)

512

Chapte: 17 Statistical Quality Control and Reliability Eugineering

is a constant depending on the sample size, so it is possible to rewrite equations 17-4 as

UCL=

X+A,R,

LCL=

X -A;J.

(17-5)

The constant A, is tabulated for various sample sizes in Table XIII of the Appendix.
The parameters of the R chart may also be easily determined. The centerline will obviously be R. To determine the control limits, we need an estimate of (jR> the standard devia~
tion of R. Once again, assuming the process is in control. the distribution of the relative
range, W. will be useful. The standard deviation of W, say (jWI is a function of n, which has
been determined. Thus, since
R=WO",

we may obtain the standard de;iation of R as


(jR= (jl\<,(j',

As a is unknown, we may estimate O"R as

and we would use as the upper and lower control limits on the R chart
UCL

R+ 30"w. R
d2

'

(17-6)

LCL=R- 30"w R.

d,
Setting D3 = 1- 30"w1d2 and D4

:=

1 + 30"w1d2' we may rewrite equation 17-6 as

UCL =D4R,

(17-7)

LCL=D,R,

where D, and D4 are tabulated in Table XIII of the Appendix.


When preliminary samples are used to construct limits for control charts, it is cus~
tomary to treat these limits as trial values. Therefore, the k sample means and ranges
should be plotted on the appropriate charts, and any points that exceed the control limits
should be investigated. If assignable causes for these points are discovered, they should be
eliminated and new limits for thecontrel charts determined. fn this way, the process may
eventually be brought into statistical control and its inherent capabilities assessed. Other
changes in process centering and dispersion may then be contemplated,

Exaxnpie17,1.
A component pa..rt for a jet aircraft engine is manufactured by ax:. investment casting process, The vane
opening on this casting is an impor.ant functiOIlal Parar::le:er of the part, We will illustrate the use of
X&.,'ld R control cha."tS to assess the statistical stability of this process. Table 17-1 presents 20 samples
of five parts each. The values given in the table have been coded by using the last three digits of the
dimension; that is. 31.6 should be 0.50316 inch.
The quantities X:= 33.33 a:;;.d R "'" 5.85 are shown at the foot of Table 17-L Notice that even
though X, X, R, and R are now realizations of random variables, we have still \V:.i.tten them as

173

TabJe

17~1

Vane Opening Measure.-nents

Sample Number
2
3
4
5
6
7
8
9
10

11
12
13
14
15
16
17
18
19
20

513

Statistica: Process Control

x,

x,

'"

33
35
35
30
33
38
30
29

"'
0"

29
33
37
31
34
37

39
28
31
27
33
35

>,

32
37
34
34
33

33
31
36
33
34
38
31
39
43

31.6
33,2
35,0
32,2
33,8
38.4
31.6
36,8

32

34,0
29,S
34,0
33,0
34,8
35.6
30,8
33.0
31.6

39
34

31
33
33
35
39
32
38
35

33

32

30

32

31

35

34

35

37
36
39
30

30

33

27
33

28

31

28

x,

33

33

35
32

34
33

28
35
34
35
32
27
34
30

25

27

34

35

35

36

3S
32
33
37

40
34
39
36
34

37
35
31
30

32

30

35,2

R
4

6
4
4

3
4

10
15
7
L

10
4

28.2
33,8

6
5
3
9
6

X=33.33

R=5,85

uppercase letters, This is the usual convention in quality control. and it will always be clear from the
context what the notation implies, The trial concrollimits are, for the Xcbart.

X A,R

33,3 (0,577)(5.85) = 33.33 ,,3.37,

or
UCL = 36,70.
LCL=29,96,

For the R chart, the trial co:r:trollimits are


UCL =D,R = (2.115)(5,85) = 12.37,
LCL =D,R = (0)(5,85) = 0,

The X aodR cootrol charts .....1tb. these trial cootrollimit.,> are sho'NU in Fig, 17~2, Notice that sa.'":lples
6, S, 11. and 19 are out of cOlltrol on the X chart, and that sample 9" is out of control on the R chart.
Suppose that all of these assigr.ablc causes can be traced to a defective tool io the wax-molding area,
We should discard these !lve samples and recompute the limits for the X and R charnL These new
revised limits are, fur the X chart,

UCL=

X+A,J< =32,90 + (0.577)(5.313) =35,96.

LCL =

32,90 - (0,577)(5.313) = 29,84.,

and, for ce R chart, they are

UCL
LCL

(2,115)(5,067) = 1O,7l,

= (0)(5,067) = 0,

514

Chapter 17 Statistical Quality ContrQI and Reliability Engineering

I-"~-""-

iii

UCL= 36.70

3S:-

Mean = 33.33

J-----*-io,

ILCL

29,96

~1;;-O------;2~O
Sample number

, - - - - - - - .... _ - - 1S

UCL; 12.37
m

0>
C

10

.!!l

c.

E
M

(j)

O~

LCL=O
10
Sample number

20

Figure 17M2 The X and R control charts for vane opening,

The revised control charts are shO'WTl in Fig. 17-3, };otice that we have treated the firs..: 20 preliminary
samples $S estimation. data with which to establish control limits. These limits can now be used. to
judge the statistical control offuturc production.. As each l1eW sample becomes available, the values
ofX' and R should be cotUputed and plotted on t.~e cOIltrol charts. It may be desirable to revise the lim-

hs periodically, even if the process rCJllJlins stable. The limits should alv.rays be revised when process
improvements are made.

Estimating Process Capability


It is usually necessary to obtain some information about the capa.bility of the proc~ that
is, about the performance of the process when it is operating in control. Two grapbical tools,
the tolerance chart (or tier chart) and the histogram, are helpful in assessing process capability. The tolerance chart for all 20 samples from the vane manufacturing process is shov.'ll
in Fig. 17-4. The specifications. on vane opening. 0.5030 0.001 inch, are also shown on
the chart. In terms of the coded data, the upper specification limit is USL ~ 40 and the lower
specification limit is LSL = 20. The tolerance chart is useful in revealing pattems oyer time
in the individual measurements, or it may show that a particular value niX or R was produced by one or two unusual observations in the sample. For example. note the tWO unusual
ObserY3tions in sample 9 and the single unusual observation in sample 8, Note also that it

17 ~3

Statistica! Process Control

40r

S
35~
w

""g . L

c.

"-

"
m

<I)

1"'~
f
i UCL~35.96

VS

LCL=29.84

10

515

20

Subgroup
15~

"e

'"
c.
E

'"

(J)

10

UCL; 10.71

R;5.C67

:~

LCL;C

10
Subgroup

= Not used in computing control ilm:fs

Figure 17~3 X a.,.d R control charts fOT vane openi.1g, revised limits,

40

35

"- 30

0
ID

USL;40

!Lit! ITI ItT 1'1


r

--\Nominal

C
ID

>

dirrension ;:: 30

25

20

LSL

Sample r:umber

Figure 17 ~4 Tolerance diagram of vane openings.

20

516

Chapter 17

Statistical Quality Contro: and Raliability Engineerillg

is appropriate to plot the specification limits on the tolerance chart, since it is a chart of indi~
vidual measurements, It is never appropriate to plot specijicationlimits on a control chart,
Or 10 use the specifications in determining the control limits, Specification limits and con~
trollimits are unrelated. FInally, note from Fig. 174 that the process is running off center
from the nonrillal dimension of 0.5030 inch.
The histogram for the vane opening measurements is shown in Fig. 17-5. The observations from samples 6, 8, 9, II, and 19 have been deleted from !his histogram. The gen
era} impression from examining this histogram is that the process is capable of meeting the
specifications. but that it is running off center.
Another way to express process capability is in terms of the process capability ratiQ
(PCR), defined as
peR = USL-LSL,
60'

(178)

Notice that the 60' spread (30' on either side of the mean) is sometimes called the basic
capability of the process. The limits 3a on either side of the process mean are sometimes
called natural tolerance limits, as these represent limits that an incontrol process should
meet with most of tIle units produced. For the vane opening, we could estimate aM

R =~=2.06.

(J

d,

2.326

Therefore. an estimator for the peR is


peR

USL-LSL
60'

40-20
= 6(2.06)

1.62.
20,------ - - - - --------,

15

~__,__rH_r'Ty.,-'-r-H_rry.,-'-r-,,_,-rl,
18 20
LSL

22.

24

26

28

30

32

34 36

Ncmina! dimension
Vane opening

38

40

USL

42

44

17-3 Statistical Process Contral

517

The PCR has a natural intetpretation; (I!PCR)IOO is just the percentage of the tolerance band used by the process. Thus) the vane opening process uses approximately
(111.62)100 = 61.7% of the tolerance band.
Figure 17-6a shows a process for which the peR exceeds unity. Since the process natural tolerance limits lie inside the specifications. very few defective or nonconforming units
will be produced. If PCR= I, as shown in Fig. 17-6b, more nonconforming units result. In
fact, for a normally distributed process, if peR = 1, the fraction nonconforming Is 0.27%.
or 2700 parts per million. Finally, when the PCR is less than mity, as in Fig. 17-6<, the
process is very yield sensitive and a large number of nonconforming U.1.l3 will be produced.
The definition of the peR given in equation 17-8 implicitly assumes that the process
is centered at the nominal dimension, If th.e process is running off center, itS actual capa~
biliry will be less than that indicated by the peR. It is convenient to think of PCR as a measure of potential capability, that is, capability with a centered process. If the process is not
centered, then a measure of actUal capability is giVe!1 by
D

P C"k =

rUSL-X X-LSL1

romi!

'J
3<1 "

3<1

---'c------'~::.L----!c-.

_PC-!-R'-=-1--:-:" ... _

LSL

L3U I

3u-J

USL

(al

PCR

Nonconforming
(
unilS

NonconformIng
units

LSL

(b)

USL

Nonconforming
units

lSL

"
3<T_...;1~-3u-"'"
(e)

Figure 17.-6 Process fallout and the process capability ratio (peR).

(17-9)

518

lI

Chapter 17 StatistiC!! Q~a1ity Cootrol and Reliability Engineering

In effect, peRk is. a one-sided process capability ratio that is calculated relative to the specification limit nearest to the process mean. For the vane opening process, we find that

PCRt

=min[USL-X, X-LSLJ~
30"

30"

=min[40-33.19 -1.10, 33.19-20 =2.131


j
3(2.06)
3(2.06)

Note that if PCR = PCR, the process is centered at the nominal dimension. Since PCR,
1.1 0 for the vane opening process, and PCR = 1.62, the process is obviously running off
center, as was first noted in Figs. 17-4 and 17-5. This off~center operation was ultimately
traced to an oversized wax tOoL Changing the tooling resulted in a substantial improvement
in the process,
Montgomery (2001) provides guidelines on appropriate values of the peR and a table
relating fallout for a normally distributed process in statistical control as a function of peR.
Many U.S. companies USe PCR 1.33 as a minimum acceptable target and PCR = 1.66 as
a mini"llUIn target for strength. safety, or critical characteristics. Also, some U.s, compa
rues, particularly the au;omobile industry, have adopted the Japanese terminology Cp =PCR
and Cpk ::;;: peRk.- As Cp has another meaning in statistics (in multiple regression; see Chapter 15), we prefer the traditional notation PCR and PCR,.
M

11-3.3 Control Charts for Individual Measurements


Many situations exist in which the sample consists of a single observation; that is, n;;::; L
These situations occur when production is very slow or costly and itis impractical to allow
the sample size to be greater than one. Other cases include processes where every observa~
tion can be measured due to automated inspection, for example. The Shewhart control
chartjor individual measurements is appropriate for this type of situation. We will see later
in this chapter that the exponentially weighted moving average control cbart and the cumulative sum control chart may be more informative than the individual chart.
The Shewhart control chart uses the moving range. MR, of two successive observations
for estimating the process variability. The moving range is defined

For example, for m observations, m - I moving ranges are calculated as MR, lx" - XII,
MR3 = lx, - x,1, ... , MRm = Ixm x",-ll. Simultaneous control chans can be established on
the individual observations and on the moving range.
The control limits for the individuals control chart are calculated as
UCL

x+3

MR
d2 '

Centerline:::: X,
- ,MR
LCL -x-'-::i;'

where MR is the saJ::1ple mean of the MR i

(17-10)

17~3

Statisticul Process Control

519

If a moving range of size n = 2 is used, then d2 = 1.128 from Table XIII of the Appendix. The control limits for the moving range control chart are

UCL = D4MR,

Centerline = MR,

(17-11)

LCL=D,MR.

Batches of a particular chemical product are se:ected from a pro:::ess ar.d the purity measured on each,
Da:a for 15 successive bat:::hes have been collected mld are given in Table 17-2, The moving ranges
of size n = 2:are also displayed in Table 17~2,
To Set up the control chart for individuals. we ~t need the sample average of the 15 purity
measurements. This average is found to bex == 0357. The average of the moving ranges of two observations is MR ;::; 0,046. The control limits for the individuals cha...-rt with moving ;:angcs of size 2 using

me linritsin equation 17*lQ are


UCL = 0.757.,. 3 0.046

1.128

=0.879,

Centerline = 0.757.
LCL = 0.757 _ 3 0.046 = 0.635.

1.128

The control limits for:he moving.range cha.-rt a..--e found using the Emits giver. in Equation 17-:11:
UCL = 3.267(0.046) = 0.150,
Centerline = 0.046,

LCL = 0(0.046)

O.

Table17-2 Purity of Olernical Product

Moving Range, ,"ill

Batch

3
4

5
6
7
8
9
10
11
12

13
14

15

0,77
0.76
0.77
0.72
0.73
0.73

0.85
0.70
0.75
0.74
0.75
0.84

0.79
0.72
0.74
7:=0.757

0,0)
0,01
0.05

om
0.00
0.12
0.15
0.05

om
am
0.09
0.05
0.07
0.02
MR=O.0<6

520

Chapter 17

Statisticil Quality Control and Reliability Engineering

0.9r-======================lUCL=0.879

::t

Mea.,::::: 0,757

L======::::::=======;::======~LCL=
5
18
15

0.6 0 ..

0.635

Subgroup

(a)

0.15 r::--==========::;;::==========l
~~
i,~

UCL=

0,10

r'

0,05

0.00

bL====:::;,
~=====::;,10=====::::J1
LCL"
o
5
15

Q,150

:!-R= 0,046
0

Subgroup
(b)

Figure 17~7 Control charts for (a) the individual observations and (b) the moving range ap pu..""ity.

The control charts for individual observations and for the moving range are provided in Fig. 11-1,

Since there are no pointz beyond the controlli..rojts, the process appears to be in statistical canuoI.
. The individuals chart can be interpreted much like the X control chan, An out~of...;;:ontrol situa~
rion would be indicated by either a poi.'u (or points) plotting beyond the controllimi-;s or a pattern
such as a run on one side of the centerline.
The moving range dwt cannot be i::tterpreted in the same way. Although a point (orp0lnts) plot~
ling beyond the control limits would likely indicate a.:l out-of-controlsituation.. a par.em or run on one
side of the centerline is not necessarily an indica:ion that the process is out of conn:oL This is due to
the fact that the moving ranges are correlated, and this cone1ation.lll3.Y natu1'3..lly cause patterns or
trends on the chart

17-3.4 Control Charts for Attributes


The p Chart (Fraction Defective or Nonconforming)

Often it is desirable to classify a product as either defective or nondefectlve on the basis of


comparison with a standard. This IS usually done to achieve economy and simplicity in the
inspection operation. For example. the diameter of a ball bearing may be checked by determining whether it will pass through a gauge consisting of circular holes cut in a template.
This would be much simpler than measuring the diameter with a micrometet. Control charts
for attributes are used in these situations. However, attribute control charts require a con~
siderably larger sample size than do their measurements counterparts. Vie will discuss the
fraction-defective chart, or p chart, and two chartS for defects, the c and u chartS, Note that
it is possible for a unit to have many defects, and be either defective or nondefective. In
some applications a unit can have several defects, yet he classified as nondefective.

17-3 Statjsti,aJ Process Control

521

SUFPose D is the number of defective units in a random sample of size n. We assume that
D is a binomial random variable with unkno'wn parameter p. Now the sample fraction
defective is an estimator of p, that is
D
n

(17-12)

Furthennore; the variance of the statistic p is

so we may estimate

p(l- p)
n

a; as

';(1- P)

(17-13)

The centerline and control limits for the fraction~defective control chart may now be
easily determined. Suppose k preliminary samples are available, each of size n, andD; is the
number of defectives in the ith sample. Then we may take

(17-14)

as the centerline and


i

(I .)

UCL=P+3? : P ,
(17-15)

LeL= p-3 I!P(l- p)


\

as the upper and lower control limits. respectively. These control limits are based On the
normal approximation to the binomial distribution. "When p is sman. the normal approximation may not always be adequate. In such cases. it is best to use control lim.lts obtained
directly from a table ofbinornial probabilities or. perhaps; from the Poisson approximation
to the binomial distribution. If p is small, the lower control limit may be a negative number.
If this should occur, it is customazy to consider zero as the lower control limit.

;.~~I1I~iZ:~
Suppose we wish to construct a raction~defectiye control char:: for a ceramic substrate production line. We have 20 preliminary samples, each of size 100; the number of defectives
in each sample are shown in Table 17~3, Assume that the samples are numbered in the
sequence of production. Note that Ii 80012000 OAO, and therefore the trial parameters
for the control chart are

Centerline =: 0.395

UCL = 0.395 ... 3,: (0.395)(0.605)


,

100

LCL=O.395_3!(0.395)(O.605)
"'~

100

0.5417,
0.2483.

522

Chapter 17 Statistical QUality Control and Reliability Engineering


'fibl.173 Nu.'Uher of Defectives in S3tI1p:ies of 100 Cerar..1ic Substrates
Sample

No. of Defectives

Sample

1
2

44
48

3
4

32

11
12
13

50

14

29
31
46
52

15
16

42

17
18

44
38

19
20

46
38
26

6
7

8
9
10

No. of Defectives

36
52
35
41
30

30

The control chart is shoVr'TI in Fig. 17~8. All samples are in control. If they were not, we
would search for assignable causes of variation and revise the limits accordingly,
Although this process exhibits statistical conlrol. its capability (,0 0.395) is very
poor, We should take appropriate steps to investigate the process to detenr.Jne why such a
large number of defective units are being produced. Defective units should be analyzed to
determine the specific types of defects present. Once the defeet types are known. process
changes should be investigated to determine their impact on defeet levels. Designed exper~
iments may be useful in this regard.

..E~!lI~i7~4
Attributes Versus Measuremenl. Control Charts
Tae advantage of measurement control charts :relarlve to the p chart 'With respect to size of sample may
be easily illustrated" Suppose that a DOnnally distributed quality characteristic has a sta:ndard devia-

tion of 4 a.."\d specification limits of 52 and 68. The process is centered at 60, which results iri. a frnc~
defective of 0.0454. Let the process mean shift to 56, ~ow the fraction defective is 0.1601. If the

tiOD

O.SS

10.45

C -'-=--=--=--=--=--=--=--=-'_'-::-:"_-=--====_"_1 UCL= 0.5417


1

~~

0.35

"'

LCL= 0,2483

0.25

10
Sample number

Figure 17~8 The p chart for a ceramic substrate.

20

17-3 Statistical Process Control

523

probability of d::tecting the shi..~ on ';he first sample following the mift is to be 0.50. ';hen the sample
size mus~ be such tl::at ';he lower 3~sigma limit will be at 56. This implies

whose sOlution is It:::; 9, For a p chart, using the no~ approximation to the binomial, we must have

0,0454+3~(0,0454~0,9546)

0.160L

whose solution is n:::; 30. Thus. unless the cost of measurement inspection is more than three times ali
costly as the attributes inspection, the measurement control chart is cheaper to operate.

The c Chart (Derects)

In some situations it may be necessary to control the number of defects in a unit of product
rather than the fraction defective. In these situations we may use the control chart for
defects, or the c chart. Suppose that in the production of cloth it is necessary to control the
number of defects per yard, or that in a')sembling an aircraft wing the number of missing
rivets must be controlled. Many defects-per-unit situations can be modeled by the Poisson
dismbution,
Let c be the numher of defects in a unit, where c is a Poisson random variable with
parameter a. Now the mean and va.."i.ance of this distribution are both ex. Therefore, if k unit.,;
are availahIe and c; is the number of defects in unit i, the centerline of the control chart is

c;;;;;~Lc[,

(17-16)

K i=l

and

UCL;;;;;c+W.

(17-17)

LCL;;;;;c -3-
are the upper and lower control limits, respectively.

Printed circuit boards are assembled by a combina~on of manual assembly and automation. A flow
sold~r machine is used to make the mechanical and elecrrical connections of the leaded components
to the board. The boards are run thtough the flow solder process almost continuously. and every bour
five boards are selected a.,d i..'lSpectcd for process-control purposes, The number of defects .l:_ each
sample of five boards is noted. Results for 20 samples are shown in Table:'7-4. Now c = 160120;;;;; 8.
and

t.tlerefore
UCL~8+3;S ~16,41!4.

LCL ~ S- 3-./8 < 0, set to 0,


F:om lite control chart i:o Fig. 17~9, we see that the process is in controL However, eight defects per
group of five printed circuit boards is too many (about 8/5;;;;; 1.6 defectslboord), and the process needs
improvement. An investigation needs to be made of the specific types of defects found on the printed
circuit boards. This will usually suggest POteJltial ave::mes for process improvement

524

Chapter 17 Statistical Quality Control and Reliability Engineering

Table 17 ~4 Number of Defects in Samples of Five Pri.nted Circuit Boards


Sample

No. of Defects
6

11

12
13
14

4
5
6
7

10
9

Sample

16

10

13

9
15

8
10
8

15
16
17
18
19
20

12

!'-l'o. of Defects

7
1
7
13

20
UCL~

16.484

r1
w
"'0
J

10

;;;

.0

E
~

0
Sample number

Figure 17-9 The c chart for defeCts in samples of five prir.red circuit boards.

The u Chart (Defects per Unit)

In some processes it may be preferable to work with the number of defects per unit rather
than the total number of defects. Thus, if the sample consists of n units and there are c total
defects in the sample, then

u=n

is the average number of defects per unit. A '" chart may be constructed for such data. If
there are k preliminary samples, each with "11 Uz, .. , Uk defects per unit, then the center~
line on the u chart is

(l7-18)
and the control limits are given by

(17-19)

17-3 Statistical Process Con""l

525

A u chart may be constructed for the printed cl.rcuir board defect data in Examp~e 17~5. Since each
sample contains r. = 5 printed circuit boards, the values of u for eacb sample may be calculated as
shown in the following display:
Sample

Number of defects. c

Sample size, n

5
2

3
4
5
6
7
8

9
1O
11
12
13
14
15

16
17

[3
19

20

Defects per unit

6
4

5
5
5
5
5
5

1.2

0.3

1.6
2.0

10

1.8

9
12
16
2

5
5
5
5
5
5
5
5
5
5
5
5
5

2.4

3.2
0.4

0.6
2.0
1.8
3.0
1.6
2.0
1.6

10

9
15
8

10
8
2
7

0.4
1.4
0.2

7
13

L4

2.6

The centerline for the u chart is

20

U=-2>
20
",[
1

32

1.6,

and the upper and lower control limits are


_

1"17

f!':6

UCL=u+3~-;; =1.6+3~5 =33_


LCL=u -3

(f. = 1_0-".,i-:5:-<0, set to O.

1n

The control char: is plotted in Fig. 17~1O. Notice that the u chart in this example is equivalent to the
c chart in Fig, 17 ~9, In some cases, particuLtrly when the s~le size is !lot constant. the u chart will
be preferable to the c chart. For a disc1.!ssio:c. of va..-iable sample sizes on control charts, see Montgomery (2001).

17-3.5 CUSLM and EWMA Control Charts


Up to this point in Chapter 17 we have presented the most basic of control charts, the
Shewbart control cham. A major disadvantage of these control cham is their insensitivity
to small shifts in the process (shifts often less than 1.50)_ This disadvantage is due to the
fact that the Shewhart charts use information only from the current observation.

526

Chapter 17 St>tistical Quality Control and Reliability Engineering

1:>

ill

"

r
!

0
0

20

10
Sample number

Figure 17,.10 The u chart of defects pe~UDit on printed circuit boards. Example 17 o.
w

Alternatives to Shewhart eontrol charts include the cumulative sum control chart and
the exponentially weighted moving average control chart. These control charts are more
sensitive to small shifts in the process because they incorporate i..'1formation from current
and recent past observations,

Tabular CUStJM Control Cilarts for lb. Process Mean


The cumulative sum (CUSUM) control chart was first introduced by Page (1954) and incorporates information from a sequenee of sample observations. The chart plots the cumulative SU~ of deviations of the observations from a target value. To illustrate. let Xj represent
the jthsample mean, let J1{J represent the target value for the process mean. and say the sample size is n " 1, The CUSUM control ebllrt plots the quantity
n

Ci =

I,lx J1e)
j -

(17-20)

j=l

against the sample i. The quantity C; is the cumulative sum up to and including the ith sam~
pIe. As long as the process is in control at the target value flo, then Ci in equation 17-20 represents a random walk with a mean of zero. On the other hand. if the process shifts away
from the target mean, then either an upward or downward drift in Ci. will be evident. By
incorporating information from a sequence of observations, the CUSUM chart is able to
detect a small shi.fi: in the process more quickly than a standard Shewhart chan, The
CUSUM eharts Can be easily implemented for both subgroup data and individual observations, We will present the tabular CUSUM for indiyidual observations.
The tabular CUSUM involves tvlo statistics. c;- and C;, which are the accumulation of
deviations above and below the target mean., respectively. C; is called the one-sided upper
CUS'(rM and
is called the one-sided lower CUSUM, The
statistics are eomputed as follows:

c:

(17-21)
(17-22)

'Nith initial values of C~ :;;:: CO =O. The eonstant, K. is referred to as the reference value and
is often chosen approximately halfway between the target mean, ,'4> and the out-of-control

17~3

Statistical Process Control

527

mean that we are interested in detecting, denoted Jl.l' In other words. K is half oftbe magnitude of the shift from 110 10 )1.1' or

The Sllltistics given in equations 17-21 and 1722 accumulate the deviations from target that
are larger than K and reset to zero when either qm.u:.tity becomes negative. The Cr..:SUM
control chart plots the values of C; and C; for each sample. If either statistic plots beyond
the decision interval. H, the process is considered out of controL We will discuss the choice
of H later in this ch,,!,!er, but a good rule of thumb is often II = 5a.

A study presented in Food Control {200t p. 119) gives the results of measuring the dry-matter cor-tent
in buttercream from a batch process, One goal of t.1e study is to monitor the amount of Cry matter from
batch to batch. Table 17-5 displays some data ':hatrnay be t']pical of this type of process. Th::: reported
values. Xi' are percentage of dry-ll.la!:ter content examinee after mixing. The target amount of d..ry~mat
ter cor-tent is 45% a:ld assnme that a= 0.&4%, Let us also assume t.1at we.are interested in detect:i:r.g a
shift in the process of mean of at least 10; that is, J1.1 =IJ;;+ 10'=45 + 1(0.84) =45.84%. We will use the

Table 175 CUSUM Calculations for Example 17-7

Batch, i
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21

x,

xi - 45.42

46.21
45.73
44.37
44.19
43.73
45.66

0.79
0.31
-1.05
-1.23
-1.69
0.24

44.24

-LIS

44.40
46.04
44.04
42.96
46.02
44.82
45.02
45.77
47.40
47.55
46.64
46.31
44.82
45.39

-0.94
0.62
-:.38
-2.46
0.60
-0.60
-D.40
0.35
1.98
2.13

22

47.80

23
24

4<5.69
46.99
44.53

25

1.22

0.89
-0.60
-0.03
2.38
"~
J. ...,I

1.57
-0.89

44.58
0.79

1.10
0.05
0
0
0.24
0
0
0.62
0
0
0.60
0
0
0.35
2.33
4.46
5.68
6.57
5.97
5.94
8.32
9.59
11.16
10.27

-1.63
-1.15
0.21
0.39
0.85
-1.08
0.34
0.10
-1.46
0.54
1.62

-1.44
-D.24
-D.44
-LJ9
-2.82
-2.97
-2.%
-1.73
-D.24
-D.81
-3.22
-2.1 1
-2.41
0.05

0
0
0.21
0.60
1.45
0.37
0.71
O.SI
0
0.54
2.16
0.72
0.48

o.()<
0
0
0
0
0
0

0
0
0
0
0

528

Chapter 17 Statistical Quality Control and Reliability E<gioeering


recommended decision i!'lterval of H = 5(j~ 5(0.84) = 4.2. The reference valu~ K. is found to be

The va!ues of C; and

c-; are given ir. Table 17-5. To illustrate the calculations, consider the first two

sample batches, Recall that


J.lo : : :; 45, we have

C;::::::. CO = 0, and using

c:

equations 17-21 and 17~22 ",itb. K:::::; 0.42 and

~=(O,,,,-(45

M2)+C;J

",,-maxlO,xj ~45,A.2't"C;]
and
C: =max[O, (45 -0.42) -xl

+ C;i

=max(O,4458-x, +CQJ,
Fcrbatth l,x\ =46,21,

=max[O, 4<5.21-45,42 +OJ


~ maz[O,

0,79]

=0,79,

and
C; =max[O, 4458 - 4<5.21 + OJ

= max[O, -L63J
=0,

For batch 2,-<" = 45.73,

c; = l.l3.X[O, 45.73 - 45.42 + 0.79]


= max[O, 1.10]
= 1.10,
and

C; = max[O, 44.58 - 45.73+ 0]


:: max[O, -1.15J
=0.

The CUSL"'M calcJlations give::. in Table 17~5 indicate rhat the upper-sided CUSUM for batch 17 is
C77 = 4.46, which CAceeds the dedsion value of H 4,2. Therefore, the process appears to have
shifted. out of concrol. The CUSUM status chart created using ~1initab withH =4.2 is given in Fig.
17-1 L The out-of-control control sh:uation is also evident On this cha.'t at batch 17.

The CliSUM control chart is a powerl'uJ quality tool lor detecting a process that has
shifted from the target process mean, The correct choices of H and K can greatly improve
the sensitivity of the control chart while protecting against t.'1e occurrence of false alarms
(the process is actually in control, but the control chart signals out of con~ol). Design rec~
ommendations for the CliSUM will be provided later in this chapter whcn the concept of
average run length is introduced,

17~3

6~

St<ltistical Process Control

529

Upper CUSUM

:~----------------------~r-----+-+--3~

~ 2~

.r

.~
~

1;,
0
-1

"'--~-""-~-"-"";o;;,.----.......

"J.~""~~;/'~s..v../ ~'~~ :!",~ .. .,j.~~

-2 ~
-3 '~4 ! Lower CUSU;v,

' I:

1-42

20

10
15
Subgrouo nurr:ber

25

We bave presented the upper and lower CUSti'M control charts for situations in which
a shift in either direction away froe the process target is of interest. There are many
instances w ben we may be interested in a shift IT. only one direction, either upward or
dov.nward, One-sided CUSUM charts can be constructed for these situations. For a thorough development of these charts and more details, see Montgomery (2001),

EWMA Control Charts


The exponentially weighted moving average (E\V.MA) control ch~"t is also a good alteITl3tive to the Shewhart control chart when detecting a small s!:rift in the process mean is of
interest. We will present the EWMA for individual measurements although the procedure
can also be modified for subgroups of size n > 1.
The EWMA control chart was firSt introduced by Roberts (1959). The EV;'MA is
defined as

(17-23)
where Ais a weight, 0 < A';; 1. The procedure to be presented is initialized withZo = Po, the
process target mean, If a target mean is unknown, then the average of preliminary data.
is used as the initial value of the EWlYfA. The definitior: given :in equation 17-23 demonstrates that information from past observations is incorporated into the current value of z/.
The value Zi is a weighted average of all previous sample means. To illustrate, we can
replace Zl~l on the right-hand side of equation 17~23 to obtain

x,

Zi =

;..x, + (1- A)[,Axi _1 + (I -

A)zd

= Axi + A(I - A)xi_l + (1 - A)2Z~2'


B~"

recursively replacing Z.i~pj

1,2, ... , t, we find


i-I

Z;

= A2::(1-

Ai

X;_J

+(1- A)i,,".

530

Chapter 17

Statistical Quality Control a=.d Reliability Engineering

The EWMA can be thought of as a weighted average of all past and current observations.
Note that the weights decrease geometrically with the age of the observation. giving less
weight to observations that occurred early in the process. The E\v:MA is often used in forecasting. but the EWMA control chart bas been used extensively for monitoring many types
of processes.
!fme observations are independent random variables with variance <f, the variance of
the E'\V"NIA, Zi' is

Given a target mean, Po, and the variance of the EVlMA, the upper control limit! centerline,
and lower control limit for the Em1.A control chart are
UCL
Centerline = il<J,

I(

"'J

A '[
LCL=.Uo-L<1~ 2~;y-(1-At',
where L is the width of the comrollimits. Note that the term 1 (1 - ).)2.i approaches 1 as i
increases. Therefore, as the process continues running. the control limits for the E\V.M.1\.

approach the steady srate values

(17-24)

Although the control limits given in equation 17 -24 provide good approximations, it is recommended that the exact limits be used for smalJ values of i.

F!x:lrnp", i '7-8
We will now imp!cmenr the EVlMA control chart with 1;::< 0.2 and L;;: 2.7 for the dry-matter content
data provided in Table 17-5. Recall that the target mean.is Ji.J;; 45% and the process standard deviation is assumed to be <5= 0.&4%. Tae EWMA calculations are provided in Thble 17-6. To demonsmtte
some of the calculations, consider the first observ:ation with XI "'" 46,21. We find
" ~,\,xl

+ (1- A.)z"

; (0.2)(46.21) + (0.80)(45)
;45.24.
The second EWMA '\'alue is then

z, =.1.x, + (1- A.)z,


; (0.2)(45.73) + (0.80)(45.24)
=45.34.

17,3

Statistical Process Control

531

Tablel7 EWMA Calculations for Example 17-8


Batch, i
2
3
4
5
6
7
8
9
10

11
12
13
14
15
16
17
18
19
20
21
22
23
24
25

XI

46.21
45.73
44.37
44.19
43.73
45.66
44.24
44.48
46.04
44.04
42.96
46.02
44.83
45.02
45.77
47.40
47.55
46.64
46.31
44.82
45.39
47.80
46.69
46.99
44.53

45.24

45.34
45.15

4.95
44.71
44.90
44.77
44.71
4.98
44.79
44.42
44.74
44.76
4.81
45.00
45.48
45.90
46.04
46.10
45.84
45.75
46.16
46.27
46.41

45.04

UCL

LCL

45.45
45.58
45.65
45.69
45.71
45.73
45.74
45.75
45.75
45.75
45.75
45.75
45.75
45.76
45.76
45.76
45.76
45.76
45.76
45.76
45.76
45.76
45.76
45.76
45.76

44.55
44.42
44.35
44.31
44.29
44.27
44.26
44.25
44.25
4.25
44.25
44.25
44.25
44.24
44.24
44.24
44.24
44.24
44.24
44.24
44.24
44.24
44.24
44.24
44.24

The EW1v1A values 3I'e plotted on a control chart along with the upper and lower control J.i:::;ito given by
UCL= 11.0 +LO' /( ~ )-1-(I-i.'J

V~2-i.J

.'

2i

1[7-0=-2=-":-----1

=45+z.7(0.841J - '~)[1-(1-0.212i ,
,\2-0.2
"

LCL= Po -LO'X~ 'f1-(I-i.)"j

i,2-.l.,l

117'-:'0"'0-'-,- - - -

=45 - 2.7( 0.84)~l2 -'~.2 )[1- (1- 0.21'1 j.


Therefore. for i = 1,

=45.45,

rr""X--'"
2(11
-LO'\(Z_i.J[.-(H.) 1
1

LCL = Po

=44.55.

..

532

Chapter 17 Statistical QUality Control and Reliability Engineering

UCL=45.76

"::;

45,5
~l;ea.n::

45.Q

45

44.5

LCL=44.24
44.0
0

15
10
Sample number

20

25

Figure 1712 EWMAcha.",forE:wnple178.

The reIl)aining controllim,its are calculated similarly and plotted on the control chart giver:. in Fig. 17~
12, The conrrollimits ::end to' increase as i increases, but then rend to die steady state values given by
equations in 17-24:
.-)-

UCL= 1'0 +La;12~ A

=45+2.7(0.84) 0.2

12~0.2

LCL=uo -La

."

'-A1_ _

~2-1.

=45-2.7(0.84)

1-0.

2
'12-0.2

= 4424.
The E\V'I\1A, control chart signals at observation 17, indicating that the process is O\;t of concr.l.

The sensitivity of the EWMA control chart for a particular process will depeod on the
choices of L and A. Various choices of iliese parameters will be presented later in this chapter, when the concept of the average run length is introduced. For more details and deyel~
opments regarding the EWMA, see Crowder (1987), Lucas and Saccucci (1990), and
Montgomery (2001).

17-3.6 Average Run Length


In this chapter we have presented control-charting techniqces for a variety of situations and
made some recommendations about the design of the control cbarts. In this section; we will
present the average run length (ARL) of a control chart The A.RL can be used to assess the
performance of the control cbart or to detennine the appropriate values of various parameters for the control charts presented in this chapter.

17 ~3

S:atistical Ptocess Control

533

The ARL is the expected number of samples taken before a control chart signals out of COn~
trOL In general, the ARL is

I
ARL=:-.
p

where p is the probability of any point exceeding the controillmlts. If the process is in con~
trol and the control chart signals out of control,. then we say that afalse alarm has occurred
To illustrate, consider the Xcontrol chart with the standard 30-limits. For this situation, p =
0.0027 is the probability that a single point falls outside the limits when the process is in
controL The in~control ARL for the X CO!ltrol chart is

.!.. =.~I~ =370.

ARL =

0.0027

In other words, even!f the process remains in control we should expect, on the average. an
out-of-control signal (or false alarm) every 370 samples. In general, if the process is actually in control. then we desjre a large value of the ARL, More formally. we can define the

in-control ARL as
1
ARLo ==-,

where ais the probability that a sample point plots beyond the control limit.
If on the other hand the process is out of control, then a small ARt value is desirable,
A small value of theARL indicates that the control chart will signal out of control soon after
the process has shifted. The out~of~control ARL is

1-13 '
wbere 13 ,s the probability of not detecting a shift on the lirst sample after a shift bas
occurred. To illustrate. consider the X control chart with 3 (J limits. Assume the target or incontrol mea:J. is f.to and that the process has shifted to an out~of...control mean given by fJ,1 ::::
,Ur! + ka. The probability of not detecting this shift is given by

13 = P[LCL 5X5 ucLl,u = ,uJl.


That is, 13 is the prooability that the next sample mea:> plots in control, when in fact the
process bas shifted out of control. Since X~N(f.J.,d'ln) and LCL= f.J.o Lo-/.[,; and
UCL=,un + Lo-/,!';; we can rewrite 13 as

13 = P!,uo -Lo-/.[,; 5X 5,uo + Lo-j-J;;I,u =,ud


r (f.J.o - LO-/.[,;)- ,ul

=Pl'~'~'

0-/.[,;

X -,u
vNn

,;;,..!<

i(,uo -LO-/.[,;)-(,uo -+ko-)

-~~

-l

0-/.[,;

(,uo -+ Lo-I.[,;)- ,u,

<Z$

0-;.[,;

1
J

l,u~,ul!

CUo + LO-/.[,;)-(,uo ~ko-) 1


0-1'[';
J

=P[-L-k.[,; ';;Z5L-kFn],
where Z is a standard normal random variable. If we let CI> denote the standard normal
cumulative distribution fonctl.on, then

534

Chapter 17

Statistical Quality Control and Reliability Engineering

From this, 1:- fJ is the probability that a shift in the process is detected on the first sample
after the shift has occurred. That is, the process has shifted and a point exceeds the control
limits,--signaling the process is out of contro1. Therefore, ARLl is the expected number of
samples observed before a shift is detected.
The ARLs have been used to evaluate and design control charts for variables and for
attributes. For more discussion on the use of ARLs for these chms, see Montgomery
(2001),
ARLs for the CUSUM and EWMA Control Charts
Earlier in this chapter, we presented the CUSl,'M and Ew'MA control charts, TheARL can
be used to specify some of the parameter values needed to design these control charts.
To implement the tabular CUSUlvl control chart, values of the decision interval, H,
and the reference value, K, must be cbosen, Recall that H and K are multiples of the
process standard deviation, specifically H =ha and K =ka, where k = 1/2 is often used as
a standard. The proper selection of these values is important. The ~~ is one criterion that
can be used to determine the values of Hand K As stated previously, a large value of the
ARL when the process is in control is desirable. Therefore, we can set ARLo to an acceptable level. and deter.nine h and k accordingly. In addition) we would want the control chart
to quickly detect a shift.in the process mean. This would require values of hand k such that
the values of ARL, are quite smalL To illustrate, Montgomery (2001) provides the ARL
for a ClJS'L'".M coritrol chart with h = 5 and k = 112, These values are given in Table 17-7,
The in-control average run length, ARLo, is 465, If a small shift, say, 0.50a, is important
to detect, then with h = 5 and k = 1/2, we would expect to detect this shift within 38 samples (on the average) after the shift has occurred, Hawkius (1993) presents a table of h and
k vaJues that will result in an in-control average run length of ARLo::::: 370. The values are
reproduced in Table 17-8,
Design of the EWMA control chart can also be based on the ARLs, Recall that the
design parameters of the EWMl\ control chart are the multiple of the standard deviation, L,
and the value of the weighting factor, l. The values of these design parameters can. be cho~
sen so that the ARL performance of the control charts is satisfactory.
Several authors discuss the ARL performance of the EWMA control chart, including
Crowder (1987) and Lucas and Saccucci (1990), Lucas and Saccucci (1990) provide the
ARL performance for several combinations of L and,t The results are reproduced in Table
17-9. Again, it is desirable to have a large value of the in-controlA.RL and small values of
out-okontrolARLs, To illustrate, if L= 2:8 and A= 0,10 are used, we would expect ARLo
e 500 while theARL for detecting a shift of 0.5 aisA.RL, '" 31.3. To detect smaller slrifu

Table 17-7 Tabula:: CUSlo'M Penor=nce withh = 5 and k ~ 1/2

Shift in M=
of 0)

0.25

0.50

0.75

1.00

LSD

2.00

2.50

3.00

4.00

ARL

465

139

38,0

17,0

10,4

5,75

4,01

3.11

2,57

2,01

Table 17..s

k
h

Values of h and k Resulting iDARI..c

0,25

s.ol

0,50
4,77

370 (Hawkins 1993)

0,75

)'0

1.25

3,34

2,52

1.99

L5
1.61

17-3 Statistical Process Control

535

Table 17-9 ARl...$ for Various E"'lMA Control Schemes {Lucas and Saccucci 1990}
Shift iL Mean
(multiple of d)
0
0.25
0.50
0.75

L= 3.054
}.;;:;;0.40

L""2.998
).=0.25

L=2.962
1=0.20

L=2.814
A=O.1O

L=2.615
).=0.05

500
170
48.2
20.1
11.1
5.5
3.6

500
150
4LS
18.2
10.5

500

500
84.1
28.8

500

224
71.2

28.4

1.00

1~.3

1.50

5.9
3.5

2.00
2.50
3.00
4.00

2.5
2.0
1-4

106
31.3
15.9
~O.3

16.4
1:-4
7.1

5.5

6.1

3.1
2.9

4.4

5,2

2.7

3.4

4,2

23
1.7

2.4
1.9

2.9
2,2

3.5

2.7

in the process mean, it is found that small values of I. should be used. Note that for L = 3.0
and I. = LO, the EWMA reduces to the standard Shewhart control chart with 3-sigma l:mits.
Cautions in the Use of ARLs
Although the ARL provIdes valuable information for designing and evaluating control

schemes, there are drawbacks to relying on theARL as a design criterion. It should be noted
that run length follows a geometric distribution. since it represents the number of sarr..ples
before a "success occurs (a success being a point falling beyond the control limits), One
drawback is the sumdard de,iation of the run length is quite large. Second, because the distribution of the run length follows a geometric distribu:ion, the mean of the distribution
(AR.L) may not be a reliable estimate of the true run length,

17-3.7 Other SPC Problem-Solving Tools


\Vhi1e the control chan is a very powerful tool for investigating the causes of variation in
a process, it is most effective when used with other SPC problem-solving tools, In this
section we illustrate some of these toois. using the printed circuit board defect da~ in
Example 17-5.
Figure 17-9 shows a c chart for the number of defects in samples of five printed circuit
boards. The chart exhibits statistical control, but the number of defects must be reduced, as
the average number of defects per board is 815 = 1.6, and this level of defects would require
extensive rework.

The first step in solving this problem is to construct a Pareto diagram of the individual
defect types. The Pareto diagram, shown in Fig. 17-13, indicates that insufficient solcer and
solder balls are the most frequently OCCUrrillg defects, accounting for (109fl60)IOO 68%
of the observed defects. Furthermore, the fust five defect categories on the Pareto chart are
all solder-related defects. This points to the flow solder process as a petential opportunity
for impro'lement.
To improve the flow solder process, a team consisting of the flow solder operator, the
shop 8upen'isor, the manufacturing engineer responsible for the process, and a quality
neer me...~ to study potential causes of solder defects. They conduct a brainstonning session
and produce the cause~and.effect diagram shown in Fig. 17-14. The causew~md-effect

536

Chapter 17

Statistical Quality Control and Reliability Eng'.neering

75
64

Figure 17~13 Pareto diagram for printed circuit board defects,

diagram is widely used to clearly display the various potential causes of defects in products
and their interrelationships, It is useful in summari7ing knowledge about the process.
As a result of the brainsto:rming session, the team tentatively identifies the following
variables as potentially influential in creating solder defects:
lw
2.
3.
4.
5.
6.
7.

Flux specific gravity


Solder temperature
Conveyor speed
Conveyor angle
Solder wave height
Preheat temperature
Pallet loading meL'>od

A statistically designed experiment could be used to investigate the effect of these seven
variables on solder defects. Also, the team constructed a defect concentration diagram for
the product. A defect concentration diagram is just a sketch or drawing of the product, with
the most frequently occurring defe<:ts shown on the part. This diagram is used to determine
whether defects occur in the same location on the part The defect concentration diagram
for the printed circuit board is shown in Fig. 17-15. This diagram indicates that most of the
insufficient solder defects are near the front edge of the board, where it makes initial contact with the solder wave. Further investigation showed that one of the pallets used to carry
the boards across the wave was bent, causing the front edge of the board to make poor contact with the solder wave.

17 -4

Reli~bility

Engineering

537

Amount
Exhau~

Conveyorsoeed

Wf:Ne height
Specifio

rav:

Contact time
Conve or an Ie
Maintenance

Wave fluidity

Alignment of pallet
Orler.tation

Contaminated lead

Temperature

Figure 17~14 Cause--and-effect diagram for the print!:d circuit board flow solder process.

Sack

Figure 17~ 15 Defect concentration diagram for a printed cireui: board.

When the defective pallet was replaced, a designed experiment was used to investigate
the seven variables discussed earlier. The results of this experiment indicated that several of
these factors were influential and could be adjusted to reduce solder defects. After the
results of the experiment were implemented, the percentage of solder joints requiring
rework was redueedOOm 1% to under 100 parts per million (0.01 %).

17-4 RELIABILITY ENGINEERING


One of "'Ie challenging endeavors of the past three decades has been the design and development of large-scale systems for space exploration, new generations of commercial and
military aircraft, and complex electromechanical products such as office copiers and computers. The perlormance of these systems. and the consequences of their failure, is of vital
concern, For example, the military community has historically placed strong emphasis on
equipment :reliability. This empbasis stems largely from increasing ratios of Illamtenance
cost to procurement costs and the strategic and tacticallinplications of system failure, In the

538

Chapter 17 Sutistical Quality Control and Reliability Engineering

area of consi.uner product manufacture, high reliability has come to be e.xpected as much as
confonnance to other important quality characteristics.
Reliability engineering encompasses several activities, one of which is reliability mod~
eling. Essentially, the system survival probability is expressed as a function of a subsystem
of component reliabilities (survival probabilities). Usually, these models are fune dependent, but there are some situations where this is not the case. A second important activity is
that of life testing and reliability esfunation.

17-4.1 Basic Reliability Definitions


Let us consider a component that has just been manufactured. It is to be operated at a stated
stress level" or within some range of stress such as temperature, shock. and so On. The random variable T will be defined lIS fune to failure, and 1ha reliability of the component (or
subsystem or system) at time t is R(,) = P[T> ,]. R is called the reliability function. The fuilure process is usually complex, consisting of at least three types of failures: initial failures,
wear-out failures, and those that fail between these. A hypothetical composite distribution
of tin:.e to failure is shown in Fig, 17-16. This is a mixeti distribution, and
(17-25)
Since for many components (or systems) the initial failures or time zero failures are
removed during testing, the random variable Tis conditioned on the event that T> 0, so that
the failure density is

g(t)

j(t) =

i - p{O)'

=0

t>o,

(17-26)

otherwise.

Thus, in tenns of/; the reliability function, R, is


R(t) = 1- F(t) = fi(x)dx.

(17-27)

The term interval failure rate denotes the rate of failure on a particular interval of time
[r i , t 2J and the terms failure rate, iflstantaneousjailure rate, and hazc.rd \\till be used synonymously as a limiting form of thc interval failure rate as 12 ~ t 1- The interval fajlure rate
FR(tt, t,) is as follows:
(17-28)

g(l)

~.-----.:::-..~----.:::--+;

Figure 17~16 A composite:faiJu:re distribution.

17-4 Reliability EngiLeering

539

The first bracketed term is simply


P{Failure during it"~ tillSurvival to time Id.

(17-29)

The second term is for the dimensional characteristic> so that we may express the conditional probability of equation 17-29 on a per-unit time basis.
We will develop the instantaneous failure rate (as a function of I). Let h(l) be !.'Ie hazard function. Then

h{ll~ lim R(t)-R(I+ill) 1

", ....0
R(rl
ill

=-lim R(I+ill)-R(I) ._1_


"' .... 0
ill
R(t)'
or
h(l) = -R'(I) ~

R(I)

J(t)
R(t) ,

(17-30)

sinceR(t) l-F(r) and -K(t) J(I). A typical hazard function is shown in Fig. 17-17. Note
that h(t) . dl might be thought of as the instantaneous probability of failure at I, given survival to t.
A useful result is that the reliability functionR may be easily expressed in tenns of has
' ) -'h(x)'" = e-H(","
Rtr ;;:; e k

where

Equation 17-31 results from the definition given in equation 17-30,

h(t)

ag~'

Early
failures

h
!

and
rancom
1allures

Wear Oll:

Random failures _ _-t-faiiures and_


random
fai!ures

Figure 17~11 A typical hazard function,

(17-31)

540

Chapter 17 Statistical Quality Control and Reliability Engineering

and the integratiorrof both sides

r h(x)dx =- 0 R'X)"ilx
=-lnR(X}',!o
R x)

Ja
so that

J; h(x)dx = -In R(I)+


Since F(O)

0, we see that In R(O)

lnR(O).

= 0 and

The mean time to failure (MITF) is

A useful alternate form is


(17-32)
Most complex system modeling assumes that only random component failures need be
considered. This is equivalent to stating that the time~to~failure distribution is exponential,
that is,

t<: 0,
othenvise.
so that

i() ;l,,-"
h(t)=-' =--=,1.
R(I) e-"
is a constant. \\tnen all early-age failures have been removed by bum in., and the time to
occurrence of wearolit failures is very great (as with electronic parts), then this assumption
is reasonable.
The normal distribution is. most generally used to mode! wearout failure or stress failure (where the random variable under study is StreSS level). In situations where most failures
are due to wear, the nonnal distribution may very well be appropriate.
The lognormal distribntion has been found to be applicable in describing time to failure for some types of components; and the literature seems to indicate an increased
utilization of this density for this purpose.
The Weibull distribution has been extensively used to represent time to failure. and its
nature is such that it may be made to approximate closely the observ-ed phenomena. Vt'hen
a system is composed of a number of components and failure is due to the most serious of
a large number of defects or possible defects, the Weibull distribution seems to do particularly well as a model.
The gll1Iltlla distribution frequently resulfll from modeling standby redundancy where
components have an exponential ti.me~rQ-failure distribution. We will investigate standby
redundancy in Section 17-4.5.

17-4 Reliability Engineering

541

17-4.2 The Exponential Time-to-Failure Model


In this section we assume that the time-to-failure distribution is exponential; that is, only
"random failures" are considered. The density, reliability function, and hazard functions are
given in equations 17-33 through 17-35, and are shown in Fig. 17~18:
t;'

;::;0,

otherwise,

(17-33)

;::;0,

otherwise,

(17-34)

othe:'INise.

(17-35)

h(t) = f(t)
R(t)
=0,

The constar.t
memory; that is,

haz~d

0,

= J..

function is mterpreted to mea."'1 L'1at the failure process has no

(17-36)

fit)

(a) Density function

(b) Re:iability function


hit)

(c) Hazard t ..mction

Figure 17~lS Density, reUabwr:y ft.'Uctio::;., and hazard function for the exponential failure model.

542

Chapter 17 Statistical Quality Control and Reliability Engineering

a quantity that is independent of t. Thus if a component is functioning at time t, it is as good


as new, The remaining life has the same density asj.

Emnple 17:9
A diode used on a printed c!.reui~ board has a fated failure rate of 2.3 x lo-e failures per hour. Howeve:, under an increased tempera~e stress, it is felt that the :ate is about :t5 x 10-5 failures per hour.
The time to failure is exponentially distributed, so that we have
ft.t) "'" (1.5 X 1O-5)e-<lJh<!o-~)r,

t~O,

~O,

otherwise,

RU) ::::: e-< L5:11)-5)I,

t~O,

othetwise,

=0,

and
h(f) ~

L5x 10-',

!~O.

otherwise.

=0,

To determine the reliability at t = 10 and t "" 1iY, we evaluate R( 10') =- e...iJ,15 ::.:: 0.861, and R( 10 5)
[15;0.223.
4

:::::-

17-4.3 Simple Selia! Systems


A simple serial system is shown in Fig. 17-19. In order for the system to function, all components must function. and it is assumed that the components function independen.!ly. We
let ~ be the time to failure for component Cj for j = 1, 2, .. ,' n and let T be system time to
failure. Tbe reliabiJ~ty model is thus
R(t) = peT> t] = P(TI > t) , P(T2 > f) . " . PIT, > t),

or
R(I) = R I (:) ,R,(t) , ". ,R,(t),

where

Example17.10.
Three components must all iu..'1ction [or a simple system to funetion. The random variables T;. T2 , and
TJ representing t.i.r!J.e to failure for the components are independent with the following distributions:
Tr-N(2XIO J , 4X10

),

I
T,-WcibUl\y=O, 0=1,

T,-lOgnormal(1l = 10, <1'=4).

Figure 17-19 A simple serial system.

'71)-'

17-4 ReliabLity Engineering

543

1: follows that

so that

For exa:nple. if t::::: 2187 ho-.;rs, then


R(2187) ~ [1 - <I>(0.935)][e-'][I- <1>(-1.154)]
~

[0.175::0.0498J[0.876)

=0.0076.

For the simple serial system, system reliability may be calculated using the product of
the component reliability functions as demonstrated; however, when all components have
an exponential distribution, the calculations are greatly simplified, since
R(t) ::;:;; e-).;l . e-.t:f

...

e-J"I = e-;).I + AZ+ " . . . ..1.,)/,

or
(17-38)

where A.f = L:~).j represent>; the system failure rate. \Ve also note that the system reliability
function is of the same fonn as the component reliability functions. The system failure rate
is simply the sum of the component failure rates, and this makes application very easy.

Consider an electronic cii-c'Jit with three integrated ciTC".rit devices, 12 silicon diodes. 8 ceramic capacitors, and 15 composition resistors. Suppose under given Stress levels of <;eroperatt:re, shock, and so
on that each component has fail:rre rates as shown 1:;. the following table. and the component faillh-eS
~ independent

Faillh-es per Hour


Integrated circuits

Diodes
Capacitors
Resistors

1.3 x lv'
1.7 X lv'
1.1 X 10-7
6.1 x 1U"

Therefore,

A, ~3(O.013

10-') + 12(1.7 x 10- ) + 8(1.2 x 10-7) + 15(0.61


'
=3.9189 X 10"',

and

x 10-7 )

544

Chapter 17

Statistical Quality Control and Reliability Engineering

The circuit mean time to failure is

l>1TIF~ E(T];..!:..;_I_X10' ;2.55xlO'ho=.


J.,

3.9189

If we wish to determine. say. ROO.?), we get R(104) =e-o039189

'!::I:

0.96,

17-4.4 Simple Active Redundancy


A.~ active redundant configuration is sho,," in Fig. 1720. 'The assembly functioos if k or
more of the components function (k:5 n). All components begin operation at time zero. thus
the term "active" is used to describe the redundancy. Again, independence is assumed.
A general fonnulation is not convenient to work withl and in most cases it is unneces~
sary. When all components ha\-e the same reliability function, as is the case when the components are the same type, we ret RP) ; r\t) for j = 1, 2, ... , n, so that

(1739)

Equation 1739 is derived from the definition of reliability.

Three identical components are arranged in active redundancy. operating independently. In order for
the assembly to fun~tion, at least two of t.'e components must function (k = 2), The reliability func~
don for the system is thus

= 3[r(1)]'[1- r(/l]+[r(r) j'


= [r(I):'[ 3 - 2r(I1].
It is noted tharR is a function of time, r,

Figure 17~2{1 An active redundant configuration.

17 A

Reliability Engineering

S4S

When only one of the n components is required, as is often the case, and the components
are not identical, we obtain
R(t) = 1-

I1[I- Rj(t)1

(17-40)

i""l

The product is the probability that all components fail, and, obviously, if they do not fail the
system survives. When the components are identical and only one is requited, equation
17-40 reduces to
R(t) = 1 - [I

r(tlJ",

(17-41)

where r(t) = R/t),j = 1, 2, ... , "'When the components have exponential failure laws, we will consider two cases. First,
when the components are identical with failure rare ;.t and at least k compo:cents are required
for the assembly to operate, equation 17-39 becomes
(17-42)
The second case is considered for the situation where the components have identical exponential failure densities and where only one component must function for the assembly to
function. Using equation 17-41. we get
1 - [1 -

R(tl

e-'1".

(17-43)

In Example 17-12, where th::ee identical components were arranged in an active redundancy, and at
:east !:\Vo were required for system operation. we found
R(t) = (r(t)]'[3 - 2r(tl],
If the component reliability functions are

r(tl = e"",

then
R(t) = .-"'[3 -

2'-'1

=3e-2M _ 2e-3.l.:,
If tv.'o eomponents arc ammgcd in an active redu.Ldancy as described. and only one must function for
the assembly to function, and. furthermore, if the time-to-failurc densities arc exponential with failure rate 1, tb,en from equation 17-42. we obtci:n

R(t)

1- [1

e-42 = 2e-"-

174.5 Standby Redundancy


A common form of redundancy, called stllndby redundancy, is shown in Fig, 17-21. The
unit labeled DS is a decision switch that we will assume has reliability of 1 for all t. The
operating rules are as follows. Component 1 is initially "online," and when this component
fails. the decision switch switches in component 2. which remains online until it fails.

546

Chapter 17 Statistical Quality Control and Reliability Engineering

Figure 17-21 Standby redundancy.

Standby units are not subject to failure until activated. The time to failure for the

assem~

bly is
T = T1 + T'}. + ... + Til!
where T! is the time to failure for the ith component and T1, T2, ,. Tn are independent random variables. The most common value for n in practice is two, so the Central Limit Theorem is of little value, However, we know from the property of linear combinations that

"

E(T]: 2,(li)
1=1

and

" V(li).
Y(T] = 2,
i=l

We must know the distributions of the random variables Ti in oroer to find the distribution of T. The most common case occurs when the components are identical and the time-tofailure distributions a..re assumed to be exponential. In this case. Thas a gamma distribution

f(t)=~(;urle-",

1>0,

(n-l)!

=0

otherwise,

so that the reliability function is


"-1

R(t): 2,e-" (At)k /k!,

t>O.

(17-44)

k.O

The parameter A. is the component failure rate; that is, E(TJ = ItA.. The mean time to failure and varia"lce are

MTIF = E[11 = nI A.

(17-45)

and

(17-46)
respectively,

17-4 Reliability Engineering

547

Example i7:14
Two identical compOllents are assembled in a standby redundant configuration with perfect switching. The cornponer.t lives are identically distributed. independer.t rand()m variables having an expol , The mean time to failure is
ner.tial distribu::ion with failure rate

lOu

and the variance is

The reliability function

V[Tl

21(100-')' = 20,000,

R(t) =

:22e.;!J('" (-t/100)' /k!,


,.0

R is
1

or

R(t) =e-;/IOO[1 ~ t/l00l,

17-4.6 Life Testing


Sometiroes~ n units are placed on test and
aged until all or most uruts ha\'e failed; the purpose is to test a hypothesis about the fann of

Ufe tests are conducted for different purposes.

the tirne-to-failure density with certain parameters. Both forma1 statistical tests and probability plotting are widely used in life testing,
A secane objective in life testing is to estimate reliability. Suppose, for example, that
a manufacturer is interested in estimating R(IOOO) for a particular component or system.
One approach to this problem would be to place n units on test and count the number offail~
ures, r, occurring before 1000 hours of operation. Failed units are not to be replaced in this
example. An estimate of unreliability isp::::: rln, and an estimate of reliability is

R(IOOO) I-~,
n

(17-47)

A 100(1- a)% lower--co:Uidence limi: on R(lOOO) is given by [I upper limit onp]. where
p is the unreliability, This upper limit on p may be detemrinec using a table of the binomial
distribution. In t..?te case where n is large, an estimate of the upper limit on pis
(17-48)

.:E"".tipi~17,~~
One hundred units are placed on life test, and the test is run for 1000 Mum. There are two failures during test. soft:;;;;; 0.02, ar.d ~~(1000) =0.98. Using a table oft.'e binomial Gistribution, a 95% uppc('confidence limit on pis 0.06, so that a lower limit on R(lOOO) is given by 0.94.

In recent years, t.~ere has been much work on the analysis of failure-u.'Ue data, .:ncluding plotting methods for identification of appropriate failure~time models and paramet:er
estimation. For a good summary oftbis work. refer to Elsayed (1996).

S48

Chapter 17 Statistical Quality Control an.d Reliability Engineering

17-4.7 Reliability Estimation with a Known Time-to-Failure Distribution


In the case where the form of the reliability function is assumed known and there is only
one parameter, the maximum likelihood estimator for R(I) isR(t), which is formed by substituting ~for the parameter ein the expression for R(t), where ~ is the maximum likelihood
estimator of O. For more details and results for specific time-to-failure distributions. refer
to Elsayed (1996).

17-4.8 Estimation with the Exponential Time-to-Failure Distribution


The most COllUDon case for the one-parameter situation is where the tlme-to-failure distri~
burioo is exponentia~ R(t) :' e..;to, The parameter e= E[T] is called the mean time to fallure
and the estimator for R is R(t), where
R(t) ~ ,-di

and {j is the maximum likeliliood estimator of e.


Epstein (1960) developed the maximum likeliliood estimators for under a number of
diEerent conditions and, furthermore, showed that a 1OO( 1 - a)% confidence interval on
R(t) is: given by

(17-49)
for the two-sided case, Or
(17-50)

for the lower, one-sided intervaL In these cases, the values


upper-confidence limits on O.
The following symbols will be used:
n

e and eu are the lower- and


L

= number of units placed on test at t;::::; O.

Q = total test time in unit hours.


t~

=.

time at wruch the test is terminated.

r = number of failures accumulated at time t.

r : : : preassigned number of failures.


1 - a ;: confidence level.

X! ~

:=

the upper a percentage pomt of the chiwsquare distribution with k degrees of


freedom.

There are four situations to consider. according to whether the test is stopped after a preassigned time or after a preassigned number of failures and whether failed items are replaced
or not replaced during test.
For the replacement test, the total test time in unit hours is Q nt", and for the nonreplacement test

Q=

2> +(n-r)t',

(17-51)

If items ace censored (withdrawn items that have not failed), and if failures are replaced
whlle censored items are not replaced. then
c

Q=

2:>! +(n-c)"
)"'1

(17-52)

17-4 Reliability Engineering

549

where c represents the number of censored items and tj is the time of the jrh censorship. If
neither censored items nor failed items are replaced, then

Q~

1''"'1

j=l

2>+ 2>j+(n-r-c)t'.

(17-53)

The development of the maximum likelihood estimators for 6 is rather straightforward.


In the case where the test is nonrep1acement,. and the test is discontinued after a fixed num~
ber of items have failed, the likeJ11cod function is
L= TIf(ti)'TIW)
(17-54)

Then

l~

)"
l~lnL~-rlne-ekti- (n-rt

/8

(""1

and sc1ving (awl!) ~ 0 yields the estimator

e 2:. 'i + (n - r)?


i=l,_ _ _ __

(17-55)

It turns out that


(17-56)

is the maximum likelihood estimator of for all cases considered for the test design and
operation.
The quantity 2"y8has a chi-square distribution with 2r degrees of freedom in the case
where the test is terminated after a fixed number of failures, For fixed te:rmL.lation time t" l
the degrees of freedom becomes 2r + 2.
Since the expression 2r818 =2Q!O, confidence limits on (j may be expressed as indicated in Table 17-10. The results presented in the table may be used direcdy with equations
17-49 and 17-50 to establish oonfidence limits on R(t). It should be noted that this testi.:lg
procedure does not require that the test be run for the time at which a reliability estimate is
required. For example, 100 writs may be placed On a aoureplacement test for 200 hours, the
paramorer e estimated, and R(lOOO) calculated. In the case of the binomial testing mentioned earlier, it would have been necessary to r.m the test for 1000 hours,

Tnble 1710 Confidence Limits on B

Nature of Limit
Two...sided lL'11its

Lower. one-sided limit

FIxed NlIlllber of Failures, r'

Fixed Terminarior. Time, /

550

Chapter 17

Statistical Quality Control and Reliability Engineering

The results are, however. dependent on the assumption that the distribution is exponential,
It is sometimes necessary to estimate the time tR for which the reliability will be R, For
the exponential model, this estimate is

(17-'>7)

and confidence limits on 'R are given in Table 17-11.

,l:;~~~';17',!6.
Twenty items are placed on a replacemenr test that is to be O?Cf3ted until 10 failures occur. The tenth
failure occurs at 80 hours, ar.d the reliability engineer wishes to estimate :he (':lean time '.:0 failure,
95% two-sided limits on 8, R(10D), and 95% two-sided limits on ROOO). Finally, she wishes ':0 estimate the time for whkh the :eliability wil: be 0,8 with point and 9.5% two-sided confdence interval
estimates.

According to equation

17~56

and the results presented in Tables

iI = nt = 20(80) =160 hours


r

10

17~lO

and 17-11,

'

Q= m* ::: 1600 unit hoUl'S,

r 2Q
2Q'
~ %&.025)0 ' X6.\YIl,,,

c 3200 3200 1
l34,17' 9,591J

= [93,65,333,65J,
R(lOO)=e-100/e =e- 1OOf:60

0.535.

According to equation :7-49, the confidence interval onR(100) is


[e-J.OOi9J65, e-IOO1333.65];;;;; [0.344, 0.741J.
Also,

The two-sided 95% confidence limit is determined from Table 17-11

r
L

Table 17-11

a.<:;

2(1600)(0,223), 2(1600)(0.223) ~ [20,9,74.45],


34,17
9.591"

Confidence Limits on tli

Nature of Limit

Two-sided limits

Fixed Ten.ni.nation Tune, l


----------~-------

Fixed Number of Failures, /

r2Q,ln(l/R), 2qln(I/Rl]
:.. %';,'2,2r+2

Lower,

ol1e~sided limit

: 2Qln(l/R)

Xi-afl,2r+2

" '-J
I %;;'.21'-1
~

17-6 Ex=ises

551

17-4_9 Demonstration and Acceptance Testing


It is not uncommon for a purchaser to test incoming products to assure that the vendor is
conforming to reliability specifications. These tests are destructive tests and, in the case of
attribute measurement, the test design follows that of acceptance sampling discussed eat~
lier in this chapter.
A special set of sampling plans that assumes an exponential time-lo-failure distribution
has been presented in a Department of Defense handbook (DOD fI- 108), and :hese plans
are in v.ide use,

17-5 SL"MMARY
This chapter has presented several widely used methods for statistical quality control. Control charts were -introduced and their use as process surveillance deviees discussed. The X
and R control cha."tS are used for measurement data. t'ben the quality characteristic is an
attribute, either the p chart for fraction defective or the cor u chart for defects may be used.
The use of probability as a modeling technique in reliability analyse, was also discussed. The exponential distribution is widely used as the distribution of time to failure,
although other plausible models inelude the normal, lognormal, Weibull, and gamma distributions, System reliability analysis methods were presented for serial systems, as weB as
for systems having active or standby redundaney. Life testing and reliability estimation
were also briefly introduced.

17-6 EXERCISES
17~1. An extrusion die is used to produce aluminum
rods. The diameter of the rods is a critical quality
eharacteristic. Below are shown X and R values for 20
samples of five rods eaeh. Specifications on the rods
are 0.5035 0,0010 inch. The values given are the
la.st three digits of the measurements; that is. 34.2 is
read as 0.50342.

Sample
1
2

3
4

5
6
7

8
9

10

Sample

34.2
31.6

3
4
4
5

11
12

35.4
34.0
36.0

8
6
4

37,2

7
3
10

31.8

33.4
35.0
32.1
32.6
33.8
34.8
38.6

2
7
9
10
4

13
14
15
16

17
18
19

20

35.2
33.4
35.0
34,4
33.9
34.0

8
4

(a) Set up the X wd R cha..''ts, revising the trial control


limits if neeessary. assuming assignable causes
cau be found,
(0) CalctUa>e peR and PCRk' Interpret these ratios.
(c) What pereentage of defectives is being produced
by this prOl..---ess?

17..2.. S'I!ppose a process is in control, and 3-sigma


controllimlts are in use on the X chart. Let the :mean
shift by 1.50. ""'ba! is the probability that t1is shift
will remain undetected for three consec:.ttive samples?
V/hat would this probability be if2-sigma control. limits are used? The sa."Ilple size is 4,
11~3. Suppose that an X chart is used ~o control a :'.01.'mall)' distributed process, and that samples of size n

are taken every h hours and plotted on the chart, which


has k sigma limits.

(a) Find the ex.pected number of samples that will be


Laken until a false action signal is generated. This
is called the in-co:::trol average run length CARL).
tha~ the ptocess srJfts to an out-ofeontrol state, Find the expected number of Saclpies that will be taken until a false action is
generated. This is the out-of-cootrol ARL.

(0) S'I!ppose

(e) Evaluate the in-comrolARL fork= 3. How does


this change if k::; 27 What do you thi:lk about the
use of 2-sigma limits in practice?
(d) Evaluate the out-of-controlARL for a shift of one
sigma, give:;', that n = 5.
11-4~ Twenty-five samples of size 5 are dra'Wn from a
process at regular intervals, and the following da!a a(e
obtained:

552

Chapter 17

'-'
IX

i ::

Statistical Quality Control and Reliability Engineering

17~7. Montgomery (2001) presents 30 observations of


oxide thickness of individual silicon wafers, The data

36275,

are

{""!

(a) Compute the control limits for the X and R charts.


(b) Assutning the process is in control and specificatior.limits are 14.50 0.50, what conclusions can
you draw about the ability of the process to operate \\1tbin these limits? Estimate the percent2.ge of
defective items ilia: will be produced.

Oxide

Wafer

Thickness
45.4

48.6
49.5

me proce..o:;s mean stills out of control by 1.50' 10 minutes after the hour, If D is the ex.pected number of
defectives produced per quarter hour in this out-ofcontrol st2.te.:find the expected loss (in tenus of defective unlt., that re..,>ults from this control procedure.

3
4
5
6
7
8
9
10
11

17..(i. The overall length of a cigar lighter body used in


a."l automobile application is controlled uSing X and R

12
13

charts. The following table gives length for 20 samples of size 4 (measU!"ements a\'1: coded from 5.00
rnm; that is, 15 is 5.15 r:ill-.).

14

(c) Calculate peR and peRI(.' Interpret these ratios.

11-5. Suppose an X c.1.li.-"t for a process is in control


with 3-sigma limits. Samples of size 5 are drawu
ever.y 15 minutes. on ::he quarter hour, Now suppose

3
~

5
6
7
8
9

10

14

]0

10

8
;0

11
13
12
13

10

16

7
12
11
16

7
14

11

II
11
13

12

10

13

15
12
12
16

9
8
14
14
9

14

8
8

10
14
15

16

10

15

10

13
14
15
16
17
18
19
20

8
15

13
9

11

!O

50.9

55.2
45.5
52.8
45.3
%.3
53.9
49.8
46.9
49.8
45.)

IS

16
17
18
19
20
21

Oxide
TIickncss
58.4

51.0
41.2

47.1
45.7
60.6
51.0
53.0
56.0

22
23
24
25

47.2

26

~.O

-,

55.9
50.0
47.9
53.4

?~

28
29
30

(a) Construct a nocnaI probability plot of the data.


Does the normality assumption seem reasonable?

Observation

15
14
9
8
14
9
15
14

44.0

Weier

12

10
10
12
5
!O

(b) Set up an individuals control chart for oxide

thickness. Interpret the chart,


17..s.. A machine i~ used to fill bottles Vlith a particu~
Jar brand of vegetable oil. A single bottle is randomly
selected every half hour and the weight of the bottle
recorded, Experience with the process indicates that
the variability is quite stable. with (J:: 0,07 oz. The
process target is 32 oz. Twenty-four samples have
been recorded in a 12~hour time period with the
results given below.
Sample
Number

6
5
12
9
9

8
8

(a) Set up the X and R charts. Is the process in statis-

tical control'?
(b) Specifications are 5,10 0.05 mm. \Vbat can you
say about process capability?

2
3

4
5
6
7

8
9
10

11
12

Sample

x
32.03
31.98
32.02
31.85
31.91
32.09
31.98
32.03
31.98
31.91
32.01
32.12

:Xumber
13
14
15
16
17
18
19
20
21
22
23

24

x
31.97
32.01
31.93
32.09
31.96
31.88
31.82
31.92
31.81
31.95
31.97
31.94

r
I

17 ~6

Ii

Exercises

553

(a) Construct a normal probability plot of the data

17 .. 13, Consider a process where specifications on a

Does the no:rrnality assumption appear to be

qUality characteristic are 180 15. We know that the

satisfied?

standard deviation of this quality characteristic is 5.


\Vbe:re ~hou1d we een~er the process to minimize the
fraction: defective produced? Now suppOse the mean
sl'.ifts to 105 ar.d we atc using a sample size of 4 on an
X chan:, Wh2.t is the probability that such a sb.ift wit!
be detw.ed on the first sou:.ple following t."ic shift:
What sample size would be needed on a p chart to
obtain a similar degree of protection?

(b) Set up an. individuals control chart for the


weights. Interpret the results,

17-9. The follo~1ng are the numbet of defective solder joints found during successive samples of 500 sol
derjol."lts,

Day

No. of Defectives

Day

No. of Defectives

17-14. Suppose the following fraction defective had

been found L'1 successive samples of size 100 (read


2

106
116

164

4
5

89

99

40

7
8

\\2
36

9
10

69
74

11

42

12
13
14
15

37
25

0.09

0.03

88
101

0.10
0.13

0.05
0.13

64

0.08
0.14
0.09
0.10
0.15

0.10

0.09

0.13

0.13

0.08
O.ll

0.12

16
17
18
19
20
21

down):

51
74
71
43
80

0.06

Construct a fraction-defective control chart. Is the

process in control?
17~10. A process is controlled by a p chart using samples of size 100. The centerline on the chart is 0.05.
What is the probability th2.t the control chart detccts a
shift to 0.08 on the first sample f&lowing the shift?
What is the probability t.hat the shift is detected by at
least the third sample following the shift?

17~11. Suppose a p :::hart with cC'I}terline at p with k


sigma units is used to control a process. There is a
critical fraction defective p< t3at must be detected with
probability 0.50 on the firs: sample followe:g the shift
to dUs state. Derive a general fonuula for the sample
size that snould be used on this enart.
17~12. A normally distributed process uses 66.7% of
the specification band, It is ce:.1tered at the nominal
dimension, located halfway between the upper and
lower specifica~on limits.

0.14

0.12
0.14
0.06
0.05
0.14

0.D7

0.11

0.06

0,09

0.09

Is the process in control with respect to its fraction


defective?
17~15. The foUov.iug- represent the nu.-r.ber of solder
defects observed on 24 samples of five printed circuit
boards: 7. 6.S. 10.24. 6.5,4,8. 11.15. 8,4.
II,
12, 8,6,5,9,7, 14,8,21. Can we conclude that t.'e
process is in control using a c chart? If not, asst.::;ne
assignabie causes can be found and revise the control

,6.

limitS.
17-16. The following represent t.'e number of de:fects
per 1000 feet ir:. rubbcr-cov:red wire: I, 1,3,7,8, 10.
5,13,0,19,24,6,9, 11, 15,8,3,6.7,4,9,20,11,7,
18, 10.6,4.0,9,7,3, I, 8, 12. Do the data come from
a control1ed process?
17-17. Suppose t.'e nt::mber of de:ects in a unit is
to be 8, If the number of defects ma unit shifts
to 16, what it !he probability that it will be detected by
the c chart on the :Erst sample follo'Ning the s..lift?

knOVI'l1

(a) What is t."ie process capability ratio peR?

17~lS. Suppose we are inspectir:.g disk drives for

(b) What fallout level (fraction defective) is produced?

defects per unit, and it is h:o-wn that there is an aver-

(c) Suppose the mean shifts. to a distance exactly 3


standard deviations below the upper 1>1lecification
llinil What is the value of PCRk? How has peR
changed?

Cd) 'iVhat is the actual faI:out experienced a....fter the


shift ill the meal:?

age of two defects per urit. We decided to make our


inspection unit for the c chart five disk drives. and we
control the tOtal number of defects per inspection unit.
Describe the new control chart,
17~19. Consider the data in Exercise 17-15. Set ..tp a u
chart for tlrls process. Compare it to the c chart in
Exercise 17-15.

554

Chapter 17 Statistica: Quality Control and Reliability Engineering

17~20. Consider the oxide thickness d3.ia given in


Exercise 17-7. Set up an EWMA control chart with
;. == 0.20 and L;:;;;;: 2.962. Interpret t.':le chart.

11-21. Consider the oxide thickness data given in


Exercise 17-7. Construct a CUSUM control ch..'ilrt with
k = 0.75 and h = 3.34 if the target L';leme" is 50.
Interpret the chart.
17-22. Consider the weigh:s provided in Exereise
178. Set up an E\VMA eontro1 chart with .<=0.10
and L == 2.7. Interprl!t the chart.

17kl3. Consider t."Ie weights provided in Exercise


17~g. Set up a CUStr:M control chart with k;:;;;;: 0.50 and
h:::::: 4,0. Interpret the chart.
17-24. A titue-to-failure distribution is giv::::n by a uniform distribution:
1

/(1)=--,
otheI'Vllse,

Ca) Determine the re:iability function.


(b; Show

17~28. One 111mdred unit;.; are p:aced on test and aged


until all units have failed. The follo~ing results are

obtained, and a mean life oft:= 160 hours is calculated


from the serial data.
Number off-allures

Time Interval

0-100
100-200
200-300
300-400
400-500
After 500 hours

50

18

:7

8
4
3

Use the chi-square goodness-of-fit test to detennine


whether you consider the exponential distribution to
represent a reasonable tirne~to--fai1ure model for thesl!
data.

fJ-a

=0

decision switch and only one unit is required for subsystem survival, detem.:i;ne tbe subsystem reliability.

tha~

17-29. Fifty uaits are placed on a life test for 1000


hours. Eight units fail during the period. Estimate
R(lOOO) for these units. Determine a lower 95% con~
fidence interval on R(lOOO).

(R(t)dt= (tf(t)dt.
(c) Determine t.':le hazard function,
(d) Show that

17-30. In Section 17-4.7 h was noted that for onepartt.."nerer reliability functions, R(t;fi ),R(t;f1):= R(t;8).
where and Rare the maximum likelihood estima~
tors. Prove this statement for the case

whereH is defmed as in eqcati.on 17~31.

17-25. TIrree units that operate and fail independently


form a series configuration. as shown in the figure at
the bottom of this page.

R(r,8) ~ e-<l'.

=0,

t:?:O,

othenv1.se.

The tirr.e-to-failure distribution for each UJI.it is exponential with t.1e failure rates indicated,

Hint: Ex.press the density function fin tenns of R.

(a) Find R(60) fur the systeIIL


(b) ~'hat is the mean time-to-failurc Cv1TTF) for this

17-31. For a nonreplacement test that is terminated after


200 hours of operation, it is noted that failures occur at
the following ti:nes: 9, 21, 40, 55, and 85 l:ours. The
units are a.<sl.llned to have an expOTh---ntial tittle-To-failure
distribution, and 100 uoits were on test initially,
(a) Estim.ate the mean time to failure.
(b) Construct a 95% lower-confidence limit on t?-e
mean tir::te to failure.

sy&tem'f

17M26. Five identical units are arranged in an active


redundancy to fonn a subsystelXl. Unit failure is indepcadem, and a:: least t'No of the units must survive
1000 hours for the subsystem to perfonn its :nission.
(a) If the units have c:xpone!ltial time-to-failure distributions with failure rate 0,002. what is t.':le subsys~ reliability?
(0) Vihatis the reliability if only onc unit is reqci."'ed?

If the utrits described in the previous exercise


are operated in a standby redundancy with a perfect

17~27.

A1=3x10- 2

Figure for Exercise 17-25.

A2=6x10- G

:4

17~32.

Use the statement in Ex.ercise 17-3!.

(2) Estimate R(300) and construct a 95% lowerconfidence limit on R(300).

(b) Estin-.ate the time for which the reliability will be


0,9. and construct a 95% lower limit on te.g,

A:::=4x10- 2

Chapter

18

Stochastic Processes and Queueing


18-1

INTRODUCfION
The term stochastic process Is frequently used in connection with observations from a
time-oriented. physical process that is controlled by a random mechanism, More precisely;
a stochastic process is a sequence of random variables {XI}' where t E T is a time or
sequence index. The range space for X f may be discrete or continuous; however. in this
chapter we will consider only the case where at a particular time t the process is in exactly
one of m + 1 mutually exclusive and exhaustive stales. The states are labeled 0, 1,
2, 3~ '''. m.
The variables Xl~~' ... might represent the number of customers awaiting service at

a ticket booth at times 1 minute, 2 minutes, and so on. after the booth opens, A.nether
example would be daily demands for a certain product on successive days. xc. represents the
initial state of the process.
The chapter will introduce a special type of stochastic process called a Markov process.
We will also discuss the Chapman-Kolmogorov equa:tions~ various special properties of
Markov chain.s, the birth-dearh equations, and some applications to waiting~line, or queue~
ing, and interference problems.
In the study of stochastic processes. certain assumptions are required about the joint
probability distribution of the random variables Xl' X" ._ .. In the case of Bernoulli trials,
presented in Chapter 5, recall that these variables were defined to be independent and that
the range space (state space) consisted of tw'o values (0, 1). Here we will first consider discrete-time ~arkov chains, the case where time is discrete and the independence assumption
is relaxed ro allow for a one-stage dependence.

18-2 DISCRETE-TTh:IE MARKOV CHAJNS


A stochastic process exhibits the lr1arkovian property if

P {XrT1 ::::!!jlX,= i} =P {Xr+ l =jlXr = i, Xt_ l = i j ,Xr_7. =i2,

.".

Xo:;;;:

fa

(18-1)

for t= O. 1r 2, ... ~ and every sequencej, i, it .. , it' This is equivalent to stating that the probability of an event at time t + 1 gi:ven only the outcome at time t is equal to the probability
of the event at time t + 1 given the entire state history of the system. In other words, the
probability of the event at t.,. 1 is not dependent on the state history prior to time t.
The conditional probabilities

P {Xt + 1 =j1X,. """ i} :;:::: Plj

(18.2)

are called one..step transition probabilities, and they are said to be statior.ary if
for t= 0,1,2, .,.,

(18-3)

55S

556

Cbapter 18 Stochastic Processes and Queueing

so that the transition probabilities remain unchanged through time. These values may be
displayed in a matrix P :;;; {Pi)]' called the one-step transitio.o matrix. Tne matrix P has m +
1 rows and m + 1 columns, and

while

That is. each element of the P matrix is a probability, and each row of the matrix surns to one.
The existence of the one-step! stationary transition probabilities implies that

p;;) =P{X,~" =jlK, =i) =P (X, =jlK, =i)

(18-4)

p:;)

for all t= 0, 1,2, .... The values


are called n-step transition probabilities, and they may
be displayed in an n-step transition matrix

pzn; = [pi;)],
where

O,,;p'" ~I
'J

'

n=O, 1,2, ....

i:;;;O, 1.2, "., m,

O. 1; 2, ... , m,

and
n = 0,1,2.....

i=O,I,2, ... ,m.

The O~step transition matrix is the identity matrix.


Afinitestate Mar!r.ov chain is defined as a stochastic process having a tillite number of
states, the Markovian property, stationary transition probabilities, and an initial set of prob, [(0)
'0)
(0)]
b
"' - P IXI) -::
_ I'}
abili"tieL~
GIn, a (0)
p ll1' ... , am were ai -

The Chapman-Kolmogorov equations are useful in computing n-step transition probabilities. These equations are

i=O,1,2, ... m~
j = O.1~2... ,m ,

(18-5)

Osv:::;;n,

and they indicate that in passing from state i to state j in n steps the process will be in some
state, say I, after exactly v ,mps (v,,; n). Therefore
J'~-'J is the conditional probability
that given state i as the starting state, the process goes to state 1in v steps and from 1to j in
(n - v) steps. When summed over 1. the sUIIl of the products )'ie1dspiJ) .
By setting v lor V'=n -1. we obtain

pt' .

i=O,1,2, ... m,

O.1.2.... ,m.
n=l,2, .. ..
j

It follows that the n-step transition probabilities, pt'" may be obtained from the one-step
probabilities, and
(18-6)

18-3 Classification of States and Chains


The unconditional probability of being in statej at time

A (")

[{n)

In,]

(IT)

a'J!a l

, ,a",

t::;;;n

557

is

'

where
"

'" a~O) . p~;:)

I)

j
1

O,I,2" ..,m,

n:;;: 1,2, ...

i=C

Tnus, A(n) =A p(n). Further, we note that the rule for matrix mcitiplication solves: the total
probability law of Theorem 1~8 so that A{n)

A(n-I).

p.

.lli~~l~i.~:r
In a eomputing system, the probabiHty of an error on each cyele depends on whether or not it was pre~
ceded by an error. We will define a as the error state and 1 as the nonerror sta:e. Suppose the probability of an error if preceded by an error is 0.75, the probability of a."'! error if p:eceded by a nonerror
is 0.50, :.he probability of a nonerror if pre~eded by an error is 0.25, a:r:d the probability of noneITOr
if preceded by nOl!error is 0.50. Thus.

O.75 0.25;
p- L0.50 0.50'

Two-step, threo-step. , .. , seven-s::.ep transition matrices arc sho'W'U below:

, rO. 688

P""=

0.312] p'
LO.625 0.375 '

jO.668
=LO.664

p,

= [0.667

=rO,672

0.328]
LO.656 0.34 '
p-< = [0.667 0.333]
0.666 0.334 '
667 0.333J
,
_rO.
0333]
0.333 ' P -,0.667 0.333

~~~!}

0.667

If we know that initially the system is in the nonerro: state, thc.'l ai~)
A pro,. Thus, forcxample, A(1)= [0.667, 0.333J.

I, d,,0) = O. and A(II) :::::; (ar] =

183 CLASSIF1CATION OF STATES AND CBADlS


We will first consider the notion ofJirst passage times. The length of time (number of steps
in discrete~time systems) for the process to go from state i to state j for the first time is
called the first passage time. If i ::::j. then this is t.."Ie number of steps needed for the process:
to return to state i for the first time, and this is tenned the ,first return time or recurrence rime
for state i.
FIrst passage times under certain conditions are random variables with an associated
probability distribction. We let
denote the probability that the first passage time from
state i to j is equal to n, where it can be sho\\'IJ. directly from Theorem 1-5 that

I;'

./~)::

j:;

0)

=Pij =Pij'

-,1) _
/ ;i} -

(n)

f'l)

Pi; -.

J}

in-ll

'PJJ

.....2)

- J

Ij

(1l-2)

'Pjj

.An-i)

_.,. - J;j Pjj'

(18-8)

558

Chapter 18 Stochastic Processes and Queueing

Thus, recursive computation from the one-step transition probabilities yields the probability that the :first passage time is n for given i,j,

;:t>:an;~i~I:Z:
Using the one-step transition probabilities presented in Exan::.ple
time index n for i = O,j = 1 is determined as

f~';1

18~1,

the distrlbutloo. of the passage

=POI =: 0.250,

f~;' = (0.312) - (0.25)(0.5) = 0.187,


fi~' = (0.328) - (0.25)(0.375) - (0.187)(0.5) = 0.141,
f;~' = (0.332) - (0.25)(0.344) - (0.187)(0.375) - (0.141)(0.5) = 0.105.

There are four such distribution, corresponding to i, j value: (0,0), (0, I), (I, 0), (I, 1).

It).

If i and} are fixed, then I:::"tfi;} s 1. W'hen the sum is equal to one, the values
for
1, 2; ... , represent the probability distribution of first passage time for specific i,j. In
the case where a process in state i may never reach state j, L:tf~; < L
Where i = j and I::,/i;) :;:;: 1, the state i is termed a recurrent STate, since given that the
process is in state i it will always eventually return to i.
If Pa 1 for some State i, then that state is: called an absorbing state. and the process
will never leave it after it is entered.
The state i is called a transient state if

n::;

ifS

tt
)

<1,

n=!

since there is a positive probability that given the process is in state i, it will never retum to
this state. It is not always easy to classify a state as transient or recurrent, since it is sometimes difficult to calculate first passage time probabilities
for all n 1 as was the case in
Example 18-2. Nevertheless, the expected first passage time is

t;;i

' " ~(n) < I ,


LJ'j
n=l

(18-9)
1) a simple conditioning argument shows that

I'ij = 1+

I,Pil 'I'v'
!*j

(18-10)

If we take i
the expected first passage time is called the expected recurrence time. If J.4i :::
00 for a recurrent state, itis called null; if Jlu < 00, it is called narmul[ or positive recurrent.
There are no null recurrent states in a finite-state Markov chain. All of the states in such
chains are either positive recurrent or transient.

18~3

Classification of Sta:es and Chains

559

A state is called periodic with period 1'> 1 if a return is possible only in 'r, 2'r, 3-r, "'\ steps;
so p;~) 0 for all values of n that are not divisible by 1'> 1. and -ris the smallest integer hav~
ing this property.
A state j is termed accessible from state i ifp::) > 0 forsorne n= 1.2, .. , . In ol.!!'exam'J
pie of the computing system. each state. 0 and 1, is accessible from the other; since p;~') > 0
for all i, j and all n. If state j is accessible from i and state i is accessible from j. then the
states are said to communicate. This is the case in Example 18-1. We note that any state
conununicates with itself. If state i communicates withj,j also communicates with i, Also,
if i communicates with 1and I commuricates \vithJ, then i also communicates with).
If the state space is partitioned into disjoint sets (called equivalence classes) of states,
where COIIL.Yflunicating states belong to the same class, then the IvIarkov chain may consist
of one or more classes. If there is only one class so that at: states communicate. the Markov
chain is said to be irreducible. The chain represented by Example 18-1 is thus also irreducible. For finite-state Markov chains, the states of a class are either aU positive recurrent
or all transient. In many applications, the states will all communicate. This is the case if
there is a value of n for which
> 0 for all values of i andj,
If state i in a class is aperiodic (not periodic), and if the state is also positive recurrent.
then the state is said to be ergodic. An irreducible Markov chain is ergodic if all of its states
are ergodic. In the case of such Markov chains the distribution

::

p;;)

A") = A P"

converges as n -7 00 , and the limiting disuibution is independent of the initial probabilities,


A.In F.xamp1e 18-1, this was clearly observed to be the case, and after fiye steps (n > 5),
PIX, = OJ = 0.667 and PIX, 1) = 0.333 when three significant figures are used,
In general. for irreducible, ergodic 11arkov chains,
lim

{r.) -

n--t"" Pi)

lim

{II)_

- n~"''''' aj

Pj.

and, fur..hennore, these values Pi are independent of i. These "steady stare" probabilities, p/,
satisfy the following state equations:
(l8-Ha)
m

I,Pj =1,

(18-Hb)

j=O
m

j == O.1,2,.".m.

pj=LPi'Pij

(l8-11e)

i=O

Since there are m + 2 equations in 18-11b and 18-11c, and since there are m + 1 un.!::nowns,
one of the equations is redundant. Therefore we will use 111 of the 111 + 1 equations in equation 18-110 with equation 18-11b.
j

In the case of the computing system presented in E.xample 18-1. we have fro;:;. equario:1S

1S-lle,

1 =Po+Pj'
Po=Po (0.75)+p,(0.50).

IS-lIb and

560

Chapter 18 Stochastic Processes and Queucing

or
p, = 113,

and

Po =213

which agrees with l.1e emerging result as n:> 5 in Exampie

18~ 1.

The steady state probabilities and the mean recurrence time for
Markov chains have a reciprocal relationship,

j = O,1,2, .. "m,

irreducible~

ergodic

(18,12)

Pj

In Example 18,3 :lote that,lloo ~ 1ipo = 1,5 and)1"

lip,

3,

Tue mood of a corporate president is observed over a period of time by a psychologist in the opera~
tions research department. Being inclined toward mathematica:. modeling, the psychologist classifies
mood into three states as follows:
0: Good (cheerful)
1: Fair (so-so)

2: Poor (glum and depressed)


'The psychologist observes that mood changes
of the transition probabilities

OCClr

only overnight: thus. the data allow estimation

jO,6 0,2 0,2'


P~

l0.3

0,4

O.3J'

0,0 0,3 0,7

The equations

Po

O.6pQ .... a.3pt + 0P2'

PI ;;;: O,2pJ + OAp; -'- O.3P2'

1 =Po+P,":"Pz
are solved simultaneously for the steady s+..ate probabilities
Pa=3/13.

p, =4113,
P2= 6/13.
Given that the president is in a bad mood. that is, state 2. the mean rime reql.l.ired to return to that state
is i-i;:;, where
1

[3

p:!

I-tn ,::::::-=-days,

As noted earlier ifpk}t= 1, state k is called an absorbing state. and the process remains
in state k once that state is reached, In this case, b" is called the absorption probability,
j

18~4

CQntinuous~Time

Markoy Chains

561

which is the conditional probability of absorption into state k given state i. }'lathematically,
we have
bik.

'" bjl\.>
= LP(l

l=0,1.2, ... ,m,

(18-13)

i""O

where

and

for i recurrent. i

*' k.

184 CONTINUOUS-TIME MARKOV CHAINS


If tl'..e time parameter is continuous rather than a discrete index, as asslL.'TIed in the previous
sections, the Markov chain is called a cOfltinuous~parameter chain, It is customary to use a
slightly different natation for continuous~parameter Markov chains, namely X{t} =X(' where
(X(I) J, t:;' 0, will be considered 1:0 bave states 0, I, ... , m. The discrete nature of the state

space [range space for XlI)] is thus maintained. and


i
PiP)~P

[X(t +s)=jlX(s) = ,1,

0,1,2, ... , m.

j=O, 1, 2t .. " m,
s;;:: 0, t::?:O,

is the stationary transition probability function. It is noted that these probabilities are not
cependent on s but only on t for a specified i.j pair of states. Furthermore, at time t = 0, the
function is continuous with

There is a direct correspondence between the discrete ..time and continuous-time models, The Chapman-Kolmogorov equations become

Pij(t)

'"
2::Pe(v),
p/j(t-v)

(18-14)

1=0

for 0:::;; v S; t, and for the specified state pair i,j and time t. If there are times t1 and '2 such
that Plj(t,) " 0 and Pj,{r,) > 0, then states i andj are said to communicate. Once agab states
that communicate form an equivalence claSs. and where the chaL, is irreducible (all states
form a single class)
p,P) > 0,

for eacb state pair iJ


We also have the property that

for t > 0,

562

Chapter 18 Stochastic Processes and Queueing


where Pi exists and is independent of the initial state probability vector A, The values Pj a..--e
again called the steady state probabilities and they satisfy

Pj >0,

j = 0.1.2....,m,

PI

= '\'
v p .. (t).
..,.<1
1)

j = O,1.2,.,.,m,

t;;'

O.

f",C

The intensity of transition, given that the state is j, is defined as


(1815)
where the limit exists and is finite, Likewise; the intensity of passage from state ito statej,
given that the system is in state i, is
(1816)

again where the limit exists and is finite. The interpretation of the intensities is that they represent an instantaneous rate of transition from state i to j. For a small ill, Pij(ill) :::: u(p.t +
0(/;1), where 0(/;1)1/;1 ... 0 as /;1 ... 0, so that .ii is a proportionality constant by which
Pij(6t) is proportional to ill as ill ~ 0. The transition intensities also satisfy the balance
equations
PI ,uj = LP;-Ujj'

j =O,1,2, .... m.

(1817)

i7lij

These equations indicate that in steady state, the rate of transition out of state j is equal to
the rate of transition into j.

An electronic control mechanism for a chemical process is constructed with two identical modU!es,
operating as a parallel. active cedur:.dant pair, The function of at least one module is necessary for the
mechanism to operate. The m3i:ntenance shop has two identical repair stations for these modules and,
furtilerrnore, when a module fails and enters the shop, on,er work is moved aside and repair work is
i.:m..rLediately initiated_ The "system" here consists of tbe mechanism and repair facility and the states
are as fellows:

0: Both modules operating


1: One unit operating and one unit itt repair
2: 1\vo units in repair (mechanism down)
The random variable representing t.inle to faillL>-e for a module has an exponential density~ say

t,.(tl = k"",
=0,

1<: O.

,<0,

and the random variable describing repair time at a repair station also has an exponential density, say
r-;<t)

~ j1e-J.U,

I ~

=0,

1<0.

0,

18-4

Continuous-Tlffie Markov Chains

563

Inteti'ailo.re and interrepait times are independent ax.d {X(t) t can be shown to be a continuous-parameter, irred:J.cible Markov cb.ain with transitions only from a state to its neighbor states: 0 -) 1. 1 --t 0,
1 ~ 2, 2 -;. 1. Of course, there may be nO state cbange.
The transition htensities are
ll~::::: 2)..,

uO!

U1=A"'!"jJ..,

21,

ilL,:::::

A..

u~=O,

U:lo ~ 0,

uIO=p.,

~i=2.u.

'" =2Jl.
L'sing eqaation 18-17.
2};J, = !1P,.
(A...,... ll) P J

::=

2}{Jo + 2p.p2'

and since Po';" p; ..... Pz ::::: 1, some algebra gives


fJ,'

Po=-(;-)-,'
A+Jl,

The system a.vailability (probability that the mechanism is up) in the s:eady state condition is thus

.. bill'ty= 1- - ';t'
Avaca'
-,.

(},+fJ,t

The matrix of transition probabilities for time increment At may be expressed as

p~[Pij(ru)J
!-uOru
=

u1O .6t

"olru
1-u, ru

UiO At

uil At

UmOM

U m t6.t

"0/'"
. Ul/lt

".

",

"

"

...

uO,"At

. ulmM
..

uljAt

urr.jAt

...

(18-18)

u:mAt

1- umD.t

and
Pj(t . /U) = iPi(t). Pij(t>t),

J:::::O.1.2,.",m,

(18,19)

i"'J

where
p,{l) = P (X(t) = jJ.

From tbejtb equation in tbe m + 1 equations of equation 18-19,


p/t+ ru) = Po(t). uo;ill+ ". + p,{t) ui;ill + ... + p/t)(1- u;ill] + ... + Pm(t) , uc, ru,

564

Chapter 18 Stochastic Processes am! Queueing

which may be rewritten as

d I
.
[PJ(t+b.J)- p,(t)] =-u.p.(t)+
-p;,t)=
lim
at
6t t'
1\
))
J'

--)"

ul

.L u.. p.(t).
I'*j

IJ

(18-20)

The resulting system of differential equations is

pj(t)=-ujp/t)+ .Luij.p,(t),

(18-21)

j:;;:; O,1.2,'4',m,

which may be solved when m is finite, given initial conditions (probabilities) A. and using
p,(t) = L The solution
lhe result that

2:;.0

(18-22)

(Po(t),p,(r), .,Pm(t)J =P(I)

pt'

presents the state probabmties as a function of time in the same manner that
presented
state probabilities as a function of the number of trdIlsitions, n. given an initial condition
vector A in the discrete-time modeL The solution to equations 18-21 may be sornewhatdifficult to obtain. and in general practice~ transformation ted.utiques are employed.

18-5

THE BIRTH-DEATH PROCESS IN QUEUEING


The maior application of lhe so-called birth-dealh process that we will study is in queueing or waiting-line theory, Here birth will refer to an arrival and death to a departure from
a physical system. as shown in Fig. 18~ L
Queueing theory is the mathematical study of queues or waiting lines, These waiting
lines occur in a variety of problem environments. There is an input process or "calling pop'"
ulation," from which arrivals are drawn. and a queueing system. which in Fig. 18-1 consists
of the queue and service facility. The calling population may be finite or infirjte. Arrivals
occur in a probabilistic manner. A common assumption is that the interarrival times are
exponentially dis1..ributed. The queue is generally classified according to whether its capac~
ity is infinite or finite. and the service discipline refers to the order in which the customers
in the queue are served. The service mechanism consists of one or more servers, and the
elapsed service time is commonly called the holding time.
The following notation will be employed:

Xl,) = Number of customers in system at time t


States::::: 0, 1,2 ... ,j,j-r 1, ...

s = Number of servers
P{X(,) = jlAl

pit)

p.; limp(t)
;

An

(4""')

= Arrival rate given that n customers are in the system

JJ..'t ::: Service rate given that n customers are in the system

The birth-death process can be used to describe how X(I) cbanges through time. It will
be assumed bere that when X(t)
lhe probability distribution of the time to the next birth
the
(arrival) is exponential \Vith parameter Aj'! j :: O. 1. 2; .... Furthermore, given X(r)
remaining rime to the next service completion is taken to be exponential with parameter j1.i'
j 1, 2, .... Poisson-type postulates are assumed to hold, so tha, the probability of mo{~
than one birth or death at me same .instant is zero.

18-5

The Birth-Deaili Process in Queuei..'lg

565

System
~------------------!

:
!

lnpLi
process

Service
facility
Queue

Arriva.ls

/)

, 0

!
1
I
I
I
I
r------1- Departl.res

o
o
o

I
I

I
0
I
I
I
I
I
------------------~

Figure 18-1 A simple queucing system,

A transition diagram is shown in Fig. 18-2. The transition matrix corresponding to


equation 18-18 is

: -,;,AI '<,tJ

AlAr

tJ.zD.:

l(!.,-.,)&

1-')6(

0
).,' Jt.:

,!tID.:

1-(}..+tJ.j)c"

0
0

po C
0

i~-()'j ..,u:})&

J1 }+l Ar

'"

\Ve note that Plj(&) ;;; 0 for j < i 1 or j > i + 1. Furthermore, the transition intensities and
intensities of passage shown in equation 18-17 are

Uj;;;Ay+,u;

forj= 1,2, .. "

u'J:;;:;:; A..{

forj=i+ I,
forj=i-l,
forj<i-l,i>i+ 1.

The fact that the transition intensities and intensities of passage are constant with time
is important in the development of this model The nature of ttansiti9D car; be viewed to be
specified by assumption, or it may be considered as a result of the prior assumption about.
the distribution of time between occurrences (births and deat.'ls),

Figure 18--2 Transition diagram for the birth-death process,

5(j(j

Chapter 18 Stochastic Processes and Queueing

The assumptions of independent, exponentially distributed service times and independent, exponentially distributed interarrival times yield transition intensities that are con~
stant in time. This was also observed in the development of the Poisson and exponential
distributions in Chapters 5 and 6.
The methods used in equations 18-19 through 18-21 may be used to formulate an infinite set of differential state equations from the transition matrix of equation 18-22. Thus. the
time-dependent behavior is described in the following equations:
(18-23)
1,2, ....
~

~>}(t)= I,

and

(18-24)

1
A--[ao(0),Ill(0) .,.,aj(0).....
I

i"O

In the steady state (I -7 ~). we have p;(t) = 0, so the steady state equations are obtained
from equations 18-23 and 18-24:

11,p1 = A"P"
A"Po + p..,y, = ('-I + 11,) . p"
'-:PI + iLJP3 =

(A., + J.iJl. p"


(18-25)

'-;.2PI-' + l1}'j = ('-J': ~ Il; ,) Pj. J'

I1 _JPj_1 '+' .il;-+lP)+!

and

(/"j + 11.;)Pjl

L;, ,PI = L

Equations 1825 could have also been determined by the direct application of equation
18-17, which provides a "rate balance" or "intensity balance:' Solving equations 18-25 we
obtain

If we let

18~6

Considerations in Que:ucing Models

'_;C!/.)-Z "'An

C
j-

J1jJ1j-l'''fl:

567

(1826)

then

and since

or

Pc+ LPj=l,

(1827)

/=1
we obtain

PC=-';"--'

1+

LC;
J.1

These steady state results assume that the A,h J1J values are such that a steady state can
be reached. This will be true if Ai = for j > k, so iliat there are a finite number of states, It
is also true if p = NS).t < 1, where A. and).t are constant and,$ denotes the number of serverS.
The steady state will not be reached if
I G.,;;;

Ll",

QQ,

186 CONSIDERATIONS IN QUEUEING MODELS


\Vhen the arrival rate ~ is constant for alij. the constant is denoted .:l SimUarly, when the
service rate per busy server is cO::lstant, it will be denoted fl. so that 11;::;;: sp. ifj 2:: S and flj <::::
j fl if j < s. 1ne exponential distributions

ACt) ~N,-k,
0,

t:?:O,
t<O,

TT(t) = Ilf,-Jil,

t<: 0,

=0,

t<O,

for inte!arrival times and service times in a busy channel produce rateS A. and p... which are
constant The mean interarrival time is II)" and the mean time for a busy channel to com
plete service is 11fl.
A special set of notation bas been widely employed in the steady S!A.te analysis of
queueing systems, This notation is given in the following list:
M

L = L~.J Pi = Expected number of customers in !be queueing system


L, = L~,,(j-S)' Pj = Expected queue length
W -;::;; Expected time in the system (including sel'\'lce time)

Wq = Expected waiting time in the queue (excluding service time)


If A is constant for all j, then it bas been shown that

L=AW
and

L,=AW,

(1828)

568

Chapter 18 Stochastic Processes and Queue!ng

(These results"are special cases of what is bown as Little's law.) If the ;"J are not equal, I
replaces A., where
~

:1:= LAjPj.

(18-29)

i:=:.O

The system utilization coefficient p = /Js(.1 is the fraction of time that the servers are busy.
In the case where the mean service time is 11(.1 for allj ~ 1,
I

),,+-.

(18.30)

I'
The birth-death process rates,~, 1 1, , Aj; and J.ljl J.L;" , J.l;~ . may be assigned
any positive values as long as the assignment leads to a steady state solution. This allows
considerable flexibility in using the results given in equation 18-27. The specific models
subsequently presented will differ in the manner in which Ay and I'} vary as a function ofj.
j

187 BASIC SINGLESERVER MODEL V\!ITH CONSTAl"iT RATES


We will now consider the case where s;:;:; 1, that is. a single server. We will also assume an
unlimited potential queue length with exponential mterarrivals having a constant parameler A., So that .:t" =AI =... =A. Fnrthennore, service times will be assumed to be independent and exponentially distributed with 1'1 = J.L,
!L We will assume). < 1'. As a result
of equation 18-26, we have
j=1,2,3, ... ,

(1831)

and from equation 18-27


j=I,2.3, ...,
1

Po =

I-p.

(18-32)

l+LP!
i"'l

Thus, the steady state equations are

Pj= (1- p)pi,

j = 0, 1,2, . ., .

Note that the probability that there X'I3 j customers in the systempj is given by a geometric
distribution with parameter p. The mean number of customers in the system, L, is deter~
mined as
~

L= Lj(1-p)pJ
j=O

(18,34)

=2I-p

1&-7 Basic Single-Server Model \f.-ith Constant Rates

569

And the expected queue length is


Lq=I,U-I),pj
j=l

=L-(I-po)

Using equations 18-28 and 18-34, we find that the expected waiting time ill the system is
(18-36)
and the expected waiting time in the queue is

;.'
!1-(!1-- 1 )1

W, = ,

1
!1-(!1--).)'

(18-37)

These results could have been developed directly from the distributions of time in the system and time in the queue. respectively. Since the exponential distribution reflects a memoryless process, an arrival finding j units in the system will wait through j + 1 services,
including its own, and thus its waiting time ~. + I is the sum ofj + 1 independent, exponentially distributed random variables, This random variable was shown in Chapter 6 to have a
gamma distribution. This is a conditional density given that the arrival fi:lds j units in the
system. Thus, if S represents time in the system.

P(S>w) 2:>rP(Tj+1 >w)


j""O

...

/+1

I, (l_p)pi . r~~e-I'Idl
i"O

n-

r (J+l)

p}!.le-I'II (P':,Y dt

1",,0

),

(18-38)

= e-,u(I-P)"',

w;;'O,

W<O,
which is seen to be the complement of the distribution function for an exponential random
variahle with parameter !1-(1 - pl. The mean value W = IIJl(1 - p) = 11(j.t - )_) follows
directly.

If we let S, represent time in the queue, excluding service time, then

P(S,=O)=p,= I-p.

570

Chapter 18 Stochastic Processes and Queueing

If we take T; as the sum of} service times.


as in the previous manipulations.

will again have a gamma distribution. Then,

p(Sq:>Wq ) 2>jP(1j:>Wq )
1=1
~

L(l-p)pj .P(Tj:>w,)

(18-39)

/",,1

-u(I-p)"
=pe'
q,

=0,
and we find the distribution of time in the queue g(w<i); for

\IV q > 0,

to be
Wq

>0,

Thus, the probability distribution is


g(w,) = I-p,

= 4(1- Ple-"'-;;'.,

(18-40)

which was noted in Section 2-2 as being for amixed~type random variable (in equation 2-2,
G,t 0 and H,t 0). The expected wajting timeln the queue W,could be determined directly
from Ibis distribution as

W.

=(1 p)O.,. Jo~ w, ;,{I pV,"-I.)""dWq


(18-41)

Vlhen /" 2: J.i, the summation of the terms PI in equation 18-32 diverges, In this case.
there is no steady state solution since the steady state is never reached. That is, the queue
would grow without bound.

18-8 SINGLE SERVER 'WITH LIMITED QUEUE LENGTH


H the queue is limited so that at most N units can be in the system, and if the exponential
service times and exponential interarrival times are retained from the prior model. we have

;1,,=41 = ... = '-:1_1 =/4]=0,

jcN,

and

/1:

f.1.z='"

/1,,=/1.

It follows from equation 18-26 that


j~N,

=0,

j>N.

(18-42)

18-8

Single Server with Limited Qt:.eue I..engt.l:!

571

j = O,I,2,,,,,N,

so that
N

pol>j =1
)"'0

and
I-p
1- pN+l'

(18-43)

As a result, the steady state equations are given by

j=O,1,2, ... N.

(18-44)

The mean number of customers in the system in this case is

(18-45)

The mean number of customers in t.lte queue is


N

Lq =

LU -1) Pj
fool

= LjPj - LPj

(18-46)

The mean time in the system is found as

(18..
and the mean time in the queue is

(1848)

where L is given by equaton 1845.

572

Chapter 18

Stochastic Processes and Queueing

18-9 MULTIPLE SERVERS WUHAN ~IMITED QUEUE


\Ve now consider tbe ease where there are multiple servers. We also assume that the queue
is unlimited and tbat exponential assumptions hold for interarrival times and senrlee times.
In this case, we have
(18-49)
and

forj" s,

forj> s.
Thus. defining :=: }Jp., we have

c =)!- . ;11/j;J1/

j';'s

j!

(18-50)
j>5.

It follows trom equation 18-27 that the state equations are developed as

p=!bi po
j

ji

j >$

(18-51)
1

where p = YS/1 = Is is the utilization coefficient, assuming p < 1,


The value Lrr representing the mean number of units in the queue, is developed as
follows:

(18-52)

Then
L

W =-'L
q
I.

(18-53)

lS~12

Exercises

573

and
(18-54)

so that
(18-55)

1810 OTHER QUEUEING MODELS


There are numerous other queueing models that can be developed from Ule birth-deat1.
process. In addition, it is also possible to develop queueing models for situations involving
nonexponential distributions. One useful result, given without proof, is for a single-server
system having exponential interan::ivals and arbitrary service time distribution with mea.,,, 1/J1
and variance d'. If p=Ai!1< I, then steady stare measures are given by equations 18-56:

Po =1

p,

)}(i2 + p2

=.~-.~

2(1-p)'

L= p+Lq ,

(1856)

In the case where service times are constant at 1Ij.L, the foregoing relationships yield the
measures of system performance by taking the variance cJl::::::; O.

1811 SUlY1MARY
This chapter introduced the notion of discrete~state space stochastic processes for discrete~
time and continuous"time orientations. The Markov process was developed along with the
presentation of state properties and characteristics. TIllS was followed by a presentation of
the birth-death process and several important applications to queueing models for the
description of waiting~time phenomena.

1812 EXERCISES
1s..1. A shoe repair shop in a suburban .!.'Call has one
shoesmith. S=.toos are brought in for repair and arrive
according ro a Poisson process with a constant arrival
rate of two pairs per hour. The repair time distribution
is exponential 'kith ame:an of20 minutes. and there is
independence betwee~ the reprur and arrival
processes. Consider a pair of s:Qoes to be the unit to be
served, and do the following:
(a) In the steady slate, find the probability tlJat the
number of pairs of shoes in the system e~ceeds 5.

(b) Find the mean ll1!IDber of pairs in :.'I)e shcp and the

n::.ean nu::nber of pairs waiting for ser.'ice.


(c) Bnd the mea>. tumaround time for a pair of shoes
(time in the shop waiting plus repair, but exclud~

ing time waiting to be picked up),


1s..2. Weather data are analyzed for a particular local~
it)', and a Markov chain is employed as a model for
weather change as follows. The conditional probahil~
i:y of change from rain to clear weather in one day is
0.3. Likewise, the co;.:;.ditional probabili~ of transition

574

Chapter 18 Stochastie Processes and Qaeueing

from elearto rain in one day is 0.1. The model is to be


a discrete~time model, with transitions occurring only
between days.

If t Xn} can. be modeled as a Markov chain with onestep tranSition matrix as

(a) Determine the matrix P of one-step transition

probabilities,

1/6

(b) Find the steady state probabilities.

2/3

(c) If today is clear, find the probability that it will be


dear exactly 3 days hence.
(d) Find the probability that the first passage from a
clear day to a rainy day OCC'Ui.'S in exactly 2 days,
given a clear day is the initial state.
(e) \Vhat is the mean reCW(ence time for the rainy
day state?
19..3~ A

communication link: transmits binary chamc~


tets, (O~ 1), There is a probability P t...'1at a trnnsmitted
character will be received corredy by a receiver,
w:-.dch then transmits to another link, etc. If Xu is the
initial character and Xl is the character received after
the first transmission, X2 after the second. etc., then
\\ith independence {X~} is a ~iarkov chain. End the
one-step and steady state transitinn matrices.
18-4. Consider a two--component active redundancy
where the components arc identical and the time-tofailure distributions are ex.ponential. When both units
are operating. each cames load U1 and each has fail~
ure rate .I. However. when one unit fails. the load carried by the other component is L, and its failure ::ate
under this load is (1.5)A. There is only one repair

facility available, and. repair time is exponentially dis~


tributed Wtth mean lip,. The system is considered
failed when both components are in the failed state.
Both components are initially operat:ir:.g. Assume that
J1:> (l.S))", Let the states be as follows:
0: No components are failed.
1: O:J.e component is failed and is in repair.
2: Two components are failed, one is in repair,
one is waiting. and the system is in the failed

condition.
(a; Detem:ine the !:latrix P of transition probabilities
associated v.ith interval ill.
(b) Determine the steady state probabilities,
(c) Write the system of differential equations that
present the transicct or time-dependent relation-

ships for transitioI!,


18 S, A communication satellite is launched via a
booster system that has a d!screte-time guidance control system, Course correction signals form a sequence
{X,,} where the state space for X is as follows:
0: No correction required,
R

1: lvfiuor correction required.


2: Major correction required"
3: Abort and system destruct.

do !he following:
(a) Show that states 0 and 1 are absorbing states.
(b) If the initial state is state 1. compute the steady
state probability Clat the system is in state O.

(c) If !he ~-utial probabilities are (0,112.112,0), compute the steady state probability Pc'
(d) Repeat (c) "i!hA

=(lJ4,1I4,1I4,1I4),

18-6, A gambler bets $1 on each hand of blackjack.


The probability of winning on any hand is p. and the
probability of losing is 1 - P = q, Tlle gambler will
continue to play until either $Yhas been accumulated.
or he has no money left. Let XI denote the acct;.mulated 'Winnings on hand t .Kate that X,+ I =Xt + 1, with
probability P. that Xl,. 1 :;:;;; Xr - 1. with probability q,
andXH I ::z:X. if XI = 0 or X,.=Y. The stochastic process
X, is a Markov chain.

(a) Find the one-step transition matrix P.


(b) For Y =4 and p 0.3, find the absorption probabilities: blO b 14, bJo and by..
18-7. Au ohject moves between fot:! points on a circle,
which are labeled 1. 2, 3, and 4, 11:~ probability of
moving one unit to the right is p, and the probability
of mO'.'ing one unit to the left is 1-P =q. Assume that
the object starts at 1. and let Xn denote the location on
t.'1c circle after n steps.
(a) Find the one....step transition matrix p.
(b) FilYJ an expression for the steady stilZe probabili~

ties Pi'
(c) Evaluate the probabilities Pj for p ::::: 0 ..5 and
p 0.8,
18..8. For the singie-set'/er queueing model presented
in Section 187, sketch the graphs of the following

quantities as a function of p:;:;;; Alp.., for 0 < P < L


(a) Probability of no units in the system.
(b) :Mean time in the system,
(c) Mean time in the queue.
1s..9. Interamval times at a telephone booth are exponential. with an average time of 10 minutes. The
length of a phone call is assumed to be exponentially
cEstnouted with a mean of 3 minutes.
(a) 'What is the probability that a person arn..r--.ng at
the booth v,.ill have to wait?

18~ 12

(b) \\lh31 is the average queue length?

(c) The telephone company will ir.stall a second


booth whe4 an arrival would expect to have to
wait 3 minutes or more for the phone. By how
much must the rate of amvals be increased in
order to justify a second booth?
(d) What is the probability that an arrival VIill ha.ve to
wait more than 10 minutes for the phone?
(e) What is the p:obability that it will: take a person

more than 10

;:nir~utes

altogether, for the phone

and to complete the call?

(0 Estimate the :fraction of <:I day that the phone will


be in use.
18-10. Automobiles arrive at a serviee station in a ran"
dam manner at a mean rate of 15 per hour. This station
has o:uy one service position. with a mean serviciI:g

rate of2i customers per hour. Service times are expo~


nentially distributed. There is space fur only the auto~
mobile being sen'cd and two waiting. If all three
spaces are filled., an miving automobile will go on to
another station.
(a) "''hat is the average ll'J.mber of units in the
station?
(b) \bat fraction of customers VIill be lost?
(e) \VbyisL,>'L-l?
18-11. An engineering school has three secretaries in
its general office.. Professors wim jobs for the secretaries arrive at random. at an average rate of 20 per
8-hour day. The amount of time that a secretary
spe::1ds on a job has an exponential distribution v."ith a
mean of 40 minutes.
(a) What fraction of the time a."'C the secretaries busy?

E;o::ercises

575

(b) Hov.' much time does it take, on average. for a


professor to get his or her jobs completed?
(c) If an. economy drive reduced the secretarial force

to 1:'\o secretaries. what wi1: be the new answers

to (a) and (b)?


18~ U.

The mean frequency of arrivals at an airpor. is


18 planes per hour, and the mean time that a runway is
tied up with an arrival is 2 minutes. How many run~
ways VoiD have to be provided so that the probability
of a plane having to wait is 0.201 Ignore finite population effects and make the assumption of exponential
mJera.'1ival and service times.
1&-13. A hotel reservations facility uses inward WATS
lines to service customer requests. The mea.r, number
of ealls that arrive per hour is 50, and the mean serv~
ice tim.e for a call is 3 minutes. Assume that interar~
rival and service times are exponentially distributee..
Calls that arrive when all lines are busy obtaln a busy
signal and a..~ lost from the syste:n.
(a) Flnd the steady state equations for dlls system,
(b) How many WATS Jines must be provided to
ensure that the probability of a customer obtaining a busy signal is 0.05?
(c) Mat fraetion of the time are all WATS lines
busy?
(d) Suppose that during the evening hours cal:
arrivals occur at a mean rate of 10 per hour, How
does this affect the WATS line utilization?
(e) Suppose the estimated mean serv:ce time (3 millutes) is in enor, and the true mean service rin::e is
really.5 minutes. Wha: effect INill this have on the
probability of a customer finding all lines busy if
the number of lines in (b) are used?

Chapter

19

Computer Simulation
One of the ",ast widespread application. of probability and stati.tics lies in the use of computer simulation methods. A simulation is simply an imitation of the operation of a realworld system for purposes of evaluating that system. Over the past 20 years t computer
simulation has enjoyed a great deal of popularity in the manufacturing, production, logistics. service, and financial industries) to name just a few areas of application. Simulations
are often used to analyze systems that are too complicated to attack via analytic methods
such as queueing theory, We are primarHy interested in simulations that are:
l~

Dynamic-that is, the system state changes over time, .

2. Discrete-that is, the system state changes as the result of discrete events such as
customer arrivals or departures.

3. Stochastic (as opposed to deterministic).


TIle stochastic nature of simulation prompts the ensuing discussion in the text.
This chapter is organized as follows. It begins in Section 19-1 with some simple motivational examples designed to show how one can apply simulation to answer interesting
questions about stochastic systems, These examples invariably involve the generation of
random variables to drive the simulation, for example customer interarnval times and serv
ice times. The subject of Section 19-213 the development of techniques to generate random
variables. Some of these techniques have already been alluded to in previous chapters, but
we will give a more complete and self-contained presentation here. After a simulation run
is completed, one must conduct a rigorous analysis of the resulting output, a task made difficult because simulation output, for example customer waiting times, is almost never independent or identically distributed. The problem of output analysis is studied in Section 19-3.
A particularly attractive feature of computer simulation is its ability to allow the experimenter to analyze and compare certain scenarios quickly and efficiently, Section 19-4 discusses methods for reducing the variance of estimators arising from a single scenario, thus
resulting in more-precise statements about system performance, at no additional cost in
simulation run time. \Ve also extend this work by mentioning methods for selecting the best
of a number of competing scenarios. Vle point out here that excellent general references for
the topic of stochastic simulation are Banks. Carson, Nelson, and Nicol (2001) and Law and
Kelton (2000).
M

19-1 MOTIVATIONAL EXAMPLES


This section illustrates the use of simulation through a series of simple, motivational examples. The goal is to show how One uses random variables wit.'lin a simulation to answer
questions about the underlying stochastic system.

576

19-1

Motivational Examples

577

:EXam
"'19";1
, p",:,
Coin Flipping
We are interested in simulating independent flips of a fair coin. Of course, :his is a trivial sequence of
Bernoulli trials with success probability p =. 112. but this example serves to show how OnC ea., use sim~
ulation to analyze such a system. First of all we need to generate realizations of heads (H) and tails
('T). each with probability 112. Assuming that the simulation can somehow produce a sequence of
independent uniform (0,1) random numbers. Vi' V2 , we will a.,"bitrarily designate f'llp i as H if we
observe Ui < 0.5, and a flip as T if we observe Vi 2: 0.5. HQ'IN one generates independent uniforms is
the subject of Section 19~2. In any case, suppose tha: the following uniforr.:.s are observed:

032

0.41

0,82

0.93

0,06

0.L9

0,21

0,77

0,71

0,08.

This sequence of uuifonns corresponds to the outcomes HHHTIHHTTH. The reader is asked to
study this exan:ple in various ways in Exercise 19--1. This type of "static" simulation. in which we
simply repeat the same type of trials over and over, has come to be known as Monte Carlo simt;lation,
in honor of the European city-state, where gambling is a populru: recreational activity.

Estimate ::r
In this example, we 'Will estimate 1r using Mon~e Carlo simulation in conjunction with a simple

geo~

metric relation. Referring to Fig. 19~1. consider a unit square with an L"lscribed eL."de, both centered
at {l/2.1/2). If one were to tlu:ow darts randomly at the square. the probability that a particular dart
willla..d in the circle is ref4, t."le ratio of the circle's area to that of the square. Hew can we use this
simple fact to estimat:e :r? We shall use Monte Carlo sirrl'..:lation to throw many darts at the square.
Specifically, genera:e indepehdent pairs of .independent '..Liiform (0,1) random variables. {U;!. Ud.
(U'2"' U'1.2)' .... Tnese pairs will fall randomlY on the square. If, for pair i. itbappens 'j}at
(19-1)

t."len that pair will also fall Vlithin the circle. Suppose we run the experiment for II. pairs {darts). Let
X, I if pair i satisfies ineqmlity 19-1. t.ltat is, if the itb dart falls in the c;rcle; otherwise,let Xi "'" O.
Now count up the number of d~'"tS X "'" L~JX{ falling in the circ:e, Clearly, X has the binoro.ial
dis;;ribution with parameters n and p "'" 1r/4. Then the proportionp .= Xlr. is the maximum likelihood
estimate for p = ;cf4, and so the rn.aximum likelihood estimator for ;cis justj"", 4ft. If, for instance,

(0.1)

r."
........ ....... ..
,

_.

...

~- .. ~

..... .

..

<t.#

-t . ....,. ...... .. .. ....::.. ....


. -,... ..
~.

r...

...

J.

~
:,:
i

~.

~.

.. ........ "

- t.

"..

,,"
. . . , . "'.

'.:~

.... -=. "

....
, . . . . . ~ ...... . , . .
Jet,
c.pc. . ::. cc;
h ../ .
.... '\.
J
. c,'.. c , . , .r-.
I

....."

#.

,. I: t , _:c -::

~..

:. It
!. . . .
_

(O,Oj:

'\

................
..
.... ....
.. - . ':''':.... .

'[-.........
.

:;

"10

...

,-... . " ".." .

.-:.
.c.... ...
'
c"

..

.,...

Figure 19~1 Throv.ing darts

~o

estimate

:..:(1,0)
i!..

578

Chapter 19 Computer Simulation


we conducted n = 1000 trials and obscrvedX =:: 753 dart5 in the circle, OUI estimate wocld be,r::::::: 3.12.
We will encounter this estimation technique again in Exercise 19-2.

Monte Curio lnltgratian


Another interesting uSc of computer simulation involves Monte Carlo integration. Usually. the
method becomes efficacious only for high-dimensional integrals, but we will fall back to the basic
one~dimensiona1 case for ease of exposition. To this e.:ld, consider the integral
(19-2)

As described in Fig. 19-2, we shall estimate the value of this integral by su:mmiog up n rectangles.
each ohvidth lin cen~ randorr.ly at point U; on [O,lJ. and of hcightj(a + (b -a)U), Then an estimate for I is

(19-3;
One can show (see Exercise 19-3) that ~ is an unbiased estimator for I. that is, EV~J : : :. 1 for all n. Tris
makes Z, an in!:'citive and attractive estimator.
To illustrate. suppose that we wa:.t to est:ir:1ate the integra:

31!d the follo\\wg n =4 numbers are a uniform (0.1) samplc:

0.419

Figure

19~2

Monte Carlo integration.

0.109

0.732

0.893.

191

Motivational Examples

579

Plugging into equation 19-3. we obtain

I, = I~O .2:[1+co",,(0+(1-O)U,))] = 0896,


I",:

wbicb. is close to the actual answer of 1. See Exercise 19-4 for


examples,

additi.o~a:

Monte Carlo integration

A Single.Server Queue
Now the goal is to simulate the behavior of a single-server queueing system. Suppose that s:x customers arrive at a bank at the following times, which have been generated from some approp;::ia:c
probability distribution:

10

15

20.

UpOn arrival, customers q~eue up in front of a single teller and are processed sequentially, in a fustcome~:first~served manner. The &erVice times corresponding to the a..'1iving customers are

2.

For this example, we assume t."at the bank opens at time 0 and closes its doors at t~me 20 (just after
custorr.er 6 arrives), serving any remaicing customers.
Table 19~1 and Fig. 19-3 trace the evolution of the system as time prOg.Tesses. The table keeps
track of ~e times at which customers arrive, begin semce, <!Ild leave, Figure 19-3 g.Taphs the status
of the queue as a function of time; in particular, it graphsL(t), the number of customers:""l the system
(queue + service) at time t.
Note that cusromer i can begin service only at tinte ma~ (A" Df-.:)' that: is, the maximum of his
arrival time and the previous customer's depanu:re time. The table und figure are quite easy to inter
pret. For insta.4.ce, the system is empty until time 3, when customer 1 arrives. At time 4, customer '2
a..'1ives, but must wait in line until Cus~mer 1 finishes service at time lO. We see from llc figure :hat
between times 2.0 and 26, eustomer 4 is in scrv::ee, while customers 5 and 6 wait in me queue. From
the table, tbe average waiting time for :he six customers is l:.~l W,16;;;;;; 4416. Further, the avcro.ge number of customers it:. the system is 1;9 L(t)d!129 "'" 70129, where we have computed the integral by
adding up :he rectangles in Fig. 19-3. Exercise 19-5 looks at extensions of :be single-server queue,
Many simulation software packages provide Simple ways to model und analyze more-eomplicated
queueing networks.

Table 191

Bank Customers in Single-Sen.'er Queueing System

t, eustomer

Ai' arrivai time

:0

Bi begin serv::cc

St, service time

4
6

10
16

3
4
5

15

20
26

20

27

6
I
2

D", Gepart time

10
16
20
26
27

29

1\':. wait
0

6
10

10
11
7

580

Chapter 19

Co:nputer Simulation

r
5

Queu e

Customerl
3

i
-~

.-

I
2

'1

I"~~:1"E> I ;

3 4

121

15 16

10

20

2627

I
29

Figure 19~3 Number of customers L(t) in single~server queueing system.

. EXamJilei?,~'
(s, S) Inventory Policy
Customer orders for a particular good arrive at a store every day. DurIDg a certain one-week period.
the quantities ordered are

10

11

20

8.

The store starts the week off with an initiai stock of 20. If the stock falls to 5 or below, the owner
orders enough from a central warehouse to replenish the stock to 20. Such replenishment orders are
p:laced only at the end of the day and are rece~ved before the store opens the next day. There arc r.o
cusmmer back orders, so any custo:r.:ter orders that are DOt filled immediatelY a..--e lost. This is called
an (s, S) inventory sYStem, where the inventory is replenisbed to S -;: ;:; 20 whenever it hits level s =: 5.
The fonowing is a history for this system:

Initial Stock

Customer Order

End Stock

Reorder?

No
Yes

20

10

10

10

3
4

20

11

6
20
14

3
20

6
7

6
8

0
14
6

No
No
Yes

No
No

Lost Orders
0
0
0
0
14
0
0

We see that at the end of days 2 and 5, replenishment orders were made, In .particular. on day 5, the
store ran out of stock and lost 14 orders as a result. See Exercise 19-6.

19-2 GEJ\"ERATION OF RANDOM VARIABI,ES


All the exampJes described in Section 19-1 required random variables to drive the simulation. In Examples 19~ 1 through 19-3, we needed uniform (0.1) random variables; Example,
19-4 and 19-5 used more~complicated random variables to model customer arrivals, serv-

19~2

Generation of Random Variables

581

ice times, and order quantities. This section discusses methods to generate such random
variables automatically. The generation of uniform (0,1) random variables is a good pI",.
to start, especially since it turns out that uniform (a,l) generation forms the basis for the
generation of all other random variables.

19-2.1 Generating Uniform (0,1) Random Variables


There are a variety of methods for generating uniform (0,1) random variables, among them
are the following:
L Sampling from certain physical devices, such as an atomic clock.
2. Looking up predetermined random nll.."'Ilbers from a table.

3. Generating pseudorandom numbers (PRNs) from a deterntinistic algorithm.


The most widely used techniques in practice all employ the latter strategy of generating
PRNs from a deterministic algorithm. Alt.':iougb, by definition. PRNs a,,-e not truly random,
there are many algorithms available that produce PRNs that appear to be perfectly random.
Fu...'1:her. these algorithms have the advantages of being computationally fast and
repeatable-speed is a good property to have for the obvious reasons, while repeatability is
desirable for experimenters wbo want to be able to replicate their simulation results when
the runs: are conducted under identical conditions.
Perhaps the most popular method for obtaining PRNs is the linear congruential generator (LeG), Here, we start with a nonnegative seed~' integer, Xo. use the seed to generate
a sequence of nonnegative integers, X:, Xz, .,., and then convert the Xi to PRNs, UI , Uz ...
The algorithm is simple.
1. Specify a nonnegative seed integer, X~j'
2. For i = 1. 2, ... , letX,= (aX,_, +c) mod (m), wbere a, c, andm are appropriately cho
sen integer constants, and where umodH denotes the modulus function, for example,
17 mod (5) =2 and-I mod (5) =4.
3~

For i= 1, 2

let Ui =X/m.

E~~iiii?-6
Consider the "'toy" generator Xi = (SXf-1 + 1) mod (8), v.-ith seed Xo ;: O. This produces the .:nteger
sequence X: 1,X2= 6, X:) ::.:: 7, X 4 =4,Xs =5, X6 = 2., X, 3, Xl! O. whereupon things start repeating, or "cycling." The PRl'i's corresponding to the sequence sta."ting with seedXo;..:;;; 0 are therefore VI ;;;;;
li8, Uz = 618, UJ =7/8, U4 ;:4/8, U~ =5/8, U6 := 2/8, U, "'" 3fS. Us:::; O. Since an;y seed evenJ:ual.:y pr0duces all integers 0, 1, .. , 7. we say that this is ajt.tU-cycle ~orfull period) generator.

~Pl.el<j:t
Not all generators are full period. Consider another ''toy'' generator Xi = (3Xi _ 1;.. 1) mod (7), 'With seed

Xc.;; O. This ptoeuces the integer sequence X) =: 1,Xz =4, X3

6, X", =5,Xj :::; 2,X6 =0, whereupon


cycling ensues. Further, notice that for this generator, a seed of Xc =.3 produces the sequence Xl := 3 =
Xl =X3"'" "', not very :random looking!

~-----------------------

The cycle length oft."e generator from ExampJe 197 obviously depends on the seed
chosen,. which is a disadvantage, Full-period generators; such as that studied in Example
19-6 obviously avoid this problem, A full~period generator with a long cycle length is given
in the following example,
7

582

Chapter 19 Computer Simulation

.~~".T;iI'I~J?;~
The generator XI = 16807 X,_l mod (2 31 -1) is full period. Since c;;= 0, this generator is termed a multipiicative LeG and must be \!sed with a seed XO ~ O. This generator is used in many rea1~world a.ppli~
cations and passes most s~atistical !ests for uniformity and randomness, Lrt order to avoid integer

overllow and real-arithmetic round~off problems. Bratley, Fox, and Schrage (1987) oner the following Fortran implementation scheme for this algorithm.
F.mCTION ltrJIF(IX)

Kl '" IX/l27773
IX = 16807*(IX - K1~127773) - K1~2836
:p (IX.LT.O)IX ~ IX + 2147483647
UNIF ~ IX ~ 4,6566l2S75E-lO
=uRN
END

L'1 the above prognu:n, we input a.."I. integer seed IX and receive a PRN UNIF. The seed IX is autorr.atical2.y updated for the next calL ?-;'ote t.1.at b Fortran, integer division results in trun.cation., for
example 15/4 3; thus I<J. is an wtege:.

19-2.2

Gimerating Nonuniform Random Variables


The goa1 now is to generate random variab1es from distributions other than the uniform. The
methods we will use to do so always start with a PRN and then apply an appropriate transformation to the PRN that gives the desired nonuniform random variable. Such nonunifonn
random variables are important in simulation for a number of reasons: for example, customer arrivals to a service facility often follow a Poisson process; service times may be normal; and routing decisions are usually cha.."3.cterized by Bernoulli random variables.
Inverse Transform Methojl for Random Variate Generation
The most basic technique for generating random variables from a uniform PRN relies on
the remarkable Inverse Transfonn Theorem.

Theorem 191
If X is a random variable with cumulative distribution function (CDF) F(x), then the random
variable Y = FCX') has the unifOffil (0,1) distribution.

Proof
For ease of exposition. suppose thatX is a continuous random variable. Then the CDF of fis
G(y)=P(Y"y)

P(F(X')" y)
= P(X" F'; (y)) (the inverse exists since FCx) is continuous)
= F(P-I(y))
=y.

Since G(y) = y is the CDF of the unifonn (0,1) distribution, we are done,
With Theorem 19-1 in hand. it is easy to generate certain random variables. All one has
to do is the following:

1_ Find the CDF of X, say F(x).

19-2 Generation of Random Variables

583

2. Set F(X) = U, where U is a uniform (0,1) PRN,


3. Solve for X = F- 1( U).

We illustrate this technique w:th a series of examples, for both continuous and disc:ete
dislributions.

Here we generate an exponential random variable with rate A, fol1owi!:g the :ecipe outlined above.
1. The COP is F(x) = l-e''-'.

2. Se: F(X) = 1 -

e-'X ~

U.

3. Solving for X, we obtain X =r'(U) =-[1n(l- U)]IJ..


Thus. if one supplies a unifoIm (0,1) PRN U, we see:hat X:::::: - [In(l - U);/). is an exponential random. variable v.'ith parameter 1.

Now we try to generate a standa."'d normal random variable, ca!l itZ. Using the special notation <D(.) for
the standard nor=! (0,1) COP, we set <1>(2) U. so that Z = <I>-I(U). t:::.fortunate1y, :he ,>verse COP
does not exist in closed fonn, so one must reso~ to the use of standard Doana! tables (or other approximations). For instan~ if we have U"" 0.72, then Table II (Appendix) yields Z"".p-l(O,72) ""0583.

'~~I~1?i'(i'"
\Ve can ex:end the previous example to generate any normal random variable, that is. one with a...'""hicrary !!lean and variance. 'This fQilows easily. since ifZ is standard normal, then X "")1+ aZ is llormcl
with mean )1 and variance (5'2. For inst3nce, suppose we are interested in generating a r.orroal ,'Briate
X 'Rith mean)1 = 3 and variance (J(.::;::: 4. Then if, as in the previous example, U= 0.72, we obtainZ""
0.533, and, as.a consequence, X o:::! 3 + 2(0.583) =4.166.

~:E;~~p,I:e'Q,I~
We C3.f~ ilio use the ideas:from Theorem 19# 1 to generate realizations from discrete random variables,
SUppose that the discrete random variable X has probability function
rO.3 ifx=-l,

-' \-to.6

P\b/-

ifx=2.3.

0.1 ifx=?

o otherwise.
To generate variates frore this distribution. we set up:he following table, where F(x) is the associated
CDF and U denotes the set ofunifonn (0,1) PR.,'\'"s cor;esponding to each x~value:
F(x)

-1

0.3

D.3

2.3

0.6

0.1

0.9
1.0

[0,0.3)
[0.3,0.9)
(0.9.1.0)

To generate a realization of X. we first generate a PRN U and then read the corresponding x-value
from the table. For instance. if U::;::: 0.43, then X =: 2.3.

584

Ch.apter 19 Computer Simulation

Other Random Variate Generation ~Iethods


Although the inverse transform method is intuitively pleasing to use, its real-life application
may sometimes be difficult to apply in practice. For instance> closed-form expressions for
the inverse CDF, r" (U), might not exist, as is the case for the normal distribution, or application of the method might be unnecessarily tedious. We now present a small potpourri of
interesting methods to generate a variety of random \wables.

Box-M.illier M.ethod
The Box-Muller (1958) method is an exact technique for generating independent and identically distributed (lID) standard normal (0,1) random variables. The appropriate theorem.
stated without proof. is
Thwrem 19-2
Suppose that U, and U, are lID uniform (0,1) random variables. Then
Z[

= ,j-2ln{U[) cos(2n;U,)

and

are IID standard normal random 'variates.


Kote that the sine and cosine evaluations must be carried out in radians,

Suppose that Vt ::::: 0.35 and U. . ;;; 0.65 are two Ill) PRNs. Using the Box-Milllermethod to generate
two normal (0.1) random variates. we obtain

Z, = .J-2Jn(O.35) cos{2:r(O.65)) = -0.8517

Z,

~-21n(O.35) sin(2:r(O.65)) ~-1.172.

Central Limit Theorem


One can also use the Central Limit Theorem (CLT) to generate "quick-and-dirty" random
variables that are approxil'lUltely normal. Suppose that U" U" ... , U, are lID PIU';s. Then
for large enough r., the CLT says that

E[l:;"U,]
"
,IV ar( l:Z" U, I
"
,

Z?[U, -

:=:

I:7-1Ui -

Ii-lE[Ud

r-'

"l:~.,var(U,)

17",1UI -(n/2)
-In/12

=:-:1(0,1).

192

Generation of Random Variables

585

In particular, the choice n = 12 (which turns out to be "large enough") yields the convenient approximation
12

2..:Ui - 6 ~ N (0,1),
j""l

Suppose we have the following PRNs:

0,28 0,87 0.44 0.49 0,10 0,76 0,65 0,98 0,24 0,29 0,77 0,90,
Then
12

2..: U,- - 6 =0,77


i",;

is a realization :from a distribution that is a;?proximately standard nOrI!lat

Convolution
Another popular trick involves the generation of random variables via convolution, indicating that some sort of sum is involved.

Suppose that X~, Xl' ... , X" are lID exponential random variables with rate A.. Then Y = L:IX1 is said
to have an Erlang distribution with para."1leters n and A.. It turn.s out that this distribution has probability density function
(19-4)
whid:, readers may recognize as a special ca.qe of the gamma distribution (see Exercise 19-16),
This distribution's CDF is too difficult to invert directly. One way that comes to mi.'1d to gener-

a:e a ~tion from the BrIang is simply to generate and then add up n TID cxponcntia1().,) random
variables. The following scheme is an efficient way to do precisely that. Suppose that Vj V z .. "' Ur
are TID PR..~s< From Example 19~9, wc know thatXr=-xln(l- UI)' i = 1, 2 ... ,n. are lID expone:r:.~
tial()..) rand.om variables. 'I1?erefore, we can ",,'rite

This iu1plcmematiou is quite efficient., sir.ce it requires only one execution of a natural log operation,
In fact:., we can even do slightly better from an efficiency point of vicw-si.mp:y note that both [I" a:..d
(1- U) a;e uniform (0.1). Then

is. also Erlang.

586

Chapter 19 Computer Simularion

To illustrate, suppose that we have three lID PRNs at our disposal. VI = 0.23, U;. =0.97, and U3 =
0.48. To generate an Erlang realization V.1th parameters n::.;:: 3 and ,1.= 2.we simply take

Y= -In(U; U,U,) =-In{(O.23)(O.91)(O.4S)) = 1.117.

Acceptance-Rejection
One of the most popular classes of random variate generation procedures proceeds by sampling PRl.'is until some appropriate "acceptance)1 criterion is met.

',J':ii~tn~t~!9,;i~~
An easy example of the acceptance-rejection technique involves the generation of a geometric :taIldom variable with snccess probability p. To this end, consider a sequence of PRNs UI U'), .... Our
aim is to genet3.te a geometric realizatior. X, t.hat is,. one that has probability function

'(
)"-''p lfx =1.2.....
p(xi=1)-P
,

O.

othenvise.

In words. X represents the number of Bernoulli trials nntil the first success is obse.....'ed, 'This English
characterizatior. immcdiately suggests an elemenTary acceptance-rejection algorithm.

1. Initialize i (- O.
2. Leti t:-i ..... 1.
3. Take a Bemoulli(p) observation,

{I

11 = 0,

ifUi <po
otherwise.

4. If Y,::::: I, then we have our first snccess Md We StOp. in which case We accept X=::> i,
wise, if Y/=O, then we reject and go back to step 2.

Other~

To illustrate, let US generate a geometric variate having success probabl.!ity p ;::: 0.3. Suppose we have
ore at our disposal the fonowing PRNs:

0.38

0.67

0.24

0.89

0.10

0,71.

Since U~ =0,38 cp, we have Y1-=0, and so we reject X = 1. Since U2 ::;; 0,67 cp. we have 1:2 ::;;0, and
so we reject X =2. Since [13::;; 0,24 <p, werurve Y3;;:::: 1, and so we acceptX=3,

193 OOTPUT ANALYSIS


Simulation output analysis is one of the most important aspects of any proper and complete
simulation study. Since the input processes driving a simulation are usually random variables (e.g., interarrival times, service times. and breakdown times), we must also regard the
output from the simulation as random. Thus, runs of the simulation only yield estimates of
measures of system performance (e,g., the mean customer waiting time), These estimators
are themselves random variables and are therefore subject to sampling error-and sampling
error must be taken into aCCO\lllt to make valid inferences concerning system performance.

19,3 Output Analysis

587

The problem is that simulations almost never produce convenient raw output that is ffi)
normal data, For example. consecutive customer waiting times from a queueing system
are not independent-typically, they are serially correlated; if one customer at the
post office waits in line a long time, then the next customer is also likely to wait a

longtime.
are not identically distributed; customers showing up early in the morning might
have a much shorter walt than those who show up just before closing time.
are not normally distributed-they are usually skewed to the right (and are certainly

never less than zero).


The point is that it is difficult to apply "classical'" statistical techniques to the analysis
of simulation Output Our purpose here is to give methods to perform statistical analysis of
output from discrete-event computer simulations. To facilitate the presentation, we identify
two types of simulations with respect to output analysis: Terminati.'1g and steady state
simulations.

1. 'terminating (or transient) simulations, Here, the nature of the problem explicitly
defines the length oftbe simulation run. For instance, we might be interested in sim~
ulating a bank that closes at a specific time each day.

2. Nollterminating (steady stare) simulations. Here, the long-run behavior of the system is studied. Presumedly this "steady state>~ behavior is independent of the simulation's initial conditions, l\.n example is that of a continuously running production line
for which the experimenter is interested in some long~run performance measure.

Techniques to analyze output from tenninating simulations are based on the method of
independenlreplications, discussed in Section 1Y,3. I. Additional prob1ems arise for steady
state simulations. For instance, we must now worry about the problem .of starting the
simulation-how should it be initialized at time zero, and how long must it be run before
data representative of steady state can be collected? Initialization problems are considered
in Section 19,3.2. Finally, Section 19-3.3 deals with point and confidence interval estimation for steady state simulation performance parameters.

19-3_1

Terminating Simulation Analysis


Here we are interested.in simulating same system of interest over a finite time horizon. For
now assume we obtain discrete simulation output y!~ Yt., H' Ymt where the number of
observations m can be a constant or a random variable. For example} the experimentcr can
specify the number m of customer waiting times Yl Yt.? .. " Ym to be taken from a queueing
simulation, Or m could denote the random number of customers observed during a specij

fied time period [0, TJ.


Alternatively, we might observe continuous simulation output {Y(t)IO ~ t ~ T} over a
specified interval [0, TJ. For instance, if we are interested in estimating the time-averaged
number of customers waiting in a queue during [0, 1], the quantity Y(t) would be the number of customers in the queue at time I.
The easiest goal is to estimate the expected value of the sam.ple mean of the
observations,

e" E[YmJ,
where the sample mean in the discrete case is
_
1

"'.1

Ym=-I;r,

588

Chapter 19 Computer Sirculation

(with a similar expression for the continuous case). For instance. we might be interested in
estimating the expected average waiting time of all customers at a shopping cemer during
the period 10 a.ill, to 2 p,m.
Although Ym is an unbiased est.imator for a proper statistical analysis requires that we
also pro\'ide an estimate of Var{Ym). Since the Y" are not necessarily IID random variables,
it may be that Var(Ym) :;i: Var(Y{)/m for any i a case not covered in elementary statistics
courses,
For this reason. the familiar sample variance

e,
j

2.

m.

S =-~(y,-y)
t
m
m- 1~'
i=l

is likely to be highly biased as an estimator of mVar(i'm)' Thus, one sbould not use [{'1m to
estimate Var{i'm)'
The way around the problem is via the method of independent replications (IR). IR
estimates Var(i'm) by conducting b independem simulation runs (replications) of the system
under study, where each replication consists of m observations.l! is eru;y to make the replications independent-simply reinltialize each replication with a different pseudorandom
number seed.
To proceed, denote the sample mean from replication i by

where YiJis observationj from replication i. for i = 1, 2.... , b andj = 1, 2, .. ', m.


If each run is started under the same operating conditions (e,g. all queues empty and
idle), then the replication sample means Zlj~' , .. , Zb are ill) random variables. Then the
obvious point estimator for Var(l'm) = Var(Z,) is
,

b-l

;",1

VR "-2:;(Z, -Z,) ,
where the grand mean is defined as

Notice how dosely the forms of lin and Slim resemble each other, But si:l.ce the replicate
sample means are IID, VR is usually much less biased for Var(Ym) than is S2/m.
In light of the above discussion, we see that V"Ib is a reasonable estimator for Var(Z,,).

Further. if the number of observations per replication, m, is large enough, the Central Limit
Theorem tells us that the replicate sample means are approximately lID normal.
Then basic statistics (Chapter 10) yields an approximate 100(1- a)% two-sided confidence interval (eI) for

e,

(19-5)

where '''''''-' is the 1 - cd2 percentage point of the t distribution with b -1 degrees of freedom.

19-3 OutpUt Analysis

589

Suppose we want to estimate the expected avc:age waiting time for the first 5000 custOmerS i::i a ccr~
tam queueing system, We wiil make five indepe:.:dent replications of th.: system, v.itl:. each run initialized e:npty and idle and consisting of 5000 waiting times. The resulting replicate means are

Z,

32

4.3

5.1

4.2

4.6

Then Zs = 4,28 andV:~ "'" 0.487, For level a::= 0.05. we have tom1,4:= 2.72. and equation 19~5 gives
[3.41.5.153 as a 95% CI for the expec'".ed average waiting time for the first 5000 customers.

Independent replications can be used to calculate variance estimates for statistics other
t.i.an sample means. Then the method can be used to obtain CIs for quantities oilier than
E[Ym ], for example quantiles, See any of the standard simulation teXts for additional uses
of independent replications.

19-3.2 Initialization Problems


Before a simulation can be run. one must provide initial values for all of t.'le simulation'S
state variables, Since the experimenter may not know what initial values are appropriate for
the state variables, these values might be chosen so~ewhat arbitrarily. For instance, we
might decide that it is "most convenient" to initialize a queue as empty and idle. Such a
choice of initial conditions cae have a significant but unrecognized impact on the simulation run's outcome. Thus, the initialization bias problem can lead to errors, particularly 1."1
steady state output analysis.
Some examples of problems concerning simulation initialization are as follows.
1~

Visual detection of initialization effect..s is sometimes difficu!t-especially in ':he


case of stochastic p:-ocesses having high intrinsic variance, such as queueing
systems.

2. How should ':he simulation be initialized? Suppose that a machine shop doses at a
certain time each day, even if there are jobs waiting to be served. One mllst therefore be careful to _-1 each day with a d=and that depends on the number of jobs
remaining from ':he previous day..
3. Initialization bias can lead t:O point: estimators for steady state parameters ha"ring
high mean squared error. as well as for CIs having poor coverage.
Since initialization bias raises important concerns, how do we detect and &a1 with it?
We first list methods to detect it.
1. Attempt 10 detect the bias visually by scanning a realization of the simulated
process. This might not be easy, since VL-'Ual analysis can nllSS bias. FurJ1er, a
visual scan can be tedious. To make the visual analysis more efficient, one might
transform ':he data (e.g., take logs or square roots), smooth it. ave::age it across several independent replications, or construct mov.ing average plots.

590

Chc.pte:r 19 Computer Simulation

2. Conduct statistica.l testsJor initialization bias, Kelton and Law (1983) give anintu-

itively appealing sequential procedure to detect bias. Various omer tests check to see
whether the initial portion of the simulation output contains mOre variation than lat~
ter portions,

If initialization bias is detected, one may want to do something about it. There are tv/o
simple methods for dealing with bias. One is to truncate the output by allowing the simu~
!ation to '''warm up" before data are retained for analysis. The experimenter would then
hope that the remaining data are representative of the steady state system. Output truncation
is probably the most popular method for dealing with initialization bias, and all of the
major simulation languages have built-in truncation functions. But how can one find a good
truncation point? If the output is truncated "too early," significant bias might still exist in the
remaining data. If it is truncated "too late," then good observations might be wasted. Unfortunately, all simple rules to determine tr..mcation pomts do not perfort:l well in general, A
common practice lS to average observations across several replications and then visually
choose a truncation point hased on the averaged run. See Welch (1983) for a good
visual/graphical approach,
TIle second method is to rr.ake a very long mn to overwhelm the effects of initialization bias. This method of bias control is conceptually simple to carry out and may yield
point estimators having lower mean squared errors than the analogous estimators from
truncated data (see, <,g. Fisbman 1978). However, a problem with :his approach is that it
can be wasteful with observations; for some systems, an excessive run length might be
required before the initialization effects are rendered negligible.

193.3 Steady State Simulatiou Analysis


:Sow assume that we have on hand stationary (steady state) simulation output, YI > Y:.;. " ' 1 YIl'
Our goal is to estimate some parameter of interest, possibly the mean customer waiting time
or the expected proft produced by a certain factory configuration, As in the case of terminating simulations, a good statistical analysis must accompany the value of any point estimator wiL~ a measure of its variance.
A number of methodologies have heen proposed in the literature for conducting steady
state output analysis: batch means, independent replications t standardized time series~ spec~
tral analysis, regeneration, time se:ies modellilg, as well as a host of others. We will examine
the two most popular: batch means and independent replications. (Recall: As discussed ear
lier, confidence intervals for terminating simulations usnally use independent replications.)

BatchMcam
The method of batch means is often used to estimate 11lr(Yn) or calculate as for the steady
state process mean JL The idea is to divide one long simulation ron into a number of contiguous batches, and then to appeal to a Central Limit Theorem to assume that the resulting
batch sample means are approximately IID normal.
In particular, suppose that We partition Yt , Y:;:> . , Y" into b nonoverlapping, contiguous
batches, eacb consisting of m observations (assume that n =bm). Thus, the itb hatcb, i;::; 1,
2, ... ! b, consists of the randoI<l variables
j

The ith batch mean is siIDply the sample mean of the mobservations from batch i, i= 1,2, ... , b,

1 m
Z, =- L~H)m+r
m

1",,1

19-4 Comparison of Systems

591

Similar to independent replications, we define the batcb means estimatorfor Vai\Z;l as

...

b_lI,(Z,-z.),

VB =

(""l

wbere

y" = Zb = -;- I,Z,


/) ;.>1

is the grand sample mean. If m is large. then the batcb means are approxixnately IID normal, and (as for IR) we obrain an approximate 100(1 - a)% Clfar f1,
J.L

Zb ta/'2.b_l-JVB !b.

This equation is very similar to equation 19~5, Of course, the difference here is that
batch means divides one long run into a number of batches~ wbereas independent replications uses a number of independent sborter runs. Indeed, consider the old IR example from
Section 19-3.1 with the understanding that the ZI must now be regarded as batch means
(instead of replicate means); then the same numbers carry through the example.
The technique of batch means is intuitively appealing and easy to understand. But
problems can come up jf the 1) are not stationary (e.g., if significa.l1t initialization bias is
present), if the batcb means are not nonnal, or if the batch means are not independent. If any
of these assumption violations exist, poor confidence interval coverage may result-un beknO\\'J1St to the analyst.
To ameliorate the initialization bias problem, the user can truncate some of the data or
make a long run, as discussed in Section 19-3.2. In addition, the lack of independence or
normality of the batch means can be countered by increasing the batch size In.

Independent Replications
Of the difficulties encountered when using batch means, the possibility of correlation
among the batch means might be the most troublesome. This problem is explicitly avoided
by the method of IR, described in the context of terminating simulations in Section 19-3.1In fact, the replicate means are independent by their construction. Unfortunately, since
each of the b replications has to be started properly. initialization bias presents more trouble when using IR than when using batch means, The usual recommendation. in the context
of steady state analysis., is to use batch means over IR because of the possible initialization
bias in each of the replications.

19-4 COMPARISON OF SYSTEMS


One of the most important uses of simulation output analysis regards the comparison of
competing systems or alternative system configurations. For example, suppose we wish to
ev-diuate two different "restart'! strategies that an airline can evoke foll0'.1o-ing a major traf~
fic disroption, such as a snowstorm in the Northeast. Vlhich policy minimizes a certain cost
function associated with the restart? Simulation .is uniquely equipped to belp the experimenter conduct this type of comparison analysis.
There are many techniques available for comparing systemS, among them (i) classical
statistical CIs! (ii) common random numbeIS t (iii) antithetic variates, and (iv) ranking and
selection procedures.

592

Chapter 19 COr:.lputer Simclation

19-4.1

Oassical Confidence lntervals


With our airline example in mind, let 2'J be the cost from thejth simulation replication of
strategy i, i 1, 2. j ::; 1, 2, ... bi' Assume that Zi,l' Zi,;l, ... , Z"b, are ITO normal with
unknown mean I1r and unknown variance, i;::::: 1, 2. an assumption that can be justified by
arguing that we can do the following:
1. Get independent data by controlling the random numbers between replications.
2. Get identically distributed costs be~'een replications by performing the replications
under identical conditions.
3* Get approximately normal data by adding up (or averaging) many su'bcosts to obtain.
overall costs for both strategies.

The goal here is to calculate a 100(1 - a)% CI for the difference ,11, - jl" To this end,
suppose that the 2,,) are independent of the z,J ane define

z,.

1,<11

hi

;=1

=~2,
b,~ 1,)'

and

L
b

1
_.2
, (
S,,2 =-,2 ,-Z. j,
b. -1 ., t,j
1'1
I

j"":'

An approximate 100(1 - a)% CI is

where the approximate degrees of freedom v (a function of the sample variances) is given
in Chapter 10,
Suppose (as in airline example) that small cost is good. Jf the interval lies entirely to
the left [rigbt] of zero, then system I [2] is better; if the interval contains zero, then the two
systems must be regarded, in a statistical sense, as about the same,
An alternative classical strategy is to use a C1 that is analogous to a paired Hest. Here
we take b replications from both strategies and set the differences Dj =Z',J - z,J forj = I,
2, ... ) b. Then we calculate the sample mean and variance of the differences:
_

1b

Db=-LD,

and

b i"",l

_2

S;;=-L(Dj-Db) '
b-l J",l

The resulting 100(1 - a)% CI is


111 - J1z

Db t/X!2,b_l-'\/S~/b-.

These paired tintervals are very efficient if Corr(Z;,J' 2zJ) > O,j= I, 2, '''' b (where we
still assume that 2"" Z,," ' .. , Z"b are lID and z,,l' z"" "" z"" are lID), In that case, it rums
out that

v(15 ') < V( 2"j ) ~ V( z"j )


b

'

Jf Zl,J and z,J hed been simulated independently, then we would bave equality in the above
expression. Thus. the trick may result in relatively small S~ and) hence. small C1 length. So
how do we evoke the trick?

19-5 Summ'r:l

593

19-4.2 ConunOll Random Numbers


The idea behind the above trick is to use cOmmon rantlom. numbers, that is, to use the same
pseudorandom numbers in exactly the same ways for corresponding runs of each of the
competing systems, For example. we might Use precisely the same customer arrival times
when simulating different proposed configurations of a job shop. By subjecting the alternative systems to identical experimental conditions, we hope to make it easy to distinguish
which systems are best eVen though the respective estimators are subject to sampling error,
Consider the case in which we compare two queueing systems, A and B, on the basis
of their expected customer transit times, fJA and 88' where the smaller 8-value corresponds
to the better system. Suppose we have estimators ~A and ~B for eA and es- respectively. We
will declare.4 as the be"er system
<~" If and are simulated independently. then
the variance of their difference,

ileA

e, e.

could be very large, in which case our declaration might lack conviction. H we could reduce
V(9A -fiB)' then We could be much more confident about our declaration.
eRN sometimes induces a high positive correlation between the point estimators ~A
and 88' Then we have
V(~A

1I,l = V(~A) + V(ils) -

2eav(llA'

e.)

< V(~A) - V(~D)'

and we obtain a savings in variance.

194.3 Antithetic Random Numbers

at

Alternatively, if we can induce negative correlation between two unbiased estimators, and
~" for some parameter then the unbiased estimator (ill + ~2)/2 might have low variance.
Most simulation texts give advice on bow to run Lf].e simulations of the competing sys~
tems so as to induce positive or negative correlation between them. The consensus is that if
conducted properly, common and antithetic random numbers can lead to tremendous variance reductions.

e,

19-4.4 Selecting the Best System


Ranking, selection. ar.d multiple comparisons methods form another class of statistical
techniques used to compare alternative systems. Here, the experi."llenter is interested in
seiecting the best of a number of competing processes. Typically, one specifies the desired
probability of correctly selecting the best process, especially if the best process is significantly better than its competitorS. These methods are simple to use. :airly general, and ir::tuitively appealing. See Bechhofer, Santner. and Goldsman (1995) for a synopsis of the most
popular procedures.

195 SUMMARY
This chapter bega!1 vrith SOme simple motivational examples illustrating ,,'arious simulation
concepts. After this) the discussion t'.lrned to the generation of pseudorandom numbers, that
is. numbers that appear to be lID unifO!D1 (0,1). PRNs are important because they drive the
generation of a number of other important random variables) for example normal. expo~
nential, and Erlang. We also spent a great deal of discussion On simulation output analysis--simulation output is almost never IID, so special care must be taken if we are to make

594

Chapter 19 Computer Si..llulation

statistically valid conclusions about the simulation's results. We concentrated on output


analysis for both terminating and steady state simulations.

196 EXERCISES
19~1. Extension of Example 19.. 1.
(a) F'.up a coin 100 times. How n-..:my heads to do you
observe?
(b) How many times do you observe two heads in a
row? T1m::e in a row? Four? Five?
(c) Find 10 friends and repeat (a) and (b) based on a
rotal of 1000 flips.
(d) Now simulate eoin flips via a spreadsheet program. Flip the simulated coin 10,000 times and
answer (a) and (h).
19-2. Extension ofRxampie 19,.2. Throw 11 dat"srandomly at a unit square cootai.ning an inscribed circle.
Use the results of your tosses to estimate 1C Let r. =:: 2'<
for k ::.:: 1. 2, ... 15. and graph your estimates as a
function of k.
19~3. Extension of Example 19~3~ Show thd
defined in equation 19-3, is ur.biased for the integral I,
defined in equation 19~2.

t.

19-4. Other c-x1.ensions ofRxample 19>-3.


(a) Use Monte Carlo :.ntegration with n=10 observa-

~e-tl/2.dx. Now use r. =


J() 2n
1000. Compare to the answer 'tat you c.a.."1 obtain
via nonna] tables.
tions to estimate

r2

(b) Vt'hat would you do if you had to estimate

r1Q J..... e-;;.1/2dx?


Jo 21r
(c) Use Monte Carlo integration withn;;;;;: 10 observa~
tions to estimate I~ COS(21CC) dx. Now use n =:
1000. Compare to the actual answer,

19-5. Extension of Example 194. Suppose that 10


customers arrive at a post office at the following
times:
3 4 6 7 13 14 20 25 28 30

Upon arrival, eustomers queue up in ::ont of a single


clerk and are processed in a first-comewfirst-served
JJ1a!mer. The service times corresponding to the arriving customers are as follows::

6.0 5,5 4.0 1.0 2.5 2.0 2.0 2.5 4.0 2.5
Assume that the post office opens at time 0, and closes
its doors at time 30 Gust after customer 10 arrives),
SIDing any remaining customers.
(a) wnen does the last customer fillally leave the
system?
(b) 'What is the average waitiIlg time for :he 10
C'Jstomers ?

(c) Vlhat is the maximum number of customers in the


system? When is this maximum achieved?
(d) What is the average number of customers in line
during the first 30 minutes?
(e) Now repeat parts (a)-{d) assuming t.tlat the serv~
ices are performed last-in-mst-out.

19-6_ Repeat Example 19-5, whichdea], with an (s, S)


im'entory pOlicy, except now use order level s.:c::: 6.
19-7. Consider the pseudorandom number generator
X, = (5X,-, + I) mOO (16), with seed A;, = O.

(a) Calculate Xl andX1, along with the eorresponding


PR..1I.fs U; and Uz'
(b) Is this afullwperiod generator?
(c) What is XJ5I)?
19~8. Consider the "recoIU."Tlended" pseudorandom
n'J:JLber geoerator XI = 16807 Xi-I mod (231 - 1\ with
seed Xc = 1234567.
(a) Calculate X; andXz, along with the corresponding
PR-'1s Vi and V,.
(h; What.isXtoo.ooo?
19-9~ Show how to use the inverse transform method
to generate an exponential ralldom va:i.able with rate
A;;;;;: 2. Demonsnte your technique using :he PRN
V=0.75.

19-10. Consider the inverse transformrnethod to generate a standard normal (0, 1) random variable.
(<1) DemOllSt.n1te your technique using the PRN
V=O.25.
(b) Using your answer in (a). generate a:;N 0,9) random variable.
19~11. SU:pPose

thatXhas probability density funetion

j(x) = !xl41, -2 <x< 2.

(a) Develop an inverse transfonn technique to generate a realization of X


(b) Demo:J.strate yQW'tech:uque usiDg U:: 0.6.
(c) Sketch out/ex) and see if you can come up with
another method to geoerate X.
19-12. Suppose that:he discrete random variable X
has probability function

(0.35

p(x)= 10.25
0. 40

1
O.

if x =-2.5.

ifx =: 1.0,
ifx= 10.5.
otherwise.

19-6 E.--.::ercises
As ia Exa.-r.ple 19 ~2, set up a table 'X) generate realizations from this distribution. Illustrate yO'GC technique with the PRN U = 0,86,
19-13. The Weibull (a, (3) distributiou. popular inreli
ability theory and o:her applied statistics disciplines,
has CDP

ifx> 0,
otherwise.
Show how to us.e the inverse transform method
to generate a realization from the Weibull
distribution,
(b) Demonstrate your technique for a ,\\reibull
(1.5,2,0) random va.-iable using the PRN

(a)

U=0.66.

595

19~22. The yearly unemployment rates for Andorra


daring the past 15 years are as follows:

6,9
9.9

8,3
9.2

8.8

11A

11.8

12.1

10,6

11.0

12.3 13.9 9.2 8.2 8.9

Use the method of batch means on the above data to


obtain a two-sided 95% confidence interval for the
mean unemployment. Use five batches, each consist~
ing of tbJ:ee years' data
19,,23. Suppose that we are interested in steady state
confidence intervals for the mean of simulation ou~ut
Xl' X2
X1COOO" (You can pre:end :hat t.'iese are wailing times.) We have conveniently divided the run up
into five batches, each of size 2000; su;pose that the
resulting batch means are as follows:
"'j

100 80 90 110 120

19~14.

Suppose that VJ "" 0,45 and U2 =: 0.12 are two


TID PR.'!s, vse the Box-Miiller method to generate
two N (0,1) variates,

Use the method of batch means on the above data to


obtain a two-sided 95% confidence interval for the
mean.

19-15. Consider the following P~"l"s:


0.88 0.87 0.33 0,69 0.20 0.79 0.21
0.96 0.11 0.42 0.91 0,70
Use the Central Limit theOrem method to generate a
realization that is approximately st..andard normal.
19~16. Prove equation 194 from the text Tnis 1.1,OWS
that the sum of n ITO exponential tandom variables is
Erlang. Hint: Find the moment-generating function of
y, and compare it to fuat of the gan.:.r:la distribution.

19~24. The yearly total snowfall figures for


Siberacuse, NY. during the pa.<;t 15 years are as
follows:

1917. Using r~o PR.'ls, U. = 0.73 and U, = 0,11.


generate a realization from an Erlang distibuti011 with
n=2and.i.:;;;3.

19-18. Suppose that UI' U2 , ... , Un are PRl'4's.


(a) Suggest an easy inverse transform method to generate a sequence of IID Bernoulli random variables, each with success pardlIleter p.
(b) Show how to use your answer to (a) to generate a
billomial r..mdom variate with parameters n and p.
19~19. Use thc aceeptance~rejeetion technique to generate a georeetric random variable wiTl::. success prob~
ability 0.25. Use as many of the P~'\ls from Exercise
19~ 15 as necessary.

Suppose that Z. = 3. z,. = 5, and .z.. = 4 are


::hree batch means res~ti.ng fr;m a long ~ation
run, Find a 90% t.vo-sided confidence lnterYfh for the
mean.
19~21. Suppose that 11 E [-2.5,3.5] is a 90% confidence interval for the mean eost incurred by a een:ain
inventory policy. Further suppose that this interval
was based on five independent replications of the
underlying inventory system. Unfortunately, the boss
has decided that she wants a 95% confidenee interval.
Can you supply it?
19~20.

100 103 88 72 98 121 106 110 99


162 123 139 92 142 169
(a) Use the me:hod ofbatcb means on the above data
to obtain a two-sided 95% confidence interval for
the mean year:y snowfa:L Use eve batches, each
consisting of three years' data.
(b) The corresponding yearly !Oml snowfall figures
for Buffoonalo, I'-'Y (which is down the road from
Siberacuse), are as fo11O\1,1s:

90 95 72 68 95 110 112
144 110 123 81 130 145

90

75

How does. Buffoonalo's snowfall compare to


Siberacuse's? Just give an eyeball answer.
(c) Now find a 95% confidence in~erval for t.'le difference in means be!:'Ween the two cities. Hint:
Think common random numbers.
19~25. Antithetic variates. Suppose :hat Xl' Xz . , ,.
X" are IID with mean 11 and yat~ce cr. F:rrthe: sup-

pose that fl' Y;;, , . , YIf are also no 'With mea'l f1 and
V'ariance 0-1 , The
triek here is that we will
also assc.me that Cov(X.. Y,) < 0 for all 1.. So, in other
words, the observations within one of the t'NO
seq"~ences are nD, but they are negatively oon:elared
between sequences.
{a} Here is an eKmlpie showing how can we end up
wi:h the above scenario using simulations. Let
Xi = -m(Ui) and ~ .",...;!nO U;), where the U i are
the usual IID uniform (0,1) random vari.b1es,
i.
/hat is :be distrib;;;.tion of Xi? Of Y,?
iL
What is Cov(U;.: U;)?

5%

Chapter 19 Computer SirmLation

Would you expect that Cov(X;, 11) < 07


Answer. Yes,
(b) LetI'" and f,; denote the sample means of the-X,
aed Y" res;;ectively, each based on n observations.
'Ni::hout actua:ly calculating Cov(Xi' Yi), state how
V(O!" ~ ))12) compares [0 V(X",). k other words.
should we do two negativelY correlated runs, each
consisting of n observations, or just one run consisting of 2r. observations?
(e) What if you tried to use tills trick when using
Monte Carlo simulation to estimate J~ sin(n:x) d:x?
iii,

19-U. Another varianee ceduction technique. Sup~


pose that our goal is to estimate the mean jl. of some
steady state simulation output process, Xl' Xi, .. ,. XII'
Suppose we SOITh':how know the expected value of
some other RV Y. and we also know that Cov(X, 1) >
0, where X is the sample me<l.."l. Obviously, X is the
"usual" estimator for jL Let us look at another est:bla
tor for j.!., namely. the control-variate estimator,
M

c~X -

where the Ui a:e IID uniform (0.1). 'Vv'hat ki!1d of dis~


tribution does it look like?
19~28. Another miscellaneous computer exercise.
Let us see if the Cent:ral Limit Theorem works. In
Exercise 19~27> you generated 20,000 exponcntial(l)
observations, Now form :000 averages 0[20 observa~
tiollS each fro:.r.. theoriginal20,QOO. More precisely, let

I '"

Y, : -20..i..J
' " X,,(,
. '.
-'" ,-.J"'J
.1",1

Make a histogram of the Yr DQ they look approxi


!n.ately nor:nal?

19~29. Yet another nUsccllaneous computer execcise. Let us generate some Donnal observations via
the Box-Mtl1cr method. To do so, fust generate 1000
pairs of liD uniform (0,1) flndoID nu...rnbers, (U J ,!'
Ul. l )' (0;,::, U2;2)' .". (UU:lJo, U2 ,JOIX;). Set

r--,

XI ~ r21n(Ulj)cos(2rrU"iJ

kiY -E(l']),

where k is some consmnt.

and

(a) Show that C is unbiased for jl,


(b} Find an expression for V{C). Commems'?
(c) Minimize Vee) with respect to k.
19~.27.

A miscellaneous computer exercise. Make a


histogram of XI ~ -In(U), for i ~ I, 2, ... , 20,000,

for i;; 1,2, ,." 1000, Make a hisrogram of the resultingX" [The Xl'S areN(O,l),) Now gtaphX; vs. Y,> Any
commcn~s?

Appendix

Table I

Cumulative Poisson Distribution

Table II

Cumulative Standard Normal Distribution

Table m

Percentage Points of the X2 Distribution

Table IV

Percentage Points of the t Distribution

Table V

Percentage Points of the F Distribution

Chart VI

Operatiug Characteristic Curves

Chart VII

Operatiug Chara :teristic Curves for the Fixed-Effects


Model Analysis of Variance

Chart vm

Operatiug Characteristic Curves for the Random-Effects


Model Analysis of Variance

Table IX

Critical Values for the Wilcoxon Two-Sample Test

Table X

Critical Values for the Sign Test

Table XI

Critical Values for the Wilcoxon Signed Rank Test

Table XII

Percentage Points of the Studentized Range Statistic

Table xm

Factors for Quality-Control Charts

Table XIV

k Values for One-Sided and Two-Sided Tolerance Intervals

Table XV

Random Numbers

597

598

Appendix

Table! Cumulative Poisson Distribution~


C~N

om

0,05

0,10

0,20

0,30

0,40

0,50

0,60

0,990

0,951

0,904

0,818

0,7"'l

0.670

0,606

0.548

0.999

0.998
0,999

0.995
0,999

0,982
0,998
0.999

0.963
0,996
0.999
0,999

0.938
0,992
0.999
0,999

0.909
0.985
0.998
0,999
0.999

0.878

2
3
4

5
c:=

0,976
0.996
0,999
0.999

iJ.

0,70

0,80

0,90

LOO

LlO

1.20

1.30

lAO

0.496

0,449

0.406

0,367

0.332

0.301

0.272

0.246

1
2
3
4

0.844
0.965
O.99 L
0.999
0.999

0,808
0.952
0.990
0.998
0,999

0.772
0.937
0.986
0.997
0.999

0.735
0.919
0,981
0.996
0.999

0,699
0.900
0.974
0,994
0.999

0,662
0,879
0,966
0,992
0.998

0,626
0.857
0,956
0,989
0.997

0.591
0,833
0.946
0,985
0,996

0.999

0.999

0.999
0,999

0.999
0.999

0.999
0.999

0.999
0.999
0.999

0.999
0.999
0,999

5
6
7
8

Co::::

Ar

1.50

1,60

1.70

1.80

1.90

2.00

2,10

2,20

0.223

0201

0,182

0,165

0,149

0,135

0.122

0,110

0.557
0,808
0.934
0,981
0.995

0.524
0.783
0.921
0.976
0.993

0,493
0,757
0.906
0.970
0,992

0.462
0.730

O,~33

2
3
4
5

0.963
0.989

0.703
0,874
0.955
0.986

O."'l6
0.676
0.857
0.947
0.983

0.379
0.649
0.838
0.937
0,979

0.354
0.622
0.819
0.927
0.975

0.999
0,999
0.999

0.998
0.999
0.999

0.998
0.999
0.999
0.999

0.997
0.999
0.999
0,999

0.996
0.999
0.999
0.999

0.995
0,998

0.994
0,998
0.999
0.999
0.999

0,992
0,998
0.999
0.999
0.999

6
7

8
9
10

0,891

0.999
0,999

Appecdix

599

Tablel Cumu:.ative Poisson Distribution" (continued)

C=N
x

230

2.40

2.50

2,60

2.70

2,80

2.90

3,00

0.100

0.090

0.082

0,074

0,067

0.060

0,055

0.049

0.330
0.596
0.799
0.916
0,970

0.308
0569

0,287

0.267

0.248

0543
0,757
0,891
0,957

0518
0,877
0,950

0."93
0,714
0.862
0.943

0,231
0.469
0,691
0,847
0,934

0.214
0,445
0,669
0.831
0,925

0.199
0.423
0,647
0.815
0,916

0,990
0,997
0,999
0,999
0,999

0,988
0,996
0.999
0,999
0.999

0.985
0.995
0,998
0,999
0.999
0.999

0,982
0,994
0.998
0,999
0.999
0,999

0,979
0.993
0.998
0.999
0.999
0,999

0.975

0.971

0.991
0.997
0.999
0.999
0,999

0.990

0.966
0.988

3.50

4.00

4.50

c::;;: j"t
5.00

5.50

{I,030

0.018

0.011

0,006

0,135
0.320
0536
0,725
0,857

0,091
0,238
0,433
0.628
0,785

0,061
0.173
0,342
0.532
0,702

0,934
0,973
0,990
0.996
0,998

0,889
0,948
0,978
0,991
0,997

0.999
0,999
0,999

0,999
0,999
0,999
0,999

3
4
5

6
7

8
9

10

0.778
0,904
0,964

II
12

1
2

3
4

5
6
7

8
9

10
11
12
13
:4
15

16
17

18

,9
20

0.736

0.996

0,996

0.999
0,999
0,999
0.999

0,998
0.999
0.999
0.999

6.00

6.50

7,00

0.004

0.002

0,001

0.000

0,040
Q,!24
0,265
0,440
0.615

0.026
0,088
0,201
0.357
0.528

0,017
0.061
0.151
0.285
0,445

0.011
0.043
0,111
0.223
0,369

0,007
0,029

0,831
0,913
0,959
0,982
0,993

0,762
0,866
0,931
0.%8
0.986

0,686
0,809

0,606
0<743
0,847
0,916
0,957

0526
0,672
0,791
0,877
0.933

0.449
0.598
0.729
0,830
0,901

0,997
0,999
0,999
0,999
0,999

0,994

0,989
0,995
0.998
0.999
0,999

0.979

0,966
0,983
0.992
0,997
0,998

0.946
0,973
0,987
0.994
0,997

0,999
0.999

0,999
0,999
0.999

0,999
0,999
0.999
0,999

0,999
0,999
0.999
0.999
0.999

0,997
0.999
0,999
0,999
0,999

0,894
0,946
0,97L

0<991
0,996
0,998
0,999

0.081
0.172
0.300

(continues)

600

Appendix
TabId Cumulative Poisson Distribution Cconfinued)

c '= At
x

7.50

8.00

8.50

9.00

9.50

10.0

_ _ ~ _ _ ~ ___ ~ _ _ _ _ _ ~ _ _ ~ _ _ _ _ _ ~ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ c_~ _ _ _ ~ _ _ ~ _ _ _ _ _ _ _

15.0

20.0

----------~-~-~.

0.000

0.000

0.000

0.000

0.000

0.000

0.000

0.000

1
2
3
4

0.004
0.020
0.059
0.132
0.241

0.003
0.013
0.()42
0.099
0.191

0.001
0.009
0.030
0.074
0.149

0.001
0.006
0.021
0.054
0.115

0.000
0.004
0.014
O.MO
0.088

0.000
0.002
0.010
0.029
0.067

0.000
0.000
0.000
0.000
0.002

0.000
0.000
0.000
0.000 .
0.000

6
7
8
9
10

0.378
0.524
0.661
0.776
0.362

0.313
0.452
0.592
0.716
0.815

0.256
0.385
0.523
0.652
0.763

0.206
0.323
0.455
0.587
0.705

0.164

0.130
0.220
0.332
0.583

0.007
O.DlS
0.037
0.069
0.118

0.000
0.000
0.002
0.005
0.010

11
12

0.920
0.957

0.848
0.909
0.948
0.97Z
0.986

0.803
0.875
0.926
0.958
0.977

0.751
0.836
0.898
0.940
0.966

0.696
0.791
0.864
0.916
0.951

0.184
0.267
0.363
00465
0.568

0.021
0.039
0.066
O.IM
0.156

0.993
0.997
0.998
0.999
0.999

0.988
0.994
0.997
0.998
0.999

0.982
0.991
0.995
0.998
0.999

. 0.972
0.985
0.992
0.996
0.998

0.664
0.748
0.819
0.875
0.917

0.221
0.297
0.381
0.470
0.559

'-Q!l9.9

0.946
0.967
0.980
0.988
0.993

0.643
0.720
0.787
.0.843
0.887

26
27
28
29
30

0.996
0.998
0.999
0.999
0.999

0.922
0.947
0.965
0.978
0.986

31
32
33

0.999
0.999
0.999

0.991
0.995

13

0.978

14
15

0.989
0.995

0.888
0.936
0.965
0.982
0.991

16
17

0.998
0.999
0.999
0.999
0.999

0.996
0.998
0.999
0.999
0.999

18
19
20
21
22

0.999

0.999
0.999

23

0.999
0.999
0.999

0.268

0.391
0.521
0.645

0.999
0.999
0.999

0.999
0.999
0.999
0.999

24
25

column may be rend as 1.0.

r.
"

0.999
0.999
0.999
0.999

0.9~7

0.998

34
"E:nries in the table are ~ucs of

0.457

F(x) =: P(X s; x):. 'Le-cc1/i!. B1auic spaces below the last cnll')'in any
;==0

I ,-

......

-iOonC:'il(
, .:,)~.,'J

"'

,'-or", -;

:: ..

-!-o

i\.;lpendix

-",-

{, ,<",;;"
-,

")

..>.')1

f\ 0

s--:-Qr,,'!;:.f;f!_ -

'\-v';:')

Table II Cumulative Standard Normal Distribution

601
r lOUT ;:
-:"r{

<I>(z) =

L "2,, e-,'il
1

du

0.00

0.0:

0.Q2

om

0.04

0.0
0.1

0.50000
0.53983
0.57926

0.61791
.' 0.055-4'20.69146

0.59095
0.62930
0.66640
0.70194

0.51595
0.55567
0.59483
0.63307
0.67003
0.70540

0.0
0.1
0.2

OJ
0,4
0.5

0.507 98
0.54776
0.58706
0.62551
0.66276
0.69847

0.51l 97

0.2

0.50399
0.54379
0.583 17
0.62172
0.65910
0.69497

0.6
0.7
0.8
0.9
l.0

0.72575
0.75803
0.788 14
0.81594
0.84134

0.72907
0.761 15
0,79103
0.81859
0,84375

0,73237
0.76424
0.79389
0.82121
0.846 13

0,73565
036730
0.79673
0.82381
0.84849

0,73891
0.77035
0.799 54
0,82639
0.85083

L1
1.2

0,864 33
0.88493
0.90320

0.86650
0,88686
0.90490
0,92073
0,93448

0,86864
0,88877
0,906 58
0.92219

0,S70 76
0,89065
0,90824
0.92364
0.93699

0,872 85
0,892 51
0.909 SS
0,92506
0.93822

1.3
1.4
1.5

1.3
1.4
1.5
'1.6

0,~9.24~

-().93 :lJ 9

:cr9:i5JD
c:r.;-

O.S5l.72

0.3
0.4
0.5
0,6

0.7
0.8
0,9

LO
1.1
1.2

0,94520
0.95543
0,964 07
0.97128

0.946 30
0.95637
0,96485
0.97193
0,977 78

0.94738
0,95728
0.96562
0.972 57
0.97831

0.94845
0.958 18
0.966 37
0.97320
0,97882

0.94950
0.95907
0,967 II
0.973 81
0.97932

1.6
1.7
1.8
1.9
2.0

2.2
2.3
2,4
2.5

0.98214
0.98610
0.98928
0.99180
0.99379

0.98257
0,98645
0.98956
0.99202
0,99396

0,98300
0.98679
0,98983
0.99224
0.994 :3

0.98341
0,987 i3
0,990;'0
0,992 45
'--".
".0,994 30'

0.98382
0.98745
0.99036
0.99266
0.99446

2.1
2.2
2.3
2.4
2.5

2.6
2.7
2,8
2.9
3.0

0.99534
0,99653
0.99744
0.99813
0.99865

0.99547
0.99664
0.99752
0.99819
0,99869

0.99560
0.99674
0.99760
0.99825
0.99874

0.99573
0.99683
0,99767
0.99831
0,99878

0,99585
0.99693
0.9977.d.
0.99836
0.99382

2.6
2,7
2.9
3.0

3.1
3.2
3J
3.4
3.5

0.99903
0,99931
0.999 52
0.999 66
0.999 77

0.99906
0,999 34
0.99953
0.999 68
0.999 7S

0.99910
0.99936
0.99955
0,99969
0,99978

0.999 13
0,99938
0,999 57
0,99970
0.99979

0.99916
0.9994{)
0.99958
0.999 71
0.99980

3.1
3.2
3.3
3.4
3,5

3,6

0.99984
0.99989
0.99993
0.99995

0.999 85
0.999 90
0.99993
0,99995

0,99985
0,99990
0,999 93
0.99996

0,99986
0,99990
0.99994
0,99996

0.99986
0.99991
0.99994
0.99996

3.6
3.7
3,8
3.9

1.7

1.8
1.9
2:6' .'
2,1

3,7
3,8

3.9

,iXin is-'-'

2.8

(continues)

602

Appendix
Table II Cumulative Standard Normal Distribution (continued)

<I>(z) = J~

-,=~e

_uu .,

- ,21t

"du

0.05

0.06

0.07

0.08

0.09

0.0
0.1

0.51994
0.55962
0.59871
0.63683
0.67364
0.70884

0.52392
0.56356
0.60257
0.641J 58
0.67724
0.71226

0.52790
0.56749
0.60642
0.644 31
0.68082
0.71566

0.53188
0.57142
0.61026
0.64803
0.684 38
0.71904

0.53586
0.57534
0.61409
0.65173

0.0
0.1
0.2

0.74537
0.77637
0.80510
0.8~
0.83141
6~8.53 14) . '0.85543

0.74857
0.77935
0.80785
0.83397

0.2

0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
1.1
1.2
1.3
1.4
1.5
1.6
1.7
1.8

1.9
2.0
2.1
2.2

2.3
2.4
2.5

2.6
2.7
2,8

2.9
3.0

3.1
3.2
3.3
3.4

3.5
3.6
3,7

3.8
3.9

D.742)~.

:1\

0.773 37 ~\
0.802 34 "

8:i~O::';b876 97

0.85769

0.87900

0,3

0.68793

0.4

0.72240

0.5

0.75175
0.78230
0.810 57
0.83646
0.85993

0.75490
0.78523
0.813 27
0.83891
0.86214

0.6
0.7
0.8

0.88100
0.89973
0.91621
0.93056
0.94295

0.88297
0.90147
0.91773
0.93189 .
0.944 08
0.954 48
0.96327
0.97062'
0.97670
0.98169

1.6
1.7
1.8
1.9
2.0

0.9

1.0
1.1

0.89435
0.91149
0.926 47
0.93943

0.89616

0.89796

0.913 08
0.927 85
0.94062

0.91465
0.92922
0.94179

0.95053
0.95994
0.96784
0.97441
0.97982

0.95154
0.960 80
0.96856
0.97500
0.98030

0.96926
0.97558
0.98077

0.95352
0.96246
0.96995
0.97615
0.98124

0.98422
0.98778
0.99061
0.99286
0.99461

0,98461
0.98809
0.990 86
0.99305
0.99477

0.98500
0.98840
0.99111
0.99324
0.994 92

0.98537
0.98870
0.99134
0.99343
0.99506

0.98574
0.98899
0.99158
0.99361
0.99520

2.1
2.2
2.3

0.99598
0.99702
0.99781
0.99841
0.99886

0.99609
0.99711
0.99788
0.99846
0.99839

0.99621
0.99720
0.99795
0.99851
0.99893

0.99632
0.99728
0.99801
0.99856
0.99897

0.99643
0.997 36
0.99807
0.99861
0.99900

2.6
2.7
2.S
2.9
3.0

0.99918
0.99942
0.999 60
0.99972
0.99981

0.99921
0.99944
0.999 61
0.99973

0.999 24
0.999 46
0.99962
0.99974
0.99982

0.99926
0.99948
0.99964
0.99975
0.99983

0.999 29
0.99950
0.99965
0.99976
0.999 83

3.1
3.2
3.3
3.4
3.5

0.999 88
0.99992
0.99995
0,99996

0.99988
0.999 92
0.99995
0.99997

0.99989
0.9999;l
0.99995
0.999 97

3.6

0.99987
0.999 91
0.99994
0.99996

0.99981

0.99987
0.999 92
0.99994

0.99996

0.95254
0.96164

1.2
1.3'

1.4
1.5

2,4

2.5

3,7
3,8
3.9

.
Appendix

603

Table ill Percentage Points of the Xl Distribution:


.'_.

0.995

0.990

0.975
0.00+
0.05
0.22
0,48
0.83

0.00+

0.00+

2
3

om

0.02

0,07

0.11

0.21
0041

0.30
0.55

0.68
0.99

0.87
124
1.65
2.09
2.56

5
6
7

,,34

9\
10

1.73
2.16

13
14

2.60
3.07
3.57
4.07

.15

4.60

5.23

16
17
18

5.14
5.70

11

r-,''',
'~::'

19

6.26
6.84

20

7.43

21
22

g.03

23

24
25

8.64
9.26
9.89
JO.52

:.24
~_69

2.18
2.70
325

0.950

0.900

0.500

0.100

0.050

0.00+

0.02

0.10

0.21
0.58
1.06
1.61

0.45
1.39
2.37
3.36
4.35

2.71
4.61

6.25
7.78
9.24

3.84
5.99
7.81

5.35
6.35

10.65
12.02

0.35
0.71

Ll5
1.64
2.17
2.73
3.33
3.94

5,()1

4.57
5.23
5.89

5.63
627

657
7.26

5.81

6.91

6.41
7.01

7.56
8.23

7.63
8.26

8.91
9.59

7.96
8.67
9.39
10.12
10.85

8.90
9.54

10.28
10.98

10.20
11.52

11.69
12.40
13.12

3.05

3.57
4.11
4.66

10.86

3.82
4.40

2.20

2.83

12.59
14.07
15.51
16.92

14.45
16.01
17.53
19.02

20.09
21.67

.J.U!

20,48

23.21

17.28
18.55
19.81
21.06
22.31

19.68
21.03
22.36
23.68
25.00

21.92
23.34

24.72
26.22
27.69
29.14
30.58

15.34
16.34
17.34
18.34
19.34

23.54
24.77
25.99
27.20
28.41

26.30
2759
28.87
30.14
31.41

28.85
30.19
31.53
32.85
34.17

32.00

36.19
3757

38.58

2034
21.34

29.62
30.81

32.67
33.92

22.3~

35.17

35.48
36.78
38.08

41.40
42.80
44.18

36.42
37.65

4>3.65

38.93
40.29
41.64
42.98
44.31

38.89
40.11
41.34
42.56
43.77

45.64
46.96
48.28
49.59
50.89

48.29

63.69
76.)5
88.38
100.42
112.33

66.77
79.49
91.95
104.22
116.32

118.14 124.12
129.56 135.81

128.30
14{i.17

5.58
10.34
6.30 . 11.34.
7.04
12:34'
7.79
13.34
8.55
14.34

12."4

11.59
12.34
13.09

1324

13.85

15.66
16.47

23.34
24.3~

32.01
33.20
.34.28

17.29

25.34
26.34
27.34
2834
29.34

35.56
36.74
37.92
39.09
40.26

51.8155.76
63,17 67.50
74.40 79.08
85.53 90.53
96.58 10l.88

14.61

14.04
14.85

26

ILl6

12.20

13.84

27
28

11.81

12.88

14.57

1357
14.26
14.95

15.31
16.05
24.43
32.36
40.48

26.51

29.05

39.34

34.76

5).74
60.39

37.69
46.46
55.33
64.28

49.33
59.33
69.33
79.33

65.65

69.13
77.93

73.29
82.36

89.33
99.33

29

12.46
13.12

30

13.79

4J
50
60
70
80'

20.71

22,16

27.99

29.71

35.53

37.48

43.28
51.17

45.44
5354

48.76
57.15

90
100

59.20
67.33

61.75
70.06

7~.22

'v:;:: cegreos off:ecdor:n.

16.79

15.38
16.15
16.93
>7.71
18.49

43.19

18.11
18.94
19.77
20.60

0.005

11.07

9,49

15.99 r..

10.87
11.65

0.010

5.02
7.38
9.35
11.14
12.83

13.36
14.68

9.31
10.09

0.G25

107.57
118.50

113.:4
12><.34

24.74

26.12
27,49

39.36

41.92
43.19

44.46
45.72

46.98
59.34
7!.42
83.30

95.02
106.63

6.63
9.21

11.34
13.28
15.09
16.81

18.48

33.41
34.81

7.88
10.60

12.84
14.86
16.75
18.55
20.28
21.96
23.59
25.19
26.76
28.30
29.82

31.32
32.80
34.27
35.72
37,16

40.00

45.56

46.93
49.65
50.99
52.34
53.67

\,:

604

Appendix

Table IV Percentage Points of the t DistDDution

~I O~___ IJ.2~ __!l:!o~_,

0.05 _0_.02_5_._0_'0__1_ _

0.~00_5_~:O_02_5_~O~_~_.000_5_

1
0.325
2
0.289
3 . 0.277
4
0.271
5
0.267

1.000
0.816
0.765
0.741
0.727

3.078
1.886
1.638
1.533
1.476

6.314
2.920
2.353
2.132
2.015

12.706
4.303
3.182
2.776
2.571

31.821
6.965
4.541
3.747
3.365

63.657
9.925
5.841
4.604
4.032

0.718
0.711
0.706

1.440
1.415
1.397

0.703

US3
1.372

1.943
1.895
1.860
1.833
1.812

2.447
2.365
2.306
2.262
2.228

3.143
2.998
2.8%
2.821
2.764

3.707
3.499
3.355
3.250
3.169

4.317
4.029
3.833
3.690
3.581

5.208
4.785
4.501
4.297
4.144

5.959
5.408
5.041
4.781
4.587

2.201

2.718
2.681
2.650
2.624
2.602

3.106
3.055
3.012
2.977
2.947

3.497
3.372
3.326
3.286

4.025
3.930
3.852
3.787
3.733

4.437
4.318
4.221
4.140
4.073

2.583
2.567
2.552
2.539
2.528

2.921
2.898
2.878
2.861
2.845

3.252
3.222
. 3.197
3,174
3.153

3.686
3.646
3.610
3.579
3.552

4.015
3.965
3.922
3.883
3.850

2.831
2.819
2.807
2.797
2.787

3.135
3.119
3.104
3.091
3.078

3.527
3.505

2.064
2.060

2.518
2.508
2.500
2.492
2.485

3.467
3.450

3.819
3.792
3.767
3.745
3.725

2.479
2.473
2.4<>7'
2.462
2.457

2.779
2.771
2.763
2.756
2.750

3.067
3.057
3.047
3.038
3.030

3.435
3.421
3.408
3.396
3.385

3.707
3.690
3.674
3.659
3.646

2.704
2.660
2.617
2.576

2.971
2.915
2.860
2.807

3.307
3.232
3.160
3.090

3.551
3.460
3.373
3.291

9
10

0.265
0.263
0.262
0.261
0.260

11
12
13
14
15

0.260
0.259
0.259
0.258
0.258

0.697
0.695
0.694
0.692
0.691

1.363
1.356
1.350
1.341

1.796
1.782
1.771
1.761
1.753

16
17
18
19
20

0.258
0.257
0,257
0.257

0.690
0.689
0.688
0.688
0.687

1.337
1.333
1.330
1.328
1.325

1.746
1.740
1.734
1.729
1.725

2.120
2.110
2.101
2.093

21
22
23
24
25

0.257
0.256
0.256
0.256

1.323
1.321
1.319
DI8
1.316

1.721
1.717

2,080
2.074
2.069

0,256

0.686
.0.686
0.685
0.685
0.684

26
27
28
29
30

0.256
0.256
0.256
0.256
0.256

0.684
0.684
0.683
0.683
0.683

1.315

1.706
1.703

1.313
1.311
1.310

1.701

1.699
1.697

2.056
2.052
2.048
2.M5
2.(/42

40
60
120

0.255

0.681
0.679

2.021

2.423

0.674

l.671
1.658
1.645

2.000

0,254
0.253

1.303
1.296
1.289
1.282

1.684

0.254

2.390
2.358
2.326

0.257

0.700

0.677

1.345

1.314

1.714
1.711
1.708

2,179

2.160
2.145
2.131

2.086

1.980
1.960

127.32 318.31
14.089 23.326
7.453
10.213
5.598
7.173
4.773
5.893

3A28

3.485

636.62
31.598
12.924
8.610
6.869

Source; TIlls table is adapted from Biometrika Tables for SraristiciAns, VoL t 3rd edi;iOD< 1966. by permissioI!. of dle
Biometrika Trustees.

Table V

Percentage Points of th~ F Distribution


F 025, V!,>'l

v,
v,
2
3
4
5
6

9
ill
11
12

"
~

7.50
3.00

2.02

2.28

1.81

2.00
LaS
1.76
1.70
1.66
1.62

1.69
1.62
LS7
1.54
LSI

1.49
1,47

1.46
1.45

1.44

li
16

.il

1.43
1,42
1,42
1.41
1,41
1,40
1,40
1,40

1.39

24

1.39
1.39

-1l

i
'"

2
5.83
2.57

n
19
20
21

1.60
1.58
1.56
LS5
LS3

2.06

2.(17
1.&9

1.78
1.72
1.67
1.63
1.60
LS8

1.79
1.72
1.66
1.63

1.56
LS5
1.53

LSI

1.51
LSO
1.49
1,49
1,48
1,48
1,47
1,47
1.46
1.46
1,45
1,45
1,45
1,45

LSO
1,49

1.49
1,48
1,48
1,47
1,47
1,47

1.38

1.46

1.38
1.38
1.38

1.46

30

1.38

120

8.82
3.28
2,41

1.52

U
D
U
B

60

8.58
3.23
2.39

152
151

8.20
3.15
2.36
2.05
1.88

1.46
1,45
1,45

1.44

1.36
1.35
1.34

1.44

1,42

1.42
1,40

1.41
1.39

1.32

1.39

1.37

1.&9

1.59
157
1.55
LS3
LS2
1.51
LSO
1.49

1.48
1,47
1,47
1.46
1,45
1,45

1.44
1.44
1.44

1.79
1.71
1.66
1.62
1.59
LS6
154
1.52
LSI
1,49
1,48
1,47
1,46
1.46
1,45
1,44

1.44
1.43
1,43
1,42
1.42
1,42
1,41
1.41
1.41

8.98

3.31
2042
2.08
1.&9
1.78
1.71
1.65.
1.61

1.5&
155
1.53
LSI

Degrees of fr~edom for the numerator (VI)


7
8
9
to
12

9.lD
3.34
2,43
2.08

9.19
3.35

1.89
1.78
1.70

1.89

I.M

1.6()

1.60

1.57

LS6
1.53

1.54

1.46

lAS

1.44

1.44

1.43
1.43
1,42
1.41
1.41
1,40
1,40

104&

1.47

1.44
1.43
1.42
1.42
1.41
1.41

1,41

1,43
1.43
1.43
1.42
1.40
1.38

1.39
1.37

lAO
lAO
lAO
1.39
[.37
US

1.37
1.35

1.35
1.33

1.33
1.31

2.08
1.78
1.70
1.64

1.52
1.50
1.49
1,47
1,46
1,45

150

2.44

1.39
1.39
1.39
1.38
1.38

1.36
1.33
1.31
1.29

LSI
1.49

1,48
1.46
1,45

1.44
1.43
1,42
1.42
1.41
lAO
lAO
1.39

15

20

24

30

40

60

120

1.89

1.88

1.77
1.69

1.76
1.68
1.62
1.57
1.53

1.75

1.75

1.66
1.6()

1.59
LS5
1.52
1.50
1048
1.46

1.77
1.68
1.62
1.58
1.54
LSI
1.49
1,47
1,45

U6
152

1,48

1.49
1,47

LS6
LS2
1.49
1.46

1.46

lAS

1.44

1.44

1.43

lAS

1.44

1.41

1.42
1.41

1.41
1.39

1.38

1.39
1.37

1.44

1.44

1.38

1.37

1.36

1.35

1.43
1.42
1,41
1,40
1.39

1,40
1.39

1.39

1.43
1.42
1.41

1.43
1.41
1,40
1,40

1.43
1.41
lAO
1.39
1.3&

LS5
1.51
1.48
1,45
1.43
1,41
1,40

1.66
1.59
LS4
LSI
1,47
1,45

3,46
2.47
2.08
1.87
1.74
1.65
1.59

3.47
2047

1.89

9.71
3,45
2,47
2.08
1.88

9.80

1.89

9.63
3,43
2,46
2.08
1.88
1.75
1.67
1.60

9.76

2.08

9.58
3,43
2,46
2.08
\.88
1.76
1.67
1.61

9.67

3.39
2,45
2.0&

9.49
3041
2046
2.08

1.38

1.35

1.34

1.37
1.36

1.37
1.36

1.36

1.38

1.35
1.34

1.34

1.39

l.37

1.38

1.37

1.37
1.37
'1.36

LJ6
1.35
1.35
1.34
1.34
1.33
1.33
1.32

9.26
3.37
2.44
2.0&
1.89
1.77
1.70
1.63
LS9
LS6
LS3

LSI
1.49
1.47
1.46

1.41

1,40
1.39

1.39

1.39
1.38
1.38

1.3&
1.38

1.37
1.37

1.38
1.37
1.37

1.37
1)6

1.35
1.32

1.30
1.2&

1.36
1.34
1.31

1.29
1.27

9.32
3.38

2.44

1.63

1.39
1.38

1.38
1.37
1.37
1.36
1.36
1.35
1.35

1.33
1.30
1.28
1.25

9,41

1.36
1.35

1.35
1.34
1.34
1.34
1.31
1.29
1.26
1.24

LSO

1.32
1.30
1.27

1.24
1.22

1.37
1.36

1.35

1.35
1.34

1.34

1.34

1.33

1.33

1.33

1.32

1.33
1.32

1.32

1.32
1.31
1.31
1.30

1.31

1.28
1.25
1.22
1.19

1.31
1.30

1.30
1.29
1.26
1.24
1.2\
1.18

3.44
2,47
2.08

1.35
1.34
1.33
1.32
1.32
1.31
1.31
1.30
1.30
1.29
1.29

1.28
1.25
1.22
1.19
1.16

1.42

1.54
LSO
1,47

1.44
1.42
1,40

2.08
1.87
1.74
1.65
1.58
1.53
1.49
1.46
1.43
1,41

1.33

1.33
1.32

1.33
1.32
1.31

1.32

1.31

1.30

1.31
1.31
1.30

1.30
1.30

1.21

1.27
1.27
1.26
1.26
1.22
1.19

LI8

Ll6

1.29
1.28
1.28
1.27
1.26
1.26
1.25
1.25
1.24
1.21
1.17
L13

1.14

Ll2

1.08

1.29
1.29
1.28
1.28
1.27

1.27
1.24

Source: Adapted with permhiion from Blome/rikn Tllbfes illr Swtisticillns, Vol. 1, 3rd edition, by E. S. Pear-sOD and H. O. HlITtI<lY, Cambridge Univenity Press, Cambridge, 1966,

1.29
1.28

1.28

9.85
3,48
2.47
2.08
1.87
1.74
1.65
1.5&
1.53
1.48

1.45
1.42
1,40
1.38

1.36
1.34
1.33
1.32
1.30

1.29
1.28
1.28
1.27
1.26
1.25

1.25
1.24
1.24
1.23
1.23

Ll9
LIS
1.10
1.00
(ContiHlles)

;g
g

Table V Percentage Points of !he F Distrihution (cmltinlJed)

l
FO$I.",,\.~,_ _ _-:-_ _ _ _ _ _ _ _ _ _ _ _ _ _ __

v,
2

Y,

31,l.86
2
3
4
5

it~3

5.$4

4.54

4.06

49JO

,.'5A6"

4.'\2
3.78

3.11

3.36

3.01

2.8\

3.29

,.'"

273

2~1

2.66

254

281
2."/6
2,73

2~1

2048

I'

3.lB

.~

13
14
IS

3.14

16

3.05
3.03

2.67

2,46

'.64

2.44
2.42

3.10
3.07

2.70

256
2.52
249

<l
.ll

17

3.01

1.62

19
20

2.99
2.97

'a

21

,.%

22

2.1,l5

23
24

'.94

2.61
2.59
2.57
2.56
255

293

254

25

2.92

U
27

2.91

2.53
252

'.90

2.51

'31
2JO

2.89
2.&9

2.50
250

2.75

2,49
244
219
235

2.71

2.30

"

28

"

30
40

60
1'0

2.B8

'.ll4

'79

4,11
352
'1.18

2.86

II

2.61
252
2A5
239

5.34

3.29
1.1l7
2.92

,."

'.73

'~9

5.39
4.19

1.46

10

2.81

9.24

3.78
3.59
1.46

1.26

2!J6

55.!B

9.16

3.62

2M)
238
1.36
2.35
2:>1
2.33
232

6
57.24
9.29
5.3]
4.{l5
't45
3.11
2.&8

53.59

2,43

2.39
2.%
233
211
2,29
227
2.21
2.13
2.22
2.21

2.19
2.18

58.20
9.33
5.28

4.Ot
'AD
3.05
2.33

,."

255
2A6

'39
2.33
2.28

2.35
2.31

2.24

22"/

1.21

224
222

2.18
2.15

2.18

2.ll

2.16
2.14
2.13
2.11
2.10

'J)9

2J)9

2.13

2.08

2.06

,.OS
2M
2J)'l
2.01

2.17
2,17

2.08
'1JfJ

2.29

2.16

2.28

21'

2.06
'.06

22'

2.l4

:un

'.M

2.'"
).95

1.93
1.81

1.99

1.00

1.82

1.94

1.85

1.77

2,23
1.18
2.13
2.03

,.'"

'.00
:2.00

1.99
1.98

D.!grees of fr~om for the numerator (VI}


8
9
10
U

58.91
!U5
5,27
198
3.17
3.01
2.78
2.62
B1
2A1
234
2.26
2.23
2.19
2.16
2.13
2.10
2.0B
2.06
2.04
2.02
2.01
1.99
1.9&
1.97
1.96
1.95
1.94
1.91
1.91
L87
U2
1.77
1.72

59.44
9.37
525
3.95
1.34
2.98
2.75
2.59
2A7
2.38
2.10
2.24
2.20
2.15
2,12
2.09
2.06
2.04
2.02
2JJO
1.98
1.97
1.95
1.94

un
1.92
1.91
}.90
1.89
UIS
1.83
1.77
, L72
1..67

5!tS6
9.3&
5,24
3.94
3.32
2.96
2.72
2.56
2.44
2.35
2.27
2.21
:Uti
2.12
2.09
2.M
2.03
2.00
1.98
1.96
1.95
L93
1.92
1.91
L89
U&
1.87
1.87
L86
U15
L79
1.74
1.68
1,63

60.l9
9.39
'5.23
3.92
,UO
2.94
2.70
2.54
2.42
2.12
2.25
2,19
2,14
2.tO
2.06
2.03
2.00
1.98
1.91i
1.94
1.92
1.90
1.89
1.88
1.87
Uti
I.SS
U!4

un
1.82
1.76
1.71
1.65
1.60

60.71
9.41
5.22
3.9{)
]'27
2.90
2.67
2JO
2.38
2.Ut
2.21
2.15
2.10
2.05
2.02
1.99
1.96
1.93
1.91
1.89
l.B7
L86
1.84
U3
1.82
1.81
1.80
1.79
1.78
1.77
l.71
1.66
1.60
U5

15

20

24

30

.6'

40

60

120

61.22

61.74

62.00

62,26

6253

61,79

63JJ6

6133

9.42

5.IR

9,45
5.18

9A1i
5,17

5.14

3,84

),Sl

),24

1.21

3.19

1.S2
3.17
2.80

9.47
'5.15
3.79

!JA9
5.13

3.87

9.47
5.16
).BO

9,48

5.20

'.44

3.16

).14

1.78
3.12

3.76
3.10

2.78

2.76

258

2.56

254
2:,6
2.21
2.l3

'"

,."

'.In

'.84

2.63

'39

2.46

2.42

2040

2.38

2.34
2.24
2.17

2.30

2.28

2.10
2,05

2.25
2.16
2.0Il
2.01

2.01
1.97
1.94

1.91
Ul9

1.86
I.Il4
1.83

220

Vi!

2.12

2.10

2J16

,{);

2.01
1.96

1.92
1.89
1.86

U4
L81
1.79

US

1.98

1.96

1.94
1.00
L87

1.91
1.87
Lll4

I.Il4

L31

2,05
1.99

1.93
Ul9
1.35
l.81
1.78

'.00

1.97
1.00
1.85

1.93

La&
l.83
1.79

2.06

L80
Uti

1.75

1.72

1.69

1.70

1.72
1.69
1.67

U5
1.72

Ug

1.75
1.73

1.74

1.71

1.6&

1.64

1.6J

1.72
1.70

1.69

1.";9
1,57

1.72

1.69

1.70

1.67

1.66
1.64
1.63

1.66
1.<>1
1.62

1.62

1.73

1.61

1.57

1.69

1.66

1.71

1.68

1.65

1.75
1.74

1.70

1.<>1
1.63

1.59

1.71

1.68
1.67

1.67
1.66
1.6'\

1.62

US

1.64

1.1;1

LS7

1.66
1.60
155

U;1
1.54

LS7
1.51

1.54

151
1.44

lAg

1,49

JA2

iA5
U8

1.72

2.29
2.16

1.76

1.76

1.69

2.32
2.18
2.0&

1.81

1.71

UlU
l:/8

2.11
2.03
1.96
1.00
1.86
l.S2
1,78

,."
2.47

1.79
1.7'1
L75

1.76
1.74
1.73
1.72

LSI

234
2.21

2:'4
2.'19

1.67

1.61
U50

1.60
159

,Jj6
If~

1.55
1.53
1.52

LSI)

156

1.58

1.54

1.50

1.57
1.56

153
1..52
1.51
l50

1.49

1042

US

US
L26
Ll7

U9

1.55
1.54
1.47

lAS
\.41

137

1.32

1.34

1.30

1.24

lAO

1048
IA7
lA6
1.29

1.00

If

Table V

Pen.:entage Points of the FDistrlbution (comitmed)

FilM,v"",
\

2
3
4
5
6
7

~.

.j
j
-!l
.!i
Il

..

199.5
19.00
9.55

l1S.1
19.16
9.23

224.6
19.25
9.12

,~

~~

'M

om

4.76

4.53

4.19

4.28

4.21

4.15

41'

1m

~
1.61

~
1,48

UI

5.99

5,14

4.26

IO

410

_ =
~l.8(j

230.2
19.30
9.01

11

12
13

Uri

ill

4.61
4.60

:UH
3.14

3.41
334

3,18
3.11

1.03
2.%

.m

15
16
lJ
18

19
20
21
23
24

,."
27

"
'"

30
40
60
120

iol.4
]8.5l
]0.13

~
5.12

'~" "

'3

8
9

.4

Degrees of freedom for the numerll.tOr (VI)


7
8
9
10
12

Vt

234.0

19.11
8.-94

U6

=.
3.37

236.8
1935
3.R9

3.29

=
= =

3,87

184

D'

~
1.21

~
:'\.18

,~
3.l4

3,07

3.01

~
2,94

~
2.90

I.

UI

'8

D9

.n

~J

UJ

'"

2.49

I~

UI

~J

,~

2.38
2.3\

2.:34
2.2/

2Jf!
2.30
2.22
2.16
2.11
2.1l6

,n

2.67
2.60

2.60
253

2.53
2.46

2.46
2.39

2.42
2.35

~I

2.61
253
254

2:5')
7..51
2AS

2,49
2.46
2.42

2.4S
2.41
2.38

2.3g
234
2.31

2.31

2.55

2.49
2.46

2,42
;>AO

2,37
2.34

2:12
2,30

2.25
2.23

2J3
2. t5

In

110

,.,

2.36
2,]4
2,32
2.11

2,30
2.211
2.27
2,25

2.25
2.24
2.22
2.20

2,18
2.16
2.15
2.13

f.ll
2.09
2JJ7
2.06

2,03
2.01

U9

1.2

ill

2.27
7.18
2.10
2.02
1.94

2.2'1
2.12
UM
1.96
LSS

2.16
2.0S:
1.99
1.91
1.113

2JO
2.09
2.00
L92
1..83
1.75

~IO

10

DI

I~

3.'17
3,44

3.07
3.05

2.84
2.8;>

2.68
2,66

2.~7

1M

4.26
4.24
4.23
4.21

lAO
'l39
3.37
3.35

3"01
2.99
2.98
2.%

2.78
2.76
2,74
2.71

2,62
2,60
:2-59
2.57

251
2.49
2.47
2.46

2,42
2.40
2.39
2.37

::tOo

3.94

::to.,

4.00

4.32
4.30

2.69
2.61
2.51
2,45
2,1'1

~D

2.70
2.66
2,61

2.92
2.84
2.76
2.68
2.60

Hi
2.77
2:J4

:t32
3.71
3.15

~
~

2.53
2,45
2.37
2.29
2.2l

2,42
2.'14
2.2.')
2.17
2.10

= =

= =

2.33
2.25
2.17
2J)9
2.01

2~27

2.23

7 O:l
1.92
1.84
1.75
1.67

250.1
19.46
8.62

2')Ll
19.41
8.59

6(J

2.7l
2iiS

2.%
2.93
2.90

4.17
4,08
4.00
192
3.84

4.M

40

2.77
2.70

3.20
3.16
1.J 3

,~

4.10

30

249.1
19.45
8.64

245.9
19.43
8.70

2.83
2.]6

3.59
3.55
352

24

248.0
19.45
8.66

= _

243.9
J9.41
8.74

20

2.92
2.85

4.45
4.41
4.38

~.,

v.

= =

2'11.9
19.40
8.19

1M

240.5
19.33
ItSl

=
=
_ = =
=
=

238.9
19.37
lUiS

15

= =

=
3.&1

~
2.&6

4AD

4.36

>.74

3.)0
3.01

1M

2.79
2.62

3,70
311
2.97
2.?S
258
2.45
2.34
2.25
2.18
2.11
2,06
2.01

1.67
3.23
2.93
2.11
2..54
2.40
2.30
2.21
2.13
2.07
2,01
L%

:un

~~

~B

2.15
2.ll
2.111

2,H)
2.06
2.00

2.10
2.07

2.05
2.03

2.01
1.98

1.96
1.94

WI

1M

1m

1.9~

L97

1.98
1.96
1.95
1.93

'M

.m

1.89
1.87
1.85
U!4

I~

.~

1.91
1.8.4
1.75
}'66
1.57

U9
1.19
1.70
1.6l
1.52

1M
1.84
1.74
1.6'5
1.5)
1.46

loY')

1.9(1

U:Hs

254,3
19.50
8.53
S.63

4,41

= =

J.92

19.49
8.SS
5.66

1.71

2.19
2.15
2.l1

5,69

2~3.3

2.21
2,19
2.16

..

252.2
t9.48
8.51

120

I.

L79
1.69
1.59
1.55
1.39

'.ill

un

1.92

1.9&

193
L9{)
1.87
1.M
1.81
L79
1.77

LSO

us

!JIg
UN
1.81
L7&
1.76
1.73
J.7l
1,69

L79

1.73

1.67

1.17

1.71
1.10
t68
1.58
IA7
US

1,95

1.92

1.89
UII'i

1.84
1.82

US
1.74

1.64
.>}
1.43
1.32

1.22

LG~

1.41
1.62
1.51
1.19
1.25
LOa
(cO'njimlfS)

'0

g
~

13

Table V Percenfage POints of lhe FDistribulion (colltr1H1ed)


FOf)'.!.5.Y,,~!
, - vIr"

~
~

1
647.&

2
799.5

864.7.

399.6

3851

39.00

39.17

39.25

17,44

16'(H

l5A4

15.l0

"D

HUll

,..

&.43

9
10

"

1"
I"

15
17

"q

t '"1
i" ""

921.8
3930
14,8&

6.&5

6.76

6.6&

662

6.9&

.n

20

24

9&4.9
39.41
\4.15

993.1
39.4'>
\4.17

997.2

100[

40

39.46

i4J2

"1

14.0&

13.99

UI

U.

6.13

6.1&

612

6.07

I.

.n

3.51

3.45

3.39

3.33

UI

6.2&

.~

un

lOW

120

:.lAo

6.33

1006
39,47
1<1,/)4

60

lOJ4
39.49
13.95

6.43

19.4.&

g"

HHB
3950
13.90

6.02

.n

SAil!

4.72

4.48

4.32

4.20

4,f(}

4.03

3.96

3,87

D,

'"

=
.n

356

1n
oM

3.67

3,71

~n

,~

3.51
].39

3.44
3.31

3.37
125

3,28
115

3.1&
3.05

3J.!7
2.95

3.0-1
2.89

2,%
2.84

2,91
2.1&

2.85
27'J;

2,79
2.66

2.72
2.60

,~

2.96
2.89

2.86
2J9

2.16
2.68

110

1.64
2.57

259

252
2A5

2.46
2:3&

2AO
2.32

~
~

.M

~.10

4.97

4A7
435

4.12
4.00

3.89
3.77

3.13
3.60

3.61
JAS

,~

U'

10

4,77
4.69

4.15
4,08

3.80
3,73

358
350

141
3.3'1

3.29
3.22

3.20
3.12

3.12
3.05

:t06
2.99

~
.~

~
~

~
.~

116
1m

1U

2%

'3.13
3.M
3.05
3.02
2.91:1
2.97
2.44

3.01
2.97
2m
2.90
2jn
/..85
2,&2

= =

2.91
2.&7
2.&4
2,81
2_78
2.75
2,73

'"

/.,?8
2.76
2.75
2,62
2.SI
239
229

2.69
2.67
2.65
2.53
2.41
2.30
2.19

2.61
2.59
257
2A5
2.33
2.22
2,11

UI
~

.~

22

5.79

4.1~

115

-1.32
4,29
4.27

172
\69
3.61

151
3A8
1.44
3Al
),38
':U5
3.33

3.29
3.?-5
:1.22
3,18
3.1'1
3.13
3.}0

UI

4,22
4.20
'LIS
4,05
3.93
3,go
3.69

3,63
3.6]
359
3.46
3.:)4
3.23
3.12

3.29
3.17
3.2)
3:13
3.01
2.89
2.79

106
3.{}4
3.03
2.W
2.79
2.67
2.57

U,

2.90
2.3S
2.B7
2,74
2.63
l.52
2.41

~
~

2.84
2.80
2.76
2.13
2.70
2.68
2.M

=
=

=
,~

=
=

2.17
/.73
210
2.61
2.64
2.61
2.59

HiS
2.64
2-60
2,51
254
2.51
2,49

257
2.53
:1.50
2A7
2.44
'.:AI
2.39

:t55
2.53
2.51
2,39
2.27
2,16
2.05

2.45
2.43
2,41
2.29
2.17
2.05
1.94

2,34
232
2.:H
U8
2.06
1.94
U3

=
~

--~-.-,--

.n

30

6.52

15

5,66
5.613
5.61
559
5.57
5042
5,19
5.15
5.02

J4.34

7.l5

3M)

7.39

3.116
3.&2
17&

I~
---

14,~4

976.7

7.76

14,62

6.94
6.72

~69

963.6
39.40
14.42

4.46
-1,42
4.3&

19.36

14,73

963,3
39.39
14.41

5.8:7
5.8"3

29
30

19,33

956,7
'l,9.31

5.11

5,15
5.72

943.2

5.92

""

Sm.l

9.

\9
10

II
26

7,2-1

6.55
6Al
6.30
6.2Q
6.12
6.;)4
5.98

12
11

, = _

"
-5

.'

lJ.!gTC;:S of fre.edom for lb(." mlmt13lOr (Y,)


7
&
9
10
12

'J2

3.61

2.63

UI

2.'16
2,42
139
236
2.3':1
2.30
228

2.4J
2,37
133
1.30
2,?7
:1-2<1
2.22

2.23
2.21
2.20
2.07
1.94
1.&2
L71

;1.17
2,15
2.14
2JH
1.8&
L76
1.64

= .,,

,m

= = =
= =
=
2.~1

,~

2.35
2.31
2.21
2.24
2.21
2.18
2.16

'D

~
.~

2~

2JI

2.16
1.1 t
2.0&
204
2.01
1.98
1.95

2.09
2.04
2.00
1.97
1.94
L9J
1.88

2.29
2.15
2.21
2.JR
2.15
2.12
2.09

2.22
2.18
2,14
;UJ
2.08
2.05
;.'! OJ

'n

,.

I.

2.11
2,09
2.07
1.94
1.82
1.69
157

2.05
Hj:l
2.0t
til!
1.74
L61
1.'18

1.98
1.96
L94

1.91
1.89
un
1.72
1.58
lA3
1.27

l.B3
LSI

L~O

un
1.53
1.39

1:19
}.6<1
1.48:
1.31
1.00

f!'l'

Table V Percentage Points: of the FDislrlbutlon (continued)

v,

F(lm,~",,:>

'.

Degrel!is of fr~~dorn fer the namenter (v,1


7
8
9
10
12

15

20

24

30

40

60

120

._---------

4052
2
3
4

I
]

..

9'),00

34.12

30.32

21.20

D.75
1.2,25

1L26

9
10

lQS6
1O.Q.I.

7.56

9.65

7.21

9.33
9.07
8,86

6.93

8.68

6.36

8.53
8,40
8.29
8.18
8.10

6.'n
6.11
6.01

11
J2
13

14
lS
16
17

"]e l'
."

98.50

18,00
1121
10.92
9.55
8.65
g,{)2

5
6

4m,5

19
20
21
22

J ""

24

1616

8.02
7.95
7.88
7.82
7.77

6.70

6.51

5403
99.17
29A6
16.69

99,25
2lt71
l't98

12.06
9.78

11.39

8,45

73'

U5
7,01

6.99

6.42

6.55
6.22

5.99

5.95
5.74

'AI

5,";6
5,42
5.29
5.1&
5.09

5m

'Di

5.S5
5.13

4.94
4.87

5.n

4.82

5.66
5.61
5,51

4.76
4,6&

4.n

26
27

7.n

5.53

4.64

7.8

"2.

7.64

5A9
5.45

4.60
4.57
4.>1

30
40
60
120

7.60
7.56
7.31

7.OS
6,35

6i;)

5.42
5.39
5.13
4.93
4.79
4.til

5625

4.51
4.31
4.13
3.')";

118

9.l5

5.67

5.21
5.04
4,89

4.'f'1
4.67
458
450
4.43
4.37
4,31
4,26
4,22

5764
99.30
2S,,24
15.52
10,91
K75
7.46

6,63
6.00
5.64
5,32
5.06
4.86
4.69

5859

"23

99.33
27.91
15.21
10,67

99,36
27.G7

8.47

~"
M9

14.98
10A6

WR2

6022

6056

6W6

613<;

99.39

99.40

99,42

99.43

99,45

99,46

ZiA9
14.&0
10.29
IUO

27.35
14.66
10.16

27.23

nos

26.87

:2(i}:;9

14.55
10.05

14.3'
9.89

14.02
9.55

7.&7

7.n

14.20
9.72
7.56

26.00
13.93
9A7
7.31

6.62
5.81
5.26
4.85
4.54
4.30

6,47

6.11

5.61

<;,";2

5.11
4.71
4.40
4.16

4.96

4.10

3.%

3.94

6.1<4
6,03
5A7
5.06

7.19
6.37

6.[8

5,80'

5.61

5.39
5.07

5.20
4.89

4.82

4.64
4.44

4.74
4.50
4,30

428

4.14

4.00

4,36
4,44

4.34
4.25

4.10
4.01

4.14
4.01
1.93
3.84

4.17

3S14

3.77

4,10

~87

3,70

3.63
3.56

4.04
3.99

3.81

364

3.5J

3.76

3.59

3,4<;

3,91

3,71
3.67

354

3,41

3.89

1.79

l71

7.93
6.72
5.91
5.35
4.94
4.63
4.39
4,19
4.03
3.89
3.78
3.68
3,60

3.52
lA6
lAO
335
330

3.W
3.69
3.59
3.51
J.43
3.37
33J
3.26
3.21

473

4.10

4.02

3.W
3.67
1.55

3.66

151

3.52
3.4}

1.46

3.31

3.37
3.30

3.23

3.13
3.17
3.12
3.07

3.50

3.36

3.26

3.17

3.03

3.63

JAG

3.22

3.13

3."

3.42

3.32
3.29

:1.18

3,09

4.11

3.78

356

3.39

3a6

3.15

3.06

4.07
4.01
4.02
3.1\3
3.65

3.75
3.73
3.70
3.51

3.53
350

3.21

3.12

'UO

3.(l9

3.01
3.00

3J7

'-07

2.98

3,12
2,95

2.%

2.79

2,72
2.56

lJlO
2.63

3.n

2.99
2,82
2.66

2)19

3.>4

3.17
3.29
3.12

3.36
3.33
3,1/)

2.'"
2.%
2.93
2.90

3.01

2.30

2.64

2.51

2.41

2.87
2.84
2.66
2.50
2.34
2.18

3.15

3m
3m

6.07
5.28

4.81
4..11
3.86
1.66

3,8':5
3J!2

2.47
2-32

7.'10
6.16
<;.16

4.56
4.25
4.01
3.82

4,l8
4.14

3AS
3.31

6.209

99.37

4.62
4,46
4.32
4.20

3.90

6j<;7

4.33

'"

'},59
3,43

3~1

3.37
3.26
3.16
1,08

3.29

3.21

3.1&

3,10

3.OS

3.00
2.92

3.00
2,94

2.92

2JP:l
2.33
2)3

:2.80

3.00

2,1:16

4.86
4.31

4,OR
3.78

4.00

3.9l

354

3.69
3.45

3.31

3.25

3.111:
1,0-;

3.00
2.96

HiO
3.36
3.17
3.00

3.02
2.92
2-'14
2.76

2,93
2JiJ

2.84
2:15

2.75
2.67

2.69

'2.61

2.66
2.58
2.52

2.64
2,58

2:'iS

lAG

250

lAO

2~1

2.45

2,)5

2.117

2J5
2.65
257

2.'"
2.42
2.36

2A0

2,31

2.58

2.27
2.23

23'

2,47

2.38

2,10

2.44

2.35

2,49
2.47
2,29
2,12
1.95

2.41

2.33

2.29
2.26
2.23

220

2~2

2.\'1
2.14

2,39

2.30
2.1 1

2,21

2.11

2.06
2,03
2.01

1.92

LSD

1.94

2.02
1.84

1.73

L76

1,66

1..53

1.59

1.17

1.32

1.60
1.38
1.00

2.63
Via

2.04

4.40

2.36

vS6

2.03
L88

4.95

4,48

2.'19

2.78
2.75
2,73

:,",19

S.UJ

6.8g
5.65

2,45
2.42

2.81

'55

9.11
6.9'}
5.74

9.m

7.06
5.8:2

2.31
2.26
2.21
2.17
2.11

2.66

2:17
2,20

99.50
26.13
13.46

2.67

2.62

2.51

6366

99,49
26.22
11;56

2.62
2,58
2.54
2.:50

214

2,70

2.1<4
:2.18
2.72

JA3
3.27
.1,13

6339

2."

:Uo

2.57

135

6313
99.4S
26.32
11.M
9.20

2.70

2.98
2.91
2.89
2.85

:,",35

6261
6287
99,47
99A7
26.50
26.41
11.7<;
13.84
9.38
9.29
7,23
7.14
5.91
'.99
5.20
5.12
4.65
4/17
4.23
4.17
3.86
3."
3:10
3.62

1:/9

2.20
2.03

LS6
1.70

'U'3

.i;'

g
~

610

Appendix
Chart VI

Operating ChMacteristic Curves

.~J

r2

(0) OC

curves for different values of n for the tvfo~sided normal test for a leve~ of significance

,,~O.05.

--- _., ..

0.00

I
,.. l-

'"
i"

-~

0.60

L._

O.4C,~-

6:

0.20

(b; OC curves fot cii.fferent values of n for the two-sided nonnal rest for a level of significance
"~O.OL

SO!4rce: Cia.'t VIa. e,/: k, r.; md q are reproduced Vlifr. pe.'1D:issior. from "'Operating Cha.-a~eristiC$ for the
Common St.J.istical Tests of Significance;" by C. L. Ferris. F, E. Grebbs. and C. L. Waver,.A1w11s of
14athematicai Statistics, June 1946.

Cha..'1:s Vlb. c, d, g, k, i,j, I, n, 0, p. and r are reproduceC ...1m pemissio!1 from Er.gineering Srar~tics. 2.'1.d edi~
non, by A H. Bowker and G. 1. Lieber:na:n, Pre!J.oce-Hal~, Englewood Cliffs, NJ, 1972.

Appendix

611

Chart VI Operating Characteristic Curves (continued)


1.00

,--=r-:.lll!Iiilli---,---.------,-----,--,-----,

0.80

f---+---+

:F
~
Co

0.60 r-----r---+-\\Ii\\\-~\\,'Ii\_'c-\-_\f--_'<+-----I-----I-

~
:c

0.40

"
0.20

o~-~----~--~~~~~~~~~~~~~~
-1.00
-0.50
0.0
0.50
1.00
1.50
2.00
2.50
3.00
d

(c) OC curves for different values of n for the one-sided normal test for a level of significance
a= 0.05.

1.00
I

....

,
,

0.80

:F
rn

0.60

":E

0.40

"

,
,

0.20

- __I.

0
-1.00

-0.50

0.0

0.50

1.00
d

1.50

2.00

2.50

3.00

(d) OC curves for different values of n for the one-sided normal test for a level of significance

a=O.OL
(continues)

612

Appendix

Chart VI Operating Characteristic Curves (continued)


1,0

0.8

:!?

~
1l

0,8

':5
~
0.4 '
~
~

02

(e) OC curves for di:ff~"ent values of n for the t\'l.to-sided r test for a level of significance ct= 0.05.

(f) OC curves for different values of n for the "No-sided t test for a level of significance 0:;: 0.01,

Appentli<

Chart VI
1.0C

Operating Charac:eristie Curves (comirtlled)

i'F"f"''l1liIi":-r-;-,,-c-T-rI , - ' - i T T ' ' 1

0.90
0.80

t'
~

0.70

<

~
~

O.fie

0.5C

0.4{}

':5

2
'-

0.30

(g) OC curves for different values of II for the oae-sided t test for a level of significance Ct= 0,05,

1.";;--;-~"i"'''''
0.90

0.&

f-+-

~--+

::r~ j-~
0.30

-f-+- fi-\I\ \--H+

;.~-- t - tr"

0,20
C.10

--i-- i\l\\~ix-",f-'c-~':~"-""I::-,-+--+'::'-.",,-j

(h) OC ctn"\'e$ for differeat values cf n for the one-sided t test for a level of significaacc

a = 0.01.

613

614

A?pendix

Chart VI Operating Characteristic Curves (conr.n.ued)


i.OO~~-T-'--:-~"--T-r-,---r-----T--T--''---'-'---T---'----;

"

0:70

g>

I
O.'0UiJjJjJJJ/j[:r;j~~~~
2.00
1

CAO

2.4C

2,SO

320

3.60

4,CO

(0 OC C'J..'"VCS for different values of r. for the t\Vo~sided crusquare test for a level of s~gnificance a= 0.05.

1.00

;--.,--:--=

ia 0.70,c

II
'0
~

'

,
II

j.60 -

'

0.50 I

OAO

OM

0.80 i.OO 1,20

1.60

2,00

2M

2,60

3.20

3,6<)

4,00

01

OC c'U..""Vcs for different values of r. for the twoMsided chi~square test for a level of signillcance a,: 0.01.

Appendix

Operating Characteristic Curves (continued)

Chart VI
1.00

615

---,---.""---.,..---,----y-------r---

c".

0.80

:i!:
c
=n
~

8u

if

0,60

-- - - - ---

--

'0

0.40

e
Q.

0.20

0;

2.0

i.O

3,0

A,O

OC curves for d.iffL'!ent values of r. for me one~sided (upper tail) chl-squa.-e test fo:: a leve; of
significance a= 0.05.

(k)

1.00

C.80

:1'
0

.
n

0,60

"15l?

0.40

75

"-

50

e
0.2:0

'20"
"5

1.0

2.0

$,0

5.0

4.0

6.0

10

8.0

9,0

<
(l) OC curves for different values of n for the one-sided (upper tail) chi-square test for a level of
significance a:::: O.OL
(continues)

616

Appendix

Chart '\'1 Operating Characteristic Curves (cohtfnued)

(m)

ac curves for different values of n for the one-sided (lower taU) chi-square test for a level of siguificaucc

<:t=

0.05.

:f
~

8 '.60
~

":5z.

I,
II
i

GAOl

E
a.

'.2<l

>
(1'1) oc C4cves for different v~ucs of n for the one-sided (lower taU)
a=O.01.

chi~squa:re

test for a level of sign.if.cance

Appendix

617

Chart'VI Operating Characteristic Cu.-ves (continued)


1.00

,..--r-,..---r-,---,---,-r--,-,-...,-----_--_-~--"

0.80

:!!:

O.BO

E
0

"

"5

:5

O.LD

<L

020

(0) OC curves for di!feren: va2ues of n for the two-sided F-test for a level of significance cc= 0.05.

.DC

0.80

:E
~

:
~

0.60

"5
~

0.40

.<L

020

(p) OC curves for different values of n for the two-sided F-test for a level of sig::.ificance a::::: 0,01
(cor.rinues)

618

Appendix
ChartVl Operating Characteristic Curves (con.tinued)

(q) OC curves for different values of n for tb.e one-sided F-test for a level of significance a:;:;:; 0.05.

1,00

0.80

:if

:E
~

0,6Q

15

:s'"

0.40

:t

o~~~~~~~~~~~==~~
o 1.00 .2.00
4.00
5,00
8.00
10.00
12.00
14.00
16,00
'(r) OC CUlVes for different values of n for the one-sided F-test for a level of significance a = 0.01.

Appcndix

619

Chart VII Operating Characteristic Curves for the Fixed-Effects Model Analysis of Variance
1.00,---~----c--,--------~-------,--------,--------,--------,----,

(!ora

"'0.Q1)-2

1.00,----------,----------,-----------,----------,-----------------,

0.40

~
5
0

'"

0.30

---T

0.20

'
0

0.10

":B'"
E

0.06
0.05

--- - - - - -

0.08
0.Q7

0.04

0.03

._--

0.02

0,01

<fJ(fora .. 0.Q1) -~1

3 2

(fora:

",0.05)
3

"1 = l1umorutor degroos ol'roodom. "2'" donomlnator dogroas 01 lroodom.

Source: Chart vn is adapted with permission from Biom.etrika Ta.bles/or Sra.risrician.s, Vol. 2, by E. S. Pearson
ilIld H. O. Hartley, Cambridge Univen;ity Press, Cambridge, 1972.

(continues)

620

Appendi:<

ChartVll Opemtiug Characteristic Curves for the FIxed-Effects Model Analysis of Variance
(continued)

'i

030

0,20 i

"?i

0.10
0.05

M7
0,,"

t"

0,05

I ~

i
" i
oror
{L(J.t

\l,;)S

-, ,t

(},01
tp(rore",i).C~)-1

1,00

o.en
0.70
0,60

0.50 i

.*

"
~

02C:

!
~
'il

'"
1
~

:),10 I"
0,06", -

e,C7:
0,00 C
;::,05 !'"

0,,,1
o,osl

Appendix

621

Chart vn Operating Characteristic Curves for the Fixerl~Effects Model Analysis of Variance
(continued)
,-~--~----,-----~----~----~~

II
5

',00"- - , - - , - - - - - - - - - , - - - - - , - - - - - - - , - - - - ,
0,';0'
0,70
0,50

0.50
OAO

'~

~,

0.0

O.03C"0_0'~'

0"1""\11,\"\

O.011!-------!:-L.l..Ll-'-.Ll~_:'::::::':_=:_::__;:_;S:_..L..LL'-.L.L-'-l-'-"'--.J
2

(continues)

622

Appendix

Chart vn Ope(ating Characteristic Curves for the Fixed-Effects Model A!talysis of Variance
(continued)
1.00 ~----:-'--------'--T'-'-'~--"--'--'--,-----'-'~'---'

O.llIJ~~--"'--'-;:=--=::~:~~~=:=:j

(),10

I
I

0.01 ",-----'-;;'-'...\..-'-'-l...\..'-;;3~~='-.{;1~":a-.-;:o.~0S):;-'-'--''--''-...\..-'--'--'--'--'---~
:!cra .. O,O'l-_1

Appendix

623

Chart VIII Operating Olaractcristic CureS ror the Random-Effects Model Analysis ofVa..iance

'.00

4!

"

O.1l<l
O.1C
0.00
0.5C

OA:

0.30
020

g>

.~

"- 0.10

S
u

15

Q,08

0.Q7
0.06 ,.
O.OS

0.04

jj om
D

"- M2

190 "",*-,\ (fw (l '" 0.05)


110
tOO

0.00
0]0
0.00

--+_ ........ _ ... _........ .

O.SO
0.40

0.20

"

"i

020

'"
,s

g>

'K

0.10

~
'0

0.08
0.C7
C 0.06

:a

f---+--+..-

J.CS

E
3.04- !----I---+--+U.
Mai--"""-+-

oo,L....--~--~__-1~--L---JL--~--~~~~~~~~~~~~~~~~~.
;3

So~rce:

9
5

11
7

i3
;;

15
11

17
'3

19
15

21
17

23
19

25""O:-.l.(!orcu.<.l.OSj
21
23
25

Reproduced with pennission from Ettgineer.n8 Statistics. 2nd edition, by A. H, Bowker and G,], Lieberman.
Englewood Cliffs, NJ, 1972.

P~ntice"HalJ.,

(continues)

624

Appendix

Chart VDI Operating Characteristic CtU-YCS for the Random-Effects Model Analysis of
Va..>iance (continued)

0,03

I
I,

002

1
00'
. 1t

I
t

.l,(for'X"'O.Ol} _1

,
4

5
3

6
4

7
5

.,

10

11 -}.{for;;t ",0.05)
9
10
12

"

13

1
2

'0

Appendix

625

Chart vm Operating Characteristic Curves for the Rando:=l-Effects Mode! Analys.is of


\Tariance (co!L~d)
:.00 ,
C.80 :

'70
::.60

c.so C.40

'";

lIi

c.so
c2C

i!'
i
'3. 0.10:

i 0.00: .
i ~:: f-~~~t_.
Jl C.04

- ---

:~- ..

0.03
C.02

0,0.1 "L-:2;--";---;--'.!~5~""~'L'-j-:;~;\;;;:;;';;~l"~~-".~~"-':'~"---_J
:t(fpra "'v.Ol}----1

;:.(bra .. C.05)-1

(con.tin.ues)

626

Appendix

vm

Chart
Operating Characteristic Curves for the Ra.ndom~Effects Model Analysis of
Variance (contir.ued)
'.00 "-~---"--'-c----.. :-~'--'---C---T---;--'----r----'--r~.,

0.,,/

0,70 '

i.

0.00
C.50 :
0.<0

O.3Q

1
~

0 .2:: .

,
tOO ~ -,

-~:Y----r~--.,.

::--~'--;'--'T---,-',-~'----r---~-~-'-

0.83

0.70
CuD i0.50 i

GAO

OOSO i

~ 020

:l

000-

r
I

;2:" 0:07\
OAiS .

~ 0.C5

il

"

::\
0,02 '

;, {for

ti '" 0.05)

---- 1

--)

Appendix

627

Critical Values for the Wl1coxon Two-Sample Test"

Table IX

R~,05
n, .,

4
5

1J

3
3
3
4

12

!3

4
4
4
4

14
15
16
17
18
19
20
21
22
23
24
25
26
27
28

10

63
65

78

11

12

l3

14

15

137
141
145
150
154

160
164
169

185

10

7
8
9
10

5
5

5
6

6
6

6
6
7
7
7

11

7
7
8

12
13
14
15
15

8
9

9
10
10
!I

11

12
12
13
13

14
14
15
15
16
16
17
17

16

17
18
19
20
21

21
22
23
24
25
26
27
28
28

17
18
20
21
22
23
24
26
27
28
29
31
32
33
34
35

37
38
39
4(l

26
27
29
31
32
34
35
37
38
40
42
43
45
46

48
50
51
53
55

36
38
40
42
44
46
48
50
52
54
56
58
60
62
64
66
68

49
51
53

55
58
60
63
65
67
70

72
74
77

79

68
71
73
76
79
82
84
87
90

93
95

81
85

96
99

88
91
94
97
100
103
107
110

103

106
110

114
!l7
121
124

115
119
123
127

131
135
139

82

42

29

$oun::e: ReproduCed wit..'t permission from "'The Use of Ranks in ATest of Significance for Comparing Two Treatments,"

by C. 'Nlllte. Biomerrics.1952. Vol. 8, p. 37.


"For large n; and 1'-., R is :l??torimarely normal!y di.~tributed, with mean nl(n,

.....

n,. + 1)12 and variance 1l 1n::(1t 1 + 11:: + 1)/12.


(con/inUes)

628

Appendix

'ubi. IX Critical Values for the Wilcoxon Two--Sample Test" (continued)


1':: nl

10

II

12

13

14

15

,~--

5
6

10

10

7
8
9
10

6
6

11

12
12

12
13
14
15
16
17
18
19
20
21

13

7
7

14

15
15
16
16
17
18
18
19
19
20
20
21

22

23
24

25
26
27
28

8
8
8

3
3
3
3
3
3

3
3

4
4

9
9
9
10
10
10
il
11
11

II
11

14

15
16

17
17
18
19
20
21
22
22
23
24

25
26

27
28
29
29
30
31
32

23
24

25
26
27
28
30
31
32
33
34

36
37
38
39
40

42
43
44

32
34
35
37
38
40

41
43
44

46
47
49
50
52
53
55

57

43
45

56

47
49
51
53

58

54

56
58
60
62
64
66
68
70

61
63

65
67
70
72

74
76
78
81
83

71
74
76
79
81
84
86
89
92

94
97

87
90
93

96
99
102
105
108
111

106
109
112

115
119
122
125

125
129
133
137
140

147

151
155

171

629

Appendix

Table X

;;Z.!
n

Critical Values for the

R'
0,10

5
6

0
0

12

2
2

2
2
2
3
3
4
4
4
5
5

13

14

15
16

17

18
19
20
21
22

4
4
5
5
5

0.01

23
24

0
0

8
9
10
11

0.05

25

26

27

28

29

30
31
32
33

IX

0.10

0.05

om

6
6
7

7
7
8
8

9
9
10
10
10

8
8

34
35

II
II

10
10

12

3
3
3

36
37

II
11

13

38

13

4
4

39

13
14

40

12

"For n:> 40, R is approximately normally distributed. with mean nI2 and variance nl4.

9
9

12
12
12
13

7
7
7

8
8
9

9
9
;0
10
II
II

630

Appendix
Table XI Critical Values for the Wilcoxon Signed Rank Test'

EI
4
5

0,10

0.05

2
3

0
2

8
10

13

14
15
16
17

18
19
20
21
22

23
24
25
26
27

O,O[

10
11
12

0.02

~-~--,----.-~--

13

17
21
25

30
35
41
47
53
60
67

75
83
91
100
110
119

5
10
13
17
21
25

29
34
40
46
52
58
65
73
81
89
98
107

0
3
5
7
9

12
15
19
23
27
32

37
43
49
,5
62
69

76
84
92

0
1
3
5
7

9
12

15
19
23
27
32
37
42
48
54

61
68
75
83

" "I

0,[0

28
29
30
31
32

130

116

101

140
151
163

126

137

175

159

33
34

187
200

170

110
120
130
140
151
162

35
36
37
38
39
40
41
42
43

213

271

44

353

327

45
46
47
48
49
50

371
389
407
426
446
466

343
361

0.05

0,02

0,01

.~-~--.~-~---.~-----~"~~

227
241
256

286
302
319
336

147

182
195
208
221
235
249
264
279

294
310

173
185

198
211
224
238
252
266
281

296
312
328

9[
100
109
118
128
138

148
159
171
182
194
207
220
233
247
261
276

291
307

378

345

322

396

362
379
3'17

339

415
434

355
373

--~

Source: Adapted with peo:nission from "Extended Table.'> of the Wilcoxon Matched Pair Signed Rank Statistic"
by Robert L. McComnck. Joumw. of the American. Statistical Assoc".atiOI'l, Vol. 60, September, 1965.

"If n > 50, R is approximately normally distributed, with mean n(n 1" 1)/4 and varianoe rt{r. + lX2Jt 1" 1)124.

Thh]e XII

Percentage Point~ of the Studentized Range St'Jlisli(;'


QO_01(P.f)

----------~~~~~~~~

------------------------

---------~~~~~~~~

90~0

135

164

2
3
4
5
6
7

14~0

8
9
10

11
12
13
14
15
16
17
18
19
20
24
30
40
60
120

19~0

8.26
6.51
5.70
5.24
4.95

10.6
8.12
6.97
6.33
5.92

22.3
12.2
9.17
7.80
7.03
6.54

4,74
4.60

5.63
5.43

6.20
5.96

4A8
4.39
4.32
4.26
4.21
4.17
4.13

5.27
5.14
5.04
4.96
4.89
4.83
4.78

5.77
5.62
5.50
5.40
532
5.25
5.19

4.10

4.74

5.14

4.07
4.05
4.02
3.96
3.89
3.82
3.76
3.70
3.64

4.70
4.67

5.09
5.05
5.02
4.91
4.80
4.70
4.60
4.50
4.40

4.64

4.54
4.45
4.37
4.28
4.20
4.12

5
186
24.7
13.3
9.96

8.42
7.56
7.01
6,63
6.35
6.14
5.97
5.84
5.73
5.63
5.56
5.49
5. 3
5.38
5.33
5.29
5.17
5.05
4.93
4.82
4.71
4.60

202
ll!>
26.6
28.2
14.2
15.0
10.6
ILl
8.91
9.32
8.32'
7.97
7.37
7.68
6.96
7.24
6.66
6.91
6.43
6.67
6.25
6.48
6.10
6.32
5.98
6.19
5.88
6.08
5,99
5.80
5.92
5.72
5.66
5.85
5.60
5.79
5.71
5.55
5.51
5.69
5.37
554
5,24
5.40
5.27
5.1l
5.13
4.99
4.87
5.01
4.76
4.88

10

11

12

13

14

15

16

17

18

19

20

227
237
246
253
260
266
272
272
282
286
290
294
298
30.7
31.7
32.6
33.4
31.4
34.8
35.4
3M
365
37.0
37.5
37.9
29.5
15.6
16.2
16.7
17.1
17.5
17.9
18.2
18.5
18.8
19.1
19.3
19.5
19.8
11.9
12.3
12.6
12.8
13.1
ll.3
13.5
13.7
J3.9
14.1
14.2
14.4
ll.5
9.97 lO.24 WAS 10.70 10.89 11.08 11.2,1 11.10 11.55 lL68 lLKl lL93
9.67
8.61
8.87
9.10
9.30
9,49
9.65
9.81
9.95 lO.08 10.21 JO.32 10.13 10.54
7.94
S.17
8.37
8.55
B.7l
8.86
9.00
9.12
9.24
9.35
9.16
9.55
9.65
7.68
7.87
8.03
8.18
8.31
8.44
8.55
8.66
8.76
8.85
8.94
9.03
7.47
7.32
7.49
7.65
7.78
7.91
8.03
8.13
8.23
8.32
8.41
8.49
857
7.13
6.87
7.05
7.21
7.36
7,48
7.60
7.71
7.81
7.91
7.99
8.07
8.15
8.22
6.84
6.99
7.13
7.25
7.36
7.46
7.56
7.65
7.73
7.81
7.R8
7.95
6.67
6.67
6.81
6})4
7.0(,
7.17
7.26
7.36
7.44
7.52
7.59
7.66
7.73
6.51
6.53
6.67
6.79
6.90
7.01
7.10
7.19
7.27
7.34
7.42
7.48
7.55
6.37
(,.26
6.41
6.54
6.66
6.77
6.87
6.96
7.05
7.12
7.20
7.27
7.33
7.39
6,16
6.31
6,44
6.55
6.66
6.76
6.84
6.93
7.0{)
U)7
7.14
7.20
7.26
6.08
6.22
6.35
6.46
6.56
6.66
6.74
6.82
6.90
6.97
7.03
7.09
7.15
6.15
6.27
6.38
6.18
6.57
6.66
6.73
6.80
6.87
6.94
7.00
7.05
6.01
6.08
6.20
6.31
6.11
6.50
6.58
6.65
6.72
6.79
6.85
6.91
6.96
5.94
5.89
6.02
6.14
6.25
6.34
6.43
6.51
6.58
6.65
6.72
6.78
6.84
6.89
5.97
6.09
6.19
6.29
6.37
6,45
6.52
6.59
6.65
6.71
6.76
6.82
5.84
5.81
5.92
6.02
6.11
6.19
6.26
6.33
6.39
6.45
6.51
6.56
6.61
5.69
5.54
5.65
5.76
5.85
5.93
6.01
6.0S
6.14
6.20
6.26
6.31
6.36
6.41
5.39
550
5.60
5.69
5.77
>.84
5.90
5.96
6.02
6.07
6.12
6.17
6.21
5.36
5,45
5.53
5.60
5.67
5.73
5.79
5.84
5.89
5.93
5.98
6.02
5.25
5.21
5.30
5.38
5A4
551
5.56
5.61
5.66
5.71
5.75
5.79
5.83
5.12
5.08
4.91)
5.16
5.23
5.29
5.35
5.40
5,45
5.49
5.54
5.57
5.61
5.65

f"' degress of freedom.


}l'rom J. M. May, ''li'{leuded and Corrected Tables of tbe Upper Pe~enl'lge Points of tbe S[lldel'l!ized Rang~," JJiomelrika. Vbl. 39, pp. 192-193. 1952. Reproduced by permission of the

trus{eM of Biowetrlka.

(contllIues)

,..el

Table ~'1I

!li

Percentage Points of the Studcnlized Range Statistica(cQII/ilwed)


q",(p,f)

f
I
2
3

4
5
6
7

8
9

10
11
12
13
14
15
16
17
18

19
20
24
30
4Q

60
120

18.1
6.09

4.50
3.93

3.64
3.46
3.34
3.26
3.20
3.15
3.11
3.08
3.06
3.03
3.01
3.00
2.98
2.97
2.96
2.95
2.92

2.89
2.86
2.83
2.80
2.77

26.7
8.28
5.88
5.00
4.60
4.34
4.16
4.04
3.95
3.88
3.82
3.77
3.73

3.70
3.67
3.65
3.62
3.61
3.59
3.58
3.53
3.48
3.44
3.40
3.36
3.32

32.8
9.80
6.83
5.76
5.22
4.90
4.68
4.53
4.42
4.33
4.26
4.20
4.15
4.11
4.08
4.05
4.02
4.00
3.98

37.2

3.96

3.90
3.84
3.79
3.74
3.69
3.63

10.89

7.51
6.31
5.67
5.31
5.06
4.89
4.76
4.66
4.58
4.51
4.46

4.41
4.37
4.34
4.31
4.28
4.26
4.24
4.17
4.11
4.04
3.98
3.92
3.86

40.5
43.1
11.73 12.43
8.04
8.47
6.73
7.06
6.03
6.33
5.63
5.89
5.35
5.59
5.17
5.4Q
5.02
5.24
4.91
5.12
4.82
5.03
4.75
4.95
4.69 4.88
4.64 4.83
4.59
4.78
4.56 4.74
4.52
4.70
4,49
4.67
4.47
4.64
4.45
4.62
4.37
4.54
4.30
4.46
4.23
4.39
4.16
4.31
4.10
4.24
4.03
4.11

8
45.4
13.03
8.85
7.35
6.58
6.12
5.80
5.60
5.43
5.30
5.20
5.12
5.05
4.99
4.94
4.90
4.86
4.83
4.79
4.77
4.68
4.60
4.52
4.44
4.36
4.29

47.3
13.54
9.18
7.60
6.80
6.32
5.99
5.71
5.60
5.46
5.35
5.27
5.19
5.13
5.08
5.03
4.99
4.96

4.n
4.90
4.81
4.72
4.63
4.55
4.47
4.39

10

II

49.1

50.6
51.9
14.39 14.75
9.72
9.95
8.03
8.21
7.17
7.32
6.65
6.79
6.29
6.42
6.05
6.18
5.87
5.98
5.72
5.83
5.61
5.71
5.51
5.61
5.43
5.53
5.36
5.46
5.31
5.40
5.26
5.35
5.21
5.31
5.17
5.27
5.14
5.23
5.11
5.20
5.01
5.10
4.92
5.00
4.B2
4.90
4.73
4.81
4.64 4.71
4.55
4.62

13.99

9.46
7.83
6.99
6.49
6.15
5.92
5.74
5.60
5.49
SAO
5.32
5.25
5.20
5.15
5.11
5.07
5.04
5.01
4.92

4.B3
4.74
4.65
4.56
4.47

12

13

14

53.2
15.08
10.16
8.37
7.47
6.92
6.54
6.29
6.09
5.93
5.81
5.71
5.63
5.56
5,49
5.44
5.39
5.35
5.32
5.28
5.18
5.08
4.98
4.88
4.78
4.68

54.3
15.38
10.35

8.52
7.60
7.04
6.65
6.39
6.19
6.03
5.90
5.80
5.71
5.64
5.57
5.52
5.47
5,43
5.39
5.36
5.25
5.15
5.05
4.94
4.84
4.74

15

16

55.4
56.3
15.65 15.91
10.52 JO.69
8.67
8.80
7.72 7.83
7.14
7.24
<\.75
6.84
6A8

6.51

6.28
6.12
5.98
5.88
5.79
5.72
5.65
5.59
5.55
5.50
5.46
5.43
5.32
5.21
5.11
5.00
4.90
4.80

6.36
6.20
6.06
5.95
5.86
5.79
5.72
5.66
5.61
5.57
5.53
5.50
5.38
5.21
5.17
5.06
4.95
4.84

11

18

19

57.2
58.0 58.8
16.14 16.36 16.57
10.84 10.98 11.12
8.92
9.03
9.14
8.12
7.93
8.03
7.34 1.43 7.51
6.93
7.01
7.08
6.65
6.73
6.80
6.44
6.51
6.58
6.27
6.34
6.41
6.14
6.20
6.27
6.02
6.09
6.15
5.93
6.00
6.06
5.86
5.92
5.98
5.79
5.85
5.91
5.73
5.79
5.84
5.68
5.74
5.79
5.69
5.63
5.74
5.59
5.65
5.70
5.66
5.61
5.56
5.44
5.50
5.55
5.33
5.38
5.43
5.22
5.27
5.32
5.1 I
5.15
5.20
5.00
5.04
5.09
4.98
4.93
4.97

20
59.6
16.77
11.24
9.24
8.21
7.59
7.16
6.87
6.65
6.47
6.33
6.21
6.11
6.03

5.96
5.90
5.84
5.79
5.75
5.71
5.59
5.48
5.36
5.24
5.13
5.01

f!t-

Appendix

Tabl.xm Factors for Quality~Con1rol o.arts

x Chart

R Chart

Factors for
Control Limits

Factors for
Central Line

Factorsfo!
Control Limits

n'

2
3
4

5
6
7
8
9
10

11
12
13
14
15
16
17
18
19
20
21
22

23
24
25
7t> 25; AI :;:::

3.760
2.394
1.880
1.596
1.410
1.277
1.175
1.094
1.028
0.973
0.925
0.884
0.848
0.816
0.788
0.762
0.738
0.717
0.697
0.679
0.662
0.647
0.632
0.619

1.880
1.023
0.729
0.517
0.483

0.419
0.373
0.337
0.308

0..285
0.266
0.249
0.235
0.223
0.212
0.203
0.194
0.187
0.180
0.173
0.167
0.162
0.157
0.153

3f'>h:. n ::::: nl1IT.ber of obserY.lticns in sample.

1.128
1.693
2.059
2.326
2.534
2.704
2.847
2.970
3.078
3.173
3.258
3.336
3.407
3.472
3.532
3.588
3.640
3.689
3.735
3.778
3.819
3.858
3.895
3.931

0
0
0
0
0
0.076
0.136
0.184
0.223
0.256
0.284
0.308
0.329
0.348
0.364
0.379
0.392
0.404
0.414
0.425
0.434
0.443
0.452
0.'159

3.267
2.575
2.282
2.115

2.004
1.924
1.864
:.816
1.717
1.744
1.716
1.692
1.671
1.652
1.636
1.621
1.608
1596
1586

1.575
1.566
1.557
1.548
1.541

633

634

Appendix

Table XIV

k Values for OT,le-Sidcd and Two-Sided Tolerance Intervals

One-Sided mlerance Intervals


Confidence
Level

0,95

0,90

0,99

Percent
Coverage

0,90

0,95

0.99

0.90

0.95

0.99

2
3
4
5
6
7
8
9

10.253
4.258
3.188
2.742
2.494
2.333
2.219
2,133
2.066
2,011
1.966
1.928
1.895
1.867
1.842
L819
1.800
1.782
1.765
1.750
1.737
1.724
1.712
1.702
1.657
1.598
1.559
1.532
1.511
1.495
1.481
1.470

13.090
5.311
3.957
3.400
3.092
2,894
2.754
2.650
2,568

18.500
7,340
5.438
4.666
4.243
3.972
3,783
3.641
3.532
3.443
3.371
3.309
3257
3.212

20.581
6.155
4.162
3.407
3.006
2.755
2.582
2,454
2.355
2.275
2.210
2.155
2.109
2.068
2.033
2.002
1.974
1.949
1.926
1.905
1.886
1.869
1.853
1.838

26,260
7.656
5.144
4.203
3.708
3.399
3.187
3,031
2.911
2.815
2.736
2.671
2.614
2.566
2,524
2.486
2.453
2.423
2.396
2.371
2.349
2.328
2.309
2.292
2.220
2.125
2.065
2,022
1.990
1.964
1.944
1.927

37.094

10
11
12

13
14
15
16
17
18
19
20
21
22
23
24
25
30
40
50
60
70
80
90
100

2.503
2.448
2.402
2.363
2.329
2.299
2.272
2.249
2.227
2.028
2,190

2.174
2,159
2.145
2.132
2.080
2.010
1.965
1.933
' 1.909
1,890
1.874
1.861

3.l72
3.137
3.105
3.077
3.052
3.028
3,007
2.987
2.969
2.952
2.884
2.793
2.735
2.694
2.662
2.638
2.618
2.601

I.m
1.697
1.646
1.609
1.581
1.559
1.542
1.527

10.553
7.Q42
5.741
5.062
4.642
4.354
4.143
3.981
3,852
3.747
3.659
3.585
3.520
3.464
3,414
3.370
3.331
3.295
3.263
3.233
3,206
3.181
.3,158
3,064
2.941
2.862
2,807
2.765
2.733
2.706
2.684

0,90

0.95

0.99

103.029 131.426 185.617


13.995 17.370 23.896
7.380
9,083 12.387
5.362
6.578
8.939
4.411
5.406
7.335
3.859
4.728
6.412
3.497
4.285
5.812
3.240
3.972
5.389
3.()48
3.738
5,074
2.898
3.556
4.829
2.m 3.410 4.633
2,677
4,472
3.290
4,337
2.593
3.189
2.521
3.102
4.222
2.459
3.028
4.123
2.405
2.963
4,037
2.357
2,905
3.960
2.314
2.854
3.892
2,808
2.276
3.832
2.241
2.766
3.777
2.209
2.729
3.727
2.180
2.694
3.681
2.154
2.662
3.640
2.633
2.129
3.601
2.030
3.447
2.515
1.902
2.364
3.249
1.821
2.269
3.125
1.764
2.202
3.038
1.722
2.153
2.974
1.688
2.114
2.924
1.661
2.082
2,883
2.056
1.639
2.850

Appendix

635

ThbleXIV k Values for One-Sided and Two-Sided Tolerance Intervals (continued)

Two-Sided Tolerance Intervals


Confidence
Level

0.90

0.95
"".--~----

..

0.99
~~

..

Percent
Coverage

0.90

0.95

0.99

0.90

0.95

0.99

2
3

15.978
5.847
4.166
3.949
3.131
2.902
2.743
2.626
2.535
2.463
2.404
2.355
2.314

18.800
6.919
4.943
4.152
3.723
3.452
3.264
3.125
3.018
2.933
2.863
2.805
2.756
2.113
2.676
2.643
2,614
2.588
2.564
2.543
2.524

24.167
8.974
6.440
5.423
4.870
4.521
4.278
4.098
3.959
3.849
3.758
3.682
3.618
3.562
3.514

32.019
8.380
5.369
4.275
3.712
3.369
3.136
2.967
2.839
2.737
2.655
2.587
2.529
2.480
2.437
2.400
2.366
2.331
2.310
2.286
2.264
2.244
2.225
2.208
2.140
2.052
1.996
1.958
1.929
1.907
l.889
1.874

37.674
9.916
6.370
5.079
4.414
4.007
3.732
3.532
3.379
3.259
3.162
3.081
3.012
2.954
2.903
2.858
2.819

48.430
12.861
8.299
6.634
5.775
5.248
4.891
4.631
4.433

4
5
6
7
8
9

10
11

12
13
14
15
16
17
18
19
20
21
22
23
24
25
30
40
50
60
70
80
90
100

2.278

2.246
2.219
2.194
2.112
2.152
2.135
2.118
2.103
2.089
2.071
2.025
1.959
1.916
1.887
:.865
1.848
1.834
1.822

3.471

2.506

3.433
3.399
3,368
3,340
3.315
3.292

2.489
2.474
2.413
2.334
2.284
2.248
2.222
2.202
2.185
2.172

3.251
3.170
3.066
3.001
2.955
2.920
2.894
2.872
2.854

3,270

2.7&4

2.752
2.723
2.697
2.673
2.651
2.631
2.529
2.445
2.379
2.333
2.299
2.272
2.251
2.233

4277

4.150
4.044
3.955
3.878
3.812
3.154
3.702
3.656
3.615
3.577
3.543
3.512
3.483
3.457
3.350
3.213
3.126
3.066
3.021
2.986
2.958
2.934

0.90

0..95

0.99

160.193 188.49: 242.300


18.930 22.401 29.055
9.398 ILl50 14.527
7.855 10.260
6.612
6.345
8.301
5.337
4.613
5.488
7.187
4,147
4.936
6.488
3.822
4.550
5.966
4.265
5.594
3.582
3.397
4.045
5.308
5.079
3.250
3.870
4.893
3.130
3.727
3.029
3.608
4.737
2.945
4.605
3.507
2,'072
4.492
3.421
2.808
3.345
4.393
2.753
3.279
4.307
2,703
3.221
4.230
2.659
3.168
4.161
2.620
3.121
4.100
2.584
3.078
4.044
3,040
2.551
3.993
2,522
3.004
3.947
2.494
2.972
3.904
2.385
2.841
3.733
2.247
2.677
3.518
2.162
2.576
3.385
2.103
2.506
3.293
2.060
2.454
3,225
2,026
2.414
3.173
2,382
1.999
3.130
2.355
1.977
3.096

636

Appendix

ThbleXV Random Numbers

1G480
22368
24130
42167
37570
77921
99562
96301
89579
85475
28918
63553
09429
10365
07119
51085
02368

own

52162

07056
48663
54164
32639

29334
02488
81525
29676
00742
05366
91921
00582
00725
69011
25976
09763
91567
17955
4<i503
92157
14577

98427
34914
70060
53976
76072

90725
64364
08962
95012
15664

15011
46573
48360
93093
39975
06907
72905
91977
14342
36857
69578
40961
93969
61129
97336
12765
21382
54092
53916
97628
91245
58492
32363
27001
33062
72295
2()591
57392
G4213
26418
04711
69884
65795
57948
83473
42595
56349
18584
89634
62765
07523
63976
2'0277
54914
29515
52210
67412
00358
68379
10493

01536

25595
22527

06243
81837
11008
56420
054<i3
63661
53342

88231
48235
52636
87529
7lG48
51821

52404

33362
46369
33787
85828
22421
05597
87637
28834
04839
68086
39064
25669
64117
87917
62797
95876
29888
73577
27958
90999
18845
94824
35605
33362
88720
39475
06990
40980
83974
33339
31662 .
93526
2G492

02011
85393
97265
61680
16656
42751

81647
30995
76393
07856
06121
27756

69994
(Y1972

98872

10281
53988
33276
03427
92737
85689
08178
51259
60268
94904
58586
09998
14346
74103
24200
87308
07351
96423
26432
66432
26422
94305
77341
56170
55293
886G4
12908
30134
49127
49618
78171
81263
64270
82765

46473
67245
07391
29992
31926
25388
70765
38391

18876
17453
53060
70997
49626
88974
48237
77233
77452
89368
31273
23216
42698
09172
47070
13363
58731
19731
24878
46901
34673

44107
26766
422()6
86324
18988
67917
30883
04024
20044
023G4
34610

3%67
01638
34476
23219

68350
58745
65831
148S3
61642
10592

91132

91646
89198
64809
16376
91782
53498
31016
20922

69179
27982
15179
39440
6G468
18602
71194
94595

18103

57740

59533
79936
69445
33488
52267
13916
16308
19885
G4146
14513
06691
30168
25306
38005
00256
92420
82651

38867
56865
18663
36320
67689

20849

40027
44048
25940
35126
88072
27354
48708
18317
86385
59931
51038
82834
47358
W,477

17032
53416
82948
25774
38857
24413
34072

G4542
21999

47564

60756
55322
18594
83149
76988
90229
764<i8
94342
45834
60952
66566
89768
32832
37937
39972
74087
76222
26575
18912
28290
29880
06115
20655

09922
56873
66%9
87589
94970
11398
22987
50490
59744
81249
76463
59516

14194
53402
241130
53537
81305
70659
18738
56869
84378
62300
05859
72695

62590
93965
49340
71341

98420

49684
90655
44013
69014
25331
08158
90106
52180
30015
01511
97735
49442
01188
71585
23495
51851
59193
58151
35806
46557
50001
76797
86645
98947
45766
71500
81817
84637
40801
65424
05998
55536
18059
28168
44137
61607
04880

40836
25832
42878
80059
83765
92351
35648
54328
81652

69975
80287
39911
55657
97473
56891
02349
27195

17617

93394
81056
92144
44819
29852
98736
13602
04734
26384
28728
15398
61280

14778
81536
61362
639G4
222()9
99547
36086
08625
82271

35797
99730
20542
58727
25417

56307

32427

References

Agresti, A. and B. Coull (1998), "Approximate is


Better than "Exact' for Jr.terval Estimation of
B:nomiai Proportions." The Aw..erican Statistician,
52(2).
.t\nderson, V. L., ar.dR A. McLean (1974), Design
of Experiments: A Realistic Approach,

~1arcel

Del:ker. New York.


Banks.]., J, S, Carsoo, B,L, Nelson., andD. M.Nicol
(2001).Discrete~E)len.t System Simulation, 3rd edition, Prentice~Hall. Upper Saddle River, NJ.
Bartlett, M. S. (1947), "The Use ofTransforma
tions," Biometrics, Vol. 3, pp. 39-52.
Bechhofer, R. E . T. 1, Santner, and D. Goldsman
(1995), Design. and Analysis of Experiments for
Statistical Selection" Screen.ing and Multiple Com~
parnaM, lohn Wiley and Sor.s, New York.
BelsleY, D, A., E, Kuh, and R. E. Welsch (1980).
Regression Diagnostics, John Wiley & Sons, New
York.
Berrettoni, J. M. (1964), "Practical Applications of
the Weibull Distribution,"" Industrial Quality Con""I, Vol, 21,No, 2,pp. 71-79.
Box, G, E, P" and D, R. Cox (1964). "AllA:1alysis of
Ttansfon:::lations." Journal of the Royal Statistical
Society, B, VoL 26, pp. 211-252,
Box, G, E, p" and M, F. Mdller (1958), "A Note on
the Generation of Normal RandgID Deviates,"
A.n1tals of Mathematical Statistics, Vol, 29, pp.
610-011,
Bradey, P,. B, L, Fox. and 1" E. Schrage (1987),A
Guide to Simulation, 2nd edition, Sp.ringer~Verlag,
New York.
Cheng, R. C. (1977), "The Generation of Gamma
Variables with NoninJ:cgra1 Shape Parameters;'
Applied Statistics, VoL 26, No, I. pp, 71-75,
Cochran, W, G, (1947), "Some Co!)Seq"e""es "''ben
the .A.ssumptions for the Analysis of Variance Are
Not Satisfied," Biometrics, Vol. ;., pp, 22-38.
CochrllJ1, W. G, (1977). Sampling Techniques, 3rd
edition, Joha Wl1ey & Sons, New York,
Cochr:m, W. G,. and G, M, Cox (1957), Experimental Designs. John Wiley & Sons, Ne\VYork.
Cook, R. D. (1979), "1Dlluential Obsel'V1loons in Linear Regression," Jou.mal oftheAmencan: Sta;isti169-174.
cal Association.. Vol.

Cook, R. D. (1977). "Detection of1Dlluential Ooservations in Linear Regression," Technoff'..etrics, Vot


19, pp, 15-18,
Crowder, S, (1987), "A Simple Method for Studying
Run~LeIlc,trth Distributions of Exponentially
Weighted Moving Average Olarts," Technometn'cs,
VoL 29. pp, 401-407,
Daniel. C" and F. S, Wood (1980), Fitting Equations to
Daw.. 2nd edition, John Wiley & Sons, New York.
Davenport, W. B" and W, 1" Root (1958), An Introduction to the Theory ofRarulom Signals and
Noise, McGraw~Hill. ~ewYork.
Draper, N. R., and W. G. Hanter (1969), "Transformations: Some Examples Revisited,'" Technometrics, Vol. 11, pp, 23-4{),
Draper, N. R., and 'H. Smith (1998), Applied Regres
sion Atudysts, 3rd edition., John Wiley & Sons,
NewYotk.
D-.mcan, A, I, (1986), Quality Control and Industrial

Statistics, 5th edition, Richard D. Irwin, Homewood, IL,

Duncan, D, B. (1955). "Multiple Range and Multip!e


FTests," Biometrics, Vol. III PP, 1-42,
Efron, B. andR. TIbshi.'1mi(1993),Anln.rroti"wctionlO
the Bootstrap, ~an ar.d Hall. ~ew lOCk.
Elsayed, E. (1996), Reliability Engineering. Addison
Wesley Longman, Reading. MA.
Epstein, B, (1960). "'Estimation from Ufe Test Data,"
IRE Tran.sactions on Reliability, Vol. RQC"9.
Feller, W. (1968). An Introduction to Probability
Theory and Its Application.~, 3rd edition, John
Wiley & Sans, New York.
Fishman, G, S. (1978). PrinCiples of Discrete Event
Simulation, John Wtiey &: Sons, ~C\\' York.
FumiYll!, G, :>1" and R. W, Wilson, Jr, (1974),
"Regression by Leaps and Bounds," Tecrmometnus. Vol. 16, pp, 499-512,
Hahn, G., and S. Shapiro (1967), Statistical Models
in Engineering, John Wliey &: Sons, New York.
Hald,A. (1952), Statistical 'Dutory wiJI: Engineering
Application.s, John Wiley & Sons, New York.
Hawkins, S. (1993), "Cumulative Sum Control
Chartillg: All underntilized SPC Tool;' Quality

Engineering. Vol, 5, pp, 463-477,

637

638

References

Hocking, R. R, (1976), "The Analysis and Selection


of Variables in Linear Regression," Biometrics,
Vo!' 32, pp. 1.-49.
Hocking, R. R .. F. M. Speed, and M. J. Lynn (1976).
"A Class of Biased Estimators in Linear Regres~
SiOD," TecntU>metrics, Vol. 18, pp. 425-437.
HOed, A. E., and R W. Kennard (1970a), "Ridge
Regression: Biased Estimation for Non-Orthogonal
Problems,'" Technometdcs, Vol. 12, pp, 55-67.
Hoed. A. E., andR. W. Kennard (197Gb), "Ridge
Regression: Application to Non-Orthogonal Prob~
leInS," TechtU>metrics, Vol. 12, pp. 69~82.
Kelton, W. D .. and A. M. Law (1983), "A New
Approach for Dealblg with the Startup Problem in
Discrete Event Simulation:' Naval Research
Logist;cs Q-,tarterly, \O\. 30, pp. 641-{558.
Kendall, M. G., and A. S:uart (1963), The Advanced
Theory of Statistics, Hafner Publishing Co:::npany,
>lew York,
Keuls, M. (1952). 'The Use of me S:Udentized Range
in Connection with aD Analysis of Variance,"
Euphytics, Vol. 1, p. 112.
Law, A. M., and W. D. Kelton (2000). Simulation
Modeling andAnalysis, 3rd edition, McG:raw-Hill,
New York,
Lloyd.D. K, aDdM. Lipow (1972), Reliability:
Mana.gerr.ent, Methods, arul Mathenulrics. P1entice~
Hall, Englewood ClilIs, N.J.
Lucas, 1., and M, Saccucd (1990), "Exponentizlly
Weighted Moving Average Control Schemes:
Properties and Enhancements." Technometrics,
Vol, 32, pp. 1-12.
Marquardt, D. W.. and R. D. Snee (1975), "Ridge
Regression in Practice:' The American
Statistician, Vol. 29, pp. 3-20.
Montgomery, D. C. (2001), Design and Analysis of
Experimeras, 5th edition, 10hn Wiley & Sons.
New York.
~fontgomery, D. C (2001),lntroduction to Statisti~
cal Quality Control, 4th edition, John W:t1ey &
SOIlS, New York.
Montgomery, D. C, E. A. Peck, and O. G. vining
(2001), Introduction to Linear Regressicrn.
Analysis. 3rd edition, John Wiley & Sons, New
York.
Montgomery, D.C., and G.C. Runger (2003),
Applied Statistics and Probability for Engineers,
3rd edtion, John Wiley & SODS, New York.

Mood, A. M., F. A. Graybill, and D. C. Bees (1974),


IrtlroducMn to (roe Theory 0/ Statistics. 3rd edi~
!ion, McGraw..Hill, New York.
Neter.l. M. Kutner, C. Nachtsheim, and W. Wasser~
man (1996), Applied Lir.ear Statistical Mrxiels.
4th edition, Irwin Press, Homewood.,.IL.
>lewman, D, (1939), 'The Distribution of the Range
in Samples from a NoID13l Population Expressed
in Tern:s of an Iodependent Esti.JruIte of Standard
Deviation," Biometrika, Vol. 31, p. 20.
Odeh. R., and D. Owens (1980), Tables/or Nonr.al
Tolerance Limits, Sampling Plans, ami Screening,
Marcel Dekker, ~ew York.
Owen. D. B. (l962), Handbook of Statistical Tables,
AddisonWesley Publishing CompllIlY. Reading,
Mass,
Page, E. S. (1954). "Continuous Inspection
Schemes." Biometrika, Vol. 14. pp. 100-115.
Roberts, S. (1959), "Control C'ha.ttTests Based on
Geometric Moving Averages," Technometrics. Vol
I, pp. 239-250.
Scbeff6, H. (1953), "A Memod for Judging All Con
trasts in the Analysis of Variance," Biometrika,
VoL 4(], pp.87-104.
Snee, R. D. (1977), "Validation of Regression Mod~
eIs; Methods and Examples," Technometrics, VoL
19, No.4, pp. 415-428.
Tucker, Ii. G. (1962), An Introduction to Probability
and Mathematical Statistics, Academic Press,
~ewYork.

Tukey, J. W. (1953), 'The Problem of Multiple


Comparisons," unpublished notes, Princeton
University.
Tukey, J, W. (1977), Exploratory Data Analysis.
Addison-Wesley, Reading, MA.
United States Department of Defense (1957). Military Sttmdard Sampling Procedures and Tablesfor
Inspection by Variables for Percent Defective
(MlL-STD-414), Government Printing Office,
Washing'.on, DC.
Welch, P. D. (1983), 'The Statistical Analysis of
Simulation Results." in The Corr.puter Perfor
mance M.odeling Handbook (ed. S. Lavenberg),
. Academic Press, Orlando, FL.

Answers to Selected Exercises

Chapter 1
11.

(a) 0.75.

13.

(a)

(h) 0.18,

it (' B= [5), (h) it c; B= {I, 3,4, 5, 6, 7, 8, 9.1O}, (e) A0!i = (2, 3, 4, 5).

(d) U=(l,2,3,4,5,6,7,8,9,lOj,
1.5

9' = W, r,):

t;

~ 0,

(e) A('(B,-,C)= (1,2,5,6,7,8,9,IO).

I," 0),

't

A = (t t,):t, " 0, ,,0, ('l - :,)12 S; 0,15).


"
B = [('l' '2): " ~ 0,', ~ 0, max (t" t,)'; 0,15),

e = {(t

17,

t,l: 'l ,,0, t, ~ 0, It, - r,jt2 ,; 0,06),


"
9' = (NNNNN, NNN,ND, NNNDN, NNNDD. NND"IN, l'-CIDND, NNDD, ':'IDl'--:W, NDt-.'NIJ,
NDND, NDD, DNNNN, DNNND, DNND, miD, DD),

1~9.

(a) N:::: not defective, D := defective.

9' = (NNN, ~'NIJ, NDN, NDD, DNN, DND, DDN, DDD) ,


(h) 9' = (NNNN, NNND, NNDN, NDi'.'N, DNNN),
111. 30 tomes. 1-13. 560.560 ways.
(3OOP "1(300(I-

115.

P(Accept Ip')

l
=L \ ,
t

(31~)

,-0
1-17. 28 comparisons,

p'Yi

IO-x

)
I

119. (40)(39) = 1560 tests,

1-21. (a) (;)(;) = 25 ways,

(h)

@@= 100 ways,

1-23. R, =[1- (0,2)(0,1)(0.1)] [1 - (0,2)(0,1)] (0,9)

=0,880,

=Siberia. U = Ural, P(S) =0,6, P(U) =0.4, peElS) =P(FlS) =0,5,


P(F~U)=0,3, P(SIF) '( 0,6),0,))4/:~X~'\
0,714.
0.4 (0,3

1-25. S

)'

1
1 1 m-l m 2 -m-l
1-27, -_._+-,--= 2

m-1 m

1.35,

129. P(womenI6,) '" 0,03226,

m (m-l)'

131.

!l4,

p(li) = (365)(364)"'(365-~ + I) .
.

n
PCB)
137. 8: = MmO,

365/1

21

22

0.444 j 0.476

30

40

60

0.706

iJ:891

0,994

1-39. 0,441,

Chapter 2

21.

Px(X=x)

x=o, L 2, 3, 4,

639

640

Answers to Selected Questions

3:;'~S~l-Yes,

2.11 .(.) k ~~,


"-~/"'~

">

(eJ Yes.

(b) No,
I

(b)

2-7. a, b. 2-9. P(Xs 29) ~ 0.978.

11 ~ 2. ",' = l.

/"

(e) 7;(x) = 0,

,<0,

=,'/8. O>x<2.
=-l+x-

x'

S'

2Sx<4,

x:::'4.

2-13. b

2. [14 -

2'12. 14 + 2J.

2~15, (a)

k=;.

(b) .u::::~, cr"=i;,

(e) Fx(x) = 0"

< l.

=-k.l S;x<2,

" 2 :5:x<3,

=r.f.

=1.x~3.

217. k = 10 and 2 + kW '" 8.3 days.


2-21. (.)F,(x)=O,x<O,
=-<'I9,O';x<3,
=1.x2:3.
(b)

11 = 2. cr

"

,.

(e) 113

3
,2

(d) m=cc'

223. F,(,)=l-e'v"",,,O, 225. k=I.I1=1.

=O,x<O.

Chapter 3
31.

(a)

p,fy)

(b) E(l') = 14, Vel') = 564.

-~---

0.6

20

0.3

80
Qther.vise
33.

(al 0.221.

0,1

0.0

(b) SI55.80.

3~5.

liz) :::: e-r., Z 2: 0,

3-7.
39.

93.S c!gal.
{a} f,J;;) =:;; kfr 1f1 , trOfl)ill, y > D,

= 0, other.vise.

::::: 0, otherwise~
(b) Mv)= 2"'~, v > 0,
:::: 0, otherwise.

(c) f,)u)=e-('f"-u!,u>O,

::::; 0, other"Nise.

311. s = (513) x 10'.


313. (al !,(Y)=1<4-yr ",O';yS3,
"
"'" Q, otherwise.
(b) Jy(y):::' J. e:5: y.$ C,
:::;: 0. othGI"Nisc.

Answers

[Q

Selected Questions

315.

Mx(t) = I,(t)e",
E(X)=M~(O)=t,

V(X) =M;(0)-[M~(0)J2

=-/!.

ECX) = 6.16, VCX) = 0.027.


319. Mj.t) = (1- tnt', ECX) =M~CO) =I, VCX) = M~CO) - [M~CO)J' =.;..

3.17. ECl')
3~21.

= I, VCl') = I,

M.J..t):::: E(e IY)

::::

E(er(aX+ /J))

3-23. (a) MJ,.t) :::: ~ + ie + ~e2t +


t

e1b E(e(al)X) :::: e rb Mx(at).

::::

ic

E(X):::: Mx(O) ::::;,

V(X):::: ~(O) -

Cb) Fy(Y)=O,y<O,

=~,O~Y<1.
=~.1~y<4,
=1,y>4.

Chapter 4
41.

Ca)

(b)

pj.x)

27/50

II/50

6/50

3/50

'li50

1/50

Y
py(y)

20/50

15/50

10/50

4/50

1/50

Pno Cy ) llf27

Ce)

P.oCx) llnO
4-3.

Ca) k = 1.

4f27

3/27

In7

4/20

2120

1/20

Ino

1/20

I
8/27

Cb) Ix,Cx,) =100, 0 ';x,'; 100,


= 0, otherwise.
Ix,Cx,) =

TI>,

0 ,; x,'; 10,

= 0, otherwise.

(c) IXI.x/xI'~):::: 0, if xl or Xz < 0,

=~, 0<x l <100, 0<Xz<10,

=-frm. a <XI < 100, ~ ;;::'10,


=ti, XI;::: 100, O<~< 10,
= I,

45.

Ca);.

(b)

x, ;, 100, x,;' 10.

ior.

Ce) J..Cw)'= 2w, 0,; w'; I,


= 0, otherwise.

~ 4-9. ECX,ix,) =l. ECX,i:,,) =;. 4-15. ECl') = 80, VCl')


4-19. P =~. 4-21. X and Y are not independent. p = -0.135.
47.

423. Ca)

Ix(x)=3c~I-x',
= 0,"

otherwise;

fr(Y)=~JI-y',

"
:::: 0,

-1<x<I,

O<y<I,

otherwise.

=36.

(;f = ~.

641

642

Answers to Selected Questions

~,. -~I-y' <X<~I-y',

4-23. (h) f..,,(x)=

2,I-y
= 0,

otherwise;
1

C""1
iJ'1.x(Y)= r~-=' 0<y<'VI-x ,
'i1_X2

otherwise.

;;; 0,
(e) E(Xiy)=O.

r~~--

E(Ylx)=t~l-x' .
2+3y
4-27. (a) E{Xly)= 3(h2Y)' O<y<l.
4-29. (a) k= (n - 1)(n 2).
4-31. (a) Independent.

4-33. (a)

(b)

(h) F(x. y) = I (1 + x),". (1

(b) Not independent

+ y)'"

(1

+ x+ Y)"'.;P 0, y;> O.

(c) Not independent

4-35. (a)F,(z)=Fx!(,.a)lb).

(e) F,!.z) = Fx(i').

(b) F,!.z)=I-Filh).

(d) F,!.z) = FJ)n z).

Chapter 5
5.1.

P(X=x) =(~ p' (1- p)4.', x = 0, 1,2.3,4.

53.

Assuming independence, P(W" 4) = 1-(0.5)"

5-5.

p(X>2)=I-

57.

PCu S 0.03) '" 0.98. 59. p(X= 5) "0.0407. 511.

~(~) =0.927.

L, (50)
x (0.Q2)"(0.98)>>-' =0,078.
.r"'-l

p '"

0.8.

E(X) = ,\Ix (0) = 1., E(X') =M;(O) = ~,<1~ =E(X')-(E(X))' =3,.


p
p'
P
515. P(X= 36) '" 0.0083.
513.

517. P(X = 4)

=0.077. P(X <4) =0.896.

521. p(3, 0, 0) + p(O, 3, 0) + P(O. 0, 3)

519. E(X) = 6.25, VeX)

= D.llS.

=1.5625.

5-23. p(4, I, 3. 2) '" 0.005.

5.25. P(X" 2) 0.98. Binomial approx. P(X S 2) '" 0.97.


527, P(X" I) '" 0.95. Binomial'ppro.". "" n = 9.

529,

P(x<IO)=11.388SXIO"yf,(25Y
\
} x-.d)
xl

5--33.

Poisson model, c:::::: 30.

5-31. P(X>5)

= 0.215.

P(XS;3)=e''' 30',
..:..0

xl

P(X~5)=I-e''' 3~ .
.r< X_I

5-35. Poisson model, c = 2.5. P(X" 2) '" 0.544.


X = number of errors on n pages - Bin (5, nJ200)_

5~37.

(h) P(X" 3)" 0,90, ifn = 15!'

5-39. P(X <: 2) = 0.0047.

(a) n -= 50. P(x ~ 1);:;; 0_763.

Answers to Selected Questions

643

Chapter 6

,,

6-1.

~,n

6-3.

f,,(y)

6-5.

E(X) = M~(O) = (f3 + a)l2, VeX) = M~(O) - [M>O)J' = (f3 - 0.)'112.

6-7,

y
y<1

=l, 5 <y <9,

= 0, otherwise.
Fyiy)

0
0.3
0.5
0.9
1,0

l~y<2
2~y<3
3~y<4

y>4

Generate realizations u. i - uniform [0, 1] as random numbers, as is described in Section 6-6; use these
in the inverse as Y. =Fy-1(u.). i = 1, 2, ....
6-9,

E(X) =M~(O)

= 1/.1., VeX) =M~(O) -

6-11. 1- e-1I6 == 0.154.


6-15.

= 1/-'.'.

[M~(O)J'

1- e- == 0.283.
1f3

6-13.

C1 = C, X > IS,

ClI = 3C, x > 15,


= 3C+Z, x.::; 15;

=C+Z,x:515;

E(CI) = Ce-'!> + (C + 2)[1-

e- 31'J

'" C + (0.4512)2;

E(C,) = 3Ce-3n + (3C + 2)[1- e-J"J

'" 3C + (0.3486)2;

=> Process J, if C > (0.0513)26-19. 0.8305.


6-27.

6-23.

0.8488.

r=2, fxtx) =

For-'.=I,

(~(3)(
1

r 2

).xO(I-X)=2(I-X),0<x<1.
= 0, otherwise.

6-31. 1- e- I

'"

0.63.

6-33.

= 0.24. 6-35.

(a)

= 0.22.

(b) 4800.

6-37.

= 0.3528.

Chapter 7
7-1.
7-3.

(a) 0.4772.

(b) 0.6827.

(e) 0.1336.

(I) 0.9485.

(a) c = 1,56.

(e) 0.9505.

(d) 0.9750.

,(g) 0.9147.

(h) 0.9898.

(b) c = 1,96.

7-5.

(a) 0.9772.

7-7.

30.85%. 7-9. 2376.63 fc.

(b) 0.50.

7-13. (a) 0.0455.

(e) 0.6687.

(b) 0.0730,

7-15. B, if cost of A < 0.1368.

7-19. (a) .0.6687.

(b)

7-23. 0.616. 7'25.

~&4.

(e) c = 2.57.

(d) c = -1,645.

(d) 0.6915.

(e) 0.3085.

(e) 0.95.

(d) 0.3085.

7-17. !' = 7.
(e) 6.018.

0.00714,

7-27, (a) 0.276.

(b) At!, = 12.0,

7-29. (a) 0.552.

(b) 0.100.

(e) 0.758.

(d) 0.09.

7-30. n = 139. 7-36. 2497.24.


7-37. E(X) = e623 VeX) = e 125 (o"_I), mdn = eso , mode =

0". 7-41. 0.9788. 7-42. 0.4681,

Chapter 8
8-1.

x = 131.30,? = 113.85, s = 10.67.

8-21.

x = 74.002,? = 6.875 x 10", s = 0.0026.

8-23. (a) The sample average will be reduced by 63.

644

Answers to Selected Questions


(b) The sample mean and standard deviation will be 100 units larger. The sample vali.ance v.iJl be
10,000 units larger,

x. 829. (a) x= 120.22, s' 5.66. s =2.38.


8-31. For 829. cv = 0.0198; for 830. CY 9.72, 833.

825. a =

(b) x = 120. mode = 121.

x= 22.41, s' =208.25, x = 22.81, mode = 23.64.

Chapter 9
91.

j(x" x,.

"., x,); (1I(2nO"5!le,n"'

93.
97.

j(x" x" x" xJ = 1.

L (x, Il)'.

95. N(5.00,O.00125).

Use S Ir;;.

- -

:~-;;:z

I(T

5)' (" 0)'

99.

ThestandatderrorofX,-X,is i'....L+:2.='/---+-=--=0.47.
in!
r-;.
,:25
30

913.

se(p)=.j;{I=IN~, se(p)=~p{H)/n.

9-17.

ForF",Il.wehave,u=n!(n-2)rorn>2anda"'-

915. J.l

"

, (f';:;:
1- e-t'.N. F.:Iii,,;'(t)
921- JX;1l
I

(b) 11.34.

923. (al 2.73.


925. (a) 1.63.

(b) 2.85.

911. N(O,I).

u,O"=2u.

2n2(m+ft~2)
.'~
forn>4.

rn(n-2)"(.-4)

=: 11_ e.J.Al't,

(e) 34.17. (d) 20,48.


(el 0.241. (d) 0.588.

Chapter 10
10-1.
10-3.

10-7.

Both esti:nators are unbiased. ~ow,. V(X1):= (J'llln whereas V(X2)


a more efficient estimator ':han )(2'
$'l' because: it would have a smalier MSE.
A

fox, -

IX;.. -=X.
i=l

1025,

109.

::=

a 211'1. Since V(X\) < V{X:J, XI is

lit.

, . r 6f

nX JJ.c ,\'i1 ]l,


f(.uJx),xZ,,,,x,,)::::CJ, 2(2~rtl12Jexp~ __ J,l--11'--Z+-f

I( L2.

a"

c\ a

0'0"

/j j

wMre C=-.,+-,.

1021- The posterior density for p is a beta distribution with para.n:.eters a + n and b + L. Xi - n,
1029, The posterior density fer ?.is gamma with parameters r =: m + Lx; + 1 and 0;:; n + (m +l)/l,:.
1031. 0.967. 1033. 0.3783.
j(x"e)
2x
1035. (al j.elx,
112. 10-37. at:::;: a;::::: cd2 is shorter,
f )=-':-(~)' =-'(2-2" (b)
J x,
e , - xl,
1039, (a) 74.03533 ~ 1">74.03666. (b) 74.0356" .u.
10..41, (al 3232.11 ;; i" ~ 3267.89. (b) 1004.80:$ J.l. 1043.

150 Or 151.

10-45. (a) 0.0723:$ J.l, -Ilz:$ 3267.89. (b) 0.049% Il: -Ilz:$ 0.33.

(e) i",

-Ilz;; 0.3076.

Answers to Selected Questions

645

1041. -3.68 S 1', - iLl S -2.12. 1049. 183.0 S I'S 256.6. 1051. 13.
1053. 94.282 S it ~ 111.518. 10-55. -D.839 ~ .u, - iLl ~ -D.679. 1057. 0.355 $1', - iLl'; 0.455.
10-59. <a) 649.60'; a' $ 2853.69. (0) 714.56 s a'. (e) a' $ 2460.62. 10-61. 0.0039$ a' $ 0.0124.
10-63. 0.514" ,,',; 3.614. 1065. O.ll S 0';1,,; ~ 0.86. 10-67. 0.088 $p S 0.152. 1069. 165i7,
10-11. -D.0244
0.0024. 1013. -2038 $1', - iLl'; 3714.8.

"p, -p,,,

10-75. -3.1529';!1, - iLl " 0.1529; -1.9015 "!1, - iLl S 0.9015; -D.1775 S!1, - iLl S 2.1775.

Chapter 11
111.
115.
119,
11-15.

<aJ:z;, -:.333, do not reject H,. (0) 0.05. U3, (a) z;, = -12.65. reject lID. (b) 3.
:z;, = 2.50, reject II,. 117, <al:z;, = 1.349, do not r<;ject H" (b) 2. (e) L
Z, = 2.656, reject H~ n11. z;, = -7.25, reject H,. 1113. 10 = 1.842, do not reject Iio
t o ;;d.47,

1119. (a)

do not rejectHo at a=: 0.05.

'0 ~ 8.49, reject Ii",

(b)

'0

11-17. 3.

-2.35, do not reject II,.

(e) 1. (d) 5.

1121. Fe = 0.8832, do not reject He' 1123. <a) Fo = 1.07. do not reject He' (b) 0.15. (e) 15.
11-25. to"" 0,56, do not reject Hey
1127, (a) ~~43.75,rejectH,. (b) 0.3078 x 10"'. (e) 0.30.

(d) 17.

X;

1129. (aj
= 2.28. ,eject H,. (b) 0.58. 1131. Fo = 30.69, rejeet H,; P'"' 0.65.
11..33. :;;;;: 2.4465, do not r<Bect Hr;> 1l<~5. to =: 5.21, reject HD, 11..37. Zo;;: 1.333, do not reject HQ
1141. Zr;=-2.023,donotrejectHo. 11-47. ~=:2.9!5,donotrej:;ctHrr
1149. ~ = 4,724. do not reject Ho. 11",53. ~ = 0,0331, do not rejectHc~

11~55. ~ =: 2A65. do not reject Hrr 11s!. ~ =34.896. rejectHD 11~59. ~ =: 22.06, reject He'

Chapter 12
12-1.

(aj Fe

=3.17.

12-3. (a) Fo

=12.13.

(bj );.fixing technique 4 is different from I, 2, and 3.

12-5. (a) Fe ~ 2.62. (b) jl = 21.70, '" 0.023. T, ~ -D. 166, i:, 0.029, "4 ~ 0.059,
127. (a) Fe=4.0L (b) Mean 3 differs from 2. (e) S5"=246.33. (d) 0.88.
12-9. (aJ Fe ~ 2.38. (b) None. 12-11. n = 3.
12-15. (aJ /l = 20.47, !; = 0.33, T, = 1.73. i:, ~ 2.07. (b) T, - i:, = -lAO.

Chapter 13
13-1.

Source

CS
DC
CS'DC
Error

Total
13-3.
137.

DF
2
2

4
1.8
26

S5

MS

J?

0.0317805
0.0271854
0.0006873
0.017941.3
0.0775945

0.01.58903
0.01.35927
0.0001.71.8
0.0009967

1.5.94
13.64
0.1.7

0.000
0.000
0.9S0

Main efiects are significant; interaction is not significant.


-23.93 $ i', - f.L, ~ 5.15. 135. No change in conclusions.
Source
DJ?
S5
MS
F
glass
1 14450.0 1.4450.0 273.79
phos
2
933.3
466.7
8.84
glass*phos 2
133.3
66.7
1.26
Error

Total

12
17

Significant main effects.

633.3
16150.0

52.8

P
0.000
0.004
0.318

646
13-9_

Answers to Selected Questions

Source
-..

DP

MS

SS

P
0.001
0.000
0.000
0.015
0.084
0.075
0.290

~~.~

Cone
Freeness

Time
Conc'llFreeness

Conc*Time
:E'reeness*Time
Conc~Freeness~Tirne

Error
Total

7_7639
19_3739
20.2500
6.0911
2.0817
2.1950
1.9733
6.5800
66.3089

2
2
1
4
2
2
4
18
35

3_8819
9.6869
20_2500
1.5228
1. 0408
1. 0975
0.4933
0.3656

10_62
26.50
55.40
4.17
2_85

3.00
1.35

Concentration.lIDle, Freeness. and the interaction TlllleFreeness axe significant at 0.05.


13--15.
13-11.
13-19.
13-21.
13M27.

Main effects A. B, D. E and the interaction AB are signi:ficant


Block 1: (1). abo ac, be. Block 2: a, b. c. abe.
Block 1: (ll, abo bed. !lCd, Block 2: a. b. cd, abed, Block 3: c, abc. bd. ad, Block 4: d, abd, be. ac.

A and C are significant. 13-25. (a) D; ABC. (b) A is significant.


2>-\ wit.i two replicates. 13..29. 25-2 design.. Estimates for A, E, and AB are large.

Chapter 14
14-1. (a) y = 10.4391 - 0.00156.<.
14-3.

(aJY;J 1.656 - 0.041x.

14-7.

(a)

(b) F,; 2.052.

(b) Fo ~ 57.639.

y= 93.3399 + 15.6485..

(e) --0.0038'; fl, ,; 0.00068.

(c) 81.59%.

(d) 7.316%.

(d) (19.374.21.388).

(b) Lack of fit not significant, regression significant.

s: 23.299. (d) 74.828 $ flo s 111.852. (e) (126.012.138.910).


y; --6.3378 + 9.20836x. (b) Regression is significant. (e) to ~ -23.41. reject He.

(e) 7.997'; .8,

14-9.

(a)

(d) (525.58,529.91).

14-11. (a)

(e) 0.3933.

14-13. Ca)

Cd)

3.96 + 0.00169x.

(b) Rcgrcsiion is significant.

(d) 95.2%.

y~ 69.1044 + 0.4194.<.
ZO~

(b) Lack of fjt not signifi=~ regression is significru:;L

(d) (4.5661. 19.1607).

(e) (0.0015.0.0019).

14-15. Ca)

(e) (521.22, 534.28).

y; 77.7895 + 11.8634x.

1.61.

(b) 77.35%.

(e) t, ~ 5.85, reject lit.

(e) (05513.0.8932).

Chapter 15
15-1.

Ca)

Y;7.30~0.0183xl-0.399x.,.

(b) Fo~ 15.19.

15-5. y; -1.808372 + 0.00359&x, + O.1939360x, - 0.0()4;l15x,.


Ca) ,;; -102.713 + 0.605x, + 8.924:<" + 1.437", + 0.014.<,.
(b) Fe; 5.106. (c) fJ,. Fa ~ 0.361; fJ,.Fo O.OOO<l.
15-9. Cal y = -13729 + 105.0""" - 0.18954>'.
1513. (a) y ~ -4.459 + 1.384" + 1.467x'. (b) Significant lack of fit. (e) F, = 16.68.
15-15. to= 1.7898. 15-21. v'IF, ~VIF,; 1.4.
15-3.

(-0.8024.0.0044).

IS-7.

Chapter 16
1/i-l.

R = 2.

16-15. Z,

16-5_ R = 2.

16-9. R

88.5.

16-13. R, = 75.

-2.117, 16-17. K = 4.835.

Chapter 17
17-1.

(a) ~ ~ 34.32,

17-5.

DI2.

R ~ 5.65.

(b) PeR, ~ 1.228.

(e) 0.205%.

177. LCL; 34.55. CL; 49.85. UCL ~ 65.14.

17-9. Process is not in control.

A.nswers to Selected Questions


1713.

0.1587, n ~ 6 or 7.

17-17,

UCL=16.485: 0.434. 1719, LCL=

1727.

0.98104. 17-29. 0.84,0.85. 17-31. Cal 3842. (b) (913.63, OOJ.

1715. R....1sed cOlltrollimits:

LCL~

O. UCL = 17,32.

om. UCL =4378.

17-25. (alR(60l=O.0105. (b) 13.16.

Chapter 18
18-1.

(al

0.088. (b) L = 2, L, = 1.33. (el W = 1 h.

l~PJ.

18-3.

1/

1-

1/2;

18-7.

(al

I~ P I-p~
p= 0

1- P

A=[lOOol
189,

(a) ,0'

(e) 3.

(d) 0.03. (e) 0.10.


18-11. (a) 0.555. (b) 56.378 min. (e) 244.18 min.
(b)?ii'

18-13, (a) Pj=[(.:llili /f!j-Po,f=0,1,2, .. "s,

(I) 10'

(b) s=6,p=G.417.

""" 0, otherwise;
I

Po

(;.!!l)i .

j~

j!

(e) p, = 0.354.

(d) From4L6 to 8.33%.

(el = 4.17. p,

0.377.

Chapter 19
19-3.

E(l"l= b:" ~ t,f(a+(b-a)V11J


= (b-a)E(J(a+(b-a)V,))

=(bo) f(a+{b-a)u)du
=I.

19-5.

(a) 35.

197,

Ca) X,I, V,

199.

(al X=-(II.1.) tn(l- U).

(b) 4.75.

(e) 5 (at time 14).

1116, Xi

1911. (a) X=-2,,}1-2V,

=6, V, =6/16.

(b) Yes.

(e) X"o

2.

(b) 0.693.

ifO<V<1/2,

(b) X=O.894.

=2~, if 112 <V <1.


19-13. (a) X = a[-tn(l- v)]"p.
19-15,

:E;:: V,-6=1.07.

(b) 1.558.

1.917. X=-(l/.1.)m(V,V,)=O.84L

1919, X = 5 trials. 19-21. [-3.41,4.41].


19-23.

(80.4,119.6].

647

1925. E<Jl"llelltial with parameter 1; -V (V,) = -1/12.

Index

22 factorial design. 373


2 3 factorial design. 379
2); factorial designs+ 373, 379
2k- l .fractional factorial design,
394
2k."? fractional factorial design.

400
3-sigma control limits, 511

nonnal approximation to
binomial, 155
Assignable cause of variation,

Bootstrap confidence intervals,

509
Asymptotic prOperties of the

Box plot, 179

maximum likelihood
estimator, 223
Asymptotic relative effiCiency.

499,501
Absorbi.:lg sta~e for a Markov

Attribute data. 170


Attocorrelarion in regression,

cbain, 558, 560


Acceptance-rejection method of
random number generation,

472
Average ron length (ARL), 532,
534,535

586

BOA-Mill.1ermethod of
generating normal random
numbers, 5&4

Burn-in, 540

c
c chart, 523
Car..esian product, 5, 71
Cauchy distribution, 168
Cause-and-effect diagram, 509,
535

Active redunda:Lcy, 544


Actual process capabilit)'. 517
Additivity theorem of chl~

square. 207
, Adjusted R', 453, 476
Aliasing of effects in a fractional
fact,:)rlal,395,400

All possible regressions, 474


Alternate fraction, 396

Alternative hypothesis, 266


Analysis of variance, 321. 323,
324,331,342
Analysis of variance model, 323,

328,337,341,359,366,367
Analysis of variance tests in
regression, 416, 448
Analytic study, 172
A'iOVA, see analysis of

B
Bacbvard elimination of
variables in regression,
483
Batcb means, 590
Bayes' estimator. 228. 230
Bayes' theorem, 27, 226
Bayesian confidence intervals,

252
Bayesian inference, 226,227,
252

Bernoulli distribution, 106, 124


Bernoulli process, 106
Bernoulli rlIDdom variable, S82
Bernoulli tr',aJs, 106,555
Best linear unbiased estimator,
260
Beta. distribution, 141

binomial approximation to

Biased estimator, 220, 467, 588,


589
Binomial approximation to
hypergeometric, 122
Binomial distribution, 40, 46, 66,
106, !O8, Ill, 124,492
melID and variance of, 109
cumulative tlmomial
distribution, 110
Birth-a""", equations, 555, 564
Bivariate normal distribution.

hypergeometric, 122
Poisson approximation to
binom:ial. 122

Blocking Pl'_,c;p]e, 341, 390


Bootstrap, 231

variance
A:.ltithetic variates, 591, 593,
595
Approximate coniidence

intervals in. ma.--umum

likelihood estimation. 251


Approximation to the mean and
variance,

253

Bootstrap standard etror, 231

62

Approxin:tations to Cistributions.
122,

160,427

Censored:ife teSt, 548, 549


Census, 172
Centerline on control chart,
510
Central limit theorem, 152, 202,
233,546,584,588
Central moments, 48, 58, 189

Chance cause of variation, 509


Chapman-Kolmogorov
equations, 555, 556, 561
Characteristic function, 66
Chebysbev's incq:Hhlty, 48
Check sheet, 509

Chi-square distribution, 137,

168,206,238,300,549
mean and variance of, 206
additivity theorem of, 207
percentage points of. 603
Chi~square goodness-of-fit test,
300
Class intervals for histogram,
176
Cochran's Theorem, 325
Coefficient of deternrination,
426
Coefficient of multipie
determ.ination, 453
Combinations, 16
Common random numbers i.rl
simulation, 59:. 593
Comparison of sign ~t a:J.d
I-test, 496

649

650

Index

Comparison of\,\'ilooxon rank~


sum test and t-test, 501
Comparison ofW::.1ooxon signed
rank test and I-test. 499
Complement (set complement),
3
Completely randomized design,
359

Components of variance modei.


323
Conditional distributions. 71. 79,
80,128,162

Conditional expectation. 71, 82,


84
Conditional probzbLity, 19.21
Confidence coefficient. 232
Confidence inter>al. 232. 233
Co:iidence interval on the
difference in means for
paired observations, 247
Confidence inter.'ai on the
difference in means of two
:1ormal distnDutions,
variances known, 242
Confidence interval on the
difference in means of two
:.ormal distributions,
variances unknown, 244,

246
Confidence interval on the
difference in two
proportions? 250
Confidence interval on mean
response in regression., 418
Confidence interval on the mean
of a nor!:1al distribution,
variance lcnOv.n, 233, 236
Confidence interval on the mean
of a normal distribution,
variance unknown., 236
Confidence interval on a
proportion, 239, 241
Confidence interval On me ratio
ofvariancC$ oftwn normal
distributior.s, 248
Confidence i::1tervaLs on
regression coefficients. 4:7,
444
Confider.ce interVal on
sim:llation output. 588, 591,

592
Confidence i:.terval OIl. thc
variance of a normal
distribution. 238
Confidence level, 234, 235

Confidence limitS, 232


Confounding, 390, 394

Consistent estimator, 220, 223


Contingency table, 307
Continuity corrections, 155
Continuous data, 170
Continuous function of a
contirtuous random
variable, 55
Continuous random ...a...-iable. 41,

55
Continuous sample space, 6
Continuous sim.I.l.,iation output,
587

Continuous uniform distribution,


128,140

mean ar.d \-'Rriance of. 129


Continuous-time :Markov chain.
561

Contour plot of response, 355,


358
Contrast, 374

Control chart. 509


ContrOl charts for attributes, 520,

522,523,524
Control chart for individuals,
518

Control charts for measurements.


510,518,522
Control limits, 510, 511

Convolution, 585
CODYoh:tion method of random
number generation. 585
Cook's distlltlce, 470
Correlation coefficient, 190
Correlation matrix of :egressor
variables, 461

Correlation, 71, 87, 88, 101,.409,


427

Covariance, 71, 87, 88,161


Covariance mat:ri.x of regression
coefficients. ~3
Cp Statistic in :regression, 475
Cramer-Rao lower bound, 219,
223,251
Critical region, 267
Critical values of a test statistic,
267

Cumulative distributiQn function


(CDF), see distribution
function
CUll':.U.lao.vc nor:::nal distribution.
145

Curr:.ulative sun: (Cl!SL"M)


control chart, 525, 526, 534

Cycle length of a random


DllIl:ber generator, 581
D
Data, 173
Decision int-.'TIal for the

CUSUM,527
Defect concentration diagram
536
Defects, 523

Defining contTIlSts, 391


Defining relation for a desig:1:,
395,400
Degrees offreedom. 188
Delta method, 62
Demonstration and acceptance
testing. 551
Descriptive statistics, 172
Design generator, 395,. 400
Design of experimentS, 353, 354
Design resolution, 399,402
Designed experiment, 321, 322.
536
Determ!.nistic versus
nondererrr.inistic systems, 1
Discrete data, 170
Discrete distributions, 106
Discrete random variable, 38, 54
Discrete sample space, 6
Discrete simulation output, 587
Discrete-event simulation. 576
Distribution free statistical
methods, see nonpararnetric
:nethods
Distribution function 36,37,38,
91,110
Dot plots, 173, 175
Durbin-Watson test, 472
Dynamic simulation, 576
E

Enumerative study, 172.204


Equally likely outcomes, 10
Equivalent events, 34, 52

Ergodic property of a Markov


chain,559
Erlang distributiou, 5&5
Error sum of squares, 325, 413
Estiro.able functions; 329
Estin::.ated standard error, 203,
230,240
Estirr:.at:i.on of r:? in regression,-

414,444
EStiInationofvariance
components, 338,. 367, 368

Index

Events. 8
Expectation, 58
E.~pected JJie, 62
Expeeted mean squares, 326,

Function of a random variable,


52, 53
Functions of two random
variables, 92

338,360,361,366,368
..E.xpected :ecurre:nce time in a

Markov chaw, 558


Expected value of a random
variable, 58, 77
Experimental design, 508
Exponential distribution, 42, 46,

130,140.540,541.548.
546,564,583
relationship of exponeIltia.!
, and Poisson, 131
mean and variance of 131
memoryless property. 133
..E.xponentially weigh:ed
mOving average (EViM,A.)
control ehart, 525. 526.

529.534
EXtra sum of squares method.

450
Extrapolation, 447

Gamma distribution, 67,134,


140.540,546

relationship of gamma and


exponential, 135
mean and variance of. 135
relationship to chi~square
distribution, 137
three-parameter gamma,
141

Gamma funetion, 134


General regression signifieance
!eslS,450
Generalized ir.teraction. 393
Generation of random variables.
580,582
Generation of realizations of
random variables, 123, 138,

164
Geometric distribution, ] 06, 112,

Factor effects, 356

124.535,586
mean and variance of. 113

Factorial experiments. 355. 359,

memoryless property, 114

369
Failure rate, 62, also see bazard
function
F-distribution, 211
mean and va.."ianec of, 212
percentage points of, 605,

606,607,608.609
Finite population, 172, 204
Finite sample spaces, 14
Finite~state Markov ch.a.in, 556
First passage tiJ:Ge, 557
Fin;t~order autoregressive model,

472
Fixed effects ANOVA, 323
Fixed effects model, 359
Forward selection of variables in
regression, 482
Fraction <!efective or
nonconforming, 520
Fractional factorial design,
394
Frequency distn'bution,175,
176
FuU-cyele :random number
generator. 581
Function of a discrete tandom
variable. 54

H
Half-interval corrections, 155
Half-nonnal dlstribution, 168
Hat matrix in regression, 471
Hazard function, 538, 539,

541
Histogram. 175, 177, 509. 514
Hypergeometric distribution, 40,
106,117.124

mean and variance of, lIS


Hypothesis testing, 216, 266,

267
Hypothesis tests in the
correlation model, 429
Hypothesis tests in multiple
linear regression, 447
Hypothesis tests in simple linear
regression. 414
Hypothesis tests on a proportion,
283
Hypothesis testS on individual
coefficients in multiple
regression, 450
Hypothesis tests on the equality
of two variances, 295,

296

651

Hypothesis tests on the mean of


a norrn.a1 distribution.
varianee known, 271
Hypothesis teStS on the mean of
a normal distribution
variance ur.known, 278
Hypothesis tests on the means of
tvlo normal distributions.,
variances knmvn. 286
Hypothesis tests on the means of
two normal distributions,
variances unknmvn, 288.

290
Hypothesis tests on the vatianee
0; a nol1ll.al distribution.
281

Hypothesis tests on tvlO

propoctions, 297,299
I

Idealized experimentS, 5
ldenti:y element. 381
In-control process, 509
Independent events, 20, 23
Independent expet'.ments, 25
Independent ra.1rlom variables,
71,86,88.95
Indiearor variables, 458

Inferential statistics, 170


In.fi:lential observations in
regression, 470
Initialization bias in simulation,
589

Instantaneous failure rate. see


hazard function
Intcns,ty of passage, 562
Intensity of transition, 562
Interaction, 356, 374.438
Interaction term in a regression
model, 438
Intersection (set intersection), 3
Interval failure rate, 538
Intri.:J:\sieally linear ;egression
model,426

InvarianCc property of the


ma."<.imum likelihood
estimator, 223

Inventory system, 580


Inverse transfonn method for
generating ::an.dom
nll1tlbers. 582
Inverse transform theorem, 582
Inverse transformation method,

64
In'educ-;o1e Markov chai:.!. 559

652

Index.

Mean squares, 325, 342, 360

Jitter, 174
Joint probability distributions.
71, 72,73,94

Mean time to failure (MTI'F),

Judgmentsample, 198

KrJ.Skal-Wallis tesr, SOl, 503,


504
Kurtosis, 189, 307

L
Lack of fit test in regression, 422
Largesample confidence
interval, 236

Law oflarger.umbers, 71, 99,


101
Law of the ur.conscious
statistician. 58
Least squa.--es estimation, 328,
410,438
Least squa.'"Cs nonnal equations,
328,410,439,440
Life testing, 547
Likelihaud function, 221, 226

Linear combinations of random


variables, 96, 99,151
Linear congruential random
number generator (LeG).
581,582

Linear regression DJJXlcl, 409,


437
Little's law, 568

LognormaJdistribution, 157, 159


mean and variance of. 158
Loss function, 227
Lower control1imit, 51 0

62,540

Measurement data. 170


Median of the population, 185
Median, 158,492
Memoryless property of the
exponential distribution~
133,541

Memory!ess property of the


geometric distributioo,

114
Method of least squares, see
least squares estimation
Method of maximum likelihood.
see maximum li.\:elihood
estimator

1fini.mum varianc unbiased


estimator. 219
Mixed mudd, 367

Nonparametric ANOVA, see


Kruska1~ Wallis test
Nonparametric confidence
interval, 237
Nonparametric methods, 237,
491

Nonterminating (steady state)


sUtxilations,587,590

NoIJ.D.al apprOximation for the


sign test, 493

Normal approximation for the


Wilcoxon rank-sum test,
501

Normal approximation for the


Wilcoxon signed rank test.
497
Nonnal approximation to the
binomial, 155,241
Normal distribution, 143,583,
mean and variance

of.
144
cumulative distribution, 145
reproductive property of,

Mude, 158

Model adequacy checking, 330,


345,364,384,414,421,
452, 454
Moment estimator. 224

Moment generating function, 65.


99,107,110,114, 116,121,
129, 132, 135, 145, ISO,
202
Moment', 44, 47, 48, 58, 65

Monte-Carlo integration, 578


Monte-Carlo sin:tulation. 577
Moving range, 518
mp'.l, see mean per unit estimator
Multicollinearity, 464

Multicollinearity diagnostics,

466
Multinomial distribution, 106,

150
Normal probability plot, 304,
305
Normal probability plot of
effects, 387, 398
Normal probability plot of
residuals,330

Null hypothesis, 266

o
One factor at a time expericent.
356

One-half fraction, 394


One-sided alternative hypothesis,
266,269-271

M
Main effects. 35.5, 374

116
Multiple comparison procedures,

One-sided confidence interval,

Mann-Vlhitney test, see


VlUcoXQO rank-sum test

593
Multiple regression model, 437

One-step transition matrix for a

Margi:na.l distribution. 71, 75, 76,

Multiplication principle, 14" 22


Multiplicative linear
congruential random
oumber generator. 582
Mutually exclusive, 22, 24

One-way classification ANOVA,


323
Operating charactutistic (OC)
curve. 274. 275,280. 347
charts for, 610-626
Optimal estimator, 220
Order statistics, 214
Origin moments, 47,58,224
Orthogonal contrasts, 332

161
Markov chain, 555, 558, 559,
560,561
Markov process, 555, 573
MlL-kov property, 555

Maximum likelihood estimator,


221,223,230,251,428

Mean of a random variable- 44.


58
Mean per unit estimate. 204
Mean square error of an
estiuJator, 218

N
Natural tolerance limits of a
process, 516
Negative binomial distribution,
106,124

Noncentral F distribution, 347


Nonccntral t-distribution, 279

233,236
Murkov chain, 556, 563

Orthogonal design, 381

Outlier, 180,421
Output analysis of simulation
models, 586, 587, 591

Index
p
p cnart, 520
l'.urcd data, 292, 494, 498
Paired r~test. 292
Parameter estimation. 216
Pareto char. 181,509,535
Partial regression coefficients.

437
Partition of j)e sample space. 25

Pascal tUStribUtiOD, 106, 115,


124

mean and variance of. 115


Pearson correlation coefficient,
see oorrelatlon coefficient
P~utatioDS)

Probability plotting, 303

Regression of thc mean, 71, 85,

Probability sampling, 172


Probability, 1, 8, 9, 10, 11.12,
13
ProCess capability, 514, 516

Regression sum of squares. 416


Regressor va."'iable, 409
Rejection region, see critical

Process capability ratio, 516, 5:8


Projection of 2k designs, 384
Projection of2K<1 designs, 398

Relationship between hypothesis


teSts and confidence

Properties of estimated
:regression coefficients, 412.
413,443
Properties of probability, 9
Pseudorandom numbers (PRNs),

581

15, 19

Piecewise linear regression. 490


Point estimate, 216
Point estimator, 216
Poisson approximation to
binorr.ial, 122
Poisson distribution., 41. 106,
118, 120, 124,523

mean and variance of, 120


cumu1ative probabilities for,
598,599,600

Poisson process. 119, 582


Polynomial regression model.
438,456

Pooled estimator of variance,

289
Pooled r-test, 289

PopUlation. 170, 171


Popclarion mean, 184
Population mode. 185
Positive reC':lII'ent Markov chain.,
558

Posterior distribution, 226


Potential process capability, 517
Power of a statistical test, 267.
347
Practical versus statistical

significance iI:. hypothesis


tests. Tl7
Precision of estimation, 234, 235
Prediction in regression, 420,
446
Prediction interval. 255
Prediction interval in regression,
420,446
Principal block, 392
Principal medon, 396
Prior distribution. 226
Probability density function, 42
Probability distribution, 39. 42
Probability mass function. 39

653

Q
QuaLitative regressor variables,
458
Quality improvement., 507
Quality of conformance, 507

Quality of design, 507


Quetting, 555, 564, 568, 570,
572,573,579

R
Robart, 510, 512
R',426,429,453,475
R1adit

see adjusted R2

Random effects A:.'10VA, 323,


337
Random eff<>.-'is model, 366

Random experiments. 5
Ra.'1.dom number. see
pseudorandom number
Random sample, 198, 199
Random sampling, 172

Random variable. 33, 38


Random vector, 71
Randomization, 322.359
Randomized block ANOVA,
342

Randomized block design, 341


Rank transformation in ANOVA.

504
Ranking and selection
procedures, 591, 593
Ranks, 496, 498, 499, 502, 504
Rational subgroup, 510
Rayleigh distnlludon, 168
ReCUI1'e.nce state for a Markov
chain, 558
Recurrence time. 557
Redundant systems, 544, 545
Regression analysis. 409
Regression model 437

163

region

intervaJs,276

Relationship of exponential and


Poisson random variables.,
131

Relationship of gamma and


exponential ratldom
variables, 135
Relative efficiency of an
estimator, 218
Relative frequency, 9, 10
Relative range, 511
Reliability, 507, 538
Reliability engineering, 24, 507,
537

Reliability estimation. 548


Reliability funcdon, 538, 539
Reliability of serial systems. 542
Replication, 322
Reproductive property of the
normal distribution. 150,
151

Residual analysis, 330. 345, 364,


384,414,421,454
R.. :dlWs, 330, 364, 377, 421
Resolution m design. 399
Resolution N design, 399
Resolution V design. 400
Response variable. 409
Ridge regression, 466, 4f>7
Risk, 227

S
Sample correlation coefficient,
428
S<mlple mean, 184
Sample median, 184, 185
Sample mode, 185
Sample range, 188
Sample size fur ANOVA, 347
Sample size for confulence
intervals, 235, 237, 241,
2"4

Sample size for hypothesis tests,

273,279,282,284,287,
291, 296, 298
Sample spaces., 5, 6, 8
Sample standard deviation, 181

654

Inde,

Sample variance, 186


Sampling distribution, 201, 202
Sampling fraction, 204
Sampling with replacement, 199,
204

Sampling without replacement,


172. 199,204
Satu.rated fractional facrorial.
402
Scatter diagram. see scatter plot
Scatter plot, 173, 174, 17.5. 41 1.
412,509
Seed for a random nU0100r
generator. 581
Selecting the form of a
distribution, 306
Set operations, 4
Sets. 2
Shewhart control charts, 509,

525
Sign rest, 491, 492, 493, 494,
496

critical values for, 629.


Sign test for pai."'ed samples, 494
Significance level of a statistical
test, 267
Significance levels in the sign
test, 493
Significance of regression. 415,

447
Simple correlation coefficient,.
see correlation oocfficient
Simple linear regression model.
4Q9
Simulation, 576
Simultaneous confidence
intervals, 252
Single replicate of a factorial
experiment, 364, 386
Skewness, 189,203,306,307

Sparsity of effects principle, 386


Specification limits, 514
Standard deviation, 45
Standard error, 203
Standard error of a point

estimator, 230
Standard error of factor effect,
383

Standard Dormal distribution.


145,152,233

cumulative probabilitie.", for.


60),602
Standardized regression
coefficients, 462
Staudardized residual, 421

S:andardizing, 146
Standby redundancy. 545
S~ate equations for a Markov
chain, 559
Statistic, 201, 216

Statistical control) 509


Statistical hypotheses. 266

Statistical inference, 216


Statistical process control (SPC),
508

Statistical quality control. 507


Statistically based =pling
plans, 508
Statistics. field of, 169
Stem and leaf plot, 178

Stepwise regression. 479


Stirling', fo<mula, 145

Stochastic process. 555


Stochastic simulation, 576
Strata, 200, 204
Stratified ra.'1dom sample, 200.
204
Strong versus weak conclusions
in hypothesis testing, 268
Studentized residual, 471

Sum ofPoiSS{}n random


variables, 121
Summa."j' statistics for grouped
dam, 191
System failure rate, 543
T
Tabular form of the CUSW'M,
526

Taylor series, 62
t<listcibution,208,236
percentage points at 604

Terminating (transient)
simulations, 587
Three factor analysi.<t of variance,
366
Three factor factori.a1. 369
1'h.fee...dimensional scatter plot,
174,176
TIer chart, see tolerance chart
Tws in the Kruskal-Wallis test.
503
Ties in the sign test. 493
Ties in the Wilcoxon signed t3.Ilk
test, 497
'lime plots, 183
Titrlb-ro-failure distribution, 53'S,
540,541
Tolerance chan, 514
TQlerance interval, 257

Total !'robability Law, 26


in regression,
426
Transformation of the response,
330
Transient state for a Markov
chain, 558
Thmsition probabilities, 555,
556,563
Treatment. 321
Treatment sum of squares. 325
Tree diagram., 14
'Ilial control limits, 512
Triangular distribution. 43
TriInIned mean, 195
t-testS on individual regression
coefficients, 450
Tukey's test, 335, 336, 344, 363
Two-facto! factorial, 359
Two-factorrando~ effects
ANOYA,366
Two-factor mixed model
k'iOVA.367
Tw<'rsided alternative
hypothesis, 266, 269,271
Two-sided confidence interval,
233
Typc 1 error, 267
Type U error, 267, 494
'l.Ype II error for the sign test.,
494
Transfonna~on

524
Vnbalanced design, 331
Unbiased estimator, 217,219.
u chart,

220

Uncorrelated random variables,


88

Uniform (0, 1) !1lIldom

numb.",

581,582

Union of sets, 3
Universal set, 3
Universe, 170
Upper controlliroit, 510

Y
Variability, 169
Variable selection in regression.
474.479,482,483

Variance components, 337, 366,


368

Variance inflation factors, 464Variance of a random variable,


45,58

Index
Variar.ce of an estimator. 218
Va.-iance reduction JlleU:.ods in
simulation. 591, 593. 596
Venn diag:am... 4
W
Waiting-line tb.coiy, see
queuing

Weibull distribution, 137, 140,

540
mean and variance of, 137
\Veighted least squares, 435
\Vllcoxon ran..t:-sum test. 499
critical values for, 627. 628
\V1lcoxon s.:gned rank test for
paired s3JLples, 498 .

655

Wllcoxon signed rank test. 496


critical values for, 630

X chart, 510, 511


y
Yates' algorithm fot the2k, 385

You might also like