You are on page 1of 9

THIRD EDITION

Applied Regression Analysis and


Other Multivariable Methods
David G. Kleinbaum
Emory University

Lawrence L. Kupper
University of North Carolina, Chapel Hill

Keith E. Muller
University of North Carolina, Chapel Hill

Azhar Nizam
Emory University

An Alexander Kugushev Book

^P> Duxbury Press


An Imprint of Brooks/Cole Publishing Company
lP An International Thomson Publishing Company
Pacific Grove Albany Belmont Bonn Boston Cincinnati Detroit Johannesburg London
Madrid Melbourne Mexico City New York Paris Singapore Tokyo Toronto Washington

Contents

CONCEPTS AND EXAMPLES OF RESEARCH

1-1 Concepts
1
1-2 Examples
2
1-3 Concluding Remarks
References
6
2

CLASSIFICATION OF VARIABLES AND THE CHOICE OF ANALYSIS

2-1 Classification of Variables


7
2-2 Overlapping of Classification Schemes
2-3 Choice of Analysis
11
References
13

11

BASIC STATISTICS: A REVIEW

14

3-1
3-2
3-3
3-4
3-5
3-6

Preview
14
Descriptive Statistics
15
Random Variables and Distributions
16
Sampling Distributions of /, #2, and F
19
Statistical Inference: Estimation
21
Statistical Inference: Hypothesis Testing
24
xi

xii

Contents
3-7 Error Rates, Power, and Sample Size
Problems
30
References
33

28

INTRODUCTIONTO REGRESSION ANALYSIS

4-1
4-2
4-3
4-4

Preview
34
Association versus Causality
35
Statistical versus Deterministic Models
Concluding Remarks
38
References
38

STRAIGHT-LINE REGRESSION ANALYSIS

5-1
5-2
5-3
5-4
5-5
5-6
5-7
5-8
5-9
5-10
5-11

34

37

39

Preview
39
Regression with a Single Independent Variable
39
Mathematical Properties of a Straight Line
42
Statistical Assumptions for a Straight-line Model
43
Determining the Best-fitting Straight Line
47
Measure of the Quality of the Straight-line Fit and Estimate of a2
Inferences About the Slope and Intercept
52
Interpretations of Tests for Slope and Intercept
54
Inferences About the Regression Line JUY\X = o + i%
57
Prediction of a New Value of FatX 0
59
Assessing the Appropriateness of the Straight-line Model
60
Problems
60
References
87

THE CORRELATION COEFFICIENT


AND STRAIGHT-LINE REGRESSION ANALYSIS
6-1
6-2
6-3
6-4
6-5
6-6
6-7

51

88

Definition of r
88
ras a Measure of Association
89
The Bivariate Normal Distribution
90
r and the Strength of the Straight-line Relationship
93
What r Does Not Measure
95
Tests of Hypotheses and Confidence Intervals for the Correlation Coefficient
Testing for the Equality of Two Correlations
99
Problems
101
References
103

96

Contents
7

THE ANALYSIS-OF-VARIANCE TABLE

7-1 Preview
104
7-2 The ANOVA Table for Straight-line Regression
Problems
108

104

104

MULTIPLE REGRESSION ANALYSIS:


GENERAL CONSIDERATIONS
111

8-1
8-2
8-3
8-4
8-5
8-6
8-7

Preview
111
Multiple Regression Models
112
Graphical Look at the Problem
113
Assumptions of Multiple Regression
115
Determining the Best Estimate of the Multiple Regression Equation
The ANOVA Table for Multiple Regression
119
Numerical Examples
121
Problems
123
References
135

TESTING HYPOTHESES IN MULTIPLE REGRESSION

9-1
9-2
9-3
9-4
9-5
9-6

Preview
136
Test for Significant Overall Regression
137
PartialFTest
138
Multiple PartialFTest
143
Strategies for Using Partial F Tests
145
Tests Involving the Intercept
150
Problems
151
References
159

CORRELATIONS: MULTIPLE,
PARTIAL, AND MULTIPLE PARTIAL
10-1
10-2
10-3
10-4
10-5
10-6

118

136

160

Preview
160
Correlation Matrix
161
Multiple Correlation Coefficient
162
Relationship of Ry\xhx2,...,xk to the Multivariate Normal Distribution
Partial Correlation Coefficient
165
Alternative Representation of the Regression Model
172

164

xiii

xiv

Contents
10-7 Multiple Partial Correlation
10-8 Concluding Remarks
174
Problems
174
Reference
185

11
11-1
11-2
11-3
11-4
11-5

12
12-1
12-2
12-3
12-4
12-5
12-6
12-7
12-8
12-9

13
13-1
13-2
13-3
13-4
13-5
13-6
13-7
13-8
13-9

172

CONFOUNDING AND INTERACTION IN REGRESSION


Preview
186
Overview
186
Interaction in Regression
Confounding in Regression
Summary and Conclusions
Problems
199
Reference
211

188
194
199

REGRESSION DIAGNOSTICS

212

Preview
212
Simple Approaches to Diagnosing Problems in Data
Residual Analysis
216
Treating Outliers
228
Collinearity
237
Scaling Problems
248
Treating Collinearity and Scaling Problems
248
Alternate Strategies of Analysis
249
An Important Caution
252
Problems
253
References
279

POLYNOMIAL REGRESSION

212

281

Preview
281
Polynomial Models
282
Least-squares Procedure for Fitting a Parabola
282
ANOVA Table for Second-order Polynomial Regression
284
Inferences Associated with Second-order Polynomial Regression
Example Requiring a Second-order Model
286
Fitting and Testing Higher-order Models
290
Lack-of-fit Tests
290
Orthogonal Polynomials
292

Contents
13-10 Strategies for Choosing a Polynomial Model
Problems
302

301

14

DUMMY VARIABLES IN REGRESSION

14-1
14-2
14-3
14-4
14-5
14-6
14-7
14-8
14-9
14-10
14-11
14-12
14-13

Preview
317
Definitions
317
Rule for Defming Dummy Variables
318
Comparing Two Straight-line Regression Equations: An Example
319
Questions for Comparing Two Straight Lines
320
Methods of Comparing Two Straight Lines
321
Method I: Using Separate Regression Fits to Compare Two Straight Lines
322
Method II: Using a Single Regression Equation to Compare Two Straight Lines
327
Comparison of Methods I and II
330
Testing Strategies and Interpretation: Comparing Two Straight Lines
330
Other Dummy Variable Models
332
Comparing Four Regression Equations
334
Comparing Several Regression Equations Involving Two Nominal Variables
336
Problems
338
References
360

-, c
1^

ANALYSIS OF COVARIANCE AND OTHER


METHODS FOR ADJUSTING CONTINUOUS DATA

15-1
15-2
15-3
15-4
15-5
15-6
15-7

16
16-1
16-2
16-3
16-4
16-5

317

361

Preview
361
Adjustment Problem
361
Analysis of Covariance
363
Assumption of Parallelism: A Potential Drawback
365
Analysis of Covariance: Several Groups and Several Covariates
Comments and Cautions
368
Summary
371
Problems
371
Reference
385

366

SELECTING THE BEST REGRESSION EQUATION

386

Preview
386
Steps in Selecting the Best Regression Equation
387
Step 1: Specifying the Maximum Model
387
Step 2: Specifying a Criterion for Selecting a Model
390
Step 3: Specifying a Strategy for Selecting Variables
392

xv

xvi

Contents
16-6
16-7
16-8
16-9

17
17-1
17-2
17-3
17-4
17-5
17-6
17-7
17-8
17-9

18
18-1
18-2
18-3
18-4
18-5
18-6
18-7

19
19-1
19-2
19-3
19-4
19-5
19-6
19-7

Step 4: Conducting the Analysis


401
Step 5: Evaluating Reliability with Split Samples
Example Analysis ofActual Data
403
Issues in Selecting the Most Valid Model
409
Problems
409
References
422

ONE-WAY ANALYSIS OFVARIANCE

401

423

Preview
423
One-way ANOVA: The Problem, Assumptions, and Data Configuration
426
Methodology for One-way Fixed-effects ANOVA
429
Regression Model for Fixed-effects One-way ANOVA
435
Fixed-effects Model for One-way ANOVA
438
Random-effects Model for One-way ANOVA
440
Multiple-comparison Procedures for Fixed-effects One-way ANOVA
443
Choosing a Multiple-comparison Technique
456
Orthogonal Contrasts and Partitioning an ANOVA Sum of Squares
457
Problems
463
References
483

RANDOMIZED BLOCKS: SPECIAL CASE OF TWO-WAY ANOVA


Preview
484
Equivalent Analysis of a Matched Pairs Experiment
488
PrincipleofBlocking
491
Analysis of a Randomized-blocks Experiment
493
ANOVA Table for a Randomized-blocks Experiment
495
Regression Models for a Randomized-blocks Experiment
499
Fixed-effects ANOVA Model for a Randomized-blocks Experiment
Problems
503
References
515

TWO-WAY ANOVA WITH EQUAL CELL NUMBERS


Preview
516
Usinga Table of Cell Means
518
General Methodology
522
F Tests for Two-way ANOVA
527
Regression Model for Fixed-effects Two-way ANOVA
Interactions in Two-way ANOVA
534
Random- and Mixed-effects Two-way ANOVA Models
Problems
544
References
560

530
541

502

516

Contents

20
20-1
20-2
20-3
20-4

21
21-1
21-2
21-3
21-4
21-5
21-6

22
22-1
22-2
22-3
22-4

23
23-1
23-2
23-3
23-4
23-5
23-6

TWO-WAY ANOVA WITH UNEQUAL CELL NUMBERS

561

Preview
561
Problems with Unequal Cell Numbers: Nonorthogonality
563
Regression Approach for Unequal Cell Sample Sizes
567
Higher-way ANOVA
571
Problems
572
References
588

ANALYSIS OF REPEATED MEASURES DATA

589

Preview
589
Examples
590
General Approach for Repeated Measures ANOVA
592
Overview of Selected Repeated Measures Designs and ANOVA-based Analyses
Repeated Measures ANOVA for Unbalanced Data
611
Other Approaches to Analyzing Repeated Measures Data
612
Appendix 21-A Examples of SAS's GLM and MIXED Procedures
613
Problems
616
References
638
THE

METHOD OF MAXIMUM LIKELIHOOD

Preview
639
The Principle of Maximum Likelihood
639
Statistical Inference via Maximum Likelihood
Summary
652
Problems
653
References
655

LOGISTIC REGRESSION ANALYSIS

639

642

656

Preview
656
The Logistic Model
656
Estimating the Odds Ratio Using Logistic Regression
658
A Numerical Example of Logistic Regression
664
Theoretical Considerations
671
An Example of Conditional ML Estimation
Involving Pair-matched Data with Unmatched Covariates
677
23-7 Summary
681
Problems
682
References
686

594

xvii

xviii

Contents

24

POISSON REGRESSION ANALYSIS

687

24-1
24-2
24-3
24-4
24-5
24-6
24-7
24-8

Preview
687
The Poisson Distribution
687
An Example of Poisson Regression
688
Poisson Regression: General Considerations
690
Measures of Goodness of Fit
694
Continuation of Skin Cancer Data Example
696
A Second Illustration of Poisson Regression Analysis
Summary
704
Problems
705
References
709

APPENDIX ATABLES

701

711

A-l
A-2
A-3
A-4

Standard Normal Cumulative Probabilities


Percentiles of the t Distribution
715
Percentiles of the Chi-square Distribution
Percentiles of the F Distribution
717
1
+
r
A-5 Values off In724

712
716

1 -r

A-6
A-7
A-8
A-9
A-10

Upper a Point of Studentized Range


726
Orthogonal Polynomial Coefficients
728
Bonferroni Corrected Jackknife and Studentized Residual Critical Values
Critical Values for Leverages
730
Critical Values for the Maximum of N Values of Cook's d(i) times (n-k-\)

APPENDIX BMATRICES AND THEIR


RELATIONSHIP TO REGRESSION ANALYSIS

731

732

APPENDIX CANOVA INFORMATION FOR FOUR


COMMON BALANCED REPEATED MEASURES DESIGNS
C-1
C-2
C-3
C-4
C-5

729

744

Balanced Repeated Measures Design with One Crossover Factor (Treatments)


744
Balanced Repeated Measures Design with Two Crossover Factors
746
Balanced Repeated Measures Design with One Nest Factor (Treatments) 750
Balanced Repeated Measures Design with One Crossover Factor and One Nest Factor
Balanced Two-group Pre/Posttest Design
755
References
757
SOLUTIONS TO EXERCISES

INDEX

787

758

752

You might also like