You are on page 1of 9

Adivitya Sharma, 80303120046

BADM

Assignment 2

Question
A local health clinic sent flyers to encourage everyone, but especially older persons at high risk of complications, to get flu shot in time for protection against an expected flu epidemic. In a pilot follow-up study, 50 clients were randomly selected and asked whether they actually received a flu shot. This serves as the binary dependent variable (SHOT). In addition, data were collected on their age (AGE) and on a health awareness index (HAI), for which higher values indicate greater awareness. A client who received a flu shot was coded SHOT=1, and a client who did not receive a shot was coded SHOT=0. OUTPUT (LOGISTIC REGRESSION) Logistic Regression Results
The LOGISTIC Procedure Model Information

Data Set WORK.SORTTEMPTABLESORTED Response Variable shot Number of Response Levels2 Model binary logic Optimization Technique Fisher's scoring Number of Observations Read 50 Number of Observations Used50
Response Profile Ordered Total Valueshot Frequency

10 21

29 21

Probability modeled is shot=1. Model Convergence Status

Convergence criterion (GCONV=1E-8) satisfied.


Model Fit Statistics Criterion Intercept Only Intercept and Covariates

AIC SC -2 Log L

70.029 71.941 68.029

38.416 44.152 32.416

Testing Global Null Hypothesis: BETA=0

Testing Global Null Hypothesis: BETA=0 Test Chi-SquareDF Pr > ChiSq

Score Wald

26.4760 2 11.6027 2

<.0001 0.0030

Analysis of Maximum Likelihood Estimates Standard Wald Parameter DF Estimate Error Chi-Square Pr > ChiSq

Intercept age hai

1 21.5826 1 0.2218 1 0.2035

6.4176 0.0744 0.0627

11.3102 8.8958 10.5248

0.0008 0.0029 0.0012

Odds Ratio Estimates 95% Wald Effect Point Estimate Confidence Limits

age hai

1.248 1.226

1.079 1.084

1.444 1.386

Association of Predicted Probabilities and Observed Responses

Percent Concordant Percent Discordant Percent Tied Pairs

92.3Somers' D 0.846 7.7Gamma 0.846 0.0Tau-a 0.420 609c 0.923

1. Run a model with both predictors, HAI and AGE. What is the -2Log Likelihood associated with the two-predictor model? Ans. With two predictors, the -2LL is now 32.416. 2. For the two-predictor model, write the regression equation in three forms

i. log (p/1-p) = B0 + B1*HAI + B2*AGE log (p/1-p) = -21.5821 + .2218 AGE + .2035HAI ii. Odds = eB0 + B1 * AGE + B2 * HAI) (Odds) = e (-21.5821 + 0.2218 * AGE + 0.2035 * HAI)

iii. p = 1/ (1+ e-(B0+ B1 * AGE + B2 * HAI)) p= 1/ (1+e-(-21.5821 + 0.2218 * AGE + 0.2035 * HAI))

3. What is the interpretation of the value of Beta2 (for AGE) in terms of the change in Odds? For a one unit increase in AGE we expect, on average, that the odds will become 1.2483 times their value before the one unit increase (holding all else constant). That is, we expect a 24.83% increase in the odds of getting a shot for each unit increase in AGE. 4. For the subject with ID#40, compute the values of [log (odds)], (odds), and p. What prediction do all of these lead to for this subject? Is your prediction correct for this subject? Data: Case # 40: SHOT=1; AGE=57; HAI=54 i. [log (odds)]' = -21.5821 + .2218AGE + .2035HAI = -21.5821 + .2218(57) + .2035(54) = 2.0495 this is above 0, so we predict shot (SHOT =1). ii. (odds)' = e (-21.5821 + .2218AGE + .2035HAI) = e (2.0495) = 7.764 this is above 1, so we predict shot. (SHOT = 1) iii. P' = 1/ [1 + e-(-21.5821 + .2218AGE + .2035HAI)] = 1/ [1 + e-(2.0495)] = 1/ [1 + .1288] = .8859 This is above .5, so we predict shot (SHOT = 1) Hence, using two predictor model, we predict a shot. Also from the data, we have SHOT = 1. Since both the predictions (from the model as well as from the data) match, we can say that our predictions are correct for this subject.

5. Perform Discriminant Analysis with Prior Probabilities set as equal. Compare the Misclassification rates from both the Logistic Regression and Discriminant Analysis models. Discriminant Analysis Results
The DISCRIM Procedure

5 Total Sample Size 0DF Total 49 Variables 2DF Within Classes 48 Classes 2DF Between Classes 1 Number of Observations Read 50 Number of Observations Used50
Class Level Information Variable Prior shotName Frequency Weight Proportion Probability

00 11

29.000 0 0.580000 0.500000 21.000 21 0 0.420000 0.500000 29


Pooled Covariance Matrix Information Natural Log of the Covariance Determinant of the Matrix Rank Covariance Matrix

8.56706

Generated by the SAS System ('Local', XP_PRO) on July 26, 2013 at 1:23:00 PM

Discriminant Analysis Results


The DISCRIM Procedure Generalized Squared Distance to shot From shot 0 1

Generalized Squared Distance to shot From shot 0 1

1 4.43542

Linear Discriminant Function for shot Variable 0 1

Constant -38.49756 -58.98085 Age 1.00183 1.21658 Hai 0.74476 0.94304

Discriminant Analysis Results


The DISCRIM Procedure Classification Summary for Calibration Data: WORK.SORTTEMPTABLESORTED Resubstitution Summary using Linear Discriminant Function Number of Observations and Percent Classified into shot From shot 0 1 Total 21 8 29 27.59 100.00 0 72.41 3 18 21 14.29 85.71 100.00 1 24 26 50 48.00 52.00 100.00 Total 0.5 0.5

Priors
Error Count Estimates for shot 0 1 Total

Rate 0.2759 0.1429 0.2094 Priors 0.5000 0.5000


Generated by the SAS System ('Local', XP_PRO) on July 26, 2013 at 1:23:00 PM

Discriminant Analysis Results


The DISCRIM Procedure Classification Results for Calibration Data: WORK.SORTTEMPTABLESORTED Cross-validation Results using Linear Discriminant Function Posterior Probability of Membership in

* Misclassified observation

Discriminant Analysis Results


The DISCRIM Procedure Classification Summary for Calibration Data: WORK.SORTTEMPTABLESORTED Cross-validation Summary using Linear Discriminant Function Number of Observations and Percent Classified into shot From shot 0 1 Total 21 8 29 27.59 100.00 0 72.41 3 18 21 85.71 100.00 1 14.29 24 26 50 52.00 100.00 Total 48.00 0.5 0.5

Priors
Error Count Estimates for shot 0 1 Total

Rate 0.2759 0.1429 0.2094 Priors 0.5000 0.5000

Misclassification rate = FP+FN/ (TP+TN+FP+FN) = (3+8)/ (21+8+3+18) = 0.22

You might also like