Professional Documents
Culture Documents
370
371
Succimer
Placebo
Total
78
76
54
16
26
26
47
51
40
372
373
374
375
Normal Distribution
f(Yi ; i,) = (22)-1/2exp{-(Yi - i)2/22}
= exp{-1/2ln(22)}exp{-(Yi - i)2/22}
= exp{-(Yi2 2Yi i + i2)/22 -1/2ln(22)}
= exp[{Yi i - i2/2)/2 -1/2{Yi2/ 2 + ln(22)}]
is an exponential family distribution with i= i and = 2
Thus, i is the location parameter (mean) and the scale parameter.
Note that a(i) = i2/2
b(Yi, ) = -1/2{Yi2/ 2 + ln(22)}
= -1/2{Yi2/ + ln(2)}]
Two other important exponential family distributions are the Bernoulli and
the Poisson distributions.
377
Bernoulli Distribution
f(Yi;i) = iYi(1 - i)1-Yi
Where i = Pr(Yi = 1)
Note that
f(Yi,i) = iYi(1 - i)1-Yi
= exp{Yiln(i) + (1 - Yi)ln(1 - i)}
= exp{Yiln(i/(1 - i)) + ln(1 - i)}
Thus
and
i = ln(i/(1 - i)
=1
378
Poisson Distribution
f(Yi; i) = e- IiYi/(Yi!)
= exp{Yilni - i- ln(Yi!)
i = ln(i)
and
=1
379
380
Bernoulli distribution:
For the Bernoulli distribution, the mean i is i , with 0 < i < 1, so we would
prefer a link function that transforms the interval [0, 1] onto the entire real
line [-, ].
There are several possibilities:
logit: g(i) = ln[i/(1- i)]
probit: g(i) = -1(i)
where () is the standard normal cumulative distribution function.
Another possibility is the complementary log-log link function:
g(i) = ln [-ln (1- i)]
382
Poisson Distribution:
For count data, we must have i = i > 0.
The Poisson distribution is often used to model count data with the log
link function
g(i) = ln(i)
with inverse i = exp(i)
With the log link function, additive effects contributing to i act
multiplicatively on i.
383
Xi'{Yi - i) = 0
yields the maximum likelihood estimates of .
385
386
387
Marginal Models
One approach is to specify the marginal distribution at each time point:
f (Yij) for j = 1, 2, . . . p
along with some assumptions about the covariance structure of the
observations.
The basic premise of marginal models is to make inferences about
population averages.
The term `marginal' is used here to emphasize that the mean response
modeled is conditional only on covariates and not on other responses (or
random effects). That is, we specify the univariate distribution for the
response at each time point.
388
Succimer
Placebo
28
42
78
16
76
26
54
26
389
10
390
391
11
Suppose, for example, that the probability of a blood lead < 20 for
participants in the TLC trial is described by a logistic model, but that the
risk for an individual child depends on that child's latent (perhaps
environmentally and genetically determined) `random response level'.
Then we might consider a model where
logit Pr (Y = 1|bi) = 0 + 1time1ij + 2time2ij + 3Xi + bi
Note that such a model also requires specification of F(bi).
One common assumption is that bi is normally distributed with mean 0 and
unknown variance.
This is an example of a generalized linear mixed effects model.
392
393
12
Thus, for the blood lead data in the TLC trial, we could write the
probability model as
f (Yi1) f (Yi2|Yi1) . . . f (Yip|Yi1, . . . Yi,p-1)
That is, the probability of a blood lead value below 20 at time 2 is
modeled conditional on whether blood lead was below 20 at time 1,
and so on.
If the conditional probability depends only on the previous state, we have
f (Yi1) f (Yi2|Yi1) . . . f (Yip|Yi,p-1)
As noted above, transitional models potentially provide a complete
description of the joint distribution.
394
For the linear model, the regression coefficients of the marginal, mixed
effects and transitional models can be directly related to one another.
For example, coefficients from random effects and transitional models can
be given marginal interpretations.
However, with non-linear link functions, such as the logit or log link
functions, this is not the case.
We will return to this point later.
For now, we describe the development and application of marginal models
for the analysis of repeated responses.
395
13