Professional Documents
Culture Documents
MRD
Sri Chavva
February 27, 2019
Introduction
In this markdown, I will brief what random intercept and random slope models are and how they differ from
conventional regression models and a specific use case of a random intercept model to Dr.Rootmans ptosis
experiment.
library(readxl)
##
## Call:
## lm(formula = MRD ~ `Weight (g)`, data = weightedEye)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.68434 -0.55467 -0.01001 0.45801 2.92132
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3.31832 0.09549 34.751 < 2e-16 ***
## `Weight (g)` -0.52831 0.07097 -7.445 9.62e-13 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.955 on 311 degrees of freedom
## Multiple R-squared: 0.1513, Adjusted R-squared: 0.1485
1
## F-statistic: 55.42 on 1 and 311 DF, p-value: 9.618e-13
xFixed = data.frame(`Weight (g)`= seq(0,2.4,length=20))
names(xFixed) = names(weightedEye)[5]
prediction = data.frame(xFixed,yFixed)
prediction
## Weight..g. yFixed
## 1 0.0000000 3.318316
## 2 0.1263158 3.251582
## 3 0.2526316 3.184848
## 4 0.3789474 3.118114
## 5 0.5052632 3.051381
## 6 0.6315789 2.984647
## 7 0.7578947 2.917913
## 8 0.8842105 2.851179
## 9 1.0105263 2.784445
## 10 1.1368421 2.717711
## 11 1.2631579 2.650977
## 12 1.3894737 2.584243
## 13 1.5157895 2.517509
## 14 1.6421053 2.450775
## 15 1.7684211 2.384041
## 16 1.8947368 2.317307
## 17 2.0210526 2.250573
## 18 2.1473684 2.183839
## 19 2.2736842 2.117106
## 20 2.4000000 2.050372
ggplot(data = weightedEye,aes(x=weightedEye$`Weight (g)`,y = weightedEye$MRD)) + geom_line(aes(colour =
2
6
weightedEye$`Subject ID`
N1 N2−15
N14 N2−16
N16 N2−17
weightedEye$MRD
4 N17 N2−18
N18 N2−2
N2 N2−3
N2−1 N2−4
N2−10 N2−5
2
N2−11 N2−6
N2−12 N2−7
N2−13 N2−8
N2−14 N2−9
Now lets create a similar regression model by treating the intercept as a random effect. In effect, we will
assume every subects intercept will be different. This means, we are easing the assumption that everyone
has the same MRD at weight=0 (MRD without a weight). This is beneficial because this is probably true. . .
people probably have different baseline MRD’s and treating the intercept as a random effect allows us to
make this distinction.
library(lme4)
## [1] 3.334749
xFixed = data.frame(`Weight (g)`= seq(0,2.4,length=20))
names(xFixed) = names(weightedEye)[5]
ggplot(data = weightedEye,aes(x=`Weight (g)`,y = MRD)) + geom_line(aes(colour = `Subject ID`)) +
geom_line(data = xFixed,aes(`Weight (g)`,mean(coef(modelRandomIntercept)[[1]][[1]]) + (`Weight (g)`)*mea
3
6
Subject ID
N1 N2−15
N14 N2−16
N16 N2−17
4 N17 N2−18
N18 N2−2
MRD
N2 N2−3
N2−1 N2−4
N2−10 N2−5
2
N2−11 N2−6
N2−12 N2−7
N2−13 N2−8
N2−14 N2−9
Ok great! So now we’ve created TWO regression lines, but they both look the same. The first one is a fixed
effect model (create through the lm function) and the second is a mixed effect model treating the intercept as
a random effect (through the lmer function). There is however a slight difference. Lets look at the intercept
of both coefficients.
cat("fixed effect model intercept : ", coef(fixedModel)[1], "\n")
Applying splines
What we have so far is great, however we can see the that the trends are not very linear. Rather, when we
look at any specific patient, we can see the effect of adding weight on MRD is ebbed and flowed. That is,
there are some segments where adding additional weight causes a large change in MRD and some segments
where adding additional weight is not so significant. We can model these changes in intensity by using what
are known as splines.
Essentially we are creating breakpoints on our predictor variables (x-axis) and create seperate regression
models to model each of these segments. Typically we make sure that lines on each segmented are connected
4
and connect smoothly (differentiable), but we’ll only worry about connecting the lines.
Choosing how many splines to implement and where to apply the breakpoints is often done through EDA or
expert judgement.
library(splines)
splinedModel = lmer(MRD~bs(`Weight (g)`,df=3)+(1|`Subject ID`),data=weightedEye)
betaCoefs = rep(0,dim(coef(splinedModel)[[1]])[2])
for(i in 1:dim(coef(splinedModel)[[1]])[2]){
betaCoefs[i] = coef(splinedModel)[[1]][[i]][1]
}
slopes = rep(0,dim(coef(splinedModel)[[1]])[2] - 1)
for(i in 1:dim(coef(splinedModel)[[1]])[2]-1){
slopes[i] = (betaCoefs[i+1])/1
}
xLine = seq(0,2.4,length=4)
yline = rep(0,length(xLine))
yline[1] = mean(coef(splinedModel)[[1]][[1]])
for(i in 1:(length(xLine)-1)){
yline[i+1] = yline[i] + slopes[i]*(2.4/3)
}
testData= data.frame(xLine,yline)
5
6
Subject ID
N1 N2−15
N14 N2−16
N16 N2−17
4 N17 N2−18
N18 N2−2
MRD
N2 N2−3
N2−1 N2−4
N2−10 N2−5
2
N2−11 N2−6
N2−12 N2−7
N2−13 N2−8
N2−14 N2−9
Well! This is a major improvement to the initial linear regression model we’ve started with. Let’s look at
summary table of our mixed effect segemented linear regression model.
summary(splinedModel)
6
##
## Correlation of Fixed Effects:
## (Intr) b(`W()`,d=3)1 b(`W()`,d=3)2
## b(`W()`,d=3)1 -0.357
## b(`W()`,d=3)2 -0.032 -0.480
## b(`W()`,d=3)3 -0.323 0.656 -0.399
All the slope coefficients are negative and the intercept seems to make sense. We can see that the area of
highest impact on MRD is in the 3rd (and last) segement (from 1.6 to 2.4 grams). This makes sense because
this is probably where most people face their natural breakpoint (the weight where people can not handle
more weight) and thus the reseracher chose to end the experiment with 2.4 gram weights.
Conclusion
We can see that inducing random effects and splines in our data improves the accuracy our our model
substancially. From here on out, we can add additional predictors and add random slopes. Give it a try!