9.1 Overly simple random effects structure in LMMs inflate Type I error
We return here to the data-set we saw earlier, the Grodner and Gibson (2005) self-paced reading study (their Experiment 1) on English relative clauses. First, load the data, fit a model with a full variance-covariance matrix for subjects and items, and then extract the parameters from the fitted model, just like we did in the simulation chapter.
library(lingpsych)
## Grodner and Gibson 2005 Expt 1 data:
data("df_gg05e1")
<-df_gg05e1
gg05e1## look at the data frame:
head(gg05e1)
## subject item condition rawRT
## 6 1 1 objgap 320
## 19 1 2 subjgap 424
## 34 1 3 objgap 309
## 49 1 4 subjgap 274
## 68 1 5 objgap 333
## 80 1 6 subjgap 266
## convert raw RTs to log RTs:
$logrt<-log(gg05e1$rawRT)
gg05e1## sum-coding of predictor:
$so<-ifelse(gg05e1$condition=="objgap",1/2,-1/2)
gg05e1## fit model to log RTs:
<-lmer(logrt ~ so +
m1+so|subject) +
(1+so|item),
(data=gg05e1,
## "switch off" warnings:
control=lmerControl(calc.derivs=FALSE))
## boundary (singular) fit: see help('isSingular')
## extract parameter estimates:
<-round(summary(m)$coefficients[,1],4)
beta<-round(attr(VarCorr(m),"sc"),2)
sigma_e<-round(attr(VarCorr(m)$subject,"stddev"),4)
subj_ranefsd<-round(attr(VarCorr(m)$subject,"corr"),1)
subj_ranefcorr
## assemble variance-covariance matrix for subjects:
<-SIN::sdcor2cov(stddev=subj_ranefsd,
Sigma_ucorr=subj_ranefcorr)
<-round(attr(VarCorr(m)$item,"stddev"),4)
item_ranefsd<-round(attr(VarCorr(m)$item,"corr"),1)
item_ranefcorr
## assemble variance matrix for items:
## choose some intermediate values for correlations:
<-(diag(2) + matrix(rep(1,4),ncol=2))/2
corr_matrix
<-SIN::sdcor2cov(stddev=item_ranefsd,
Sigma_wcorr=corr_matrix)
We are now ready to compute Type I error using simulation.
9.1.1 Type I error with a varying intercepts-only model
Compute Type I error for a varying intercepts model using log reading times:
## null is true:
2]<-0
beta[<-100
nsim<-rep(NA,nsim)
significant
for(i in 1:nsim){
#generate sim data:
<-gen_sim_lnorm2(nitem=16,
simdatnsubj=42,
beta=beta,
Sigma_u=Sigma_u,
Sigma_w=Sigma_w,
sigma_e=sigma_e)
## fit model to sim data:
<-lmer(log(rt)~so+(1|subj)+(1|item),simdat,
mcontrol=lmerControl(calc.derivs=FALSE))
## record whether t-value is significant:
<-ifelse(abs(summary(m)$coefficients[2,3])>2,1,0)
significant[i] }
## boundary (singular) fit: see help('isSingular')
## boundary (singular) fit: see help('isSingular')
## boundary (singular) fit: see help('isSingular')
## boundary (singular) fit: see help('isSingular')
## boundary (singular) fit: see help('isSingular')
## boundary (singular) fit: see help('isSingular')
## boundary (singular) fit: see help('isSingular')
## boundary (singular) fit: see help('isSingular')
## boundary (singular) fit: see help('isSingular')
## boundary (singular) fit: see help('isSingular')
## boundary (singular) fit: see help('isSingular')
## boundary (singular) fit: see help('isSingular')
## boundary (singular) fit: see help('isSingular')
## boundary (singular) fit: see help('isSingular')
## boundary (singular) fit: see help('isSingular')
## boundary (singular) fit: see help('isSingular')
## boundary (singular) fit: see help('isSingular')
## boundary (singular) fit: see help('isSingular')
## boundary (singular) fit: see help('isSingular')
## boundary (singular) fit: see help('isSingular')
## boundary (singular) fit: see help('isSingular')
## boundary (singular) fit: see help('isSingular')
## boundary (singular) fit: see help('isSingular')
## Type I error is quite high:
mean(significant)
## [1] 0.31
We find that Type I error is quite high. This is the problem with fitting varying intercepts-only models that Barr et al. (2013) point out: if the data-generative process is such that varying slopes should have been included in the model, fitting a varying intercepts-only model generally will lead to Type I error inflation. Of course, the extent of the inflation will vary from data-set to data-set; no simulation can give us a general answer to the question of how much inflation will happen.
9.1.2 Type I error with a varying intercepts and varying slopes model
Next, let’s compute Type I error for a model with varying intercepts and varying slopes, but without correlations for the random effects (because these would be difficult to estimate with the number of subjects (42) and items (16) used in Grodner and Gibson (2005)). The dependent measure is still log rt.
<-rep(NA,nsim)
significant
for(i in 1:nsim){
#generate sim data:
<-gen_sim_lnorm2(nitem=16,
simdatnsubj=42,
beta=beta,
Sigma_u=Sigma_u,
Sigma_w=Sigma_w,
sigma_e=sigma_e)
## fit model to sim data:
<-lmer(log(rt)~so+(1+so|subj)+(1+so|item),simdat,
mcontrol=lmerControl(calc.derivs=FALSE))
## is the t-value significant?
<-ifelse(abs(summary(m)$coefficients[2,3])
significant[i]>2,1,0)
}
## Type I error is the expected value:
mean(significant)
[1] 0.06
Now we find that Type I error is near the expected 5%. As Barr et al. (2013) pointed out in their paper, when varying slopes are included, Type I error will have the expected value.
9.1.3 Type I error inflation due to model mis-specification
Although this point is not discussed in Barr et al. (2013), an inappropriate likelihood function, like the Normal likelihood in the case where the true generative process involves a log-normal distribution, can also lead to inflated Type I error. This can happen even if the model has a full variance-covariance matrix specified for it. This is because of extreme values generated by the log-normal distribution being excessively influential in generating Type I errors.
Type I error is inflated even with a maximal model if the likelihood is mis-specified:
mean(significantlog)
## [1] 0.04
mean(significantraw)
## [1] 0.12
One can also compare the estimates (along with their 95% confidence intervals) that were significant in the raw and log ms scales:
<- position_dodge(.3)
pd ggplot(ests, aes(x=n, y=mn, colour=transform, group=transform)) + geom_errorbar(aes(ymin=lower, ymax=upper),
width=.2, size=0.25, colour="black", position=pd) + geom_line(position=pd) +
geom_point(position=pd, size=2.5)+theme_bw()+theme(axis.title.x = element_blank())
with(ests,tapply(width,transform,mean))
## log raw
## 71.59 94.42
We see that the effect estimates that were significant were based on over-/mis-estimates on the raw scale (the true effect is 0 ms).