3.5 Linear mixed models

We return to our subject and object relative clause data from English (Experiment 1 of Grodner and Gibson 2005). First load the data as usual, define relative clause type as a sum-coded predictor, and create a new column called so that represents the contrast coding (\(\pm 1\) sum contrasts). From this point on, we will fit log-transformed reading times. To this end, create a column that holds log-transformed reading time.

gg05e1$so <- ifelse(gg05e1$condition == "objgap", 1, -1)
gg05e1$logrt <- log(gg05e1$rawRT)

Recall that these data have multiple measurements from each subject for each condition:

t(xtabs(~ subject + condition, gg05e1))

##          subject
## condition 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
##   objgap  8 8 8 8 8 8 8 8 8  8  8  8  8  8  8  8  8  8
##   subjgap 8 8 8 8 8 8 8 8 8  8  8  8  8  8  8  8  8  8
##          subject
## condition 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33
##   objgap   8  8  8  8  8  8  8  8  8  8  8  8  8  8  8
##   subjgap  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8
##          subject
## condition 34 35 36 37 38 39 40 41 42
##   objgap   8  8  8  8  8  8  8  8  8
##   subjgap  8  8  8  8  8  8  8  8  8

We can visualize the different reading times of each of the 42 subjects as in Figure 3.4.

The subject (coded -1) vs. object (coded +1) relative clause reading times for each subject. Also shown are linear models fit separately to the data for each subject.

FIGURE 3.4: The subject (coded -1) vs. object (coded +1) relative clause reading times for each subject. Also shown are linear models fit separately to the data for each subject.

It’s clear from this figure that different subjects have different effects of the relative clause manipulation: some slopes are positive sloping, some are flat, and some are negatively sloping. There is between-subject variability in the relative clause effect.

Given these differences between subjects, you could fit a separate linear model for each subject, collect together the slopes for each subject, and then check if the slopes are significantly different from zero. There is a function in the package lme4 that computes separate linear models for each subject: lmList.

lmlist_m1 <- lmList(logrt ~ so | subject, gg05e1)

One can extract the intercept and slope estimates for each subject. For example, for subject 1:

lmlist_m1$`1`$coefficients

## (Intercept)          so 
##     5.76962     0.04352

Figure 3.5 shows the individual fitted linear models for each subject, as well as the fit of a simple linear model m1log for all the data taken together (the gray line); this figure shows how each subject deviates in intercept and slope from the model m1log’s intercept and slope.

The thick gray line represents the intercept and slope estimate from the simple linear model `m1log` fit at the beginning of the chapter. The black lines are the individual linear model estimates of the intercept and slope for each subject, where each linear model is fit separately for each subject using the `lmList` function in the `lme4` package.

FIGURE 3.5: The thick gray line represents the intercept and slope estimate from the simple linear model m1log fit at the beginning of the chapter. The black lines are the individual linear model estimates of the intercept and slope for each subject, where each linear model is fit separately for each subject using the lmList function in the lme4 package.

To find out if there is an effect of relative clause type, we simply need to check whether the slopes of the individual subjects’ fitted lines taken together are significantly different from zero. A one-sample t-test will achieve this:

summary_ttest(t.test(coef(lmlist_m1)[2]))

## [1] "t(41)=2.81 p=0.008"
## [1] "est.: 0.06 [0.02,0.11] ms"

The above test is exactly the same as the paired t-test and the varying intercepts linear mixed model that we fit earlier using the by-subject aggregated data. The only difference is that we are now using log reading times.

bysubj <- aggregate(log(rawRT) ~ subject + condition,
  mean,
  data = gg05e1
)

colnames(bysubj)[3] <- "logrt"

Compare the paired t-test and linear mixed model results (the t-values):

summary_ttest(t.test(logrt ~ condition, bysubj, 
                     paired = TRUE))

## [1] "t(41)=2.81 p=0.008"
## [1] "est.: 0.12 [0.03,0.21] ms"

## compare with linear mixed model:
summary(lmer(
  logrt ~ condition + (1 | subject),
  bysubj
))$coefficients[2, ]

##   Estimate Std. Error    t value 
##   -0.12403    0.04414   -2.81021

The t-test we carried out using the slopes from the lmList function is called repeated measures regression. This repeated measures regression model is now largely of historical interest, and useful only for understanding the linear mixed model, which will be our standard approach.

We turn next to three main types of linear mixed model; other variants will be introduced in later chapters.

3.5.1 Model type 1: Varying intercepts

We have already encountered this model. In the model shown below, the statement

\[\begin{equation} (1 \mid \hbox{subject}) \end{equation}\]

adjusts the grand mean estimates of the intercept by a term (a number) for each subject. That was the term called \(u_{0i}\) earlier in the chapter.

m0_lmer <- lmer(logrt ~ so + (1 | subject), gg05e1)

Notice that we did not aggregate the data this time.

Here is the abbreviated output from model m0_lmer:

Random effects:
 Groups   Name        Variance Std.Dev.
 subject  (Intercept) 0.09983  0.3160  
 Residual             0.14618  0.3823  
Number of obs: 672, groups:  subject, 42

Fixed effects:
            Estimate Std. Error t value
(Intercept)  5.88306    0.05094 115.497
so           0.06202    0.01475   4.205

One thing to notice in the present example is that the coefficients (intercept and slope) of the fixed effects of the above model are identical to those in the linear model:

m0_lm <- lm(logrt ~ so, gg05e1)
summary(m0_lm)

## 
## Call:
## lm(formula = logrt ~ so, data = gg05e1)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -1.183 -0.308 -0.109  0.260  2.575 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   5.8831     0.0191  308.78   <2e-16 ***
## so            0.0620     0.0191    3.26   0.0012 ** 
## ---
## Signif. codes:  
## 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.494 on 670 degrees of freedom
## Multiple R-squared:  0.0156, Adjusted R-squared:  0.0141 
## F-statistic: 10.6 on 1 and 670 DF,  p-value: 0.00119

What is different between the linear model and the linear mixed model is the standard error. In the latter, the standard error is determined by more than one source of variance, as we explain below.

The intercept adjustments for each subject can be viewed by typing:

## first 10 subjects' intercept adjustments:
ranef(m0_lmer)$subject[, 1][1:10]

##  [1] -0.103928  0.077195 -0.230621  0.234198  0.008828
##  [6] -0.095363 -0.205571 -0.155371  0.075944 -0.364367

Figure 3.6 shows another way to summarize the adjustments to the grand mean intercept by subject. The error bars represent 95% confidence intervals.

library(lattice)
print(dotplot(ranef(m0_lmer, condVar = TRUE)))

## $subject

The varying intercepts by subject plotted using the dotplot in the lattice library. Also shown are 95% confidence intervals of each estimate.

FIGURE 3.6: The varying intercepts by subject plotted using the dotplot in the lattice library. Also shown are 95% confidence intervals of each estimate.

Figure 3.6 is only showing us the distribution of the adjustments \(u_{0i}\) around zero. One can also work out the estimates \(\hat\beta_0 + u{0i}\), i.e., each subject’s predicted intercept from the model. However, some random variable theory is needed.

If we have two random variables \(X\) and \(Y\) that are independent, the variance of the sum of these two random variables is \(Var(X) + Var(Y)\). The sampling distribution of the parameter \(\beta\) has some standard error \(SE_{\beta_0}\): we have those estimates from the linear mixed model.

The code below uses the above fact about random variable theory. The steps are:

extract the random effect adjustments to the intercept for each subject; also extract the standard deviation of each estimate
extract the intercept from the linear mixed model (\(\hat\beta_0\)), and its standard error
work out the estimates \(\hat\beta_0 + u_{0i}\) and the standard deviation of the sum of these two random variables (keep in mind that for \(\hat\beta_0\) we are looking at the sampling distribution of the mean; the parameter \(\beta_0\) itself is not a random variable but a fixed value).
use these estimates to work out the confidence intervals, and plot.

Here is the code that implements the above steps:

ranefu0<-as.data.frame(ranef(m0_lmer, 
                             condVar = TRUE))
ranefu0SE<-transform(ranefu0, 
                     lwr = condval - 1.96*condsd, 
                     upr = condval + 1.96*condsd)
u0<-ranefu0SE$condval
u0sd<-ranefu0SE$condsd

b0<-summary(m0_lmer)$coefficients[1,1]
b0SE<-summary(m0_lmer)$coefficients[1,2]

b0u0<-b0 + u0
b0u0SD<-sqrt(b0SE^2 + u0sd^2)
b0u0lower<-b0u0 - 2*b0u0SD
b0u0upper<-b0u0 + 2*b0u0SD

bysubjintercept<-data.frame(b=b0u0,
                            lower=b0u0lower,upper=b0u0upper)
bysubjintercept$subj<-factor(1:42)
bysubjintercept<-
  bysubjintercept[order(bysubjintercept$b),]
bysubjintercept$subj<-
  factor(bysubjintercept$subj,
         levels=as.numeric(bysubjintercept$subj))

FIGURE 3.7: The predicted adjusted intercepts (ordered by magnitude) for each subject, with a 95 percent confidence interval.

3.5.2 The formal statement of the varying intercepts model

The model m0_lmer above prints out the following type of linear model. \(i\) indexes subject, and \(j\) indexes items.

Because this is a Latin square design, once we know the subject id and the item id, we know which subject saw which condition:

subset(gg05e1, subject == 1 & item == 1)

##   subject item condition rawRT so logrt
## 6       1    1    objgap   320  1 5.768

The mathematical form of the linear mixed model is:

\[\begin{equation} y_{ij} = \beta_0 + u_{0i}+\beta_1\times so_{ij} + \varepsilon_{ij} \end{equation}\]

The only new thing here beyond the linear model we saw earlier is the by-subject adjustment to the intercept. These by-subject adjustments to the intercept \(u_{0i}\) are assumed by lmer to come from a normal distribution centered around 0:

\[\begin{equation} u_{0i} \sim Normal(0,\sigma_{u0}) \end{equation}\]

The ordinary linear model m0_lm has one intercept \(\beta_0\) for all subjects, whereas this linear mixed model with varying intercepts m0_lmer has a different intercept (\(\beta_0 + u_{0i}\)) for each subject \(i\).

Figure 3.8 visualizes the adjustments for each subject to the intercepts.

A visualization of the linear mixed model with varying intercepts for each subject (black lines). The model is `m0_lmer`. Also shown as a gray line is the fitted line corresponding to the simple linear model `m0_lm`, without varying intercepts.

FIGURE 3.8: A visualization of the linear mixed model with varying intercepts for each subject (black lines). The model is m0_lmer. Also shown as a gray line is the fitted line corresponding to the simple linear model m0_lm, without varying intercepts.

An important point is that in this model there are two variance components or sources of variance (cf. the linear model, which had only one):

Between-subject variability in the intercept:
- \(u_0 \sim Normal(0,\sigma_{u0})\)
Within-subject variability:
- \(\varepsilon \sim Normal(0,\sigma)\)

These two standard deviations determine the standard error of the \(\beta_1\) slope parameter.

3.5.3 Model type 2: Varying intercepts and varying slopes, without a correlation

Unlike the figure associated with the lmlist_m1 model above, which also involves fitting separate models for each subject, the model m0_lmer assumes different intercepts for each subject but the same slope.

We can choose to fit different intercepts as well as different slopes for each subject. To achieve this, assume now that each subject’s slope is also adjusted:

\[\begin{equation} y_{ij} = \beta_0 + u_{0i}+(\beta_1+u_{1i})\times so_{ij} + \varepsilon_{ij} \end{equation}\]

That is, we additionally assume that \(u_{1i} \sim Normal(0,\sigma_{u1})\). The lmer notation for fitting separate intercepts and slopes is (1+so||subject). The double vertical bar syntax will be explained soon.

m1_lmer <- lmer(logrt ~ so + (1 + so || subject), gg05e1)

The output of this model shows that now there are not two but three sources of variability. These are:

Between-subject variability in the intercept:
- \(u_0 \sim Normal(0,\sigma_{u0})\)
Between-subject variability in the slope:
- \(u_1 \sim Normal(0,\sigma_{u1})\)
Within-subject variability:
- \(\varepsilon \sim Normal(0,\sigma)\)

The model output reports the estimates of the three standard deviations:

\(\hat\sigma_{u0}=0.317\)
\(\hat\sigma_{u1}=0.110\)
\(\hat\sigma = 0.365\).

Random effects:
 Groups    Name        Variance Std.Dev.
 subject   (Intercept) 0.1006   0.317   
 subject.1 so          0.0121   0.110   
 Residual              0.1336   0.365   
Number of obs: 672, groups:  subject, 42

Fixed effects:
            Estimate Std. Error t value
(Intercept)   5.8831     0.0509  115.50
so            0.0620     0.0221    2.81

Figure 3.9 visualizes these fits for each subject. As before, the gray line shows the model with a single intercept and slope, i.e., the simple linear model m0_lm):

A visualization of the linear mixed model with varying intercepts as well as varying slopes for each subject (black lines). The model is `m1_lmer`. Also shown as a gray line is the fitted line corresponding to the simple linear model, without varying intercepts, `m0_lm`.

FIGURE 3.9: A visualization of the linear mixed model with varying intercepts as well as varying slopes for each subject (black lines). The model is m1_lmer. Also shown as a gray line is the fitted line corresponding to the simple linear model, without varying intercepts, m0_lm.

3.5.3.1 Comparing lmList model with the varying intercepts model

Compare this m1_lmer model with the lmlist_m1 model we fitted earlier. Figure 3.10 visualizes the difference between the two models.

FIGURE 3.10: The lmList model (left) vs. the varying intercepts and varying slopes model (right).

What is striking in Figure 3.10 is that in the linear mixed model plot, each subject’s estimated best fit line is “gravitating” towards the gray line which represents the simple linear model fit. Compare the lines to those from the lmList fits; there, each subject’s fitted line is based only on that subject’s data, the lines don’t gravitate towards the gray line. This gravitation in the linear mixed model of individual subjects’ intercepts and slopes towards the grand mean intercepts and slopes is called shrinkage. What is happening here is that the estimates for the intercept and slope for each subject is informed by two sources of information: the average behavior of all the subjects taken together, and the individual subjects’ measurements. By contrast, in the lmList model, the individual subjects’ fits are only informed by the individual subjects’ data, ignoring the average estimates of the intercept and slope over all subjects.

3.5.3.2 Visualizing random effects

As before, it is instructive to visualize the by-subjects adjustments to the intercept and slope. See Figure 3.11. What Figure 3.11 is showing is wide variability in the mean reading times between subjects, but relatively smaller variation in the slope between subjects.

FIGURE 3.11: The varying intercepts and varying slopes in the model m1_lmer.

3.5.3.3 The formal statement of the varying intercepts and varying slopes linear mixed model

Here is the formal statement of the varying intercept and varying slopes model. Again, i indexes subjects, j items.

\[\begin{equation} y_{ij} = \beta_0 + u_{0i}+(\beta_1+u_{1i})\times so_{ij} + \varepsilon_{ij} \end{equation}\]

There are three variance components, which partition the variation in the data to between-subjects and within-subjects variability:

Between-subjects variability in the intercept:
- \(u_0 \sim Normal(0,\sigma_{u0})\)
Between-subjects variability in the slope:
- \(u_1 \sim Normal(0,\sigma_{u1})\)
Within-subject variability:
- \(\varepsilon \sim Normal(0,\sigma)\)

3.5.3.4 Crossed random effects for subjects and for items

The varying intercepts and varying slopes model doesn’t capture the full story yet. The items also contribute sources of variance: just like subjects, items may also vary in their reading times or in the extent to which the reading times are impacted by condition. I other words, they might have different intercepts and slopes.

Notice that in the Latin square design (which is standard in psycholinguistics or linguistic research) each subject sees all the items, but only one condition from each item. In such cases, we say that subjects and items are crossed.

head(xtabs(~ subject + item, gg05e1))

##        item
## subject 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
##       1 1 1 1 1 1 1 1 1 1  1  1  1  1  1  1  1
##       2 1 1 1 1 1 1 1 1 1  1  1  1  1  1  1  1
##       3 1 1 1 1 1 1 1 1 1  1  1  1  1  1  1  1
##       4 1 1 1 1 1 1 1 1 1  1  1  1  1  1  1  1
##       5 1 1 1 1 1 1 1 1 1  1  1  1  1  1  1  1
##       6 1 1 1 1 1 1 1 1 1  1  1  1  1  1  1  1

A design with only some items being shown to each subject would be called partially crossed. A linear mixed model with (partially) crossed subject and items random effects can be defined with the following syntax:

m2_lmer <- lmer(logrt ~ so + (1 + so || subject) +
  (1 + so || item), gg05e1)

Analogously to the preceding example, now there are five variance components:

Random effects:
 Groups    Name        Variance Std.Dev.
 subject   (Intercept) 0.10090  0.3177  
 subject.1 so          0.01224  0.1106  
 item      (Intercept) 0.00127  0.0356  
 item.1    so          0.00162  0.0402  
 Residual              0.13063  0.3614  
Number of obs: 672, groups:  subject, 42; item, 16

Fixed effects:
            Estimate Std. Error t value
(Intercept)   5.8831     0.0517  113.72
so            0.0620     0.0242    2.56

The item intercept and slope adjustments can be visualized as well; see Figure 3.12. Notice that in the present data, there is less item-level variation than subject-level variation; this is often the case in planned experiments in sentence processing, where the experimental items are carefully constructed to vary as little as possible. However, there could be important variation in the items, and for that reason one should always take item level variability into account.

FIGURE 3.12: Varying intercepts and varying slopes by item in the model m2_lmer.

In the above models, there is an explicit assumption that there is no correlation between the varying intercept and slope adjustments by subject, and no correlation between the varying intercept and slope adjustments by item. It is possible that the intercept and slope adjustments are in fact correlated. We turn to this model next.

3.5.4 Model type 3: Varying intercepts and varying slopes, with correlation

A correlation can be introduced between the intercept and slope adjustments by using a single vertical bar instead of two vertical bars in the random effects structure:

m3_lmer <- lmer(
  logrt ~ so + (1 + so | subject) + (1 + so | item),
  gg05e1
)

## boundary (singular) fit: see help('isSingular')

To understand what this model is doing, we have to recall what a bivariate/multivariate distribution is.

Random effects:
 Groups   Name        Variance Std.Dev. Corr
 subject  (Intercept) 0.10103  0.3178       
          so          0.01228  0.1108   0.58
 item     (Intercept) 0.00172  0.0415       
          so          0.00196  0.0443   1.00 <= degeneracy
 Residual             0.12984  0.3603       
Number of obs: 672, groups:  subject, 42; item, 16

Fixed effects:
            Estimate Std. Error t value
(Intercept)   5.8831     0.0520  113.09
so            0.0620     0.0247    2.51

The correlations (0.58 and 1.00) you see in the model output above are the correlations between the varying intercepts and slopes for subjects and for items. The variance covariance matrix for items is called “degenerate”. This means that, because its correlation is 1, the matrix cannot be inverted. Variance-covariance matrices must be invertible. The model output also issues a “singularity” warning, which is the same thing as a matrix being not invertible.

When the correlation is \(\pm 1\) or very close to \(\pm 1\), this means that the optimizer in lme4 is unable to estimate the correlation parameter. This failure to estimate the correlation is almost always due to there not being enough data. If you are in such a situation, you are better off not trying to estimate this parameter with the data you have, and instead fitting one of the simpler models. We will return to this point when discussing model selection. For further discussion, see Barr et al. (2013), Bates et al. (2015), and Matuschek et al. (2017).

3.5.4.1 Formal statement of varying intercepts and varying slopes linear mixed model with correlation

As usual, i indexes subjects, j items. The vector so is the sum-coded factor levels: +1 for object relatives and -1 for subject relatives. The only new thing in this model is the item-level effects, and the specification of the variance-covariance matrix for subjects and items, in order to include the correlation parameters.

\[\begin{equation} y_{ij} = \beta_0 + u_{0i} + w_{0j} + (\beta_1 + u_{1i} + w_{1j}) \times so_{ij} + \varepsilon_{ij} \end{equation}\]

where \(\varepsilon_{ij} \sim Normal(0,\sigma)\) and

\[\begin{equation}\label{eq:covmatLM} \Sigma_u = \begin{pmatrix} \sigma _{u0}^2 & \rho _{u}\sigma _{u0}\sigma _{u1}\\ \rho _{u}\sigma _{u0}\sigma _{u1} & \sigma _{u1}^2\\ \end{pmatrix} \quad \Sigma _w = \begin{pmatrix} \sigma _{w0}^2 & \rho _{w}\sigma _{w0}\sigma _{w1}\\ \rho _{w}\sigma _{w0}\sigma _{w1} & \sigma _{w1}^2\\ \end{pmatrix} \end{equation}\]

\[\begin{equation}\label{eq:jointpriordistLM} \begin{pmatrix} u_0 \\ u_1 \\ \end{pmatrix} \sim \mathcal{N} \left( \begin{pmatrix} 0 \\ 0 \\ \end{pmatrix}, \Sigma_{u} \right), \quad \begin{pmatrix} w_0 \\ w_1 \\ \end{pmatrix} \sim \mathcal{N}\left( \begin{pmatrix} 0 \\ 0 \\ \end{pmatrix}, \Sigma_{w} \right) \end{equation}\]

3.5.4.2 Visualizing the random effects

One can visualize the correlation between intercepts and slopes by subjects; see Figure 3.13. The positive correlation of 0.58 between subject intercept and slope adjustments implies that slower subjects show larger effects. The correlation pattern isn’t easy to see in Figure 3.13.

FIGURE 3.13: The varying intercepts and varying slopes by subject in the model m3_lmer.

The correlation pattern becomes more apparent if we plot the slope adjustments against the intercept adjustments. See Figure 3.14.

FIGURE 3.14: The varying intercepts and varying slopes by subject in the model m3_lmer.

When we talk about hypothesis testing, we will look at what inferences we can draw from this correlation.

Figure 3.15 is a dotplot showing the item-level effects. The plot suggests a perfect correlation between intercept and slope adjustments, but as mentioned above, this correlation is a result of a failure to estimate the correlation. It is not meaningful.

FIGURE 3.15: Varying intercepts and varying slopes by item from the model m3_lmer.

For these data, the appropriate model would be one without correlations for items:

m4_lmer <- lmer(
  logrt ~ so + (1 + so | subject) + (1 + so || item),
  gg05e1
)
summary(m4_lmer)

## Linear mixed model fit by REML ['lmerMod']
## Formula: 
## logrt ~ so + (1 + so | subject) + ((1 | item) + (0 + so | item))
##    Data: gg05e1
## 
## REML criterion at convergence: 696.8
## 
## Scaled residuals: 
##    Min     1Q Median     3Q    Max 
## -3.000 -0.509 -0.129  0.292  6.180 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev. Corr
##  subject  (Intercept) 0.10088  0.3176       
##           so          0.01220  0.1105   0.58
##  item     (Intercept) 0.00132  0.0363       
##  item.1   so          0.00163  0.0403       
##  Residual             0.13060  0.3614       
## Number of obs: 672, groups:  subject, 42; item, 16
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)   5.8831     0.0518  113.67
## so            0.0620     0.0242    2.56
## 
## Correlation of Fixed Effects:
##    (Intr)
## so 0.388

The double vertical bar for the item random effects is the syntax in lme4 for removing the correlation term that the model was unable to estimate. For the Grodner and Gibson (2005) data, model m4_lmer is the one we would report in a paper. Specifically, the wording that would be appropriate is the following: “The estimate of the relative clause effect was 43 ms, with a t-value of 2.56. We can therefore reject the null hypothesis of no difference between the two relative clause types.”

In a balanced data set like this one (i.e., with no missing data), the t-test on the slope \(\beta_1\) is sufficient for drawing inferences about the null hypothesis test. However, when there is missing data (this happens for example in eyetracking data), the likelihood ratio test discussed in the next chapter is the more general method. In practice, researchers often interpret the t-value as we have done here, but in this book we will always use the likelihood ratio test (chapter 4) for hypothesis testing, as that is the more general approach.

References

Barr, Dale J, Roger Levy, Christoph Scheepers, and Harry J Tily. 2013. “Random Effects Structure for Confirmatory Hypothesis Testing: Keep It Maximal.” Journal of Memory and Language 68 (3): 255–78.

Bates, Douglas M., Reinhold Kliegl, Shravan Vasishth, and Harald Baayen. 2015. “Parsimonious Mixed Models.”

Grodner, Daniel, and Edward Gibson. 2005. “Consequences of the Serial Nature of Linguistic Input.” Cognitive Science 29: 261–90.

Matuschek, Hannes, Reinhold Kliegl, Shravan Vasishth, R. Harald Baayen, and Douglas M. Bates. 2017. “Balancing Type I Error and Power in Linear Mixed Models.” Journal of Memory and Language 94: 305–15. https://doi.org/10.1016/j.jml.2017.01.001.