8.1 A reminder: The maximal linear mixed model

Recall the structure of the linear mixed model that can be used to fit the Grodner and Gibson (2005) data. We will discuss the so-called maximal model here—varying intercepts and slopes for subject and for item, with correlations—because that is the most general case.

In the model specification below, \(i\) indexes subjects, \(j\) items. The vector so has the sum contrast coding as usual: object relatives are coded as +1/2 and subject relatives as -1/2. We use this coding instead of \(\pm 1\) as before, because now the slope will reflect the effect size rather than two times the effect size (see the hypothesis testing chapter).

Every row in the data-frame can be uniquely identified by the subject and item id, because this is a Latin square design and each subject sees exactly one instance of each item in a particular condition.

\[\begin{equation} y_{ij} = \beta_0 + u_{0i} + w_{0j} + (\beta_1 + u_{1i} + w_{1j}) \times so_{ij} + \varepsilon_{ij} \end{equation}\]

where \(\varepsilon_{ij} \sim Normal(0,\sigma)\) and

\[\begin{equation}\label{eq:covmatsimulations} \Sigma_u = \begin{pmatrix} \sigma _{u0}^2 & \rho _{u}\sigma _{u0}\sigma _{u1}\\ \rho _{u}\sigma _{u0}\sigma _{u1} & \sigma _{u1}^2\\ \end{pmatrix} \quad \Sigma _w = \begin{pmatrix} \sigma _{w0}^2 & \rho _{w}\sigma _{w0}\sigma _{w1}\\ \rho _{w}\sigma _{w0}\sigma _{w1} & \sigma _{w1}^2\\ \end{pmatrix} \end{equation}\]

\[\begin{equation}\label{eq:jointpriordistsimulation} \begin{pmatrix} u_0 \\ u_1 \\ \end{pmatrix} \sim \mathcal{N} \left( \begin{pmatrix} 0 \\ 0 \\ \end{pmatrix}, \Sigma_{u} \right), \quad \begin{pmatrix} w_0 \\ w_1 \\ \end{pmatrix} \sim \mathcal{N}\left( \begin{pmatrix} 0 \\ 0 \\ \end{pmatrix}, \Sigma_{w} \right) \end{equation}\]

\(\beta_0\) and \(\beta_1\) are the intercept and slope, representing the grand mean and the deviation from the grand mean in each condition. \(u\) are the subject level adjustments, and \(w\) the item level adjustments to the intercept and slope.
The above mathematical model expresses a generative process. In order to produce simulated data using the above process, we have to decide on some parameter values. We do this by estimating the parameters from the Grodner and Gibson (2005) study.

References

Grodner, Daniel, and Edward Gibson. 2005. “Consequences of the Serial Nature of Linguistic Input.” Cognitive Science 29: 261–90.