3.4 From the paired t-test to the linear mixed model

One important point to notice is that the observed t-value of the paired t-test and the t-test printed out by the simple linear model don’t match:

summary_ttest(t.test(rawRT ~ condition, bysubj, 
                     paired = TRUE))

## [1] "t(41)=3.11 p=0.003"
## [1] "est.: 102.29 [35.85,168.72] ms"

round(summary(m0raw)$coefficients, 2)[, c(1:3)]

##                  Estimate Std. Error t value
## (Intercept)         471.4      31.13   15.14
## conditionsubjgap   -102.3      44.02   -2.32

This is because the linear model is equivalent to the unpaired (i.e., two sample) t-test:

summary_ttest(t.test(rawRT ~ condition, bysubj,
  paired = FALSE
))

## [1] "t(57)=2.32 p=0.024"
## [1] "est.: 471.36 [14.14,190.43] ms"
## [2] "est.: 369.07 [14.14,190.43] ms"

As discussed earlier, the paired t-test’s analog in the linear modeling framework is the linear mixed model with varying intercepts. We turn next to this extension of the simple linear model. The command corresponding to the paired t-test in the linear modeling framework is as follows. Notice that we are working with the aggregated data, as in the paired t-test. Later, we will switch to the unaggregated data; that will be the correct analysis of these data.

m0raw_lmer <- lmer(rawRT ~ condition + (1 | subject), bysubj)
summary(m0raw_lmer)$coefficients

##                  Estimate Std. Error t value
## (Intercept)         471.4      31.13  15.143
## conditionsubjgap   -102.3      32.90  -3.109

To understand the connection between the paired t-test and the above command, it is necessary to consider how a paired t-test is assembled.

First, some background knowledge from random variable theory. If you have two random variables that have correlation \(\rho\), the variance of the difference between the two random variables is:

\[\begin{equation} Var(Y_1-Y_2)=Var(Y_1) + Var(Y_2) - 2\times Cov(Y_1, Y_2) \end{equation}\]

\(Cov(Y_1, X2)\) is the covariance between the two random variables and is defined as:

\[\begin{equation} Cov(Y_1, Y_2) = \rho \sqrt{Var(Y_1)}\sqrt{Var(Y_2)} \end{equation}\]

You can find the proofs of the above assertions in books like Rice (1995).

As discussed earlier, a paired t-test is used when you have paired data from subject \(i=1,...,n\) in two conditions; assume that they are labeled conditions 1 and 2. Let’s write the data as two vectors \(Y_{1}, Y_{2}\). Because the pairs of data points are coming from the same subject, they are correlated with some correlation \(\rho\); recall Figure 2.23. Assume that both conditions 1 and 2 have standard deviation \(\sigma\).

To make this discussion concrete, let’s generate some simulated bivariate data that are correlated. At this point, you may want to review chapter 1’s section on bivariate/multivariate distributions. Assume that \(\sigma = 1\), \(\rho=0.5\), and that the data are balanced.

samplesize <- 12
mu <- c(.3, .2)
rho <- 0.5
stddev <- 1
Sigma <- matrix(stddev, nrow = 2, ncol = 2) + diag(2)
Sigma <- Sigma / 2
Sigma

##      [,1] [,2]
## [1,]  1.0  0.5
## [2,]  0.5  1.0

## simulated data:
y <- mvrnorm(n = samplesize, mu = mu, Sigma = Sigma, 
             empirical = TRUE)
head(y)

##         [,1]     [,2]
## [1,]  1.0736 -1.42696
## [2,]  0.2057  0.01514
## [3,] -0.4906  0.26519
## [4,]  0.9771  1.14904
## [5,] -0.3875  0.52204
## [6,]  0.8582  0.14618

n <- samplesize
y1 <- y[, 1]
y2 <- y[, 2]
y1

##  [1]  1.0736  0.2057 -0.4906  0.9771 -0.3875  0.8582
##  [7]  0.1865 -1.5633  1.0768 -0.7866  0.3463  2.1040

y2

##  [1] -1.42696  0.01514  0.26519  1.14904  0.52204
##  [6]  0.14618 -0.47579 -0.09902  0.79568 -1.22516
## [11]  0.45791  2.27576

To carry out the paired t-test, we need to know the variance of \(Y_1-Y_2\) because the t-statistic will be:

\[\begin{equation} T = \frac{(\mu_{Y_1} - \mu_{Y_2})-0}{\sqrt{Var(Y_1 - Y_2)/n}} \end{equation}\]

Now,

\[\begin{equation} Var(Y_1 - Y_2) = \sigma^2 + \sigma^2 - 2 \rho \sigma\sigma = 2\sigma^2 (1-\rho) \end{equation}\]

Now let’s compute the t-statistic using the above formula. Let the actual data vectors be \(y_1, y_2\).

\[\begin{equation} t_{obs} = \frac{(mean(y_1) - mean(y_2))-0}{\sqrt{Var(y_1 - y_2)/n}} \end{equation}\]

This simplifies to:

\[\begin{equation} t_{obs} = \frac{(mean(y_1) - mean(y_2))-0}{\sqrt{2\sigma^2(1-\rho)/n}} \end{equation}\]

Now compare the paired t-test output and the by-hand calculation:

summary_ttest(t.test(y1, y2, paired = TRUE))

## [1] "t(11)=0.35 p=0.736"
## [1] "est.: 0.1 [-0.54,0.74] ms"

round(((mean(y1) - mean(y2))-0) / 
        sqrt((2 * stddev^2 * (1 - rho)) / n),2)

## [1] 0.35

The two t-values are identical.

The linear mixed model we present next will fit exactly the same model as the paired t-test above. We will prove that the linear mixed model and the paired t-test are exactly the same model.

Suppose we have \(i\) subjects and two conditions, labeled 1 and 2. For now, assume that each subject sees each condition only once (e.g., the by-subjects aggregated English relative clause data), so we have two data points from each subject. In other words, the data are paired.

Then, for condition 1, the dependent variable can be described by the equation:

\(y_{i1} = \beta_0 + u_{0i} + \varepsilon_{i1}\)

Here, \(\beta_0\) is the mean reading time, and \(\varepsilon\) is the usual residual error term. The interesting new term is \(u_{0i}\). This is a numerical value that constitutes an adjustment to the intercept \(\beta_0\), for subject \(i\). That is, if some subject is slower than \(\beta_0\), \(u_{0i}\) will be a positive number; if a subject is faster than \(\beta_0\), then that subject’s adjustment \(u_{0i}\) will be negative in sign; and if a subject has exactly the same reading time as the intercept \(\beta_0\), then \(u_{0i}\) for that subject will be exactly 0.

Similarly, for condition 2, the dependent variable is described by the equation:

\(y_{i2} = \beta_0 + \delta + u_{0i} + \varepsilon_{i2}\)

Here, \(\delta\) is the additional time taken to process condition 2 (thus, this is the treatment contrast coding we saw earlier in this chapter).

If we subtract the equation for condition 2 from the equation for condition 1, the resulting equation is:

\(d_i=y_{i1} - y_{i2}= \delta + (\varepsilon_{i1}-\varepsilon_{i2})\)

The \(u_{0i}\) terms cancel out. The expectation or mean value of \(d_i\) is \(\delta\) because the expectation of the \(\varepsilon\) terms is 0 (we set up the model such that \(\varepsilon \sim Normal(0,\sigma)\)).

Now, assuming that the error terms are correlated with correlation \(\rho\), the result presented at the beginning of this section applies:

\[\begin{equation} Var(y_{i1}-y_{i2}) = \sigma^2 + \sigma^2-2\rho\sigma^2=2\sigma^2(1-\rho) \end{equation}\]

The generative distribution for \(d_i\), the pairwise differences in the two conditions, is

\[\begin{equation} d_i \sim Normal(\delta, \sqrt{2\sigma^2(1-\rho)}) \end{equation}\]

But that is exactly the same standard deviation as the one used in the paired t-test.

The above result proves that the paired t-test will deliver exactly the same t-score as the linear mixed model with the additional term \(u_{0i}\). It is called a linear mixed model because of the new adjustment to the intercept \(\beta_0\), the term \(u_{0i}\) for each subject. This term is called a varying intercept by subject. It is called a varying intercept because the \(u_{0i}\) adjustment leads to the intercept for a particular subject becoming the sum of a fixed, unvarying term and a term that varies by subject, \(\beta_0 + u_{0i}\). Compare this to the fixed intercept \(\beta_0\) in the simple linear model that corresponds to the unpaired t-test. In the linear mixed model, the intercept term is different for each subject because of the \(u_{0i}\) term.

Let’s check that the linear mixed model delivers exactly the same t-value as our paired t-test above. We will use our simulated data. In the code below, the term (1|subj) is the adjustment by subject to the intercepts—the term \(u_{0i}\) above.

dat <- data.frame(y = c(y1, y2), 
                  condition = rep(letters[1:2], 
                                  each = n), 
                  subj = rep(1:n, 2))
dat$condition <- factor(dat$condition)
## sum coding
dat$cond<-ifelse(dat$condition=="a",1,-1)
## sanity check!
xtabs(~cond+condition,dat)

##     condition
## cond  a  b
##   -1  0 12
##   1  12  0

summary(m2 <- lmer(y ~ cond + (1 | subj), dat))

## Linear mixed model fit by REML ['lmerMod']
## Formula: y ~ cond + (1 | subj)
##    Data: dat
## 
## REML criterion at convergence: 65.6
## 
## Scaled residuals: 
##    Min     1Q Median     3Q    Max 
## -1.899 -0.410  0.193  0.558  1.496 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  subj     (Intercept) 0.5      0.707   
##  Residual             0.5      0.707   
## Number of obs: 24, groups:  subj, 12
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)    0.250      0.250    1.00
## cond           0.050      0.144    0.35
## 
## Correlation of Fixed Effects:
##      (Intr)
## cond 0.000

The t-statistic from the linear mixed model is exactly the same as that from the paired t-test.

Now we have the necessary background for investigating the linear mixed model in detail.

References

Rice, John A. 1995. Mathematical statistics and data analysis. Duxbury press Belmont, CA.