3.8 Exercises

Download the data-set E1_data.csv. Then run the following commands to load the lme4 library and to set up your data for analysis:

library(lme4)

## load data:
dat<-read.csv("data/E1_data.csv",header=TRUE)
## convert RT to milliseconds:
dat$RT<-dat$RT*1000
## choose critical region:
word_n<-4
## subset critical data:
crit<-subset(dat,Position==word_n)

The data consist of a repeated measures experiment comparing two conditions which are labeled Type 1 and Type 2. The column Sub refers to subject id, and the column ID refers to item id. RT refers to reading time in seconds (we have converted it above to milliseconds); NA is missing data. You can ignore the other columns. This is a standard Latin square design. We will work with the data frame crit below.

3.8.1 By-subjects t-test

Using RT as a dependent variable, carry out the appropriate by-subjects t-test to evaluate the null hypothesis that there is no difference between the conditions labeled Type 1 and 2. Write down all the R commands needed to do the appropriate t-test, and the resulting t-value and p-value. State whether we can reject the null hypothesis given the results of the t-test; explain why.

3.8.2 Fitting a linear mixed model

Now, using the data-frame called crit above, fit a linear mixed model (called M0). Recode the column called Type as sum contrasts.

Assume varying intercepts for subjects and varying intercepts for items (varying intercepts are sometimes called random intercepts). Write down the linear mixed models command, and write down the fixed-effects estimates (intercept and slope) along with their standard errors. State whether we can reject the null hypothesis given the results of the t-value shown in the linear mixed model output; explain why.

3.8.3 t-test vs. linear mixed model

Why do the results of the t-test and the linear mixed model M0 differ?

3.8.4 Power calculation using power.t.test

The researcher wants to achieve 80% statistical power in a future study. Based on the available data above, she determines that the standard error (note: not the standard deviation!) of the difference in means between the conditions Type 1 and Type 2 is 21. She has reason to believe that the true difference in means is 30 ms. What is the number of participants (to the nearest whole number) needed to achieve approximately 80% power? Use the power.t.test function to compute your answer. Write down the power.t.test function specification you used, as well as the number of participants needed, based on the output of the power.t.test function.

3.8.5 Residuals

The plot below shows the distribution of the residuals from model M0 plotted against the standard normal distribution with mean 0 and standard deviation 1. Explain what the plot tells us about one of the model assumptions of the linear mixed model M0 that you fit earlier.

(You can ignore the numbers below the plot.)

3.8.6 Understanding contrast coding

Using only your estimates of the intercept and the slope in model M0’s fixed effects output, write down the mean of the condition labeled Type 1 in the data, and the mean of the condition labeled Type 2.

3.8.7 Understanding the fixed-effects output

Suppose that the model M0’s output for the fixed effects analysis were as follows. SO is a sum-coded contrast specification for the conditions in the column labeled Type.

results

##             Estimate Std. Error t value
## (Intercept)   686.01      47.54   14.43
## SO             18.94         NA    2.00

What is the value of the standard error of the slope (SO), which is labeled NA above?

3.8.8 Understanding the null hypothesis test

A researcher fits a linear mixed model to compare the reading times between two conditions (a) and (b), just like in the above study. Her hypothesis is that the mean for condition (a) is larger than the mean for (b). She observes that condition a has sample mean 500 ms, and condition (b) has sample mean 450 ms. She also establishes from the linear mixed model that the t-value is 1.94. The approximate p-value associated with this t-value is 0.052. Answer the following: (A) Do we have evidence against the null hypothesis and (B) do we have evidence for the particular research hypothesis that the researcher has?

The researcher runs the same analysis as above on a new data-set that has the same design as above, and now gets a p-value of 0.001. Now she has stronger evidence than in the above case where the p-value was 0.052. What does she have stronger evidence for?