7.5 Exercises
Exercise 7.1 ANOVA coding for a four-condition design.
Load the following data. These data are from Experiment 1 in a set of reading studies on Persian (Safavi, Husain, and Vasishth 2016); we encountered these data in the preceding chapter’s exercises.
library(lingpsych)
data("df_persianE1")
dat1 <- df_persianE1
head(dat1)## subj item rt distance predability
## 60 4 6 568 short predictable
## 94 4 17 517 long unpredictable
## 146 4 22 675 short predictable
## 185 4 5 575 long unpredictable
## 215 4 3 581 long predictable
## 285 4 7 1171 long predictable
The four conditions are:
- Distance=short and Predictability=unpredictable
- Distance=short and Predictability=predictable
- Distance=long and Predictability=unpredictable
- Distance=long and Predictability=predictable
For the data given above, define an ANOVA-style contrast coding, and compute main effects and interactions. Check with hypr what the hypothesis tests are with an ANOVA coding, and write down the null hypotheses.
Exercise 7.2 ANOVA and nested comparisons in a \(2\times 2\times 2\) design.
Load the following data-set. This is \(2\times 2\times 2\) design from Lena A. Jäger et al. (2020b), with the factors Grammaticality (grammatical vs. ungrammatical), Dependency (Agreement vs. Reflexives), and Interference (Interference vs. no interference). The experiment is a replication attempt of Experiment 1 reported in B. W. Dillon et al. (2013).
library(lingpsych)
data("df_dillonrep")- The grammatical conditions are a,b,e,f. The rest of the conditions are ungrammatical.
- The agreement conditions are a,b,c,d. The other conditions are reflexives.
- The interference conditions are a,d,e,h, and the others are the no-interference conditions.
The dependent measure of interest is TFT (total fixation time, in milliseconds).
Using a linear model, do a main effects and interactions ANOVA contrast coding, and obtain an estimate of the main effects of Grammaticality, Dependency, and Interference, and all interactions. You may find it easier to code the contrasts coding the main effects as +1, -1, using ifelse in R to code vectors corresponding to each main effect. This will make the specification of the interactions easy.
The researchers had a further research hypothesis: in ungrammatical sentences only, agreement would show an interference effect but reflexives would not. In grammatical sentences, both agreement and reflexives are expected to show interference effects. This kind of research question can be answered with nested contrast coding.
To carry out the relevant nested contrasts, define contrasts that estimate the effects of
- grammaticality
- dependency type
- the interaction between grammaticality and dependency type
- reflexives interference within grammatical conditions
- agreement interference within grammatical conditions
- reflexives interference within ungrammatical conditions
- agreement interference within ungrammatical conditions
Do the estimates match expectations? Check this by computing the condition means and checking that the estimates from the models match the relevant differences between conditions or clusters of conditions.
Exercise 7.3 Another ANOVA comparison in a \(2\times 2\times 2\) design.
Hypothesis testing can be a very fragile business. One situation that sometimes occurs is that a few influential data points can change the result from significant to non-significant. An example is a self-paced reading experiment (Experiment 1) by Smith, Franck, and Tabor (2021). This is a \(2\times 2\) design, with a factor called N2Factor with two levels representing the number marking on the second of two nouns in the sentence, and another factor called SemFactor representing the semantic similarity of the two nouns in the sentence. The data are from the critical region in the sentence (a verb), in milliseconds. The expectation is that when the two nouns are similar, reading times are slower at the critical region.
library(lingpsych)
data("df_smithE1")
head(df_smithE1)## Participant StimSet RT N2Factor SemFactor
## 203 101 32 976 N2pl SemSim
## 245 101 20 640 N2pl SemSim
## 277 101 7 448 N2sg SemSim
## 304 101 10 640 N2pl SemDissim
## 344 101 14 448 N2pl SemDissim
## 383 101 18 432 N2pl SemDissim
First, fit a linear mixed model using an ANOVA contrast coding, and determine whether there is a main effect of semantic similarity. Then refit the model, removing all the data points that are larger than 3000 milliseconds and reassess the main effect of semantic similarity. This should remove 9 data points out of 3441 data points. Does the significant effect of similarity disappear?