7.5 Exercises

Exercise 7.1 ANOVA coding for a four-condition design.

Load the following data. These data are from Experiment 1 in a set of reading studies on Persian (Safavi, Husain, and Vasishth 2016); we encountered these data in the preceding chapter’s exercises.

library(lingpsych)
data("df_persianE1")
dat1 <- df_persianE1
head(dat1)
##     subj item   rt distance   predability
## 60     4    6  568    short   predictable
## 94     4   17  517     long unpredictable
## 146    4   22  675    short   predictable
## 185    4    5  575     long unpredictable
## 215    4    3  581     long   predictable
## 285    4    7 1171     long   predictable

The four conditions are:

  • Distance=short and Predictability=unpredictable
  • Distance=short and Predictability=predictable
  • Distance=long and Predictability=unpredictable
  • Distance=long and Predictability=predictable

For the data given above, define an ANOVA-style contrast coding, and compute main effects and interactions. Check with hypr what the hypothesis tests are with an ANOVA coding, and write down the null hypotheses.

Exercise 7.2 ANOVA and nested comparisons in a \(2\times 2\times 2\) design.

Load the following data-set. This is \(2\times 2\times 2\) design from Lena A. Jäger et al. (2020b), with the factors Grammaticality (grammatical vs. ungrammatical), Dependency (Agreement vs. Reflexives), and Interference (Interference vs. no interference). The experiment is a replication attempt of Experiment 1 reported in B. W. Dillon et al. (2013).

library(lingpsych)
data("df_dillonrep")
  • The grammatical conditions are a,b,e,f. The rest of the conditions are ungrammatical.
  • The agreement conditions are a,b,c,d. The other conditions are reflexives.
  • The interference conditions are a,d,e,h, and the others are the no-interference conditions.

The dependent measure of interest is TFT (total fixation time, in milliseconds).

Using a linear model, do a main effects and interactions ANOVA contrast coding, and obtain an estimate of the main effects of Grammaticality, Dependency, and Interference, and all interactions. You may find it easier to code the contrasts coding the main effects as +1, -1, using ifelse in R to code vectors corresponding to each main effect. This will make the specification of the interactions easy.

The researchers had a further research hypothesis: in ungrammatical sentences only, agreement would show an interference effect but reflexives would not. In grammatical sentences, both agreement and reflexives are expected to show interference effects. This kind of research question can be answered with nested contrast coding.

To carry out the relevant nested contrasts, define contrasts that estimate the effects of

  • grammaticality
  • dependency type
  • the interaction between grammaticality and dependency type
  • reflexives interference within grammatical conditions
  • agreement interference within grammatical conditions
  • reflexives interference within ungrammatical conditions
  • agreement interference within ungrammatical conditions

Do the estimates match expectations? Check this by computing the condition means and checking that the estimates from the models match the relevant differences between conditions or clusters of conditions.

Exercise 7.3 Another ANOVA comparison in a \(2\times 2\times 2\) design.

Hypothesis testing can be a very fragile business. One situation that sometimes occurs is that a few influential data points can change the result from significant to non-significant. An example is a self-paced reading experiment (Experiment 1) by Smith, Franck, and Tabor (2021). This is a \(2\times 2\) design, with a factor called N2Factor with two levels representing the number marking on the second of two nouns in the sentence, and another factor called SemFactor representing the semantic similarity of the two nouns in the sentence. The data are from the critical region in the sentence (a verb), in milliseconds. The expectation is that when the two nouns are similar, reading times are slower at the critical region.

library(lingpsych)
data("df_smithE1")
head(df_smithE1)
##     Participant StimSet  RT N2Factor SemFactor
## 203         101      32 976     N2pl    SemSim
## 245         101      20 640     N2pl    SemSim
## 277         101       7 448     N2sg    SemSim
## 304         101      10 640     N2pl SemDissim
## 344         101      14 448     N2pl SemDissim
## 383         101      18 432     N2pl SemDissim

First, fit a linear mixed model using an ANOVA contrast coding, and determine whether there is a main effect of semantic similarity. Then refit the model, removing all the data points that are larger than 3000 milliseconds and reassess the main effect of semantic similarity. This should remove 9 data points out of 3441 data points. Does the significant effect of similarity disappear?

References

Dillon, Brian W., Alan Mishler, Shayne Sloggett, and Colin Phillips. 2013. “Contrasting Intrusion Profiles for Agreement and Anaphora: Experimental and Modeling Evidence.” Journal of Memory and Language 69 (2): 85–103.
Jäger, Lena A, Daniela Mertzen, Julie A Van Dyke, and Shravan Vasishth. 2020b. “Interference Patterns in Subject-Verb Agreement and Reflexives Revisited: A Large-Sample Study.” Journal of Memory and Language 111: 104063.
Safavi, Molood Sadat, Samar Husain, and Shravan Vasishth. 2016. “Dependency Resolution Difficulty Increases with Distance in Persian Separable Complex Predicates: Implications for Expectation and Memory-Based Accounts.” Frontiers in Psychology 7.
Smith, Garrett, Julie Franck, and Whitney Tabor. 2021. “Encoding Interference Effects Support Self-Organized Sentence Processing.” Cognitive Psychology 124: 101356.