4.6 Exercises
Exercise 4.1 Chinese relative clauses
Load the following two data-sets:
data("df_gibsonwu")
gibsonwu <- df_gibsonwu
data("df_gibsonwu2")
gibsonwu2 <- df_gibsonwu2The data are taken from two experiments that investigate (inter alia) the effect of relative clause type on reading time in Chinese. The data are from Gibson and Wu (2013) and Z. A. L. Vasishth Shravan AND Chen (2013) respectively. The second data-set is a direct replication attempt of the first.
Chinese relative clauses are interesting theoretically because they are prenominal: the relative clause appears before the head noun.
As discussed in Gibson and Wu (2013), the consequence of Chinese relative clauses being prenominal is that the distance between the gap in relative clause and the head noun is larger in subject relatives than object relatives. Hsiao and Gibson (2003) were the first to suggest that the larger distance in subject relatives leads to longer reading time at the head noun. Under this view, the prediction is that subject relatives are harder to process than object relatives. If this is true, this is interesting because in most other languages that have been studied, subject relatives are easier to process than object relatives; so Chinese will be a very unusual exception cross-linguistically.
The data provided are for the critical region (the head noun). The experiment method is self-paced reading, so we have reading times in milliseconds.
The research hypothesis is whether the difference in reading times between object and subject relative clauses is negative. For both data-sets, investigate this question by (a) fitting a paired t-test (by-subjects and by items), (b) fitting the most complex linear mixed model you can to the data and then interpreting result using the t-value as well as the likelihood ratio test. What can we conclude about the research question?
Exercise 4.2 Agreement attraction in comprehension
Load the following data:
data("df_dillonE1")
dillonE1 <- df_dillonE1
head(dillonE1)## subj item rt int expt
## 49 dillonE11 dillonE119 2918 low dillonE1
## 56 dillonE11 dillonE119 1338 low dillonE1
## 63 dillonE11 dillonE119 424 low dillonE1
## 70 dillonE11 dillonE119 186 low dillonE1
## 77 dillonE11 dillonE119 195 low dillonE1
## 84 dillonE11 dillonE119 1218 low dillonE1
The data are taken from an experiment that investigate (inter alia) the effect of number similarity between a noun and the auxiliary verb in sentences like the following. There are two levels to a factor called Int(erference): low and high.
- low: The key to the cabinet are on the table
- high: The key to the cabinets are on the table
Here, in (b), the auxiliary verb are is predicted to be read faster than in (a), because the plural marking on the noun cabinets leads the reader to think that the sentence is grammatical. (Note that both sentences are ungrammatical.) This phenomenon, where the high condition is read faster than the low condition, is called agreement attraction.
The data provided are for the critical region (the auxiliary verb are). The experiment method is eyetracking; we have total reading times in milliseconds.
The research question is whether the difference in reading times between high and low conditions is negative.
- First, figure out which linear mixed model is appropriate for these data (varying intercepts only? varying intercepts and slopes? with or without correlations?).
- Then, carry out a statistical test using (a) the paired t-test (using the t.test function), (b) the t-test of the linear mixed model, and (c) the likelihood ratio test. What is your conclusion? Is there evidence for agreement attraction in the data?
Exercise 4.3 The grammaticality illusion
Load the following two data-sets:
data("df_english")
english <- df_english
data("df_dutch")
dutch <- df_dutchIn an offline accuracy rating study on English double center-embedding constructions, Gibson and Thomas (1999) found that grammatical constructions (e.g., example a below) were no less acceptable than ungrammatical constructions (e.g., example b) where a middle verb phrase (e.g., was cleaning every week) was missing.
The apartment that the maid who the service had sent over was cleaning every week was well decorated.
*The apartment that the maid who the service had sent over — was well decorated
Based on these results from English, Gibson and Thomas (1999) proposed that working-memory overload leads the comprehender to forget the prediction of the upcoming verb phrase (VP), which reduces working-memory load. This came to be known as the VP-forgetting hypothesis. The prediction is that in the word immediately following the final verb, the grammatical condition (which is coded as +1 in the data-frames) should be harder to read than the ungrammatical condition (which is coded as -1).
The data provided above test this hypothesis using self-paced reading for English (S. Vasishth et al. 2011), and for Dutch (Frank, Trompenaars, and Vasishth 2015). The data provided are for the critical region (the noun phrase, labeled NP1, following the final verb). We have reading times in log milliseconds.
Is there support for the VP-forgetting hypothesis cross-linguistically, from English and Dutch?