6.7 Exercises

Exercise 6.1 Contrast coding for a four-condition design.

Load the following data. These data are from Experiment 1 in a set of reading studies on Persian (Safavi, Husain, and Vasishth 2016). This is a self-paced reading study on particle-verb constructions, with a \(2\times 2\) design: distance (short, long) and predictability (predictable, unpredictable). The data are from a critical region in the sentence. All the data from the Safavi, Husain, and Vasishth (2016) paper are available from https://github.com/vasishth/SafaviEtAl2016.

library(lingpsych)
data("df_persianE1")
dat1 <- df_persianE1
head(dat1)
##     subj item   rt distance   predability
## 60     4    6  568    short   predictable
## 94     4   17  517     long unpredictable
## 146    4   22  675    short   predictable
## 185    4    5  575     long unpredictable
## 215    4    3  581     long   predictable
## 285    4    7 1171     long   predictable

The four conditions are:

  • Distance=short and Predictability=unpredictable
  • Distance=short and Predictability=predictable
  • Distance=long and Predictability=unpredictable
  • Distance=long and Predictability=predictable

The researcher wants to do the following sets of hypothesis tests:

Compare the condition labeled Distance=short and Predictability=unpredictable with each of the following conditions:

  • Distance=short and Predictability=predictable
  • Distance=long and Predictability=unpredictable
  • Distance=long and Predictability=predictable

Questions:

  • Which contrast coding is needed for such a comparison?
  • First, define the relevant contrast coding. Hint: You can do it by creating a condition column labeled a,b,c,d and then use a built-in contrast coding function.
  • Then, use the hypr library function to confirm that your contrast coding actually does the hypothesis tests you need.
  • Fit a simple linear model with the above contrast coding and display the slopes, which constitute the relevant comparisons.
  • Now, compute each of the four conditions’ means and check that the slopes from the linear model correspond to the relevant differences between means that you obtained from the data.

Exercise 6.2 Helmert coding for a four-condition design.

Load the following data:

library(bcogsci)
## 
## Attache Paket: 'bcogsci'
## Die folgenden Objekte sind maskiert durch '.GlobalEnv':
## 
##     df_contrasts1, df_contrasts2, df_contrasts3
data("df_polarity")
head(df_polarity)
##   subject item condition times value
## 1       1    6         f   SFD 327.8
## 2       1   24         f   SFD 205.9
## 3       1   35         e   SFD 315.2
## 4       1   17         e   SFD 264.8
## 5       1   34         d   SFD 252.2
## 6       1    7         a   SFD 155.5

The data come from an eyetracking study in German reported in S. Vasishth et al. (2008). The experiment is a reading study involving six conditions. The sentences are in English, but the original design was involved German sentences. In German, the word durchaus (certainly) is a positive polarity item: in the constructions used in this experiment, durchaus cannot have a c-commanding element that is a negative polarity item licensor. Here are the conditions:

  • Negative polarity items
      1. Grammatical: No man who had a beard was ever thrifty.
      1. Ungrammatical (Intrusive NPI licensor): A man who had no beard was ever thrifty.
      1. Ungrammatical: A man who had a beard was ever thrifty.
  • Positive polarity items
      1. Ungrammatical: No man who had a beard was certainly thrifty.
      1. Grammatical (Intrusive NPI licensor): A man who had no beard was certainly thrifty.
      1. Grammatical: A man who had a beard was certainly thrifty.

We will focus only on re-reading time in this data-set. Subset the data so that we only have re-reading times in the data-frame:

dat2 <- subset(df_polarity, times == "RRT")
head(dat2)
##      subject item condition times  value
## 6365       1   20         b   RRT  239.6
## 6366       1    3         c   RRT 1866.2
## 6367       1   13         a   RRT  529.6
## 6368       1   19         a   RRT  269.0
## 6369       1   27         c   RRT  844.8
## 6370       1   26         b   RRT  634.7

The comparisons we are interested in are:

  • What is the difference in reading time between negative polarity items and positive polarity items?
  • Within negative polarity items, what is the difference between grammatical and ungrammatical conditions?
  • Within negative polarity items, what is the difference between the two ungrammatical conditions?
  • Within positive polarity items, what is the difference between grammatical and ungrammatical conditions?
  • Within positive polarity items, what is the difference between the two grammatical conditions?

Use the hypr package to specify the comparisons specified above, and then extract the contrast matrix. Finally, specify the contrasts to the condition column in the data frame. Fit a linear model using this contrast specification, and then check that the estimates from the model match the mean differences between the conditions being compared.

References

Safavi, Molood Sadat, Samar Husain, and Shravan Vasishth. 2016. “Dependency Resolution Difficulty Increases with Distance in Persian Separable Complex Predicates: Implications for Expectation and Memory-Based Accounts.” Frontiers in Psychology 7.
Vasishth, Shravan, Sven Bruessow, Richard L. Lewis, and Heiner Drenhaus. 2008. “Processing Polarity: How the Ungrammatical Intrudes on the Grammatical.” Cognitive Science 32 (4, 4): 685–712.