8.3 Decide on a range of plausible values of the effect size

Notice that the effect in milliseconds is relatively large, given the estimates from similar phenomena in reading studies in psycholinguistics (Lena A. Jäger, Engelmann, and Vasishth 2017):

b0 <- summary(m)$coefficients[1, 1]
b1 <- summary(m)$coefficients[2, 1]
## effect estimate in log ms:
b1
## [1] 0.124
## effect estimate in ms:
exp(b0 + b1 * (0.5)) - exp(b0 + b1 * (-0.5))
## [1] 44.54

But the standard errors tell us that the effect could be as small or as large as the following values:

b1_stderr <- summary(m)$coefficients[2, 2]
lower <- b1 - 2 * b1_stderr
upper <- b1 + 2 * b1_stderr
lower
## [1] 0.02539
upper
## [1] 0.2227

The above range 0.03 and 0.22 arises because the range of plausible effect sizes is between \(\hat\beta_1 \pm 2SE\) on the log ms scale.

On the ms scale, the range is:

exp(b0 + lower * (0.5)) - exp(b0 + lower * (-0.5))
## [1] 9.113
exp(b0 + upper * (0.5)) - exp(b0 + upper * (-0.5))
## [1] 80.09

On the ms scale we see that that’s a lot of uncertainty in the effect size! With some experience, you will come to recognize that such a wide confidence bound is a sign of low power. We will just establish the prospective power properties of this study in a minute.

We can take the above uncertainty of the \(\hat\beta_1\) estimator into account (on the log ms scale—remember that the model is based on log rt) by assuming that the effect has the following uncertainty on the log ms scale:

\[\begin{equation} \beta_1 \sim Normal(0.12,0.05) \end{equation}\]

Here, we are doing something that is, strictly speaking, Bayesian in thinking. We are describing our uncertainty about the true effect from the best estimate we have—existing data. To talk about the uncertainty, we are (ab)using the 95% confidence interval (treating it like its telling us the range of plausible values). Recall that strictly speaking, in the frequentist paradigm, one cannot talk about the probability distribution of the effect size—in frequentist theory, the true value of the parameter is a point value, it has no distribution. The range \(\hat\beta_1 \pm 2\times SE\) refers to the estimated mean of the sampling distribution of the sample means, and to the standard deviation of this sampling distribution. Thus, strictly speaking, this range does not reflect our uncertainty about the true parameter’s value. Having said this, we are going to use the effect estimates from our model fit as a starting point for our power analysis because this is the best information we have so far about the English relative clause design.

References

Jäger, Lena A., Felix Engelmann, and Shravan Vasishth. 2017. “Similarity-Based Interference in Sentence Comprehension: Literature review and Bayesian meta-analysis.” Journal of Memory and Language 94: 316–39. https://doi.org/https://doi.org/10.1016/j.jml.2017.01.004.