Welcome to the Sixth Summer School on Statistical Methods for Linguistics and Psychology, 12-16 September 2022
Application, dates, location
- Dates: 12-16 September 2022.
- Times: 9AM-5PM daily.
- Location: The summer school will be held at the Griebnitzsee campus in Potsdam.
- Application period: 17 Sept 2021 to 1 April 2022. Applications are closed. Decisions will be announced around 15 April 2022.
Brief history of the summer school, and motivation
The summer school was started by Shravan Vasishth in 2017, as part of a methods project
funded within the SFB 1287
. The summer school aims to fill a gap in statistics education, specifically within the fields of linguistics and psychology. One goal of the summer school is to provide comprehensive training in the theory and application of statistics, with a special focus on the linear mixed model. Another major goal is to make Bayesian data analysis a standard part of the toolkit for the linguistics and psychology researcher. Over time, the summer school has evolved to have at least four parallel streams: beginning and advanced courses in frequentist and Bayesian statistics. These may be expanded to more parallel sessions in future editions. We typically admit a total of 120 participants (in 2019, we had some 450 applications). In addition to the all-day courses, we regularly invite speakers to give lectures on important current issues relating to statistics. Previous editions of the summer school:
Code of conduct
All participants will be expected to follow the (code of conduct
, taken from StanCon 2018
. In case a participant has any concerns, please contact any of the following instructors: Audrey Bürki, Anna Laurinavichyute, Shravan Vasishth, Bruno Nicenboim, or Reinhold Kliegl.
Invited keynote speakers
- Prof. Dr. Lena Jäger, Zürich, Switzerland. (Tuesday, 13 Sept 2022, 5-6PM). Title and abstract coming soon.
- Prof. Dr. Riccardo Fusaroli. (Thursday, 15 Sept 2022, 5-6PM).
Title: Standing on the shoulders of normal-sized people. Promise and challenges of cumulative statistical approaches
We often hear that Newton stood on the shoulders of giants and that science is a cumulative enterprise where new research builds on previous results. This conception of science relates to a commonly cited benefit of Bayesian approaches: their ability to integrate diverse sources of information, e.g. results of previous studies as informed priors. However, this practice is rarely seen in the literature. One possible explanation could be that we remain skeptic of scientific findings in our own field; that is, we know that we stand on the shoulders of normal-sized, fallible people (just as we ourselves are), rather than on the shoulders of giants. This raises the question of how we best integrate fallible findings from previous analyses in our studies.
In this talk I will tackle this issue using a combination of simplified simulations, and the concerns arisen in concrete studies using informed priors.
First I will cover simulation-based studies of posterior passing: what happens when we use previous posterior estimates as priors, in a sterilized in silico environment? I will then let the complexity of real research slowly creep in: from linear chains of one study following the other, to interrupted chains due to publication bias, to meandering forking paths where studies know and include only some of the literature. These simulations show that posterior passing is slowed down by complexity, but still provides the best solution for this cumulative enterprise.
With the simulations at hand, I will turn to real application scenarios, where previous literature and expert opinions are used to build informed priors. Novel concerns arise: hierarchical structures of expectations, heterogeneity of studies, undue levels of confidence, etc.
Based on these results, I will advocate for a critical use of informed priors, involving comparisons between informed priors and alternative (e.g. skeptical) priors, and explicit testing of inferential robustness.
Curriculum and schedule
: All participants are invited to an evening of
snacks and general hanging out together on Monday and Wednesday 5PM onwards (CEST) .
We offer foundational/introductory and advanced courses in Bayesian and frequentist statistics. When applying, participants are expected to choose only one stream.
This year, there will be a special series of lectures by Ralf Engbert that everyone is welcome to attend.
- Special short course: Introduction to Dynamical Models in Cognitive Science (all participants are welcome, no need to register). Taught by Ralf Engbert, assisted by Lisa Schwetlick and Maximilian Rabe. Tuesday and Thursday afternoon.
This course is an introduction to Dynamical Models with examples from various fields of cognitive science. Preliminary list of topics: Analysis and modeling of dynamical systems and random walks, decision times, eye movement control in reading and scene perception, likelihood function for dynamical models. Course Materials will be made available online before the summer school. Textbook:
Timing: Monday, Wednesday, Friday: 3:00-4:30.
How to apply: People who are selected for the summer school will be asked to apply to this special course through a separate application form.
- Introduction to Bayesian data analysis (maximum 30 participants). Taught by Shravan Vasishth, assisted by Anna Laurinavichyute, and Paula Lissón
This course is an introduction to Bayesian modeling, oriented towards linguists and psychologists. Topics to be covered: Introduction to Bayesian data analysis, Linear Modeling, Hierarchical Models. We will cover these topics within the context of an applied Bayesian workflow that includes exploratory data analysis, model fitting, and model checking using simulation. Participants are expected to be familiar with R, and must have some experience in data analysis, particularly with the R library lme4.
Previous year's course web page: all materials (videos etc.) from the previous year are available here.
Textbook: here. We will work through the first six chapters.
- Advanced Bayesian data analysis (maximum 30 participants). Taught by Bruno Nicenboim, assisted by Himanshu Yadav
This course assumes that participants have some experience in Bayesian modeling already using brms and want to transition to Stan to learn more advanced methods and start building simple computational cognitive models. Participants should have worked through or be familiar with the material in the first five chapters of our book draft: Introduction to Bayesian Data Analysis for Cognitive Science. In this course, we will cover Parts III to V of our book draft: model comparison using Bayes factors and k-fold cross validation, introduction and relatively advanced models with Stan, and simple computational cognitive models.
Textbook here. We will start from Part III of the book (Advanced models with Stan). Participants are expected to be familiar with the first five chapters.
- Foundational methods in frequentist statistics (maximum 30 participants). Taught by Audrey Buerki, Daniel Schad, and João Veríssimo.
Participants will be expected to have used linear mixed models before, to the level of the textbook by Winter (2019, Statistics for Linguists), and want to acquire a deeper knowledge of frequentist foundations, and understand the linear mixed modeling framework more deeply. Participants are also expected to have fit multiple regressions. We will cover model selection, contrast coding, with a heavy emphasis on simulations to compute power and to understand what the model implies. We will work on (at least some of) the participants' own datasets. This course is not appropriate for researchers new to R or to frequentist statistics.
Textbook draft here.
- Advanced methods in frequentist statistics with Julia (maximum 30 participants). Taught by Reinhold Kliegl, Phillip Alday, Julius Krumbiegel, and Doug Bates.
Applicants must have experience with linear mixed models and be interested in learning how to carry out such analyses with the Julia-based MixedModels.jl package
) (i.e., the analogue of the R-based lme4 package). MixedModels.jl has some significant advantages. Some of them are: (a) new and more efficient computational implementation, (b) speed — needed for, e.g., complex designs and power simulations,
(c) more flexibility for selection of parsimonious mixed models, and
(d) more flexibility in taking into account autocorrelations or other dependencies — typical EEG-, fMRI-based time series (under development).
We do not expect
profound knowledge of Julia from participants; the necessary subset of knowledge will be taught on the first day of the course. We do expect a readiness to install Julia
and the confidence that with some basic instruction participants will be able to adapt prepared Julia scripts for their own data or to adapt some of their own lme4-commands to the equivalent MixedModels.jl-commands. The course will be taught in a hybrid IDE. There is already the option to execute R chunks from within Julia, meaning one needs Julia primarily for execution of MixedModels.jl commands as replacement of lme4. There is also an option to call MixedModels.jl from within R and process the resulting object like an lme4-object. Thus, much of pre- and postprocessing (e.g., data simulation for complex experimental designs; visualization of partial-effect interactions or shrinkage effects) can be carried out in R.
Github repo: here
Fees and accommodation
If the summer school is held in person (as is the plan), there will be a 40 Euro fee; this covers costs for coffee and snacks. Participants who are accepted are expected to arrange their own accommodation. We strongly advise participants to find a place to stay near Griebnitzsee campus, and not in Berlin. The reason is that German train personnel tend to go on strike every year around the time of the summer school. You will be better off if you can get easily to the Griebnitzsee campus.
For any questions regarding this summer school that have not been addressed on this home page already, please contact Shravan Vasishth