Introduction to Bayesian data analysis (SMLP 2021)


Instructors

Shravan Vasishth, Dr. Anna Laurinavichyute, Paula Lissón.

Dates and location

6-10 Sept 2021, taught online at SMLP.


Want to clone the materials on this website?

You can download everything on this website from here.

Overview and schedule

In recent years, Bayesian methods have come to be widely adopted in all areas of science. This is in large part due to the development of sophisticated software for probabilisic programming; a recent example is the astonishing computing capability afforded by the language Stan (mc-stan.org). However, the underlying theory needed to use this software sensibly is often inaccessible because end-users don't necessarily have the statistical and mathematical background to read the primary textbooks (such as Gelman et al's classic Bayesian data analysis, 3rd edition). In this course, we seek to cover this gap, by providing a relatively accessible and technically non-demanding introduction to the basic workflow for fitting different kinds of linear models using Stan. To illustrate the capability of Bayesian modeling, we will use the R package RStan and a powerful front-end R package for Stan called brms. All times are Berlin time (CEST). The chapters refer to this book: see here. You can start reading the book already.
Social hour interface: All participants will be meeting Monday and Wednesday 5-6PM CEST on wonder.me. The link is here. The password will be sent to participants within each stream.
Schedule:
Monday 6 Sept
9-9:15 Welcome to all participants in SMLP (Zoom link will be provided)
9:15-11 lecture chapter 1 (Shravan) 
11-11:30 Break
11:30-12:30 Shravan lecture+exercises/demos
12:30-1:30 Lunch
1:30-3:30 Time for exercises for the students 
          (Anna + Paula available for questions)
3:30-5:00 Anna will present logit transform in preparation 
          for a later chapter
5-6PM Social hour (link will be provided)

Tuesday 7 Sept
9-11 HW solutions + lecture chapter 2 (Shravan)
11-11:30 Break
11:30-12:30 Shravan lecture+exercises/demos
12:30-1:30 Lunch
1:30-3:30 Time for HW exercises for the students 
          (Shravan + Anna + Paula available for questions)
3:30-5:00 Spillover time (Shravan)
5:00-6:00 PM Talk by Dora Matzke (zoom link will be provided)

Wednesday 8 Sept
9-11 HW solutions + lecture chapter 3 (Shravan) 
11-11:30 Break
11:30-12:30 Shravan lecture+exercises/demos
12:30-1:30 Lunch
1:30-3:30 Time for exercises for the students (Anna + Paula 
          available for questions)
3:30-5:00 Anna will present example analyses
5-6PM Social hour (link will be provided)

Thursday 9 Sept
9-11 HW solutions + lecture chapter 4 (Shravan)
11-11:30 Break
11:30-12:30 Shravan lecture+exercises/demos
12:30-1:30 Lunch
1:30-3:30 Time for HW exercises for the students 
          (Shravan + Anna + Paula available for questions)
3:30-5:00 Anna will present example analyses 
          (HW solutions will be released)
5:00-6:00 PM Talk by Phillip Alday (zoom link will be provided)

Friday 10 Sept
9-11 HW solutions + lecture chapters 5+6 (Shravan) 
11-11:30 Break
11:30-12:30 Shravan lecture+exercises/demos
12:30-1:30 Lunch
1:30-3:30 Time for exercises for the students 
          (Anna + Paula available for questions)
3:30-5:00 Anna will talk about R Markdown and developing 
          a reproducible workflow (45 mins); time for free discussion (45 mins)
5-5:30 farewell and wrap-up (link will be provided)


Prerequisites

You must have a functioning computer to do this course.
We also assume familiarity with R. Participants will benefit most if they have previously fit linear models and linear mixed models (using lme4) in R, in any scientific domain within linguistics and psychology. No knowledge of calculus or linear algebra is assumed (but will be helpful to know), but basic school level mathematics knowledge is assumed (this will be quickly revisited in class).

Please install the following software before coming to the course

We will be using the software R, and RStudio, so make sure you install these on your computer. You should also install the R package rstan; the R package brms. Please follow the installation instructions carefully.
Install the library bcogsci from here.

Outcomes

After completing this course, the participant will have become familiar with the foundations of Bayesian inference using brms, and will be able to fit a range of multiple regression models and hierarchical models, for normally distributed data, and for lognormal and Binomially distributed data. They will know how to calibrate their models using prior and posterior predictive checks.

Course materials, HW assignments

We will work through the first six chapters of this book. Please note that the slides do not strictly follow the chapters' contents; the slides should be seen as complementing the chapters.
  1. Day 1: HW: Exercises 1.2, 1.4, 1.6, 1.8
    1. Videos:
    2. Slides: Rmd file, pdf file.
  2. Day 2: HW: Exercises 2.1, 2.2, 2.5
    1. Video:
    2. Slides: Rmd file, pdf file.
    3. Additional slides on sampling: Rmd file, pdf file.
  3. Day 3: HW: Exercises 3.1., 3.2, 3.3, Optional: 3.4
    1. Video:
    2. Slides: Rmd file, pdf file.
  4. Day 4: HW: Exercises 4.2., 4.3, 4.4
    1. Videos:

    2. Slides: Rmd file, pdf file.
  5. Day 5: HW: Exercises 5.1., 5.2, 5.4, 5.5
    1. Video:
    2. Additional notes, spillover from day 4: shrinkage in LMMs
    3. Slides: Rmd file, pdf file.
Solutions to exercises are not publicly available; they will only be provided to participants.
Tutorial articles Here are some articles that provide background reading, and cover some additional topics that we will skip:
  1. brms tutorial by the author of the package, Paul Buerkner.
  2. Ordinal regression models in psychological research: A tutorial, by Buerkner and Vuorre.
  3. Contrast coding tutorial, by Schad, Hohenstein, Vasishth, Kliegl.
  4. Bayesian workflow tutorial, by Schad, Betancourt, Vasishth.
  5. Linear mixed models tutorial, Sorensen, Hohenstein, Vasishth.
  6. brms tutorial for phonetics/phonology, Vasishth, Nicenboim, Beckman, Li, Kong.
  7. Workflow Techniques for the Robust Use of Bayes Factors by Daniel J. Schad, Bruno Nicenboim, Paul-Christian Bürkner, Michael Betancourt, Shravan Vasishth.
  8. Sample size determination for Bayesian hierarchical models commonly used in psycholinguistics. Shravan Vasishth, Himanshu Yadav, Daniel Schad, and Bruno Nicenboim. Submitted to Computational Brain and Behavior, 2021.
  9. Michael Betancourt's resources: These are a must if you want to get deeper into Stan and Bayesian modeling.
  10. MCMC animations/visualizations,McElreath's blog post on MCMC
Some example articles from our lab and other groups that use Bayesian methods A frequently asked question is: how to summarize the results of a Bayesian analysis? Here are some examples of articles we have published using Bayesian data analysis. Our presentation of results is continuously evolving; there is no fixed answer to the question, how should I display my results? Use your judgement. The most important thing you can do to facilitate understanding of your work is to be open and transparent about your analyses. This means releasing all data and code with the published paper. We will be providing some guidelines on how to do this, in the summer school course.
  1. Example random-effects meta-analysis.
  2. Example of finite mixture models using Stan.
  3. Replication attempt of a published study.
  4. Bayesian analysis of relatively large-sample psycholinguistic experiment.
  5. Examples of regression analyses by Vehtari and colleagues