Intro Bayes course home page

Introduction to Bayesian data analysis (home page)

Instructors

Dates and location

March 2020, taught online.

Overview

In recent years, Bayesian methods have come to be widely adopted in all areas of science. This is in large part due to the development of sophisticated software for probabilisic programming; a recent example is the astonishing computing capability afforded by the language Stan (mc-stan.org). However, the underlying theory needed to use this software sensibly is often inaccessible because end-users don't necessarily have the statistical and mathematical background to read the primary textbooks (such as Gelman et al's classic Bayesian data analysis, 3rd edition). In this course, we seek to cover this gap, by providing a relatively accessible and technically non-demanding introduction to the basic workflow for fitting different kinds of linear models using Stan. To illustrate the capability of Bayesian modeling, we will use the R package RStan and a powerful front-end R package for Stan called brms.

Prerequisites

We assume familiarity with R. Participants will benefit most if they have previously fit linear models and linear mixed models (using lme4) in R, in any scientific domain within linguistics and psychology. No knowledge of calculus or linear algebra is assumed (but will be helpful to know), but basic school level mathematics knowledge is assumed (this will be quickly revisited in class).

Please install the following software before coming to the course

We will be using the software R, and RStudio, so make sure you install these on your computer. You should also install the R package rstan; the R package brms.

Outcomes

After completing this course, the participant will have become familiar with the foundations of Bayesian inference using Stan (RStan and brms), and will be able to fit a range of multiple regression models and hierarchical models, for normally distributed data, and for lognormal and Binomially distributed data. They will know how to calibrate their models using prior and posterior predictive checks; they will be able to establish true and false discovery rates to validate discovery claims. If there is time, we will discuss how to carry out model comparison using Bayes factors and k-fold cross validation.

Online interaction

We will use google groups and zoom. A link to the private group will be sent to participants.

Course materials

Click here to download everything. If you use github, you can clone this repository: https://github.com/vasishth/IntroductionBDA

Textbook (in progress): See here. PDF version available on request.

Part 1 (Monday-Tuesday): Shravan Vasishth
The lectures correspond roughly to chapters 1 and 2 of our textbook in preparation

Monday:

Introductory video

PDF: 00 Frequentist Foundations (review of some basic ideas)
Exercises: 00 Frequentist Foundations Exercises

PDF: 01 Foundations
Exercises Part 1: 01 Foundations Exercises Part 1
Exercises Part 2: 01 Foundations Exercises Part 2

Tuesday:

PDF: 02 Introduction to Bayesian methods
Exercises: 02 Introduction to Bayesian methods Exercises

PDF: 02 Sampling
02 Sampling, Additional Notes

Part 2 (Wednesday-Friday): Bruno Nicenboim
For this part of the workshop besides rstan and brms, be sure to have the following packages installed (and loaded in your session):
MASS, dplyr, tidyr, purrr, readr, extraDistr, ggplot2, brms, bayesplot, tictoc, gridExtra The lectures correspond roughly to chapters 3, 4 and 5 of our textbook in preparation

Wednesday - 03 Computational Bayesian data analysis
Slides and exercises
Stan slides
Part 1

Part 2

Part 3

A brief intro to Stan

Thursday - 04 - Bayesian regression models
Slides and exercises
More exercises
Part 1 (Linear model)

Part 2 (Log-normal regression)

Part 3 (Logistic regression)

Friday
05 - Bayesian hierarchical models
Slides
Exercises

06 - Model comparison with Bayes factor
Slides
Exercises

Case studies:
Three case studies (zip archive): meta-analysis, measurement error models, and an example of pre-registration.

Tentative schedule
Depending on the class, we may go faster or slower, so I may not adhere to this exact schedule.

Additional readings
R programming

Getting started with R

R for data science

Efficient R programming.

Books

A Student's Guide to Bayesian Statistics, by Ben Lambert: A good, non-technical introduction to Stan and Bayesian modeling.

Statistical Rethinking, by Richard McElreath: A classic introduction.

Doing Bayesian Data Analysis, Second Edition: A Tutorial with R, JAGS, and Stan, By John Kruschke: A good introduction specifically for psychologists.

Tutorial articles and materials

brms tutorial by the author of the package, Paul Buerkner.
Ordinal regression models in psychological research: A tutorial, by Buerkner and Vuorre.

Contrast coding tutorial, by Schad, Hohenstein, Vasishth, Kliegl.

Bayesian workflow tutorial, by Schad, Betancourt, Vasishth.

Linear mixed models tutorial, Sorensen, Hohenstein, Vasishth.

brms tutorial for phonetics/phonology, Vasishth, Nicenboim, Beckman, Li, Kong.

Reproducible workflows tutorial

Michael Betancourt's resources: These are a must if you want to get deeper into Stan and Bayesian modeling.

MCMC animations/visualizations,McElreath's blog post on MCMC

Some example articles from our lab and other groups that use Bayesian methods

Example random-effects meta-analysis (phonetics data on neutralization).

A second example of a large-scale study and a random-effects meta-analysis (EEG data)

A third example of a large-scale study and a random-effects meta-analysis (reading data)

Example of a hierarchical finite mixture model using Stan.

Replication attempt of a published study.

Another (large-sample) replication attempt of a published study.

Bayesian analysis of relatively large-sample psycholinguistic experiment.

Examples of regression analyses by Vehtari and colleagues