Introduction to Bayesian data analysis (SMLP 2019)
Instructor
Shravan Vasishth
Teaching Assistants
Anna Laurinavichyute,
Dr. Dario Paape.
Dates and location
Taught at
SMLP 2019.
Sept 913, 2019. Haus 6, room S24, Griebnitzsee campus, University of Potsdam.
Overview
In recent years, Bayesian methods have come to be widely adopted in all areas of science. This is in large part due to the development of sophisticated software for probabilisic programming; a recent example is the astonishing computing capability afforded by the language Stan (mcstan.org). However, the underlying theory needed to use this software sensibly is often inaccessible because endusers don't necessarily have the statistical and mathematical background to read the primary textbooks (such as Gelman et al's classic Bayesian data analysis, 3rd edition). In this course, we seek to cover this gap, by providing a relatively accessible and technically nondemanding introduction to the basic workflow for fitting different kinds of linear models using Stan. To illustrate the capability of Bayesian modeling, we will use the R package RStan and a powerful frontend R package for Stan called brms.
Prerequisites
We assume familiarity with R. Participants will benefit most if they have previously fit linear models and linear mixed models (using lme4) in R, in any scientific domain within linguistics and psychology. No knowledge of calculus or linear algebra is assumed (but will be helpful to know), but basic school level mathematics knowledge is assumed (this will be quickly revisited in class).
Please install the following software before coming to the course
We will be using the software
R,
and
RStudio,
so make sure you install these on your computer.
You should also install the R package
rstan; the R package
brms.
Outcomes
After completing this course, the participant will have become familiar with the foundations of Bayesian inference using Stan (RStan and brms), and will be able to fit a range of multiple regression models and hierarchical models, for normally distributed data, and for lognormal and Binomially distributed data. They will know how to calibrate their models using prior and posterior predictive checks; they will be able to establish true and false discovery rates to validate discovery claims. If there is time, we will discuss how to carry out model comparison using Bayes factors and kfold cross validation.
Course materials
Click here to download everything. If you use github, you can clone this repository:
https://github.com/vasishth/IntroductionBayes
Solutions to exercises are not publicly available; they will only be provided to participants.
Moodle page:
click here
lecture notes:
Download from
here.
slides and exercises:
 01 Foundations
 02 Introduction to Bayesian methods
 03 Linear Modeling
 04 Hierarchical Linear Models
 05 Model Comparison using Bayes Factors
case studies:
Three case studies (zip archive): metaanalysis, measurement error models, and example of preregistration.
Tentative schedule
Depending on the class, I may go faster or slower, so I may not adhere to this exact schedule.
 Monday: Foundations of Bayesian inference
Probability theory and Bayes' rule, Probability distributions, Understanding and eliciting priors, Analytical Bayes: BetaBinomial, PoissonGamma, NormalNormal
 Tuesday: Linear models
Basic theory of linear modeling. Generating prior predictive distributions using RStan and R, Fakedata simulation for model evaluation, Sampling methods will be skipped in class but please read the lecture notes later which cover: Inverse sampling, Gibbs sampling, Random Walk Metropolis, Hamiltonian Monte Carlo.
 Wednesday: Hierarchical linear models
HLMs using RStan and brms, fakedata generation, true and false discovery rate, logistics mixed effects models, individual differences, shrinkage.
 Thursday: HLMs continued, exercises
Here we will get some handson experience with real life problems.
 Friday: keynote lectures
Please see the SMLP schedule.
Additional readings
R programming
 Getting started with R
 R for data science
 Efficient R programming.
Books

A Student's Guide to Bayesian Statistics, by Ben Lambert: A good, nontechnical introduction to Stan and Bayesian modeling.
 Statistical Rethinking, by Richard McElreath: A classic introduction.
 Doing Bayesian Data Analysis, Second Edition:
A Tutorial with R, JAGS, and Stan, By John Kruschke: A good introduction specifically for psychologists.
Tutorial articles

brms tutorial by the author of the package, Paul Buerkner.
 Ordinal regression models in psychological research: A tutorial, by Buerkner and Vuorre.

Contrast coding tutorial, by Schad, Hohenstein, Vasishth, Kliegl.

Bayesian workflow tutorial, by Schad, Betancourt, Vasishth.

Linear mixed models tutorial, Sorensen, Hohenstein, Vasishth.

brms tutorial for phonetics/phonology, Vasishth, Nicenboim, Beckman, Li, Kong.
 Michael Betancourt's resources: These are a must if you want to get deeper into Stan and Bayesian modeling.
 MCMC animations/visualizations,McElreath's blog post on MCMC
Some example articles from our lab and other groups that use Bayesian methods

Example randomeffects metaanalysis.

Example of finite mixture models using Stan.
 Replication attempt of a published study.
 Bayesian analysis of relatively largesample psycholinguistic experiment.
 Examples of regression analyses by Vehtari and colleagues