Workshop and hands-on session materials for the MPI Leipzig workshop on Nov 14-15, 2019.

A recent analysis of publicly released data accompanying published papers in Cognition showed that not all published numbers could be reproduced, even though the data and code were available (see here). The authors state that: "...suboptimal data curation, unclear analysis specification and reporting errors can impede analytic reproducibility, undermining the utility of data sharing and the credibility of scientific findings." In this workshop, I will suggest one way to minimize the chances of producing irreproducible results, focusing on a repeated measures two-condition design as a case study. The steps I will discuss are:
  1. Experiment design, and planning sample size using simulated data
  2. Defining the analysis plan using simulated data
  3. Checking that your experiment software actually collects the data you need
  4. Once data are collected, visualizing and summarizing the data
  5. Creating an R package to document and release your data and analyses
  6. Code refactoring
  7. Integrating the data analysis into the manuscript
  8. Releasing data and code: a suggested checklist
Download everything from here.

Preparation for the workshop

  • R and RStudio
  • the R library papaja
  • github

Workshop materials

Hands-on session materials

In this hands-on session, I will demonstrate a workflow for storing, analyzing, and maintaining data and code associated with a published paper. One major practical problem with conducting open science is how to provide the materials to one's audience (this could be just one's PhD advisor, or the wider scientific community) in a way that they can actually use and evaluate the author's work. We will develop practical skills in using tools like knitr, R Markdown, LaTeX, and GitHub, and learn to avoid common pitfalls and problems when managing the workflow for releasing data publicly. If there is time, I will also discuss the workflow and protocols that I follow when I discover mistakes, problems, and/or outright errors in other researchers' data.

Contact details

For any questions regarding these materials, please contact Shravan Vasishth.