Tutorial
Introduction
The Github repository located here contains a template for a reproducible research project. The fundamental idea of reproducible research is that the steps that take your research from raw data to manuscript, thesis, or report should be fully automated. This way, your work can be checked by your adviser, mentors, collaborators, others working in your area, journal reviewers, and your future self.
R and RStudio are an excellent vehicle for conducting reproducible
research. You write manuscripts and reports in .rmarkdown
documents
that includes code chunks that perform analysis. The code chunks are
evaluated by R
and incorporated in the document by the tools in the
knitr
package to produce a markdown .md
document. From there a
program called pandoc
converts your markdown document to whatever file
format you like: PDF (formatted with latex .csl files), html, or
Microsoft Word. This all happens without the user really knowing what is
going on, which makes it easy to get started.
Table of Contents
Introduction to R
Getting Started
Data
Analysis
Manuscript
Markdown
Latex
If Your Collaborators Use Word
Download Repository
Contents
On Your Own
Once you understand how all the peices fit together you can modify these
files to conduct your own reproducable project. Just make sure your
data-raw
is accessed by your analysis scripts and that your results
are stored in the anlaysis-output
folder. Then make sure your
manuscript pulls the data and analysis results automatically.
Background
I became interested in reproducible research because I was tired of being terrified of my own analysis. I was constantly petrified someone would question my work and ask me to open the black box and verify what I did was correct; in some cases even being asked reproduce a result was terrifying because I knew the convoluted path of data prep and cleaning that I took in arriving at my result.
I read the book Reproducable Research with R and RStudio by Christopher Gandrud (2013), and I read a lot of blog posts and tutorials by Karl Broman and Carl Boettiger and I struck out on my own path to execute a reproducible research project from start to finish. The repository for that project is here, while I was successful in learning the basics of how r, rmarkdown, knitr, and pandoc combine to make reproducible research possible, you can tell just by looking at the project’s Github repository that I failed miserably at making the project reproducible. This is because the repository is totally unorganized, and I am pretty sure I am the only one who could reproduce the results from this project.
But in that failure, I learned a lot about how a reproducible research project should be organized. I built this template for my future students and for my future self.
References
Gandrud, Christopher. 2013. Reproducible Research with R and R Studio. CRC Press.