Version control and collaborative scientific coding

Reproducibility and transparency are aspects of scientific research practices that have great potential to affect the quality of scientific results. Quality, in the sense that results will be available for scrutiny by researchers, funding agencies, and the public. As scientists, we have a moral obligation to be transparent and strive towards reproducibility, and to have these as goals in their own right will likely make us better scientists.

Version control software allows researchers to automate the process of keeping a record of changes in a project. In creating this record, we also create transparency and allow for reproducibility. As scientists, shifting our focus from a single end product, such as a scientific paper, to researching and communicating with transparency and reproducibility may alter our perspective on the scientific process.

Collaboration in complex projects is messy. Multiple files exist in various versions; changes are made in parallel with little to no control over what is lost or gained. Although most scientific collaborations would benefit from a more formal structure for collaboration on planning, data analysis, and writing, such structures are difficult to establish without a common point of departure. This part of our course will focus on tools and workflows that will make collaboration more effective. We will introduce version control as a tool in collaborative scientific writing and discuss how to establish a transparent framework for a collaborative writing project.