Versioning of data and code using Git

Version control is the lab notebook of the digital world: it’s what professionals use to keep track of what they’ve done and to collaborate with other people. Every large software development project relies on it, and most programmers use it for their small jobs as well. And it isn’t just for software: books, papers, small data sets, and anything that changes over time or needs to be shared can and should be stored in a version control system.

Teams are not the only ones to benefit from version control: lone researchers can benefit immensely. Keeping a record of what was changed, when, and why is extremely useful for all researchers if they ever need to come back to the project later on (e.g., a year later, when memory has faded).

They want to be able to work on the plans at the same time, but they have run into problems doing this in the past. If they take turns, each one will spend a lot of time waiting for the other to finish, but if they work on their own copies and email changes back and forth things will be lost, overwritten, or duplicated.

You will learn how to use version control in Git to

  • Avoid mailing revisions back and forth
  • Keep track of important revisions and changes in your files
  • Collaborate with others and merge revisions in a manner that avoids work getting lost, overwritten, or duplicated.

Getting Started

Before you begin this lession you need to 1) create a GitHub account and 2) install GitHub Desktop on your computer. See Setup for futher instructions.