Lecture 2: Version control and collaboration with Git!

Daniel Hammarström

Why version control?

  • Reproducibility and transparency Better science
  • Collaboration and robustness Better science
  • Formal structures and workflows Better science

Introduction to git

  • git is a version control software that is installed locally
  • It tracks changes to files in a specific repository (folder)
  • A version history are stored in a hidden folder .git
  • git is really good at trackning plain text files, but can also track other files…

Introduction to GitHub

  • GitHub is a collaborative platform that allows you to host version controlled repositories online
  • GitHub makes it possible to share code, collaborate on developing code, host websites, and more

A list of tools for version control

  • GitHub CLI Command line interface to GitHub
  • GitHub desktop Graphical user interface to GitHub/git

Contributing to a central repository by pull requests

Contributing to a central repository by pull requests

  • A pull request is “all or nothing” → smaller changes are easier to pull into the central repository
  • The owner of the central repository can incorporate and work on a large pull request in a separate “branch”

Contributing to a repository by branching

  • A branch can contain edits to the project that we want to do without risking breaking the main branch.
  • Changes in a branch is merged with the main branch using pull requests.

Contributing to a repository directly by “pull” and “push”

  • You could collaborate on a repository by directly pulling and push from the main branch…
  • This may be risky as parallel changes to the same files creates merge conflicts

Merge and conflicts

Merge conflicts

Local repository:

## This is an example

It has some content that needs to be version controlled

Remote repository:

## This is an example

It has some content that needs to be version controlled. We are adding some information in the remote repository

Local repository:

## This is an example

It has some content that needs to be version controlled. Adding local changes.

Pull from remote:

## This is an example

<<<<<<< HEAD
It has some content that needs to be version controlled. Adding local changes.
It has some content that needs to be version controlled. We are adding some information in the remote repository.
>>>>>>> aac1016966305b6d8dd91aea5f8194fdfb929171

Merge conflicts

Local repository:

## This is an example

<<<<<<< HEAD
It has some content that needs to be version controlled. Adding local changes.
It has some content that needs to be version controlled. We are adding some information in the remote repository.
>>>>>>> aac1016966305b6d8dd91aea5f8194fdfb929171

Pull from remote:

<<<<<<< HEAD
This is the state of the file in your copy
This is what you get from the remote
>>>>>>> aac1016966305b6d8dd91aea5f8194fdfb929171

Keep a list of files that you do not want to track with .gitignore

# History files

# Session Data files

# User-specific files

# produced output can be rebuilt from source

  • The .gitignore file let’s you decide what files to track in your history.
  • By adding e.g. *.pdf and .html to .gitignore we avoid having merge conflicts in files that can be built from source.

Git and GitHub: Best practices

  • Do not push into the main/master directly, use pull requests
  • Do not store sensitive information on github
  • Use .gitignore to avoid pushing/pulling files that do not need version control.
  • Do not push temporary files or files that are built from source. Add e.g. pdf-files to your .gitignore file.
  • Write good commit messages, they should be descriptive.
  • Commit often and work in small increments.
  • Do not use github to store large files.
  • Always “test” before commiting.