Collaboration with version control

Licenced under CC-BY 4.0 and OSI-approved licenses, see licensing.

Overview

Teaching: 10 min
Exercises: 30 min
Questions
  • How do you keep track of versions when working in a group?

Objectives
  • Add collaborators to your repository

  • Contributing to existing repositories using pull requests

  • How to submit small changes using pull requests

  • How to propose and submit larger changes

  • How to review and discuss changes

About this episode

In this session we will work in pairs/small groups to learn how to work in a collaborative manner in repositories which belong to a group that you are part of. We will do this through a change proposal (pull request) and review and also discuss the pros and cons with different strategies.

Step 1: Learn how to add collaborators to your repository

In the previous episodes we have learned how to work with our own local and remote repositories. The first step to allow changes by others is to add your group members or collaborators as “collaborators” under GitHub. By default this allows them to change things directly (this can be prevented by altering the settings of your repository).

  • Navigate to your repository analysis-recipe.
  • Click on “Settings” (top right), then “Collaborators” (left), then “Add people” (green button).
  • Add the collaborator’s GitHub username (shared on a post-it).

invite-collaborators

From here on the collaborators can push changes in the same way as we have done in a single-person repository in the previous episodes.

Step 2: Submit a small change via the web interface as collaborator

As a collaborator, it is possible to make changes on the main branch directly. However, in this exercise we will learn a better way by submitting a “pull request” (a change proposal) towards the main branch for review. We will practice this by suggesting a change in a data file:

  • Navigate to the repository of your collaborator by entering the following in your browser: “https://github.com/username_of_collaborator/analysis-recipe” (you need to change the username_of_collaborator to the one shared with you on the post-it)
  • Once in the correct repository, click on the file samples_openrefine_lesson.csv
  • Open the file editor via web (click on the edit pen)
  • Modify or extend one or a few rows in the data file, for example: in row 2-4 change the dates to “2020-03-31”
  • Click on Commit changes and write a suitable commit message
  • Instead of choosing “Commit directly to main branch” choose "Create a new branch for this commit and start a pull request."
  • Enter a meaningful branch name (it can be useful to prefix it with your name so that we know who this branch belongs to), e.g.: YvonneKallberg-date-fix

propose-file-change

  • After we click “Propose changes” we are taken to this form: pull-request-form

  • Verify the source and target branch, that the source is the branch where you have made your commits to and that the base branch is the target where you want your changes to be integrated.
  • If desired: edit the title and description of the “pull request” (change proposal)
  • Click “Create pull request”
    Note: If you click on the arrow of “Create pull request”, you can change the action to “Create draft pull request”. This is handy when you want to collect feedback on unfinished work.

Step 3: Review the pull request

Once the “pull request” is submitted, it can be reviewed by a repository member:

  • Click on Pull requests in the top menu, and then click on the pull request you want to review (e.g. “Fixed date issue”)
  • Review changes by clicking on the Files changed tab and check if your agree on the changes made files-changed
  • In the top right, click on review changes to add comments, approve or request changes. review
  • Once you want to integrate the suggested changes, go to the conversation tab and press “Merge pull request” merge

Discussion

  1. Why is it considered better to work in branches, compared to commiting directly to “main”?
  2. What are possible benefits of having a review process?

Answer

  • The “main” branch should be kept as clean as possible, this is the ‘production level’ of the repository, while other branches are suitable for testing alternate routes, discuss proposed changes, etc. There is also an increased risk of introducing errors increase when allowing collaborators to work on the main branch.
  • A review process, where the proposed changes are checked by someone else, increases the quality in terms of context (is the text understandable), errors (does a script work for others), identifying typos, missing information or links, etc. It is also an opportunity for you collaborators to enhance their knowledge.

Tips

  • Ideally submitter and reviewer should be two different persons. Decide within the group of collaborators how you will work in common repositories.
  • You can modify an open “pull request” by committing new changes to the branch
  • Review is not only to assure quality but also to enhance learning and knowledge transfer within the group
  • Protecting the main branch “forces” all changes to it to be reviewed first. We recommend this for group repositories. To read more see: GitHub Documentation about protecting branches

Step 4: Resolving a conflict

Demo: let us experience a conflict

When merging two branches a conflict can arise when the same part of a file has been modified in two different ways on the two branches.

We can demonstrate how a conflict looks and how to resolve it:

  • Imagine that two collaborators have created “pull requests” (change proposals) branching from main changing the same line in two different ways

conflict-edit-1 conflict-edit-2

  • We merge one of the pull requests to main (this will work)
  • Then we try to merge the other and we see a conflict:

conflict-pull-request

  • We will try to resolve the conflict via the web browser.
  • Choose the version that you wish to keep, by removing the rows to be discarded as well as conflict markers, click on Mark as resolved and then on Commit the merge

conflict-resolution

Discussion

  • Compare with Google Docs: can you get conflicts there? What are the advantages and disadvantages?
  • What can we do to avoid conflicts?

Extra Exercise: Submit a small change via the web interface as external contributor (optional)

Submitting a change proposal as external contributor (we assume you are not added as “collaborator” and thus have no write-permissions to a repository) will prompt you to make a fork of the repository (GitHub makes a copy of the original repository to your user space) and then you can submit a “pull request” in a similar way as before. Only this time you have no other choice than “Propose file change”.

  • Navigate to the repository of your instructor by entering the following in your browser : “https://github.com/username_of_instructor/analysis-recipe” (you need to change the username_of_collaborator to the one written in the whiteboard)
  • Once in the correct repository, click on the file samples_openrefine_lesson.csv
  • Open the file editor via web (click on the edit pen), this will prompt you to create a fork of the repository.

fork-prompt

  • Modify or extend one or a few rows in the data file, for example: in row 14 change the date to “2020-03-31”. Note that there is a blue banner at the top to make you aware that you will be commiting the changes to a new branch in your fork.

fork-commit

  • Click on Commit changes and enter a meaningful commit message, this will take you to a pull request form:

fork-pull-request

  • This time, in addition to the source and target branch, there are also source and target repositories that we need to verify
  • You can now review the change before submitting it via the green button “Create pull request”
  • If desired: edit the title and description of the “pull request” and click “Create pull request”
  • The fork can be removed later.

Summary

  • In this episode we learned how to propose changes and submit changes via “pull requests”.
  • Protecting the main branch and insisting on every change going through a pull request can be useful to get feedback on your changes and to improve knowledge transfer.
  • Conflicts may arise when collaborating but are typically easy to solve.