1 Introduction to Git
1.1 What is Git
Git is nowadays the most widely used distributed version control system, especially in software development. By opposition to centralized version control systems, with Git, the code source, including its full history, is mirrored on every developer’s computer.
Git is the most popular tool, even though it might not be the most user-friendly one. It has a lot of options/commands and specific jargon. Fortunately there are many “Git cheat sheets” (such as https://education.github.com/git-cheat-sheet-education.pdf).
1.2 What should Git be used for
In software development, Git is mostly used for version control of code. In our bioinformatics projects, we can also track our report files, environment files, and other small files.
Git should NOT be used for storing data, particularly large data. Sensitive data (passwords, usernames, API keys…) should not be put in a Git repository, because they can be then exposed to the world. If one commits sensitive data by mistake, one can go back into the git history and remove it, but it is not a simple task.
1.3 Git Repositories
- A git repository (repo) is any folder structure that is version-controlled by git.
- A git repo can be initialized from a local folder, or cloned from a remote repo.
- To initialize a repo from a local folder:
cd myfolder git init
- To clone a git repo from a remote source:
git clone https://github.com/user/repo
- Regardless of how you obtain it, your local copy of the git repo will contain a
.git
folder. That is where the change history of your project is stored and maintained by git.
1.4 Git branches
Once you have cloned a specific git repository locally on your computer, you can navigate and/or create new branches on it using git CLI. Here some examples:
- To create a branch and switch to it type
git checkout -b branch_name
- To push the newly created branch to the remote repository type
git push -u origin branch_name
- To display all branches on the local and remote repository type
git branch -a
- To switch to one of the displayed branches type
git checkout name_of_branch
. Once a change is committed to that branch, pushing the committed change will be pushed to that specific branch on the remote repository. - To delete a branch type
git branch -d name_of_branch_to_delete
1.5 File staging and git commit
Staging in Git involves adding new, modified, or deleted files to a staging area before committing them. This allows for flexibility in choosing the files to commit.
- Check status via
git status
You’ll see what branch you are on and status of files (untracked, modified, or deleted).
- Stage Files to Prepare for Commit
- Stage all files:
git add .
- Stage a file:
git add example.html
- Stage a folder:
git add myfolder
- Check status again:
git status
You should see there are changes ready to be committed. - Unstage a File
- If you accidental stage something, use the following command to unstage it:
git reset HEAD example.html
- Deleting Files
- If you delete files they will appear in git status as deleted, and you must use git add to stage them. Another way to do this is using git rm command, which both deletes a file and stages it all with one command:
git rm example.html
to remove a file (and stage it)git rm -r myfolder
to remove a folder (and stage it)
- Commit Files
git commit -m "Message that describes what this change does"
- Check status again:
git status
If all changes have been committed, and there are no untracked files, it should say: nothing to commit, working tree clean. - View a List of Commits
- When viewing a list of commits, there are various commands depending on how much info you want to see.
- To see a simplified list of commits, run this command:
git log --oneline
- To see a list of commits with more detail (such who made the commit and when), run this command:
git log
NOTE: If the list is long, use the Down/Up Arrow keys to scroll and hit Q to quit. - To see a list of commits with even more detail (including which files changed), run this command:
git log --stat
- Fixing Your Last Commit Message
git commit --amend -m "Put your corrected message here"
: to correct a mistake in your last commit message
- Changing committed files
- The
--no-edit
flag will allow you to make the amendment to your commit without changing its commit message. Example:
# Edit hello.py and main.py
git add hello.py
git commit
# Realize you forgot to add the changes from main.py
git add main.py
git commit --amend --no-edit
The resulting commit will replace the incomplete one, and it will look like we committed the changes to hello.py and main.py in a single snapshot.
1.6 Git push/pull
You can use git push
to sync a remote repository with the changes you’ve done locally. The most basic example would be that you’ve first cloned a repository with git clone
then made some changes in that local copy and want to update the original remote repository.
Similarly, if for example, someone else made changes to the remote and you want to incorporate those changes into your local copy you will run git pull
to make sure you are up to date with the changes in the remote repository before working on your local copy.
1.7 Git merge and git rebase
Git merge and git rebase can be said to be used to solve similar things.
When working on a feature in a separate branch while someone else updates the main branch you often want to incorporate the new changes from the main branch into your feature branch.
First you would probably like to use git pull
as described above to make sure your local copy is up-to-date with changes made by others.
Then it could be done with merge like this:
git checkout my_new_feature
followed by adding your new code/feature and then merge it:
git merge main
This will create what is called a “merge commit” and put the changes from main into your feature branch.
The alternative way would be to use rebase:
git checkout my_new_feature
git rebase main
This will sort of re-write the project history by moving the feature branch to the “tip” of the main and create new commits in the original branch.