Variant calling index
About
A generic variant calling workflow consists of the following basic steps:
- read quality control and filtering
- read mapping
- removal / marking of duplicate reads
- joint / sample-based variant calling and genotyping
There are different tweaks and additions to each of these steps, depending on application and method. The variant calling exercises here present the basic steps to go from raw data to variant calls.
The exercises are based on the Monkeyflowers dataset. Make sure to read the dataset document before running any commands as it will give you the biological background and general information about where to find and how to setup the data. We will focus on the red and yellow ecotypes in what follows.
Intended learning outcomes
- Perform qc on sequencing reads and interpret results
- Prepare reference for read mapping
- Map reads to reference
- Mark duplicates
- Perform raw variant calling to generate a set of sites to exclude from recalibration
- Perform base quality score recalibration
- Perform variant calling on base recalibrated data
- Do genotyping on all samples and combine results to a raw variant call set
Listing
Title | Description |
---|---|
Variant calling introduction | Introduction to variant calling and the command line interface. |
Data quality control | Introduction to the command line interface. Preparation of data, raw data quality control and filtering for downstream analyses. |
Read mapping and duplicate removal | Read mapping to reference sequence and removal of duplicate reads. |
Variant calling workflow | Perform variant calling and genotyping. Introduction to workflow manager systems. |
Additional material
- Variant calling, long description
-
Describes all steps of a standard variant calling workflow from data preparation to final summary QC. All commands are run manually without the aid of a workflow manager. From earlier course round.