The syllabus for this workshop are as follows.
- Working on the unix/linux command line
- Command line navigation and related commands: cd, mkdir, rm, rmdir
- Commonly used linux tools: cp, mv, tar, less, more, head, tail, nano, grep, top, man
- Wildcards
- Ownership and permissions
- Symbolic links
- Piping commands
- Working on remote computing cluster
- Logging on to HPC
- Booking resources
- Job templates, submission and queues
- Modules
- Commonly used bioinformatic tools and pipelines
- Working with integrated genome viewer
- Variant-calling workflow
- Mapping reads to the reference genome
- Variant detection
- VCF file format
- RNA-Seq workflow
- RNA-Seq experimental design and considerations
- QC, mapping and gene expression counts
- Differential gene expression analyses
- Current advances in NGS technologies
After this workshop you should be able to:
- Describe the basic principles of next generation sequencing.
- Use the Linux command line interface to manage simple file processing operations, and organise directory structures.
- Connect to and work on a remote compute cluster.
- Apply programs in Linux for analysis of NGS data.
- Summarise the applications of current NGS technologies, including the weakness and strengths of the approaches and when it is appropriate to use which one of them.
- Explain common NGS file formats.
- Interpret quality control of NGS reads.
- Explain the steps involved in variant calling using whole genome sequencing data.
- Independently perform a basic variant calling workflow on example data.
- Explain the steps involved in a differential gene expression workflow using RNA seq data.
- Hands-on experience with handling of raw RNA sequencing data, QC and quantification of gene expression.
- Conceptual understanding of differential gene expression analysis.