Friday morning session

Welcome to Friday! Almost there…

In your group discuss:

yesterday’s questions (below); can you agree on the answers?
anything that you’ve found confusing or useful yesterday
and post your answers and comments on our whiteboard

Thursday’s questions

Why should you use a workflow manager for your analysis instead of manually running tools?

More accurate results
Better documentation of results
To avoid accidental mistakes
Better reproducibility / portability of analysis

What is the difference between single hyphens and double hyphens in a Nextflow command?

There is no difference
Single hyphens denote core Nextflow options, double are pipeline parameters
Double hyphens denote core Nextflow options, single are pipeline parameters
Depends on the pipeline

Which two commands are useful for re-running a failed pipeline?

nextflow pull
nextflow log
nextflow run -name
nextflow run -resume

Which peak caller does the nf-core/chipseq pipeline use?

MACS2
Genrich
SEACR
All of the above

What was the name given to the new, modular language implemented within Nextflow

Nextflow 2.0
Nextflow modules
DSL2
nf-core/modules

Which QC metric is not generated by the nf-core/chipseq pipeline?

Strand cross-correlation plots
Normalized strand coefficient & Relative strand correlation ratios
TSS enrichment plots
Gene count suitability plots

Why does the nf-core/atacseq pipeline merge peaks sets across technical and biological replicates?

To find a consensus set of peaks for downstream analysis
To get give the best genome coverage possible
To reduce background noise in the data
To make the pipeline run faster

Why are mitochondrial reads removed in the nf-core/atacseq pipeline?

To get a better idea of the genome-wide duplication rate
To avoid bias in downstream analysis
Because some tissues and cell types can give libraries with a large fraction of mito reads
All of the above