Welcome to Friday! Almost there…
In your group discuss:
- yesterday’s questions (below); can you agree on the answers?
- anything that you’ve found confusing or useful yesterday
- and post your answers and comments on our whiteboard
Thursday’s questions
- Why should you use a workflow manager for your analysis instead of manually running tools?
- More accurate results
- Better documentation of results
- To avoid accidental mistakes
- Better reproducibility / portability of analysis
- What is the difference between single hyphens and double hyphens in a Nextflow command?
- There is no difference
- Single hyphens denote core Nextflow options, double are pipeline parameters
- Double hyphens denote core Nextflow options, single are pipeline parameters
- Depends on the pipeline
- Which two commands are useful for re-running a failed pipeline?
- nextflow pull
- nextflow log
- nextflow run -name
- nextflow run -resume
- Which peak caller does the nf-core/chipseq pipeline use?
- MACS2
- Genrich
- SEACR
- All of the above
- What was the name given to the new, modular language implemented within Nextflow
- Nextflow 2.0
- Nextflow modules
- DSL2
- nf-core/modules
- Which QC metric is not generated by the nf-core/chipseq pipeline?
- Strand cross-correlation plots
- Normalized strand coefficient & Relative strand correlation ratios
- TSS enrichment plots
- Gene count suitability plots
- Why does the nf-core/atacseq pipeline merge peaks sets across technical and biological replicates?
- To find a consensus set of peaks for downstream analysis
- To get give the best genome coverage possible
- To reduce background noise in the data
- To make the pipeline run faster
- Why are mitochondrial reads removed in the nf-core/atacseq pipeline?
- To get a better idea of the genome-wide duplication rate
- To avoid bias in downstream analysis
- Because some tissues and cell types can give libraries with a large fraction of mito reads
- All of the above