This is a team project, so split up the workload as you see fit.
You have three datasets from the Ecoli K12 substrain MG1655, sequenced using Illumina, PacBio, and Nanopore.
The aim is for you to try and explore different assemblers and see what you get. It will be impossible to
do evaluate every combination, so choose your tasks wisely. Document your commands and share them with
each other.
Working directories have been created for each team:
Illumina data.
Subsample the data to 50x, 150x, and 250x coverage.
Produce summaries of the subsampled data.
Assemble the data using Spades, Abyss, and MaSuRCA.
Polish using Pilon or Racon.
Evaluate the assemblies with Quast, Busco, KAT, Bandage and FRC
Running spades example:
Running abyss example:
Running MaSuRCA:
Running Pilon:
PacBio data.
Subsample the data to 10x, 30x, and 70x coverage.
Produce summaries of the subsampled data.
Assemble the data using Canu, Miniasm, and wtdbg2.
Polish with Arrow or Racon.
Evaluate the assemblies with Quast, Busco, and Bandage.
Running Canu:
Running Minimap:
Running Wtdbg2:
Running Racon:
Nanopore data.
Subsample the data to 10x, 30x, and 70x coverage.
Produce summaries of the subsampled data.
Assemble the data using Canu, Miniasm, and wtdbg2.
Polish with Medaka or Racon.
Evaluate the assemblies with Quast, Busco, and Bandage.
Running Canu:
Running Medaka:
How to load the tools.
You have already used most of the tools needed for this task. Here is how to
load the tools you have not encountered already.