DNA sequencing approaches and data
Scale up and down with a tunable output of up to 6 Tb and 20B single reads in < 2 days.
Up to 2X250 bp read length. Price example: 8,000 SEK total for resequencing 3Gbp genome to 30X
Up to 360 Gb of HiFi reads per day, equivalent to 1,300 human whole genomes per year.
Tens of kilobases long HiFi reads. Price example (Sequel II): ~35kSEK per library and SMRT cell
Despite price drop, still need to make choices regarding depth and breadth of sequencing coverage and number of samples.
Lou et al. (2021)
Our focus will be on Whole Genome reSequencing (WGS), mostly high-coverage.
Allendorf et al. (2022)
-rw-r--r-- 1 runner runner 302K Sep 18 22:10 fastq/PUN-Y-INJ_R1.fastq.gz
-rw-r--r-- 1 runner runner 393K Sep 18 22:10 fastq/PUN-Y-INJ_R2.fastq.gz
Format:
@
)+
)Quality values represent the probability P that the call is incorrect. They are coded as Phred quality scores Q. Here, Q=20 implies 1% probability of error, Q=30 implies 0.1% and so on. Typically you should not rely on quality values below 20.
Q = -10 \log_{10} P
A common way to do QC is with fastqc
:
DNA data