Population genomics in practice

What is population genomics?

Per Unneberg

Intended learning outcomes

Course

  • Present minimum toolkit of methods that should be known to anyone starting out in population genomics
  • Sufficiently small for one-week workshop

Lecture

Example: Population genetics of the coral Acropora millepora

Motivation: corals are facing hard times and to prevent future losses of coral cover a better understanding of genetics is warranted.

Genome assembly and sampling

Motivation: most analyses require a reference sequence with which to compare resequenced samples

  • Assemble high-quality reference genome
  • Choice of populations, sampling locations

Figure 1: Genome assembly by hybrid sequencing (A) 253 individuals from 12 reefs (B) PCA with environmental and spatial variables (C) phenotype distributions (D)

Fuller et al. (2020)

Example: Population genetics of the coral Acropora millepora

Motivation: corals are facing hard times and to prevent future losses of coral cover a better understanding of genetics is warranted.

Describe genetic structure and demographic history

Motivation:

  • address basic question of why genetic structure looks the way it does
  • demographic history may generate signals similar to selection
Figure 2: Variation and demographic history inferred from 44 resequenced individuals. LD decay (r^2) from 1% of markers (A) nucleotide diversity (\pi) in 1kb-windows (B) effective population sizes (\mathrm{N_e}) estimated with PSMC (C)

Fuller et al. (2020)

Example: Population genetics of the coral Acropora millepora

Motivation: corals are facing hard times and to prevent future losses of coral cover a better understanding of genetics is warranted.

Characterize population structure

Motivation:

  1. identify populations for contrasts in e.g. selection scans
  2. identify admixed individuals that should be removed from analyses
  3. identify barriers to gene flow etc

Figure 3: Characterizing population structure and gene flow across 12 refs. F_{\mathrm{ST}} measure across geographic distance (A) PCA from LD-pruned genome-wide SNPs for 44 resequenced samples (B) estimation of relative effective migration surface from LD-pruned and common (MAF > 5%) SNPs (C)

Fuller et al. (2020)

Example: Population genetics of the coral Acropora millepora

Motivation: corals are facing hard times and to prevent future losses of coral cover a better understanding of genetics is warranted.

Genomic scans for selection

Motivation: identify loci associated with adaptation / selection

  • little differentiation over reefs, however thermal regimes
  • genomic scan for \pi (diversity) outliers

Figure 4: Genomic scans for local adaptation detect a signal at sacsin. Nucleotide diversity (\pi) in 1kb-window, values in top 0.01% genome-wide in red (A, top). Close-up view around sacsin gene, with predicted gene structure above (A, bottom). h12 summary statistic showing the frequence of the two most common haplotypes (B). sacsin gene tree (blue) together with random gene trees, indicating deep genealogy at sacsin (C)

Fuller et al. (2020)

Example: Population genetics of the coral Acropora millepora

Study highlights common analyses in population genomics study:

  1. Genome assembly, resequencing, variant calling and filtering
  2. Description of variation (e.g., \pi) and genetic structure (LD)
  3. Description of population structure (admixture, PCA)
  4. Modelling of demographic history (PSMC)
  5. Genome scans for adaptive traits

Population genetics

 

Mutation

Selection

 

Recombination

Drift

From population genetics to population genomics

The variable sites at the Drosophila melanogaster ADH locus (Kreitman, 1983)

The variable sites at the Drosophila melanogaster ADH locus (Kreitman, 1983)

First study of natural population. However, limited to one locus.

From population genetics to population genomics

Patterns of polymorphism and divergence (Begun et al., 2007)

Patterns of polymorphism and divergence (Begun et al., 2007)

Same system but genome-wide. Plots represent all chromosomes and the entire genome.

From population genetics to population genomics

Numbers of polymorphic and fixed variants (Begun et al., 2007)

Numbers of polymorphic and fixed variants (Begun et al., 2007)

Novelty: now possible to do genome-wide characterization of variation in different functional contexts

The technological revolution in sequencing and computing

Figure 5: Sequencing cost ($) per megabase (Wetterstrand, KA)

Moore’s law

Moore’s law

Statistical inference in population genomics

The data deluge requires advanced statistical methods and models to do inference. Today data production outpaces theoretical advances. Therefore, take care not to attach too much faith to a test that explains data well.

A population genomics study should aim at generating a baseline model that takes into account the processes that shape genetic variation (Johri et al., 2022):

  1. mutation
  2. recombination
  3. gene conversion
  4. purifying selection acting on functional regions and its effects on linked variants (background selection)
  5. genetic drift with demographic history and geographic structure

Applications of population genomics

Conservation genomics (Webster et al., 2023)

Conservation genomics (Webster et al., 2023)

Speciation genomics (Stankowski et al., 2019)

Speciation genomics (Stankowski et al., 2019)

disentangle forces that create variation (Rodrigues et al., 2024)

disentangle forces that create variation (Rodrigues et al., 2024)

paleogenomics (aDNA) (van der Valk et al., 2021)

paleogenomics (aDNA) (van der Valk et al., 2021)

Bibliography

Barrera-Redondo, J., Piñero, D., & Eguiarte, L. E. (2020). Genomic, Transcriptomic and Epigenomic Tools to Study the Domestication of Plants and Animals: A Field Guide for Beginners. Frontiers in Genetics, 11.
Begun, D. J., Holloway, A. K., Stevens, K., Hillier, L. W., Poh, Y.-P., Hahn, M. W., Nista, P. M., Jones, C. D., Kern, A. D., Dewey, C. N., Pachter, L., Myers, E., & Langley, C. H. (2007). Population Genomics: Whole-Genome Analysis of Polymorphism and Divergence in Drosophila simulans. PLOS Biology, 5(11), e310. https://doi.org/10.1371/journal.pbio.0050310
Fuller, Z. L., Mocellin, V. J. L., Morris, L. A., Cantin, N., Shepherd, J., Sarre, L., Peng, J., Liao, Y., Pickrell, J., Andolfatto, P., Matz, M., Bay, L. K., & Przeworski, M. (2020). Population genetics of the coral Acropora millepora: Toward genomic prediction of bleaching. Science, 369(6501), eaba4674. https://doi.org/10.1126/science.aba4674
Hahn, M. (2019). Molecular Population Genetics (First). Oxford University Press.
Hartl, D. L., & Clark, A. G. (1997). Principles of population genetics. Sinauer Associates.
Johri, P., Aquadro, C. F., Beaumont, M., Charlesworth, B., Excoffier, L., Eyre-Walker, A., Keightley, P. D., Lynch, M., McVean, G., Payseur, B. A., Pfeifer, S. P., Stephan, W., & Jensen, J. D. (2022). Recommendations for improving statistical inference in population genomics. PLOS Biology, 20(5), e3001669. https://doi.org/10.1371/journal.pbio.3001669
Kreitman, M. (1983). Nucleotide polymorphism at the alcohol dehydrogenase locus of Drosophila melanogaster. Nature, 304(5925), 412. https://doi.org/10.1038/304412a0
Li, H., & Durbin, R. (2011). Inference of human population history from individual whole-genome sequences. Nature, 475(7357), 493–496. https://doi.org/10.1038/nature10231
Rodrigues, M. F., Kern, A. D., & Ralph, P. L. (2024). Shared evolutionary processes shape landscapes of genomic variation in the great apes. Genetics, 226(4), iyae006. https://doi.org/10.1093/genetics/iyae006
Stankowski, S., Chase, M. A., Fuiten, A. M., Rodrigues, M. F., Ralph, P. L., & Streisfeld, M. A. (2019). Widespread selection and gene flow shape the genomic landscape during a radiation of monkeyflowers. PLOS Biology, 17(7), e3000391. https://doi.org/10.1371/journal.pbio.3000391
Unneberg, P., Larsson, M., Olsson, A., Wallerman, O., Petri, A., Bunikis, I., Vinnere Pettersson, O., Papetti, C., Gislason, A., Glenner, H., Cartes, J. E., Blanco-Bercial, L., Eriksen, E., Meyer, B., & Wallberg, A. (2024). Ecological genomics in the Northern krill uncovers loci for local adaptation across ocean basins. Nature Communications, 15(1), 6297. https://doi.org/10.1038/s41467-024-50239-7
van der Valk, T., Pečnerová, P., Díez-del-Molino, D., Bergström, A., Oppenheimer, J., Hartmann, S., Xenikoudakis, G., Thomas, J. A., Dehasque, M., Sağlıcan, E., Fidan, F. R., Barnes, I., Liu, S., Somel, M., Heintzman, P. D., Nikolskiy, P., Shapiro, B., Skoglund, P., Hofreiter, M., … Dalén, L. (2021). Million-year-old DNA sheds light on the genomic history of mammoths. Nature, 591(7849), 265–269. https://doi.org/10.1038/s41586-021-03224-9
Webster, M. T., Beaurepaire, A., Neumann, P., & Stolle, E. (2023). Population Genomics for Insect Conservation. Annual Review of Animal Biosciences, 11(1), 115–140. https://doi.org/10.1146/annurev-animal-122221-075025
Wetterstrand, KA. DNA Sequencing Costs: Data from the NHGRI Genome Sequencing Program (GSP). www.genome.gov/sequencingcostsdata