Selection

Per Unneberg

Selection

Selection and fitness

Figure 1: The life cycle used in the fundamental model of selection (Gillespie, 2004, Figure 3.2)

Much confusion exists in the literature regarding how various types of selection are defined, in particular because some of the terminology is used slightly differently within different scientific communities (Nielsen, 2005)

\begin{matrix} \mathrm{Genotype} & AA & Aa & aa \\ \mathrm{Frequency\ in\ newborns} & p^2 & 2pq & q^2\\ \mathrm{Viability} & w_{AA} & w_{Aa} & w_{aa}\\ \mathrm{Frequency\ after\ selection} & p^2w_{AA} / \bar{w} & 2pqw_{Aa} / \bar{w} & q^2w_{aa} / \bar{w} \\ \mathrm{Relative\ fitness} & 1 & 1-hs & 1-s\\ \end{matrix}

where \bar{w} = p^2w_{AA} + 2pqw_{Aa} + q^2w_{aa} is the mean fitness.

h=0 A dominant, a recessive
h=1 a dominant, A recessive
0<h<1 incomplete dominance
h<0 overdominance (heterozygote advantage)
h>1 underdominance

Notation follows Gillespie (2004), pp. 61–64.

The most important equation in population genetics

p^\prime - p = \Delta_sp = \frac{pq[p(w_{AA} - w_{Aa}) + q(w_{Aa} - w_{aa})]}{p^2w_{AA} + 2pqw_{Aa} + q^2w_{aa}}

Figure 2: Allele frequency change over time for directional, balancing, and disruptive selection, for different values of p_0.
Figure 3: Rate of allele frequency change as a function of allele frequency for directional, balancing, and disruptive selection.

Selection and drift - population size matters

Figure 4: The fixation probability relative to the neutral probability of fixation (p=1/2N) under the assumption s<0.1. Red highlights region where |N_es|<0.05. Adapted from Lynch (2007), Fig. 4.2.

In red region (|N_es|<0.05) the probability of fixation is within 10% of neutral fixation.

Consequence: for any population size there exists range of selection coefficients where mutant alleles \approx neutral (effective neutrality).

Direct selection can be inferred from protein substitutions

For genes, the ratio of nonsynonymous to synonymous substitutions can tell us about protein evolution:

Synonymous substitution

Protein          L
DNA         --- CTT ---
                  *
DNA         --- CTC ---
Protein          L
Nonsynonymous substitution

Protein          L
DNA         --- CTT ---
                 *
DNA         --- CHT ---
Protein          H
\mathbf{d_N/d_S << 1}
negative (purifying) selection
\mathbf{d_N/d_S < 1}
majority nonsynonymous deleterious, some advantageous
\mathbf{d_N/d_S = 1}
neutral or mix neutral / advantageous / deleterious mutations
\mathbf{d_N/d_S > 1}
positive selection
Figure 5: d_n/d_s comparisons for human-rat orthologs. For most genes, d_n/d_s << 1 indicating purifying selection. A handful of genes (n=9) have d_n/d_s > 1.0 which could indicate positive selection.

Not all mutations fall in genes. Methods for detecting direct selection not applicable to studying selection on single mutation, or e.g., balancing. This requires looking for specific patterns of diversity surrounding locus under selection.

Linked selection reduces diversity at neighbour loci

Figure 6: A selective sweep of an advantageous mutation (gray dot). Adapted from Charlesworth & Charlesworth (2010), Fig. 8.13

Example of a selective sweep. If a sweep completes at a locus, it will become monomorphic, as will the neighbouring sites. Mutation could reintroduce variation. Recombination could increase diversity in neighbourhood, but in a manner that depends on the distance from the locus under selection.

The effect of a selective sweep on diversity

Code
pgip-slim --seed 42 -n 1000 -r 1e-6 -m 1e-7 --threads 12 recipes/slim/selective_sweep.slim -l 1000000 --outdir results/slim
pgip-tsstat results/slim/slim*.trees -n 10 --seed 31 -s pi -s S -s TajD -w 500 --threads 10 | gzip -v - > results/slim/selective_sweep.w500.csv.gz
Figure 7: The effect of a selective sweep on diversity. The arrow points to the site under selection. The y-axis shows Tajima’s D which is proportional to the difference between two measures of diversity, nucleotide diversity \pi and Watterson’s \theta_W.

The effect of a selective sweep on diversity

Code
pgip-slim --seed 42 -n 1000 -r 1e-6 -m 1e-7 --threads 12 recipes/slim/selective_sweep.slim -l 1000000 --outdir results/slim
pgip-tsstat results/slim/slim*.trees -n 10 --seed 31 -s pi -s S -s TajD -w 500 --threads 10 | gzip -v - > results/slim/selective_sweep.w500.csv.gz
Figure 8: The effect of a selective sweep on diversity. The arrow points to the site under selection. The y-axis shows Tajima’s D which is proportional to the difference between two measures of diversity, nucleotide diversity \pi and Watterson’s \theta_W.

The effect of a selective sweep on diversity

Figure 9: The effect of a selective sweep on diversity. The figure shows the mean of 1000 simulations with the selected locus indicated with an arrow.

The phases of a selective sweep

Figure 10: Time goes from left to right. As sweep progresses, tree topology changes. Adapted from Hahn (2019), Figure 8.1

Amount of diversity depends on fixation time. A neutral locus fixes in 4N_e generations; for s=0.0001, it takes approximately 0.29N_e generations.

Selections changes the genealogy (different topology, shorter branches), an aspect used in many linkage-based tests for selection.

Linked selection may constrain levels of diversity

Figure 11: Hitchhiking (left) versus background selection (right).

Hitchhiking

Background selection

  • loci linked to a deleterious locus will be purged from population and thus reduce diversity (Charlesworth et al., 1993)
  • similar patterns to hitchhiking

Summary

We have looked at the Wright-Fisher model as a model of populations and genealogies

Genetic drift moves allele frequencies up and down at random and removes variation at rate \propto 1/2N

Mutation reintroduces variation. The Neutral theory posits most mutations are neutral and dynamics follow mutation drift equilibrium.

Methods to detect selection are based on direct selection or studying patterns of variation caused by linked selection.

Bibliography

Charlesworth, B., & Charlesworth, D. (2010). Elements of Evolutionary Genetics. Roberts and Company Publishers.
Charlesworth, B., Morgan, M. T., & Charlesworth, D. (1993). The Effect of Deleterious Mutations on Neutral Molecular Variation. Genetics, 134(4), 1289–1303.
Corbett-Detig, R. B., Hartl, D. L., & Sackton, T. B. (2015). Natural Selection Constrains Neutral Diversity across A Wide Range of Species. PLOS Biology, 13(4), e1002112. https://doi.org/10.1371/journal.pbio.1002112
Gillespie, J. H. (2004). Population Genetics: A Concise Guide (2nd edition). Johns Hopkins University Press.
Hahn, M. (2019). Molecular Population Genetics (First). Oxford University Press.
Lynch, M. (2007). The origins of genome architecture. Sinauer Associates.
Nielsen, R. (2005). Molecular Signatures of Natural Selection. Annual Review of Genetics, 39(1), 197–218. https://doi.org/10.1146/annurev.genet.39.073003.112420
Smith, J. M., & Haigh, J. (1974). The hitch-hiking effect of a favourable gene. Genetics Research, 23(1), 23–35. https://doi.org/10.1017/S0016672300014634