Population Genomics in Practice 2025

Selection and fitness

Figure 1: The life cycle used in the fundamental model of selection (Gillespie, 2004, Figure 3.2)

Much confusion exists in the literature regarding how various types of selection are defined, in particular because some of the terminology is used slightly differently within different scientific communities (Nielsen, 2005)

\begin{matrix} \mathrm{Genotype} & AA & Aa & aa \\ \mathrm{Frequency\ in\ newborns} & p^2 & 2pq & q^2\\ \mathrm{Viability} & w_{AA} & w_{Aa} & w_{aa}\\ \mathrm{Frequency\ after\ selection} & p^2w_{AA} / \bar{w} & 2pqw_{Aa} / \bar{w} & q^2w_{aa} / \bar{w} \\ \mathrm{Relative\ fitness} & 1 & 1-hs & 1-s\\ \end{matrix}

where \bar{w} = p^2w_{AA} + 2pqw_{Aa} + q^2w_{aa} is the mean fitness.

h=0	A dominant, a recessive
h=1	a dominant, A recessive
0<h<1	incomplete dominance
h<0	overdominance (heterozygote advantage)
h>1	underdominance

Notation follows Gillespie (2004), pp. 61–64.

The most important equation in population genetics

p^\prime - p = \Delta_sp = \frac{pq[p(w_{AA} - w_{Aa}) + q(w_{Aa} - w_{aa})]}{p^2w_{AA} + 2pqw_{Aa} + q^2w_{aa}}

Figure 2: Allele frequency change over time for directional, balancing, and disruptive selection, for different values of p_0.

Figure 3: Rate of allele frequency change as a function of allele frequency for directional, balancing, and disruptive selection.

Given the genotype fitnesses/viabilities, we can work out the difference in allele frequency p-p^\prime between successive generations as a function of viabilities (or relative fitnesses). Equation relates allele frequency change to viabilities for different genotypes. It is instructive to study the equation and plot trajectories to gain an intuition for how allele frequencies evolve given the different selection regimes.

The left panel shows that the three modes of selection have different equilibrium points: for directional selection the favored homozygote will eventually attain 100%, balancing selection has a stable equilibrium point (attractor) for a given allelic ratio (here 25:75), and disruptive selection has an unstable equilibrium point (repeller) for a given allelic ratio.

The right panel shows the rate of change as a function of allele frequency. Note how the directionality changes at the equilibrium point for balancing and directional selection.

Selection and drift - population size matters

Figure 4: The fixation probability relative to the neutral probability of fixation (p=1/2N) under the assumption s<0.1. Red highlights region where |N_es|<0.05. Adapted from Lynch (2007), Fig. 4.2.

In red region (|N_es|<0.05) the probability of fixation is within 10% of neutral fixation.

Consequence: for any population size there exists range of selection coefficients where mutant alleles \approx neutral (effective neutrality).

Direct selection can be inferred from protein substitutions

For genes, the ratio of nonsynonymous to synonymous substitutions can tell us about protein evolution:

Synonymous substitution

Protein          L
DNA         --- CTT ---
                  *
DNA         --- CTC ---
Protein          L

Nonsynonymous substitution

Protein          L
DNA         --- CTT ---
                 *
DNA         --- CHT ---
Protein          H

\mathbf{d_N/d_S << 1}: negative (purifying) selection
\mathbf{d_N/d_S < 1}: majority nonsynonymous deleterious, some advantageous
\mathbf{d_N/d_S = 1}: neutral or mix neutral / advantageous / deleterious mutations
\mathbf{d_N/d_S > 1}: positive selection

Figure 5: d_n/d_s comparisons for human-rat orthologs. For most genes, d_n/d_s << 1 indicating purifying selection. A handful of genes (n=9) have d_n/d_s > 1.0 which could indicate positive selection.

Not all mutations fall in genes. Methods for detecting direct selection not applicable to studying selection on single mutation, or e.g., balancing. This requires looking for specific patterns of diversity surrounding locus under selection.

Linked selection reduces diversity at neighbour loci

Figure 6: A selective sweep of an advantageous mutation (gray dot). Adapted from Charlesworth & Charlesworth (2010), Fig. 8.13

Example of a selective sweep. If a sweep completes at a locus, it will become monomorphic, as will the neighbouring sites. Mutation could reintroduce variation. Recombination could increase diversity in neighbourhood, but in a manner that depends on the distance from the locus under selection.

The effect of a selective sweep on diversity

Code

pgip-slim --seed 42 -n 1000 -r 1e-6 -m 1e-7 --threads 12 recipes/slim/selective_sweep.slim -l 1000000 --outdir results/slim
pgip-tsstat results/slim/slim*.trees -n 10 --seed 31 -s pi -s S -s TajD -w 500 --threads 10 | gzip -v - > results/slim/selective_sweep.w500.csv.gz

The effect of a selective sweep on diversity

Code

pgip-slim --seed 42 -n 1000 -r 1e-6 -m 1e-7 --threads 12 recipes/slim/selective_sweep.slim -l 1000000 --outdir results/slim
pgip-tsstat results/slim/slim*.trees -n 10 --seed 31 -s pi -s S -s TajD -w 500 --threads 10 | gzip -v - > results/slim/selective_sweep.w500.csv.gz

The effect of a selective sweep on diversity

The phases of a selective sweep

Figure 10: Time goes from left to right. As sweep progresses, tree topology changes. Adapted from Hahn (2019), Figure 8.1

Amount of diversity depends on fixation time. A neutral locus fixes in 4N_e generations; for s=0.0001, it takes approximately 0.29N_e generations.

Selections changes the genealogy (different topology, shorter branches), an aspect used in many linkage-based tests for selection.

Linked selection may constrain levels of diversity

Hitchhiking

alleles linked to locus under selection “hitchhike” to high frequencies (Smith & Haigh, 1974)
evidence: positive correlation between putative neutral diversity and recombination (Corbett-Detig et al., 2015)

Background selection

loci linked to a deleterious locus will be purged from population and thus reduce diversity (Charlesworth et al., 1993)
similar patterns to hitchhiking

Summary

We have looked at the Wright-Fisher model as a model of populations and genealogies

Genetic drift moves allele frequencies up and down at random and removes variation at rate \propto 1/2N

Mutation reintroduces variation. The Neutral theory posits most mutations are neutral and dynamics follow mutation drift equilibrium.

Methods to detect selection are based on direct selection or studying patterns of variation caused by linked selection.

Bibliography

Charlesworth, B., & Charlesworth, D. (2010). Elements of Evolutionary Genetics. Roberts and Company Publishers.

Charlesworth, B., Morgan, M. T., & Charlesworth, D. (1993). The Effect of Deleterious Mutations on Neutral Molecular Variation. Genetics, 134(4), 1289–1303.

Corbett-Detig, R. B., Hartl, D. L., & Sackton, T. B. (2015). Natural Selection Constrains Neutral Diversity across A Wide Range of Species. PLOS Biology, 13(4), e1002112. https://doi.org/10.1371/journal.pbio.1002112

Gillespie, J. H. (2004). Population Genetics: A Concise Guide (2nd edition). Johns Hopkins University Press.

Hahn, M. (2019). Molecular Population Genetics (First). Oxford University Press.

Lynch, M. (2007). The origins of genome architecture. Sinauer Associates.

Nielsen, R. (2005). Molecular Signatures of Natural Selection. Annual Review of Genetics, 39(1), 197–218. https://doi.org/10.1146/annurev.genet.39.073003.112420

Smith, J. M., & Haigh, J. (1974). The hitch-hiking effect of a favourable gene. Genetics Research, 23(1), 23–35. https://doi.org/10.1017/S0016672300014634