Q: What are eQTLs?

An expression QTL (eQTL) is a locus where genotype is associated with the expression level of a gene measured by RNA sequencing or microarray. cis-eQTLs sit close to the gene whose expression they affect (typically within 1 Mb), and trans-eQTLs sit elsewhere in the genome. The GTEx Project (2010–2017) measured eQTLs across 49 human tissues in ~1,000 donors and identified cis-eQTLs for nearly every protein-coding gene. eQTLs help interpret GWAS hits in non-coding regions: a disease-associated SNP whose effect on a nearby gene's expression is consistent with the disease biology gives a strong candidate causal mechanism. About 60–70 percent of GWAS-significant loci colocalise with at least one significant eQTL in a relevant tissue, supporting regulatory effects as the dominant mechanism for non-coding common-variant disease associations.

Question 1

What is a quantitative trait locus?

Accepted Answer

A quantitative trait locus, or QTL, is a region of the genome where allelic variation correlates with variation in a continuous phenotype across a population or experimental cross. Unlike Mendelian genes that produce discrete categorical phenotypes (round versus wrinkled), QTLs typically each contribute a small fraction of the phenotypic variance — often less than 5 percent — and most traits are influenced by tens to thousands of QTLs operating together. The term applies to any continuous trait: height, blood-pressure, blood-glucose, crop yield, milk fat percentage, behavioural latency, or even gene-expression levels (called eQTLs). The loci themselves are not different in kind from Mendelian genes; they differ in effect size and in the statistical methodology required to find them.

Question 2

How does QTL mapping work?

Accepted Answer

Classical QTL mapping in experimental species uses an F2 or recombinant inbred line (RIL) population from two divergent parental strains, genotypes them at hundreds to thousands of markers spanning the genome, and tests at each marker (or in each interval between markers) whether genotype is associated with the phenotype. Lander and Botstein 1989 introduced interval mapping, which uses maximum-likelihood to estimate the position and effect size of a putative QTL between two flanking markers, reporting a LOD score (logarithm of odds). LOD > 3 (corresponding to roughly P < 1e-4 per locus) is the conventional significance threshold, often raised to LOD > 4 or higher to control for multiple testing across the genome. Composite interval mapping (Zeng 1994) adds covariates from other markers to remove spurious peaks. The resolution of mapping is set by the number of recombination events in the population, typically yielding 10–30 cM intervals — meaning hundreds of genes per QTL window.

Question 3

How is QTL different from GWAS?

Accepted Answer

QTL mapping uses experimental crosses (F2, backcross, RIL) with known parents and a small number of generations of recombination, exploiting the meiotic recombination within the cross to localise variants. GWAS uses unrelated individuals from a population and exploits historical recombination embedded in linkage-disequilibrium structure across the genome. QTL mapping has more statistical power per individual (because allele frequencies are 0.5 by design and family structure is known) but lower resolution (1–10 cM blocks). GWAS has lower per-individual power but vastly higher resolution, often pinpointing variants to within a few kilobases when LD blocks are short. The two approaches have converged: large biobank cohorts (UK Biobank, FinnGen, Million Veteran Program) are essentially GWAS at scale, while in model organisms diversity outbred mice and multi-parent advanced generation intercross (MAGIC) lines combine both strategies.

Question 4

Why are QTL effect sizes typically small?

Accepted Answer

Most quantitative traits are polygenic — controlled by many loci with each contributing a small effect. Mathematical reasons: if many loci affect a trait additively, the total heritable variance is split among them, so each individual locus accounts for a small fraction. Evolutionary reasons: for a trait under stabilising selection (like body mass index in many environments), large-effect alleles are pulled to extreme frequencies (fixed or lost) by selection, leaving only small-effect variation segregating in the population. Empirically, in humans the largest height-associated SNPs explain roughly 0.4 percent of variance, the largest BMI SNPs around 0.3 percent, and the largest blood-pressure SNPs around 0.1 percent. Hundreds to thousands of variants each contribute a small slice, summed in polygenic risk scores.

Question 5

What is missing heritability?

Accepted Answer

Missing heritability is the gap between heritability estimated from twin or family studies (e.g., 80 percent for adult human height) and the variance explained by the sum of all genome-wide-significant GWAS hits (which until ~2018 amounted to less than 30 percent for height even with millions of subjects). Several explanations have been validated. First, many small-effect loci are below the genome-wide significance threshold and contribute when summed with weaker filters. Second, rare variants (MAF < 1 percent) are poorly tagged on common-variant arrays. Third, gene-gene and gene-environment interactions add variance not captured in additive models. Fourth, some heritability estimates from twin studies overestimate (shared environment, assortative mating). The largest height GWAS to date (5.4 million individuals, Yengo 2022) recovered ~40 percent of variance from common variants alone, with rare-variant studies and biobank-scale designs continuing to close the gap.

Question 6

What are eQTLs?

Accepted Answer

An expression QTL (eQTL) is a locus where genotype is associated with the expression level of a gene measured by RNA sequencing or microarray. cis-eQTLs sit close to the gene whose expression they affect (typically within 1 Mb), and trans-eQTLs sit elsewhere in the genome. The GTEx Project (2010–2017) measured eQTLs across 49 human tissues in ~1,000 donors and identified cis-eQTLs for nearly every protein-coding gene. eQTLs help interpret GWAS hits in non-coding regions: a disease-associated SNP whose effect on a nearby gene's expression is consistent with the disease biology gives a strong candidate causal mechanism. About 60–70 percent of GWAS-significant loci colocalise with at least one significant eQTL in a relevant tissue, supporting regulatory effects as the dominant mechanism for non-coding common-variant disease associations.

Feature	Classical QTL mapping	Genome-wide association (GWAS)
Sample	F2 / RIL / backcross from defined parents	Unrelated individuals from a population
Sample size	100–1,000 individuals	10,000 to several million
Recombination resolution	1–30 cM (a few generations)	1–100 kb via population LD
Statistical method	LOD interval mapping (Lander-Botstein 1989)	Mixed-model linear regression at each SNP
Significance threshold	LOD > 3 (locus-wise) or empirical permutation	P < 5 × 10⁻⁸ (genome-wide Bonferroni)
Strengths	High per-individual power, allele frequency 0.5 by design	High resolution, captures common-variant architecture
Weaknesses	Confined to parental allele set, low resolution	Misses rare variants, needs huge cohorts, population stratification
Typical organism	Mouse, rat, maize, Drosophila, Arabidopsis, yeast	Human (UK Biobank), large outbred animal populations
Output	10–30 cM QTL peaks, dozens per genome	1–100 kb associated regions, hundreds–thousands per trait

Quantitative Trait Loci (QTL)

Interactive visualization

Watch the 60-second explainer

Why QTLs matter

Common misconceptions

How QTL mapping works

QTL mapping vs GWAS

Famous experiments

Frequently asked questions