Video Abstract (AI generated) (01:23)PaperPreprint
Identifying the genetic and environmental factors underlying phenotypic differences between populations is fundamental to multiple research communities. To date, studies have focused on the relationship between population and phenotypic mean. Here we consider the relationship between population and phenotypic variance, i.e., "population variance structure." In addition to gene-gene and gene-environment interaction, we show that population variance structure is a direct consequence of natural selection. We develop the ancestry double generalized linear model (ADGLM), a statistical framework to jointly model population mean and variance effects. We apply ADGLM to several deeply phenotyped datasets and observe ancestry-variance associations with 12 of 44 tested traits in ~113K British individuals and 3 of 14 tested traits in ~3K Mexican, Puerto Rican, and African-American individuals. We show through extensive simulations that population variance structure can both bias and reduce the power of genetic association studies, even when principal components or linear mixed models are used. ADGLM corrects this bias and improves power relative to previous methods in both simulated and real datasets. Additionally, ADGLM identifies 17 novel genotype-variance associations across six phenotypes.