A growing number of genetic loci have been shown to influence individual predisposition to type 2 diabetes (T2D). Despite longstanding interest in understanding whether nonlinear interactions between these risk variants additionally influence T2D risk, the ability to detect significant gene-gene interaction (GGI) effects has been limited to date. To increase power to detect GGI effects, we combined recent advances in the fine-mapping of causal T2D risk variants with the increased sample size available within UK Biobank (375,736 unrelated European participants, including 16,430 with T2D). In addition to conventional single variant–based analysis, we used a complementary polygenic score–based approach, which included partitioned T2D risk scores that capture biological processes relevant to T2D pathophysiology. Nevertheless, we found no evidence in support of GGI effects influencing T2D risk. The current study was powered to detect interactions between common variants with odds ratios >1.2, so these findings place limits on the contribution of GGIs to the overall heritability of T2D.

Genome-wide association studies (GWAS) have provided a detailed inventory of genetic loci conferring susceptibility to type 2 diabetes (T2D). A study of ∼900,000 individuals of European descent identified >400 association signals (1). To date, most studies of T2D predisposition have focused on the detection of main effects attributable to individual genetic variants. However, there has been longstanding interest in understanding the contributions of gene-gene interactions (GGIs) to individual predisposition to T2D. This interest was initially driven by the possibility that non–log-additive interactions might explain the apparent failure of main effects to account for the observed heritability of T2D (2). More recently, the search for GGIs has been motivated by the desire to establish whether second-order genetic effects could improve disease prediction models on the basis of genotype data and by the potential for statistical interactions to provide clues to underlying disease mechanisms (3). To date, there has been little evidence to indicate that GGIs have any appreciable impact on T2D risk (4,5).

However, it is clear that detection of all but the largest GGI effects require sample sizes substantially larger than those used to identify the main effects (1,6,7). Furthermore, the sample sizes necessary to detect GGI scale exponentially (precisely, to the fourth power of the correlation coefficient) when the variants being tested are only partially correlated with the (often unknown) causal variant (3).

The UK Biobank study, which provides genetic data for ∼500,000 individuals, offers a singular opportunity to advance the exploration of GGIs in T2D. We sought to capitalize on recent advances in the characterization of T2D risk loci resulting from fine-mapping efforts that have improved localization of the causal variant at many risk loci (1).

UK Biobank Study Population

The UK Biobank (8) is a prospective cohort study of ∼500,000 individuals from across the U.K. We used imputed genetic data from the March 2018 release (version 3); details regarding quality control and imputation are provided elsewhere (9). We generated discrete genotypes using a genotype probability threshold of 0.8 and excluded variants with info score <0.5 and Hardy-Weinberg disequilibrium (P < 10−6). Individuals with discordant sex, putative sex chromosome aneuploidy, withdrawal of consent, and diagnosis of another form of diabetes, such as type 1 or gestational diabetes mellitus, were excluded. We further selected a subset of 375,736 individuals (16,430 T2D cases, 359,306 controls) who were 1) unrelated (up to second-degree relatives determined using the KING toolset [https://people.virginia.edu/∼wc9c/KING/manual.html] [10]) and 2) of European ancestry (determined using a combination of 15 principal components [PCs] provided by UK Biobank and self-reported white British ancestry). Variance-weighted PC scores were used to calculate the “genetic distance” with a hypothetical median white British participant to identify European individuals (genetic distance <60 units) (11). Prevalent T2D status was defined using self-reported medical history and medication information (12).

Prioritization of Variants for GGI Analysis

We used the following two approaches. The first approach was to select variants associated with T2D risk (the T2D risk set). We selected the index variant at each of the 403 conditionally independent association signals reported in the largest T2D GWAS in Europeans (1).

The second approach was to select variants associated with heterogeneity in T2D variance (the T2D variance set): variants that demonstrate marked heterogeneity in phenotype variance across genotypes represent potential candidates for GGI effects (1315). To detect such variants, we selected a random subset of 100,000 unrelated European subjects (4,284 T2D cases, 95,716 controls), adjusted their T2D status for age, sex, genotyping batch, and 10 PCs; standardized the residuals; and used Levene test to assess equality of variance across genotypes (including only variants with minor allele frequency [MAF] >5%). For each 1-Mb block of the genome, we identified the variant most significantly associated with T2D variance (index variant). Finally, we selected the 100 most significant index variants that did not overlap with variants in the T2D risk set (Supplementary Table 1). The union of T2D risk and T2D variance sets provided a T2D joint set of 503 variants.

Analysis of GGIs for T2D

GGI effects were sought using two complementary strategies (Supplementary Fig. 1). The first strategy was single variant-based GGI analysis. We tested for interactions between individual variants using the –epistasis function in PLINK (https://www.cog-genomics.org/plink2) (16), which fits a logistic regression model (assuming a log-additive model of disease risk), as follows: ln(P [T2D] / P [control]) = β0 +β1SNP1 +β2SNP2 +β3SNP1SNP2, where SNP1 and SNP2 refer to the variants being tested, β1 and β2 refer to their main effects, and β3 refers to their interaction effect on T2D.

We deployed two analytical approaches: in the T2D joint set pairwise analysis, interaction was tested between each pair of variants in the T2D joint set; and, in the T2D joint set-versus-genome set analysis, interaction testing for the T2D joint set variants was extended to variants in the remainder of the genome (the genome set). Power calculations were performed (17) to estimate the MAF threshold (10%) for a variant in the genome set above which there was adequate power (>75%) to detect a substantial interaction effect (which we defined as an odds ratio [OR] >1.5) with a variant of MAF = 5% from the T2D joint set on the basis of the following parameters: 1) main effect OR on T2D for T2D joint set variant = 1.1; 2) main effect OR on T2D for genome set variant = 1.0; 3) interaction effect OR between the variants = 1.5; 4) sample size = 375,736, with a T2D case:control ratio (per UK Biobank) of 0.044; and 5) single test α = 0.05, with Bonferroni adjustment for 503 genome-wide analyses. Since the presence of linkage disequilibrium (LD) can confound interaction tests, we removed variants in the genome set in LD (r2 > 0.1 within 1 Mb) with a given T2D joint set index variant.

The second strategy was polygenic score (PS)–based GGI analysis. Compared with the single variant–based approach, the PS-based approach for testing GGIs offers greater statistical power if the underlying interaction effects for multiple T2D risk alleles are shared across alleles. We aggregated effects of the 403 variants in the T2D risk set to construct an overall T2D PS using the –score function in PLINK (16). The risk allele dosage for each variant was weighed by the effect size obtained from a T2D meta-analysis of 455,302 European individuals that included all studies from Mahajan et al. (1) except UK Biobank.

Similarly, a set of 93 T2D variants, which were either members of the T2D risk set or proxies (median r2 = 0.91) thereof, was used to construct six (weighted) partitioned PS (pPS) that captured biological processes relevant to T2D pathophysiology (18,19) (Supplementary Table 2). These pPS are referred to as pPSIS1, pPSIS2 (both reflecting processes involved in insulin secretion), pPSIA (insulin action), pPSadiposity (overall adiposity), pPSdyslipidemia (predominantly affecting liver metabolism), and pPSmix (a mixture of the above). The overall PS and the pPS were standardized to represent SD units. For the PS-based interaction analysis, we tested interactions between each PS (overall PS and six pPS) and genome-wide variants (MAF ≥1%; ∼6 million) variants.

Data Resource and Availability

The summary statistics of genome-wide analyses performed in this study are available at https://zenodo.org/record/3978776#.XzIAzC3MzRY.

The approaches deployed to characterize GGIs in T2D are summarized in Supplementary Fig. 1. These analyses were performed in 375,736 unrelated European individuals from UK Biobank (16,430 T2D cases, 359,306 controls) (researchdesign andmethods).

Analysis of GGIs for T2D

Since testing all genome-wide variants for interaction presents computational and statistical challenges, we prioritized two sets of variants for single variant analysis: the T2D risk set of 403 index variants, representing conditionally independent association signals for T2D risk in Europeans (1), and the T2D variance set, a nonoverlapping set of 100 index variants selected for the most extreme effects on T2D variance within UK Biobank (indicative of possible interaction effects) (1315). Our primary analyses considered the combined T2D joint set of 503 variants (researchdesign andmethods).

We first sought pairwise interactions between variants in the T2D joint set. None of the pairwise interactions crossed the Bonferroni-corrected significance threshold (α = 4 × 10−7, correcting for 126,253 pairwise tests) (Supplementary Table 3). We estimate >70% power to detect an interaction effect of OR ≥1.5 between two variants in the T2D joint set, when both have an MAF at the lower end of the common variant range (5%) and main effect T2D OR = 1.1 (Supplementary Fig. 2). For more common variants (MAF = 50%), we were powered for interaction effects OR >1.2. The quantile-quantile plot provided no evidence that the distribution of GGI effects departed from the null (Fig. 1A). The strongest signal in this pairwise GGI analysis, involving rs629137 (near UVRAG, T2D variance set) and rs76011118 (near CDKN2A/B, T2D risk set), had an interaction OR of 1.24 (95% CI 1.21–1.27; P = 1 × 10−5). No significant pairwise interactions were observed when analyses were restricted to just the T2D risk set (81,003 tests, α = 6 × 10−7) or the T2D variance set (4,950 tests, α = 1 × 10−5) (Fig. 1A).

Figure 1

Quantile-quantile (Q-Q) plots for the GGI analyses. A: Pairwise interaction analysis for T2D joint set variants. The figure shows the Q-Q plot for the pairwise interaction analysis for the index variants in the T2D joint set. In addition, the Q-Q plots when the pairwise interaction analysis was restricted to the index variants in the T2D risk set and to the index variants in the T2D variance set are shown. B: Interaction analysis between variants in the T2D joint set and the genome set. The Q-Q plot for the interaction analysis between variants in the T2D joint set and the genome set is shown as two separate curves: the red curve demonstrates the results of the genome-wide interaction with T2D risk set variants, and the blue curve demonstrates the results for T2D variance set variants. For simplicity, the results shown are restricted to the 10 variants in each set with the strongest associations for the respective measure (T2D risk or T2D variance).

Figure 1

Quantile-quantile (Q-Q) plots for the GGI analyses. A: Pairwise interaction analysis for T2D joint set variants. The figure shows the Q-Q plot for the pairwise interaction analysis for the index variants in the T2D joint set. In addition, the Q-Q plots when the pairwise interaction analysis was restricted to the index variants in the T2D risk set and to the index variants in the T2D variance set are shown. B: Interaction analysis between variants in the T2D joint set and the genome set. The Q-Q plot for the interaction analysis between variants in the T2D joint set and the genome set is shown as two separate curves: the red curve demonstrates the results of the genome-wide interaction with T2D risk set variants, and the blue curve demonstrates the results for T2D variance set variants. For simplicity, the results shown are restricted to the 10 variants in each set with the strongest associations for the respective measure (T2D risk or T2D variance).

Close modal

Next, we tested interactions between T2D joint set variants and the remainder of the genome. For the latter, we focused on ∼3.2 million variants with MAF >10%, since power calculations indicated >75% power to detect an interaction effect of OR ≥1.5 for variants above this MAF threshold, given a joint set MAF at the lower end of the common variant range (5%) (Supplementary Fig. 3). Again, we found no evidence for significant interactions at α = 1 × 10−10 (accounting for 503 genome-wide analyses) (Figs. 1B and 2 and Supplementary Table 4). The strongest signal involved two variants near TCF7L2: rs184509201, a T2D risk set variant corresponding to one of seven secondary signals at this locus, and rs10885397, a variant from the genome set (interaction OR 1.55 [1.51–1.59]; P = 2 × 10−9). These two variants, located ∼28 kb apart, are not in LD (r2 = 0.0). Genome-wide interaction analyses with variants exclusive to the T2D risk set (α = 1 × 10−10) or the T2D variance set (α = 5 × 10−10) were similarly negative.

Figure 2

Manhattan plot for the GGI analysis between variants in the T2D joint set and the genome set. The figure demonstrates the results of 503 genome-wide interaction analyses, where each genome-wide analysis corresponds to interaction testing between a variant in the T2D joint set (N = 503) and the genome set variants. The dotted line demarcates the conventional genome-wide significance threshold (P = 5 × 10−8), and the red line demarcates the Bonferroni-corrected significance threshold for 503 genome-wide analyses (P = 1 × 10−10). The gray zone at the bottom of the plot represents association P values that were not plotted (P > 0.001).

Figure 2

Manhattan plot for the GGI analysis between variants in the T2D joint set and the genome set. The figure demonstrates the results of 503 genome-wide interaction analyses, where each genome-wide analysis corresponds to interaction testing between a variant in the T2D joint set (N = 503) and the genome set variants. The dotted line demarcates the conventional genome-wide significance threshold (P = 5 × 10−8), and the red line demarcates the Bonferroni-corrected significance threshold for 503 genome-wide analyses (P = 1 × 10−10). The gray zone at the bottom of the plot represents association P values that were not plotted (P > 0.001).

Close modal

Since the power to detect GGI effects for individual variants can be limited even with large sample sizes, we sought to bolster GGI detection by aggregating T2D risk variants into PS. In addition to an overall PS generated from the 403 variants in the T2D risk set, we constructed six additional pPS using a subset of 93 T2D risk variants that were stratified into physiological clusters capturing biological processes relevant to T2D pathophysiology (researchdesign andmethods).

Analyses evaluating GGIs using a PS-based approach failed to detect significant interactions between these PS and variants in the genome set (α = 7 × 10−9, accounting for seven GWAS) (Supplementary Table 5 and Supplementary Figs. 4 and 5). The strongest signal from this analysis was between variants at the CHN2 locus and pPSdyslipidemia (interaction OR 1.22; P = 1 × 10−7); this locus has previously been associated with diabetic retinopathy and development of severe insulin resistance (20).

By addressing several limitations of previous analyses, the current study offers substantially improved power for detecting GGI effects influencing T2D risk. First, we conducted GGI analyses in a much larger sample size than earlier efforts (4,5). Second, we prioritized sets of genetic variants likely to be enriched for interaction effects. Third, we leveraged improved fine-mapping of association signals, avoiding the attrition of power for signals only partially correlated with the causal variant. Fourth, we implemented a PS-based approach to complement the single variant–based interaction analysis. Despite using all these measures, we found no credible evidence either in the overall distribution of interaction effects or in individual signals meeting study-wide significance for GGI effects influencing T2D risk.

There are some clear limitations to this analysis. For computational and statistical reasons, we chose not to test all genome-wide variants for interaction (requiring ∼1013 tests). Nevertheless, the sets of variants prioritized were those most likely to demonstrate interaction effects, and the lack of significant associations within this subset should be indicative of patterns seen more broadly. Additionally, our study did not rule out the possibility of higher-order interactions and interactions detectable on a scale of disease-risk other than log-additive.

We are unable to determine the extent to which more subtle or infrequent GGI effects, singly or in combination, contribute to the residual heritability not attributable to known T2D risk variants. Nevertheless, the absence of detectable major GGI effects on T2D risk implies that for many practical clinical and epidemiological purposes, the joint effects of multiple genetic risk factors for T2D can be derived purely from the combination of main effects. Although it has been suggested that statistical GGIs might be indicative of functional interactions (3) and, thereby, provide mechanistic insights into disease biology, the extent to which this holds for common variants influencing complex human diseases remains to be established.

M.I.M. and A.M. are currently affiliated with Genentech, South San Francisco, CA.

This article contains supplementary material online at https://doi.org/10.2337/figshare.12827219.

Acknowledgments. This research has been conducted using the UK Biobank Resource (https://www.ukbiobank.ac.uk) under application number 9161. We acknowledge Andrew P. Morris (University of Manchester) for useful input on some of the analyses.

Funding. This study received funding from the National Institute of Diabetes and Digestive and Kidney Diseases (U01-DK-105535) and the Wellcome Trust (090532, 098381, 106130, 203141, 212259). M.I.M. was a Wellcome investigator and an NIHR senior investigator.

The views expressed in this article are those of the authors and not necessarily those of the NHS, the NIHR, or the Department of Health.

Duality of Interest. As of January 2020, A.M. is an employee of Genentech and a holder of Roche stock. M.I.M. has served on advisory panels for Pfizer, Novo Nordisk, and Zoe Global; has received honoraria from Merck, Pfizer, Novo Nordisk, and Eli Lilly; and has received research funding from AbbVie, AstraZeneca, Boehringer Ingelheim, Eli Lilly, Janssen, Merck, Novo Nordisk, Pfizer, Roche, Sanofi, SERVIER, and Takeda. As of June 2019, M.I.M. is an employee of Genentech and a holder of Roche stock. No other potential conflicts of interest relevant to this article were reported.

Author Contributions. A.N., M.I.M., and A.M. wrote the manuscript. A.N. and A.M. analyzed the data. M.I.M. and A.M. conceived and designed the study. All authors revised the manuscript. A.M. is the guarantor of this work and, as such, had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.

Prior Presentation. Parts of this study were presented at the 79th Scientific Sessions of the American Diabetes Association, San Francisco, CA, 7–11 June 2019.

1.
Mahajan
A
,
Taliun
D
,
Thurner
M
, et al
.
Fine-mapping type 2 diabetes loci to single-variant resolution using high-density imputation and islet-specific epigenome maps
.
Nat Genet
2018
;
50
:
1505
1513
2.
Zuk
O
,
Hechter
E
,
Sunyaev
SR
,
Lander
ES
.
The mystery of missing heritability: genetic interactions create phantom heritability
.
Proc Natl Acad Sci U S A
2012
;
109
:
1193
1198
3.
Wei
W-H
,
Hemani
G
,
Haley
CS
.
Detecting epistasis in human complex traits
.
Nat Rev Genet
2014
;
15
:
722
733
4.
Ortega-Azorín
C
,
Sorlí
JV
,
Asensio
EM
, et al
.
Associations of the FTO rs9939609 and the MC4R rs17782313 polymorphisms with type 2 diabetes are modulated by diet, being higher when adherence to the Mediterranean diet pattern is low
.
Cardiovasc Diabetol
2012
;
11
:
137
5.
Villegas
R
,
Goodloe
RJ
,
McClellan
BEJ
 Jr
.,
Boston
J
,
Crawford
DC
.
Gene-carbohydrate and gene-fiber interactions and type 2 diabetes in diverse populations from the National Health and Nutrition Examination Surveys (NHANES) as part of the Epidemiologic Architecture for Genes Linked to Environment (EAGLE) study
.
BMC Genet
2014
;
15
:
69
6.
Wong
MY
,
Day
NE
,
Luan
JA
,
Chan
KP
,
Wareham
NJ
.
The detection of gene-environment interaction for continuous traits: should we deal with measurement error by bigger studies or better measurement?
Int J Epidemiol
2003
;
32
:
51
57
7.
Wong
MY
,
Day
NE
,
Luan
JA
,
Wareham
NJ
.
Estimation of magnitude in gene-environment interactions in the presence of measurement error
.
Stat Med
2004
;
23
:
987
998
8.
Sudlow
C
,
Gallacher
J
,
Allen
N
, et al
.
UK biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age
.
PLoS Med
2015
;
12
:
e1001779
9.
Bycroft
C
,
Freeman
C
,
Petkova
D
, et al
.
The UK Biobank resource with deep phenotyping and genomic data
.
Nature
2018
;
562
:
203
209
10.
Manichaikul
A
,
Mychaleckyj
JC
,
Rich
SS
,
Daly
K
,
Sale
M
,
Chen
W-M
.
Robust relationship inference in genome-wide association studies
.
Bioinformatics
2010
;
26
:
2867
2873
11.
Astle
WJ
,
Elding
H
,
Jiang
T
, et al
.
The allelic landscape of human blood cell trait variation and links to common complex disease
.
Cell
2016
;
167
:
1415
1429.e19
12.
Eastwood
SV
,
Mathur
R
,
Atkinson
M
, et al
.
Algorithms for the capture and adjudication of prevalent and incident diabetes in UK biobank
.
PLoS One
2016
;
11
:
e0162388
13.
Wang
H
,
Zhang
F
,
Zeng
J
, et al
.
Genotype-by-environment interactions inferred from genetic effects on phenotypic variability in the UK Biobank
.
Sci Adv
2019
;
5
:
eaaw3538
14.
Wei
W-H
,
Bowes
J
,
Plant
D
, et al
.
Major histocompatibility complex harbors widespread genotypic variability of non-additive risk of rheumatoid arthritis including epistasis
.
Sci Rep
2016
;
6
:
25014
15.
Shungin
D
,
Deng
WQ
,
Varga
TV
, et al.;
GIANT Consortium
.
Ranking and characterization of established BMI and lipid associated loci as candidates for gene-environment interactions
.
PLoS Genet
2017
;
13
:
e1006812
16.
Purcell
S
,
Neale
B
,
Todd-Brown
K
, et al
.
PLINK: a tool set for whole-genome association and population-based linkage analyses
.
Am J Hum Genet
2007
;
81
:
559
575
17.
Kooperberg
C
,
Leblanc
M
.
Increasing the power of identifying gene x gene interactions in genome-wide association studies
.
Genet Epidemiol
2008
;
32
:
255
263
18.
Mahajan
A
,
Wessel
J
,
Willems
SM
, et al.;
ExomeBP Consortium; MAGIC Consortium; GIANT Consortium
.
Refining the accuracy of validated target identification through coding variant fine-mapping in type 2 diabetes
.
Nat Genet
2018
;
50
:
559
571
19.
Udler
MS
,
McCarthy
MI
,
Florez
JC
,
Mahajan
A
.
Genetic risk scores for diabetes diagnosis and precision medicine
.
Endocr Rev
2019
;
40
:
1500
1520
20.
Buniello
A
,
MacArthur
JAL
,
Cerezo
M
, et al
.
The NHGRI-EBI GWAS Catalog of published genome-wide association studies, targeted arrays and summary statistics 2019
.
Nucleic Acids Res
2019
;
47
(
D1
):
D1005
D1012
Readers may use this article as long as the work is properly cited, the use is educational and not for profit, and the work is not altered. More information is available at https://www.diabetesjournals.org/content/license.