In this study we examine the instrument selection strategies currently used throughout the type 2 diabetes and HbA1c Mendelian randomization (MR) literature. We then argue for a more integrated and thorough approach, providing a framework to do this in the context of HbA1c and diabetes. We conducted a literature search for MR studies that have instrumented diabetes and/or HbA1c. We also used data from the UK Biobank (UKB) (N = 349,326) to calculate instrument strength metrics that are key in MR studies (the F statistic for average strength and R2 for total strength) with two different methods (“individual-level data regression” and Cragg-Donald formula). We used a 157–single nucleotide polymorphism (SNP) instrument for diabetes and a 51-SNP instrument (with partition into glycemic and erythrocytic as well) for HbA1c. Our literature search yielded 48 studies for diabetes and 22 for HbA1c. Our UKB empirical examples showed that irrespective of the method used to calculate metrics of strength and whether the instrument was the main one or included partition by function, the HbA1c genetic instrument is strong in terms of both average and total strength. For diabetes, a 157-SNP instrument was shown to have good average strength and total strength, but these were both substantially lesser than those of the HbA1c instrument. We provide a careful set of five recommendations to researchers who wish to genetically instrument type 2 diabetes and/or HbA1c. In MR studies of glycemia, investigators should take a more integrated approach when selecting genetic instruments, and we give specific guidance on how to do this.

Mendelian randomization (MR) has markedly enhanced our ability to determine the true causal nature of associations between states of diabetes (145)/hyperglycemia (4659) and presumed consequences. MR uses genetic variants as unconfounded instruments for the exposure (60). As MR has come of age in recent years alongside the advent of large-scale genome-wide association studies (GWA), numerous genetic instruments for glycemic traits have become available (6165). Choosing the most appropriate instrument is one of the most important decisions in designing an MR study (66) as an ill-informed choice may lead to misleading or conflicting findings.

Broadly, criteria for instrument selection (which are intrinsically linked to the core assumptions underlying MR [Fig. 1]) include 1) ensuring that there is no sample overlap between the samples used in the discovery GWAS and the data under analysis, as this helps minimize bias arising from “winner’s curse” and the use of weak instruments (67); 2) selecting independent variants from the latest and largest GWAS for the exposure (at a threshold of P < 5 * 10−8); 3) choosing variants based on the amount of variance explained in the exposure (R2); 4) selecting variants on the basis of biology and function; and 5) deciding whether variants for a continuous or binary exposure are more appropriate. However, although the criteria described above in 1, 2, and perhaps 3 are often prioritized in glycemic MR studies, the remainder are not always taken into consideration. In relation to 2, we argue that bigger is not always better, as the greater the number of genetic variants, the more we increase our chances of including pleiotropic variants. This directly violates a core MR assumption of no horizontal pleiotropy (that variants for the exposure should not be associated with common confounders or directly with the outcome under study but should only associate with the outcome via the exposure being instrumented) (60). A balance is needed between including sufficient genetic variants to enable well-powered analyses but not so many that pleiotropy is inevitable.

Figure 1

Summary of genetic instrument selection criteria in MR studies.

Figure 1

Summary of genetic instrument selection criteria in MR studies.

Close modal

Currently, few, if any, journals demand a clear explanation for choice of genetic instrument. While some determinants of choice, such as overlap with genetic instrument derivation of a GWAS, variant function, and whether the trait is continuous or binary, may be gleaned from the manuscript without being explicit, key statistical characteristics, specifically R2 and F, which may make a major contribution to the power of an MR analysis, are not. Here the R2 is the amount of variance in the exposure that is accounted for by the selected genetic variants, and generally when it comes to the R2, the larger the better, as this will directly contribute to the power of an MR analysis. The F statistic provides information about the average strength of a genetic variant for the exposure of interest. An F of >10 indicates that substantial weak instrument bias is unlikely (1/F of the bias from the observational estimate) (68). Weak instrument bias is of concern in MR studies, as weak instruments can bias MR estimates toward the confounded observational estimate (68) and thus results are not as robust as with a strong instrument.

Therefore, our overall objectives were to understand instrument selection approaches currently used in MR studies of diabetes and HbA1c, to present why we need integrated approaches (described below) for this, and to provide a framework for how this can be done in practical terms. Our specific aims were as follows:

  1. Conduct a literature search for MR studies that have instrumented type 2 diabetes and/or HbA1c to understand which exposure is instrumented more frequently and whether metrics of instrument strength are reported.

  2. Argue for the use of integrated approaches for the selection of HbA1c and type 2 diabetes genetic instruments, with recent examples from the MR literature.

  3. Use empirical examples to compare the total and average strength of an HbA1c genetic instrument (with partition by function) with those of a type 2 diabetes instrument to show that an HbA1c instrument may be superior.

  4. Provide an overall framework for how to best select instruments for HbA1c and type 2 diabetes in an MR setting, considering 1 and 2 above.

Here we highlight recent examples from the MR literature where HbA1c and/or diabetes genetic variants were used in MR studies, in what we are naming “an integrated approach.” An integrated approach to genetic instrument selection is one that considers factors that are sometimes overlooked in MR studies of glycemic traits. These include the use of novel approaches, such as, for example, that of Burgess et al. (57) described here; more careful consideration of which exposure GWAS is used; where possible prioritizing instrumentation of a continuous rather than a binary exposure; and, finally, ensuring that both the variance explained (R2) and measures of instrument strength (F statistic) are always calculated and presented.

Example A: Published MR Study of Glycemia and Coronary Heart Disease Using an Integrated Approach to HbA1c Genetic Instrument Selection

In a recent MR study, Burgess et al. (57) used HbA1c genetic variants to investigate associations between genetically instrumented glycemic status and incident coronary heart disease. The authors used a novel approach to genetic instrument selection: they took 40 independent HbA1c, single-nucleotide polymorphisms (SNPs) based on their associations with diabetes at genome-wide significance from a recent GWAS (64) and their association with HbA1c in the 2017 Meta-Analyses of Glucose and Insulin-related traits Consortium (MAGIC) GWAS by Wheeler et al. (61). They then calculated a weighted allele score for each individual in their data (UK Biobank [UKB]) whereby they multiplied each diabetes risk–increasing allele dosage by the SNP’s HbA1c β-coefficient from the MAGIC GWAS. By doing so, the authors ensured that their allele score reflected average blood glucose levels, as opposed to only HbA1c or risk of diabetes. This also relates to our earlier point about selecting instruments based on biological function. Corresponding metrics for their instrument were F = 144.5 and R2 = 0.018, indicating that although they had fewer variants, this was a strong instrument, both in terms of total (R2) and average (F statistic) strength, and thus carried a low risk of weak instrument bias.

Example B: Published MR Study of Glycemia and Cognitive/Brain Health

As mentioned earlier, an assumption that is often made in approaching genetic instrument selection in MR studies is that “bigger is better.” Therefore, researchers are likely to take as many SNPs (genome-wide significant and independent) as possible from the largest and latest GWAS. However, our own recently published MR study shows that this is not necessarily the case (69). We instrumented diabetes using both a 157- and 77-SNP genetic instrument, as we needed to try to mitigate issues of sample overlap between the GWAS for the exposure and the data under study (both UKB). Therefore, we took the 157 diabetes SNPs included in our instrument and looked them up in an older diabetes GWAS from 2014 (70). We found 77 of the diabetes SNPs (reduced number could be due to differences in coverage of imputation panels, for example) and observed that although this was an older GWAS in a different and smaller sample, the log(β values) for each SNP were comparable, even though most of the variants did not reach conventional genome-wide significance (P < 5 * 10−8). When we calculated the average strength (F statistic) of our 77-SNP instrument and compared this with the 157-SNP F statistic, they were 31 and 27, respectively. This indicates that an instrument with more genetic variants is not necessarily better in terms of average strength and greater the number of variants, i.e., the greater the likelihood of including pleiotropic variants.

A greater number of SNPs not always being better is also supported by recent MR studies that have instrumented BMI (71). The authors used an “older” instrument containing 96 BMI SNPs that performed well: Therefore, it is perhaps unnecessary to always use an instrument with hundreds of SNPs. Larsson et al. (71) showed that this BMI instrument explained 1.6% of the variance in BMI and had an F statistic of 61, while another recent MR study that instrumented BMI, for understanding of its association with chronic kidney disease, used a 773-SNP instrument, which explained ∼6% of the variance in BMI with only an F statistic of 23.6 (72). It is important to note that we need to balance these metrics against one another when selecting a genetic instrument for an MR study. This is because an instrument with more genetic variants has a larger R2 (total strength) and more power but is also more likely to include pleiotropic variants, which could lead to violation of a core MR assumption. An instrument with a larger R2 usually has a lower F statistic (average strength), which, if <10, will carry a greater risk of weak instrument bias.

Literature Search for MR Studies That Instrument Type 2 Diabetes and/or HbA1c

We were interested in how many studies have instrumented HbA1c and type 2 diabetes to date, whether there is a preference for one over the other, and whether they report metrics of instrument strength. Thus, we conducted a literature review in PubMed up until March 2021 (for details of our search terms and strategy see Supplementary Material) of MR studies that instrumented these exposures. We excluded anything that was not a research article, i.e., conference abstracts, letters, editorials, reviews, opinion pieces, and commentaries. Studies that evidently did not instrument HbA1c or type 2 diabetes were not included. Supplementary Tables 1 and 2 list all the studies for diabetes and HbA1c, respectively, that were included.

Empirical Examples in UKB: Calculation of Total (R2) and Average (F Statistic) Strength Metrics for HbA1c and Type 2 Diabetes Instruments

The aim of these empirical examples was to show the reader that 1) calculating R2 and F statistic metrics as part of an MR study is important to understand both the total and average strength of the instrument of choice; and 2) irrespective of whether individual- or summary-level data are used for an MR study, options for obtaining these metrics are available. We chose two approaches, as there has not been quantitative comparisons of how they perform for glycemic instruments when considering both the R2 and F statistic. These methods are “individual-level data regression” and Cragg-Donald F statistic.

The UKB is a cohort of ∼500,000 adults recruited across the U.K.’s general population, aged 40–69 years at baseline (2006–2010), for which more details are published elsewhere (73). For the empirical examples in the individual-level data regression and the Cragg-Donald method we used individual-level data from 349,326 UKB participants of White European ancestry, who had complete genotype (quality-controlled) and phenotype (type 2 diabetes and HbA1c) data. Details of the genotype quality control can be found in our previous MR article (69). The UKB received ethics approval from the North West Multicentre Research Ethics Committee, and informed consent was obtained from participants.

Selection of Type 2 Diabetes and HbA1c Genetic Instruments

For both phenotypes, we used previously described genetic instruments (69). Briefly, for type 2 diabetes the genetic instrument comprised of 157 SNPs from a 2018 GWAS of European ancestry (62), while the 51-SNP HbA1c instrument came from a 2017 trans-ethnic GWAS (61). We filtered SNPs on minor allele frequency (>0.01) and used linkage disequilibrium (LD) clumping in PLINK, with P < 5 * 10−8 (69). For HbA1c we also partitioned the instrument into 16 glycemic SNPs and 19 erythrocytic SNPs (the remainder are unclassified, as per the 2017 GWAS) separately with the aim of testing whether the HbA1c instrument is strong in terms of both average (measured by the F statistic) and total (measured by the R2) strength when using all the SNPs, as well as with partition by biological function. Similar to our previously published MR study of glycemia and brain health/cognition/dementia outcomes, we suggest that it is worth doing three things when using an HbA1c genetic instrument: 1) perform MR using all of the HbA1c SNPs; 2) perform MR using only the glycemic SNPs; and 3) perform MR using only the erythrocytic SNPs.

Calculation of the F Statistic as a Measure of Average Instrument Strength and the R2 as a Measure of Total Strength

Individual-level Data Regression Approach.

This approach involves fitting a multivariable linear regression between SNPs and the exposure (treated as an outcome y here), where one evaluates the relationship between the j-th SNP and the outcome y while holding all the other SNPs constant. In the regression equation below, β0 represents the constant and ε the residual or error term. As with any multivariable regression, the output includes the F statistic and R2, which conventionally indicate the model fit, and in this case, we are likely not concerned with the interpretation of the coefficients of each SNP on the exposure. Linear regression can also be used when the exposure is binary (e.g., in this case, we used it for genetic liability for diabetes), whereby the coefficients and statistics represent associations on an absolute scale rather than a relative risk or odds ratio scale. Therefore, here we calculated R2 and the F statistic for liability for diabetes using linear regression.

The formula is

Cragg-Donald F Statistic Formula.

For this method we use the Cragg-Donald F statistic formula provided in the article by Burgess et al. (68), which requires a value for R2 (previously calculated R2 values were 0.028 and 0.030 for HbA1c and 0.015 for diabetes), k (number of SNPs = 51, 275, and 157) and n (349,326). For consistency and comparability, we kept the R2, k, and n the same as in the individual-level data regression approach above. Above, we were able to calculate the R2, but it is sometimes the case that authors for GWAS provide the R2 for the top SNPs, which could then be used in this formula.

The Cragg-Donald formula, as outlined by Burgess et al. (68), is

Literature Search Results

Our search yielded a total of 657 studies for diabetes, of which 609 did not instrument this phenotype; thus, 48 remained. For HbA1c, we found a total of 77 articles, of which 55 did not instrument HbA1c and were excluded, leaving 22 articles. From this literature search it was clear that many more studies currently choose to instrument type 2 diabetes over HbA1c.

Results of F Statistic (Average Instrument Strength) and R2 (Total Instrument Strength)

HbA1c 51- and 275-SNP Instrument and Partitioned Glycemic/Erythrocytic Instruments.

As per Table 1 below, when using 51 and 275 HbA1c SNPs in UKB, the individual-level data regression and Cragg-Donald formulae gave similar F statistics (using the same R2 values of 2.8% and 3%). The two methods yielded somewhat different F statistics for the 16-SNP glycemic instrument, but both were substantially greater than 10, indicating no cause for concern (Table 1). For the 19-SNP erythrocytic instrument the F statistics obtained using both methods were comparable (Table 1).

Table 1

Instrument strength metrics in UKB (N = 349,326)

TraitMethodVariance explained (R2)F statistic
Diabetes (157 SNPs) ILDR 0.015 (1.5%) 27.43 
 CD 0.015 (1.5%) 27.9 
HbA1c main instrument (51 SNPs) ILDR 0.028 (2.8%) 164.6 
 CD 0.028 (2.8%) 164.8 
HbA1c main instrument (275 SNPs) ILDR 0.030 (3%) 33.24 
 CD 0.030 (3%) 38.08 
HbA1c 16-SNP glycemic instrument ILDR 0.011 (1.1%) 201.1 
 CD 0.011 (1.1%) 182.3 
HbA1c 19-SNP erythrocytic instrument ILDR 0.012 (1.2%) 187.5 
 CD 0.012 (1.2%) 184.3 
TraitMethodVariance explained (R2)F statistic
Diabetes (157 SNPs) ILDR 0.015 (1.5%) 27.43 
 CD 0.015 (1.5%) 27.9 
HbA1c main instrument (51 SNPs) ILDR 0.028 (2.8%) 164.6 
 CD 0.028 (2.8%) 164.8 
HbA1c main instrument (275 SNPs) ILDR 0.030 (3%) 33.24 
 CD 0.030 (3%) 38.08 
HbA1c 16-SNP glycemic instrument ILDR 0.011 (1.1%) 201.1 
 CD 0.011 (1.1%) 182.3 
HbA1c 19-SNP erythrocytic instrument ILDR 0.012 (1.2%) 187.5 
 CD 0.012 (1.2%) 184.3 

CD, Cragg-Donald; ILDR, individual-level data regression.

Type 2 Diabetes 157-SNP Instrument in UKB.

Table 1 presents F statistics and R2 metrics using both methods. Results were comparable irrespective of which formula was used (with the same R2 of 1.5%).

Which Approach Should I Use in My Study?

The individual-level data regression approach naturally requires individual-level data for the exposure of interest, which are not always available to researchers. The Cragg-Donald formula, however, relies on having information about the R2, which could come from the published GWAS for the exposure, yet this is not always included in GWAS papers. The “t statistic” approach can be used to calculate the F statistic when the R2 is not known if β values or log(β values) and SEs are provided in the summary-level GWAS exposure data set. Thus, if individual-level data are available then the individual-level data regression may be recommended, but if this is not the case then the Cragg-Donald formula can be used.

Consideration of Total and Average Instrument Strength for HbA1c and Type 2 Diabetes

Across our empirical examples in the UKB, the HbA1c instrument outperformed the instrument for type 2 diabetes, in terms of total strength (R2) and average strength (F statistic), even though it contained markedly fewer SNPs. Specifically, the 16-SNP glycemic instrument had the highest average strength and explained 1% of the variance in HbA1c, which is lower than the 2.8% variance explained for the 51-SNP instrument but certainly still appropriate for use in MR. The type 2 diabetes 157-SNP instrument had a much smaller F statistic (F < 30) for UKB overall and explained ∼1.5% of the variance in diabetes in UKB. On the other hand, the HbA1c erythrocytic instrument also demonstrated that it is more than adequate for use in MR studies, with a similar R2 to the glycemic variants and an F value of just under 200. Therefore, whether or not it is partitioned into glycemic and erythrocytic, the HbA1c genetic instrument with 51 SNPs is overall a strong instrument for use in MR studies, as indicated by both R2 and F statistic metrics, even in comparison with the newer 275-SNP HbA1c instrument. However, the type 2 diabetes instrument appears to be somewhat weaker both in terms of total and average strength compared with the HbA1c genetic instrument(s).

Potential Recommendations for MR Studies Instrumenting Diabetes and/or HbA1c

First, as demonstrated in our empirical examples and argued above, “bigger is not always better” when it comes to selection of instruments for glycemic MR studies. Above we show that in some cases glycemic instruments with fewer SNPs may be stronger and therefore more robust for use in MR when it comes to trying to minimize the important issue of “weak instrument bias.” This is the case for both HbA1c and diabetes, with the HbA1c instrument being superior. We therefore recommend that researchers do not assume that the latest and largest GWAS will always yield the best genetic instrument for these exposures and that careful consideration be given to which GWAS is selected for the exposure. Genetic variants identified in older GWAS may of course also be pleiotropic. Thus, researchers might choose to empirically test this in their MR study by, for example, performing a phenome-wide association study. However, it is important to note that in instrument selection one will likely have to balance choosing an instrument with a larger number of genetic variants (greater R2 = total strength) and an instrument with lesser average strength (lower F statistic). When prioritizing the former, it is more likely that the instrument will include pleiotropic variants, which violates a core MR assumption. If the latter is prioritized, it is possible that the total instrument strength may be weakened, as fewer variants often yield a larger F statistic, but with lower variance explained in the exposure (R2). However, it is also important to note that more variants provide opportunities to use more robust methods, including common sensitivity analyses such as the MR-Egger test. However, for the HbA1c instrument exemplified above in the UKB cohort, when we partitioned by glycemic versus erythrocytic variants, the R2 remained at 1% for a small number of SNPs. Therefore, this example is a demonstration of an integrated approach with consideration of the total and average strength of the instrument, alongside biological function of the variants. In addition, another way to avoid pleiotropy is to use an approach such as that of Luo et al. (74), who adjusted for erythrocytic properties to control for unknown sources of pleiotropy.

Second, to reiterate the recommendation made by Boef et al. (75), and the more recent STROBE-MR guidelines (66), authors of MR studies should calculate and report the F statistic for the association between their genetic instrument and the exposure of interest in their study. As demonstrated earlier, this can be calculated using one of three approaches, depending on whether researchers have access to individual-level data. If individual-level data are available for the exposure of interest, then researchers should likely prioritize calculating the F statistic using the individual-level data regression approach. If individual-level data are not accessible, but the exposure GWAS article provides the R2 for the (exact) instrument that is being used, then we recommend using the Cragg-Donald F statistic method. An additional method exists, namely the t statistic method, which we did not implement here. This is because the t statistic method (F = β2/SE2) can be used when the R2 is not known (i.e., not provided in the article for the GWAS for the exposure). In this equation, β represents the coefficient for each SNP’s association with the exposure and SE its standard error. By using the t statistic method, the obtained F statistic will be more of an approximation because the discovery GWAS (usually for the exposure) sample size is used rather than that of the outcome data set.

Third, and related to our earlier point, there are some complex issues surrounding genetic instrumentation of binary disease exposures such as diabetes (76). In instrumenting these types of disease exposures, it is important to note that we are modeling an underlying continuous measure where liability thresholds are used to separate individuals into different categories (76,77); thus, we should interpret MR using binary exposures in terms of genetic liability (77). If MR instrumental variable assumptions are met for the underlying continuous exposure that is used to categorize individuals, then we assume that we can infer causality using the binary exposure (76). However, there may be circumstances in which researchers feel the need to genetically instrument diabetes itself as it may prove to be clinically informative. We still recommend that researchers interested in how hyperglycemia might causally impact a range of important health outcomes take advantage of what is evidently a strong HbA1c instrument. This instrument is currently underused, as we found only 22 studies where it was used as an exposure in MR studies, and thus we recommend that researchers exploit this instrument to a much greater extent. Also, the MAGIC GWAS do not include UKB, making this instrument very attractive for use in two-sample MR studies of HbA1c and important health outcomes. In terms of instrument metrics, our applied example in the UKB data clearly showed that the HbA1c instrument completely outperformed the diabetes instrument. The HbA1c instrument can also be split by biological function, into erythrocytic and glycemic SNPs, as shown above in our examples. Genetic instrumentation of a continuous exposure, such as HbA1c, also enables the application of nonlinear MR methods (78), which are also somewhat underused in MR. Using nonlinear MR methods can help define levels of risk and may also aid in understanding that it is both low and high levels of HbA1c that are associated with risk. While understanding the causal impact of disease status (e.g., diabetes) on a range of outcomes is both interesting and important, it is well established that continuous measures are superior and should be used where possible.

Fourth, we recommend that where plausible, researchers adopt an instrument selection approach such as that of Burgess et al. (57), which we described above (example A) with the aim of illustrating a novel line of thinking to integrate both diabetes and HbA1c into an MR study. In this study authors used a method that exploited properties of each of these exposures, and this yielded an instrument with good average (F = 144) and total strength (2.8% variance explained). An alternative form of biological integration is illustrated in the works of Yuan et al. (79) and Au Yeung et al. (80) who integrated expression of relevant genes and HbA1c in their instrument selection process.

Fifth, another example of an integrated approach to instrument selection is provided in example B above, in which we sought to bypass the issue of sample overlap in our previous MR study. To try to mitigate this we took as many of the newer diabetes variants as possible (from a more recent GWAS that contained overlap with our data under study) and used the effect estimates from the earlier GWAS. The most popular approach to instrument selection is, naturally, to take the most recent, largest GWAS (which often includes the UKB), due to assumptions that the benefits (e.g., large number of genetic variants) outweigh the risks (e.g., sample overlap). However, we show that a diabetes instrument with 77 SNPs had a larger F statistic (average strength), indicating that, if anything, this instrument carried a lower risk of weak instrument bias compared with our original 157-SNP instrument.

While our article focuses on genetic instrument selection for MR studies of HbA1c and/or liability for diabetes, we acknowledge that as a method, MR has limitations and is not a panacea for causality. As such, triangulation of findings is crucial whereby different study designs are used to enable robust causal statements. Key limitations of MR include confounding by ancestry, confounding by LD, confounding by horizontal pleiotropy, and canalization (81). Confounding by ancestry, or population stratification, refers to the fact that allele frequencies of common genetic variants, as well as disease frequencies, may differ by population. However, it is now common to adjust for genetic principal components in MR studies to correct for residual confounding by population structure. Confounding by LD refers to when the selected genetic variant(s) is/are in LD with (i.e., correlated with) another genetic variant associated with the outcome under study, which may produce a confounded causal estimate. Confounding by horizontal pleiotropy is when a single genetic variant influences the outcome under study directly rather than via the exposure being instrumented. However, numerous methods have been developed to detect and correct for horizontal pleiotropy (82). Canalization is when an individual develops a compensatory mechanism for disruptive genetic or environmental influences, as a response to higher or lower levels of a risk factor (e.g., higher or lower BMI).

In summary, we recommend that MR studies of glycemia take a more integrated approach when it comes to selection of genetic instruments. Therefore, careful consideration should be given to the following: 1) whether novel approaches such as those described here from the literature might be used; 2) which GWAS is used to select the instrument for the exposure; 3) whether a continuous, as opposed to a binary, exposure can be instrumented; and 4) inclusion of both variance explained (R2 = total strength of the instrument) and the F statistic (average strength).

This article contains supplementary material online at https://doi.org/10.2337/figshare.21390879.

Acknowledgments. The authors thank the volunteer participants of the UKB and the UKB researchers. The authors acknowledge Dr. Chloe Park (MRC Unit for Lifelong Health and Ageing, University College London) for her help with creating the graphical overview.

Funding. This work was conducted under UKB project no. 7661. V.G. is jointly funded by British Heart Foundation (SP/16/6/32726) and Diabetes UK (15/0005250) and is supported by the Professor David Matthews Non-Clinical Fellowship from the Diabetes Research and Wellness Foundation (SCA/01/NCF/22). A.S. and N.C. are supported by the U.K. Medical Research Council (MC_ST_LHA_2019, MC_UU_0019/2). S.B. is supported by a Sir Henry Dale Fellowship jointly funded by the Wellcome Trust and the Royal Society (204623/Z/16/Z). This research was supported by the National Institute for Health Research Cambridge Biomedical Research Centre (BRC-1215-20014).

The views expressed are those of the authors and not necessarily those of the National Institute for Health Research or the Department of Health and Social Care.

Duality of Interest. N.C. receives funds for serving on clinical trial data safety and monitoring committees sponsored by AstraZeneca. No other potential conflicts of interest relevant to this article were reported.

Author Contributions. V.G. conceived the idea for the study and performed the analyses. V.G. and A.S. each conducted the literature search for MR studies of diabetes/HbA1c. S.B. provided important intellectual input. N.C. supervised the project and provided important intellectual input. All authors provided feedback on drafts of the manuscript and approved the final version.

Data and Resource Availability. The UKB data are available at www.ukbiobank.ac.uk/using-the-resource/. This study was conducted using the UKB resource, application identifier 7661.

1.
Ahmad
OS
,
Morris
JA
,
Mujammami
M
, et al
.
A Mendelian randomization study of the effect of type-2 diabetes on coronary heart disease
.
Nat Commun
2015
;
6
:
7060
2.
Walter
S
,
Marden
JR
,
Kubzansky
LD
, et al
.
Diabetic phenotypes and late-life dementia risk: a mechanism-specific Mendelian randomization study
.
Alzheimer Dis Assoc Disord
2016
;
30
:
15
20
3.
Xu
M
,
Huang
Y
,
Xie
L
, et al
.
Diabetes and risk of arterial stiffness: a mendelian randomization analysis
.
Diabetes
2016
;
65
:
1731
1740
4.
Xu
M
,
Bi
Y
,
Huang
Y
, et al
.
Type 2 diabetes, diabetes genetic score and risk of decreased renal function and albuminuria: a Mendelian randomization study
.
EBioMedicine
2016
;
6
:
162
170
5.
Ahmad
OS
,
Leong
A
,
Miller
JA
, et al
.
A Mendelian randomization study of the effect of type-2 diabetes and glycemic traits on bone mineral density
.
J Bone Miner Res
2017
;
32
:
1072
1081
6.
Carreras-Torres
R
,
Johansson
M
,
Gaborieau
V
, et al
.
The role of obesity, type 2 diabetes, and metabolic factors in pancreatic cancer: a Mendelian randomization study
.
J Natl Cancer Inst
2017
;
109
:
djx012
7.
Gan
W
,
Clarke
RJ
,
Mahajan
A
, et al
.
Bone mineral density and risk of type 2 diabetes and coronary heart disease: a Mendelian randomization study
.
Wellcome Open Res
2017
;
2
:
68
8.
Hagenaars
SP
,
Gale
CR
,
Deary
IJ
,
Harris
SE
.
Cognitive ability and physical health: a Mendelian randomization study
.
Sci Rep
2017
;
7
:
2651
9.
Larsson
SC
,
Scott
RA
,
Traylor
M
, et al.;
METASTROKE Collaboration and NINDS Stroke Genetics Network (SiGN)
.
Type 2 diabetes, glucose, insulin, BMI, and ischemic stroke subtypes: Mendelian randomization study
.
Neurology
2017
;
89
:
454
460
10.
van’t Hof
FNG
,
Vaucher
J
,
Holmes
MV
, et al
.
Genetic variants associated with type 2 diabetes and adiposity and risk of intracranial and abdominal aortic aneurysms
.
Eur J Hum Genet
2017
;
25
:
758
762
11.
Disney-Hogg
L
,
Sud
A
,
Law
PJ
, et al
.
Influence of obesity-related risk factors in the aetiology of glioma
.
Br J Cancer
2018
;
118
:
1020
1027
12.
Xuan
L
,
Zhao
Z
,
Jia
X
, et al
.
Type 2 diabetes is causally associated with depression: a Mendelian randomization analysis
.
Front Med
2018
;
12
:
678
687
13.
Beijer
K
,
Nowak
C
,
Sundström
J
,
Ärnlöv
J
,
Fall
T
,
Lind
L
.
In search of causal pathways in diabetes: a study using proteomics and genotyping data from a cross-sectional study
.
Diabetologia
2019
;
62
:
1998
2006
14.
Bovijn
J
,
Jackson
L
,
Censin
J
, et al
.
GWAS identifies risk locus for erectile dysfunction and implicates hypothalamic neurobiology and diabetes in etiology
.
Am J Hum Genet
2019
;
104
:
157
163
15.
Funck-Brentano
T
,
Nethander
M
,
Movérare-Skrtic
S
,
Richette
P
,
Ohlsson
C
.
Causal factors for knee, hip, and hand osteoarthritis: Mendelian randomization study in the UK Biobank
.
Arthritis Rheumatol
2019
;
71
:
1634
1641
16.
Marouli
E
,
Del Greco
MF
,
Astley
CM
, et al
.
Mendelian randomisation analyses find pulmonary factors mediate the effect of height on coronary artery disease
.
Commun Biol
2019
;
2
:
119
17.
Sun
D
,
Zhou
T
,
Heianza
Y
, et al
.
Type 2 diabetes and hypertension
.
Circ Res
2019
;
124
:
930
937
18.
Yarmolinsky
J
,
Relton
CL
,
Lophatananon
A
, et al
.
Appraising the role of previously reported risk factors in epithelial ovarian cancer risk: a Mendelian randomization analysis
.
PLoS Med
2019
;
16
:
e1002893
19.
Au Yeung
SL
,
Schooling
CM
.
Impact of glycemic traits, type 2 diabetes and metformin use on breast and prostate cancer risk: a Mendelian randomization study
.
BMJ Open Diabetes Res Care
2019
;
7
:
e000872
20.
Yeung
CHC
,
Au Yeung
SL
,
Fong
SSM
,
Schooling
CM
.
Lean mass, grip strength and risk of type 2 diabetes: a bi-directional Mendelian randomisation study
.
Diabetologia
2019
;
62
:
789
799
21.
Zeng
P
,
Wang
T
,
Zheng
J
,
Zhou
X
.
Causal association of type 2 diabetes with amyotrophic lateral sclerosis: new evidence from Mendelian randomization using GWAS summary statistics
.
BMC Med
2019
;
17
:
225
22.
Bell
JA
,
Bull
CJ
,
Gunter
MJ
, et al
.
Early metabolic features of genetic liability to type 2 diabetes: cohort study with repeated metabolomics across early life
.
Diabetes Care
2020
;
43
:
1537
1545
23.
Gill
D
,
Arvanitis
M
,
Carter
P
, et al
.
ACE inhibition and cardiometabolic risk factors, lung ACE2 and TMPRSS2 gene expression, and plasma ACE2 levels: a Mendelian randomization study
.
R Soc Open Sci
2020
;
7
:
200958
24.
Elhadad
MA
,
Jonasson
C
,
Huth
C
, et al
.
Deciphering the plasma proteome of type 2 diabetes
.
Diabetes
2020
;
69
:
2766
2778
25.
Gudmundsdottir
V
,
Zaghlool
SB
,
Emilsson
V
, et al
.
Circulating protein signatures and causal candidates for type 2 diabetes
.
Diabetes
2020
;
69
:
1843
1853
26.
Harrison
S
,
Davies
AR
,
Dickson
M
, et al
.
The causal effects of health conditions and risk factors on social and socioeconomic outcomes: Mendelian randomization in UK Biobank
.
Int J Epidemiol
2020
;
49
:
1661
1681
27.
Inamo
J
,
Kochi
Y
,
Takeuchi
T
.
Is type 2 diabetes mellitus an inverse risk factor for the development of rheumatoid arthritis?
J Hum Genet
2021
;
66
:
219
223
28.
Kwok
MK
,
Kawachi
I
,
Rehkopf
D
,
Schooling
CM
.
The role of cortisol in ischemic heart disease, ischemic stroke, type 2 diabetes, and cardiovascular disease risk factors: a bi-directional Mendelian randomization study
.
BMC Med
2020
;
18
:
363
29.
Lu
Y
,
Gentiluomo
M
,
Lorenzo-Bermejo
J
, et al
.
Mendelian randomisation study of the effects of known and putative risk factors on pancreatic cancer
.
J Med Genet
2020
;
57
:
820
828
30.
Pan
Y
,
Chen
W
,
Yan
H
,
Wang
M
,
Xiang
X
.
Glycemic traits and Alzheimer’s disease: a Mendelian randomization study
.
Aging (Albany NY)
2020
;
12
:
22688
22699
31.
Parisinos
CA
,
Wilman
HR
,
Thomas
EL
, et al
.
Genome-wide and Mendelian randomisation studies of liver MRI yield insights into the pathogenesis of steatohepatitis
.
J Hepatol
2020
;
73
:
241
251
32.
Rao
S
,
Lau
A
,
So
HC
.
Exploring diseases/traits and blood proteins causally related to expression of ACE2, the putative receptor of SARS-CoV-2: a Mendelian randomization analysis highlights tentative relevance of diabetes-related traits
.
Diabetes Care
2020
;
43
:
1416
1426
33.
Smit
RAJ
,
Trompet
S
,
Leong
A
, et al.;
GIST consortium
.
Statin-induced LDL cholesterol response and type 2 diabetes: a bidirectional two-sample Mendelian randomization study
.
Pharmacogenomics J
2020
;
20
:
462
470
34.
Tang
B
,
Yuan
S
,
Xiong
Y
,
He
Q
,
Larsson
SC
.
Major depressive disorder and cardiometabolic diseases: a bidirectional Mendelian randomisation study
.
Diabetologia
2020
;
63
:
1305
1311
35.
Thomassen
JQ
,
Tolstrup
JS
,
Benn
M
,
Frikke-Schmidt
R
.
Type-2 diabetes and risk of dementia: observational and Mendelian randomisation studies in 1 million individuals
.
Epidemiol Psychiatr Sci
2020
;
29
:
e118
36.
van Oort
S
,
Beulens
JWJ
,
van Ballegooijen
AJ
,
Burgess
S
,
Larsson
SC
.
Cardiovascular risk factors and lifestyle behaviours in relation to longevity: a Mendelian randomization study
.
J Intern Med
2021
;
289
:
232
243
37.
Wang
N
,
Wang
C
,
Chen
X
, et al
.
Vitamin D, prediabetes and type 2 diabetes: bidirectional Mendelian randomization analysis
.
Eur J Nutr
2020
;
59
:
1379
1388
38.
Yuan
S
,
Larsson
SC
.
An atlas on risk factors for type 2 diabetes: a wide-angled Mendelian randomisation study
.
Diabetologia
2020
;
63
:
2359
2371
39.
Andrews
SJ
,
Fulton-Howard
B
,
O’Reilly
P
,
Marcora
E
;
collaborators of the Alzheimer’s Disease Genetics Consortium
.
Causal associations between modifiable risk factors and the Alzheimer’s phenome
.
Ann Neurol
2021
;
89
:
54
65
40.
Cui
Z
,
Feng
H
,
He
B
,
Xing
Y
,
Liu
Z
,
Tian
Y
.
Type 2 diabetes and glycemic traits are not causal factors of osteoarthritis: a two-sample Mendelian randomization analysis
.
Front Genet
2021
;
11
:
597876
41.
Jones
G
,
Trajanoska
K
,
Santanasto
AJ
, et al
.
Genome-wide meta-analysis of muscle weakness identifies 15 susceptibility loci in older men and women
.
Nat Commun
2021
;
12
:
654
42.
Peters
TM
,
Holmes
MV
,
Richards
JB
, et al
.
Sex differences in the risk of coronary heart disease associated with type 2 diabetes: a Mendelian randomization analysis
.
Diabetes Care
2021
;
44
:
556
562
43.
Molina-Montes
E
,
Coscia
C
,
Gómez-Rubio
P
, et al.;
PanGenEU Study Investigators
.
Deciphering the complex interplay between pancreatic cancer, diabetes mellitus subtypes and obesity/BMI through causal inference and mediation analyses
.
Gut
2021
;
70
:
319
329
44.
Yuan
S
,
Xiong
Y
,
Larsson
SC
.
An atlas on risk factors for multiple sclerosis: a Mendelian randomization study
.
J Neurol
2021
;
268
:
114
124
45.
Yuan
S
,
Giovannucci
EL
,
Larsson
SC
.
Gallstone disease, diabetes, calcium, triglycerides, smoking and alcohol consumption and pancreatitis risk: Mendelian randomization study
.
NPJ Genom Med
2021
;
6
:
27
46.
Au Yeung
SL
,
Luo
S
,
Schooling
CM
.
The impact of glycated hemoglobin (HbA1c) on cardiovascular disease risk: a Mendelian randomization study using UK Biobank
.
Diabetes Care
2018
;
41
:
1991
1997
47.
Hsiung
CN
,
Chang
YC
,
Lin
CW
, et al
.
The causal relationship of circulating triglyceride and glycated hemoglobin: a Mendelian randomization study
.
J Clin Endocrinol Metab
2020
;
105
:
dgz243
48.
Jia
X
,
Hou
Y
,
Xu
M
, et al
.
Mendelian randomization analysis support causal associations of HbA1c with circulating triglyceride, total and low-density lipoprotein cholesterol in a Chinese population
.
Sci Rep
2019
;
9
:
5525
49.
Leong
A
,
Chen
J
,
Wheeler
E
, et al
.
Mendelian randomization analysis of hemoglobin A1c as a risk factor for coronary artery disease
.
Diabetes Care
2019
;
42
:
1202
1208
50.
Liu
HM
,
Hu
Q
,
Zhang
Q
, et al
.
Causal effects of genetically predicted cardiovascular risk factors on chronic kidney disease: a two-sample mendelian randomization study
.
Front Genet
2019
;
10
:
415
51.
Aung
N
,
Khanji
MY
,
Munroe
PB
,
Petersen
SE
.
Causal inference for genetic obesity, cardiometabolic profile and COVID-19 susceptibility: a Mendelian randomization study
.
Front Genet
2020
;
11
:
586308
52.
Dikilitas
O
,
Satterfield
BA
,
Kullo
IJ
.
Risk factors for polyvascular involvement in patients with peripheral artery disease: a mendelian randomization study
.
J Am Heart Assoc
2020
;
9
:
e017740
53.
Hu
X
,
Zhuang
XD
,
Mei
WY
, et al
.
Exploring the causal pathway from body mass index to coronary heart disease: a network Mendelian randomization study
.
Ther Adv Chronic Dis
2020
;
11
:
2040622320909040
54.
Jin
H
,
Lee
S
,
Won
S
.
Causal evaluation of laboratory markers in type 2 diabetes on cancer and vascular diseases using various Mendelian randomization tools
.
Front Genet
2020
;
11
:
597420
55.
Au Yeung
SL
,
Luo
S
,
Schooling
CM
.
The impact of glycated hemoglobin on risk of hypertension: a Mendelian randomization study using UK Biobank
.
J Hypertens
2020
;
38
:
38
44
56.
Zhao
JV
,
Schooling
CM
.
Sex-specific associations of insulin resistance with chronic kidney disease and kidney function: a bi-directional Mendelian randomisation study
.
Diabetologia
2020
;
63
:
1554
1563
57.
Burgess
S
,
Malik
R
,
Liu
B
, et al
.
Dose-response relationship between genetically proxied average blood glucose levels and incident coronary heart disease in individuals without diabetes mellitus
.
Diabetologia
2021
;
64
:
845
849
58.
Juvinao-Quintero
DL
,
Starling
AP
,
Cardenas
A
, et al
.
Epigenome-wide association study of maternal hemoglobin A1c in pregnancy and cord blood DNA methylation
.
Epigenomics
2021
;
13
:
203
218
59.
Saunders
CN
,
Cornish
AJ
,
Kinnersley
B
,
Law
PJ
;
Collaborators
.
Searching for causal relationships of glioma: a phenome-wide Mendelian randomisation study
.
Br J Cancer
2021
;
124
:
447
454
60.
Smith
GD
,
Ebrahim
S
.
‘Mendelian randomization’: can genetic epidemiology contribute to understanding environmental determinants of disease?
Int J Epidemiol
2003
;
32
:
1
22
61.
Wheeler
E
,
Leong
A
,
Liu
CT
, et al.;
EPIC-CVD Consortium
;
EPIC-InterAct Consortium
;
Lifelines Cohort Study
.
Impact of common genetic determinants of hemoglobin A1c on type 2 diabetes risk and diagnosis in ancestrally diverse populations: a transethnic genome-wide meta-analysis
.
PLoS Med
2017
;
14
:
e1002383
62.
Mahajan
A
,
Taliun
D
,
Thurner
M
, et al
.
Fine-mapping type 2 diabetes loci to single-variant resolution using high-density imputation and islet-specific epigenome maps
.
Nat Genet
2018
;
50
:
1505
1513
63.
Xue
A
,
Wu
Y
,
Zhu
Z
, et al.;
eQTLGen Consortium
.
Genome-wide association analyses identify 143 risk variants and putative regulatory mechanisms for type 2 diabetes
.
Nat Commun
2018
;
9
:
2941
64.
Vujkovic
M
,
Keaton
JM
,
Lynch
JA
, et al.;
HPAP Consortium
;
Regeneron Genetics Center
;
VA Million Veteran Program
.
Discovery of 318 new risk loci for type 2 diabetes and related vascular outcomes among 1.4 million participants in a multi-ancestry meta-analysis
.
Nat Genet
2020
;
52
:
680
691
65.
Chen
J
,
Spracklen
CN
,
Marenne
G
, et al.;
Lifelines Cohort Study
;
Meta-Analysis of Glucose and Insulin-related Traits Consortium (MAGIC)
.
The trans-ancestral genomic architecture of glycemic traits
.
Nat Genet
2021
;
53
:
840
860
66.
Skrivankova
VW
,
Richmond
RC
,
Woolf
BAR
, et al
.
Strengthening the reporting of observational studies in epidemiology using mendelian randomisation (STROBE-MR): explanation and elaboration
.
BMJ
2021
;
375
:
n2233
67.
Lawlor
DA
.
Commentary: two-sample Mendelian randomization: opportunities and challenges
.
Int J Epidemiol
2016
;
45
:
908
915
68.
Burgess
S
;
CRP CHD Genetics Collaboration
.
Avoiding bias from weak instruments in Mendelian randomization studies
.
Int J Epidemiol
2011
;
40
:
755
764
69.
Garfield
V
,
Farmaki
AE
,
Fatemifar
G
, et al
.
Relationship between glycemia and cognitive function, structural brain outcomes, and dementia: a Mendelian randomization study in the UK Biobank
.
Diabetes
2021
;
70
:
2313
2321
70.
DIAbetes Genetics Replication And Meta-analysis (DIAGRAM) Consortium
;
Asian Genetic Epidemiology Network Type 2 Diabetes (AGEN-T2D) Consortium
;
South Asian Type 2 Diabetes (SAT2D) Consortium
;
Mexican American Type 2 Diabetes (MAT2D) Consortium
;
Type 2 Diabetes Genetic Exploration by Nex-generation sequencing in muylti-Ethnic Samples (T2D-GENES) Consortium
;
Mahajan
A
,
Go
MJ
,
Zhang
W
, et al
.
Genome-wide trans-ancestry meta-analysis provides insight into the genetic architecture of type 2 diabetes susceptibility
.
Nat Genet
2014
;
46
:
234
44
71.
Larsson
SC
,
Bäck
M
,
Rees
JMB
,
Mason
AM
,
Burgess
S
.
Body mass index and body composition in relation to 14 cardiovascular conditions in UK Biobank: a Mendelian randomization study
.
Eur Heart J
2020
;
41
:
221
226
72.
Zhu
P
,
Herrington
WG
,
Haynes
R
, et al
.
Conventional and genetic evidence on the association between adiposity and CKD
.
J Am Soc Nephrol
2021
;
32
:
127
137
73.
Sudlow
C
,
Gallacher
J
,
Allen
N
, et al
.
UK biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age
.
PLoS Med
2015
;
12
:
e1001779
74.
Luo
S
,
Au Yeung
SL
,
Schooling
CM
.
Assessing the linear and non-linear association of HbA1c with cardiovascular disease: a Mendelian randomisation study
.
Diabetologia
2021
;
64
:
2502
2510
75.
Boef
AGC
,
Dekkers
OM
,
le Cessie
S
.
Mendelian randomization studies: a review of the approaches used and the quality of reporting
.
Int J Epidemiol
2015
;
44
:
496
511
76.
Burgess
S
,
Labrecque
JA
.
Mendelian randomization with a binary exposure variable: interpretation and presentation of causal estimates
.
Eur J Epidemiol
2018
;
33
:
947
952
77.
Howe
LJ
,
Tudball
M
,
Smith
GD
,
Davies
NM
.
Interpreting Mendelian-randomization estimates of the effects of categorical exposures such as disease status and educational attainment
.
Int J Epidemiol
2022
;
51
:
948
957
78.
Staley
JR
,
Burgess
S
.
Semiparametric methods for estimation of a nonlinear exposure-outcome relationship using instrumental variables with application to Mendelian randomization
.
Genet Epidemiol
2017
;
41
:
341
352
79.
Yuan
S
,
Mason
AM
,
Burgess
S
,
Larsson
SC
.
Differentiating associations of glycemic traits with atherosclerotic and thrombotic outcomes: Mendelian randomization investigation
.
Diabetes
2022
;
71
:
2222
2232
80.
Au Yeung
SL
,
Zhao
JV
,
Schooling
CM
.
Evaluation of glycemic traits in susceptibility to COVID-19 risk: a Mendelian randomization study
.
BMC Med
2021
;
19
:
72
81.
Smith
GD
,
Ebrahim
S
.
Mendelian randomization: prospects, potentials, and limitations
.
Int J Epidemiol
2004
;
33
:
30
42
82.
Davies
NM
,
Holmes
MV
,
Davey Smith
G
.
Reading Mendelian randomisation studies: a guide, glossary, and checklist for clinicians
.
BMJ
2018
;
362
:
k601
Readers may use this article as long as the work is properly cited, the use is educational and not for profit, and the work is not altered. More information is available at https://www.diabetesjournals.org/journals/pages/license.