Type 2 diabetes shows an increasing prevalence in both adults and children. Identification of biomarkers for both youth and adult-onset type 2 diabetes is crucial for development of screening tools or drug targets. In this study, using two-sample Mendelian randomization (MR), we identified 22 circulating proteins causally linked to adult type 2 diabetes and 11 proteins with suggestive evidence for association with youth-onset type 2 diabetes. Among these, colocalization analysis further supported a role in type 2 diabetes for C-type mannose receptor 2 (MR odds ratio [OR] 0.85 [95% CI 0.79–0.92] per genetically predicted SD increase in protein level), MANS domain containing 4 (MR OR 0.90 [95% CI 0.88–0.92]), sodium/potassium-transporting ATPase subunit β2 (MR OR 1.10 [95% CI 1.06–1.15]), endoplasmic reticulum oxidoreductase 1β (MR OR 1.09 [95% CI 1.05–1.14]), spermatogenesis-associated protein 20 (MR OR 1.12 [95% CI 1.06–1.18]), haptoglobin (MR OR 0.96 [95% CI 0.94–0.98]), and α1–3-N-acetylgalactosaminyltransferase and α1–3-galactosyltransferase (MR OR 1.04 [95% CI 1.03–1.05]). Our findings support a causal role in type 2 diabetes for a set of circulating proteins, which represent promising type 2 diabetes drug targets.

Type 2 diabetes is considered the fifth leading cause of death worldwide among adults, leading to early morbidity and mortality (1). Over the past two decades, the prevalence of type 2 diabetes has increased substantially in children in all ancestries. Interestingly, youth-onset type 2 diabetes is associated with greater mortality and earlier complications, and its treatment options are limited (2). Primary prevention of both adult- and youth-onset type 2 diabetes consists of lifestyle interventions in individuals at high risk of developing the disease. Therefore, identifying early disease biomarkers is important for both screening for type 2 diabetes as early as in childhood and characterizing novel drug targets.

It has been reported that ∼150 Food and Drug Administration–approved biomarkers target plasma circulating proteins (3). Identification of circulating protein biomarkers using population-based proteomics can improve our understanding on the etiology of type 2 diabetes and enhance strategies for screening, diagnosis, and treatment of this disease. In several observational studies, mostly cross-sectional, ∼142 plasma proteins have been associated with risk of type 2 diabetes (35). However, these studies suffer from bias due to unmeasured confounders, such as adiposity, inflammation, etc. Reverse causation may occur when type 2 diabetes itself leads to changes in circulating protein levels—for instance, due to a profound metabolic disarrangement in overt diabetes. Therefore, the available observational evidence has not been able to establish a causal association between these proteins and type 2 diabetes (4).

Mendelian randomization (MR) applies an approach based on instrumental variable analyses to minimize confounding and reverse causation to identify causal effects of an exposure (such as circulating proteins) on a disease outcome (6). Adding to existing evidence from observational studies (35), a recent MR study has identified candidate protein biomarkers for adult type 2 diabetes (7) using a set of genetic instruments associated with the tested proteins in a large proteomic cohort, but not at a genome-wide level. Expanding this approach, in the current study, we used genome-wide significant single nucleotide polymorphisms (SNPs) from the five largest protein genome-wide association study (GWAS) consortia available to date (812) as genetic instruments to our protein exposures and queried their effects on type 2 diabetes in the largest adult type 2 diabetes GWAS available to date (13). Furthermore, we applied a similar approach to identify causal circulating proteins for youth-onset type 2 diabetes (2) using data from the only available recently published pediatric type 2 diabetes GWAS.

Study Exposures

We used the five largest proteomic GWAS (812) to date to identify SNPs as MR instruments for circulating proteins termed cis-protein quantitative trait loci (cis-pQTLs), which were defined as the genome-wide significant SNPs within 1 Mb of the gene encoding the measured protein. The circulating proteins in the Sun et al. (8), Emilsson et al. (9), Suhre et al. (11), and Yao et al. (12) GWAS were quantified using the Aptamer-based (SOMAmer) technology; however, in the GWAS by Folkersen et al. (10), the circulating proteins were measured using the Olink platform (Supplementary Table 1).

Study Outcomes

Adult Type 2 Diabetes GWAS: Adjusted and Unadjusted for BMI

To test whether there is an association between the proteins associated with the aforementioned cis-pQTLs and adult type 2 diabetes risk, we retrieved the effects of these cis-pQTLs on type 2 diabetes from the DIAMANTE consortium GWAS for type 2 diabetes, which is a meta-analysis of 32 European type 2 diabetes cohorts with available GWAS data (13) (n = 71,124 case and 824,006 control subjects) (Fig. 1). As a sensitivity analysis, and since there is a known overlap in the genetic architecture of obesity and type 2 diabetes (14), we repeated our MR analysis using effects from the BMI-adjusted DIAMANTE GWAS to assess if BMI could have affected the association of cis-pQTLs with adult type 2 diabetes risk. This GWAS included up to 50,409 case subjects with type 2 diabetes and 523,897 control subjects of European ancestry (13). Details on these cohorts can be found in the prior publication (13).

Figure 1

Flowchart of our MR studies assessing the causal role of circulating proteins on adult and youth-onset type 2 diabetes, and of sensitivity analyses testing the MR assumptions.

Figure 1

Flowchart of our MR studies assessing the causal role of circulating proteins on adult and youth-onset type 2 diabetes, and of sensitivity analyses testing the MR assumptions.

Close modal

Youth-Onset Type 2 Diabetes GWAS

To test whether any circulating proteins from the aforementioned MR analysis had evidence for a causal role in youth-onset type 2 diabetes, we used proteins nominally associated (MR P value <0.05) with unadjusted for BMI adult type 2 diabetes risk as exposures and queried the effects of their respective cis-pQTLs in a recently published GWAS study on youth-onset type 2 diabetes (2) (Fig. 1). This GWAS included 3,006 youth case subjects and 6,000 adult control subjects from the multiethnic Progress in Diabetes Genetics in Youth (ProDiGY) consortium. For the purpose of our MR study, we used GWAS data from the European ancestry subset of ProDiGY, including 664 case subjects with type 2 diabetes and 1,976 control subjects (2).

Statistical Analyses

Two-Sample MR

To test for causal evidence between circulating protein levels and type 2 diabetes risk in adults or children, we performed two-sample MR analyses implemented in the “TwoSampleMR” R package (15) using the Wald ratio method. First, we identified the lead SNPs (cis-pQTLs) with the lowest P value for association with protein levels in the five proteomic GWAS (812). We then performed linkage disequilibrium (LD) clumping (R2 < 0.3) using the 1000 Genomes “EUR” reference panel to avoid including more than a single cis-pQTL instrument per protein exposure. It is important to mention that the LD clumping was performed within each proteomic GWAS study, but not across all of the studies. We next combined all of the candidate proteins prioritized by our MR analysis obtained from all of the five proteomic studies (812) and assessed the variance explained (R2), which is the variance in protein levels explained by the SNP and the F statistic of their respective cis-pQTL association as a metric of strength for the genetic instrument (an F statistic >10 implying a strong instrument) to further assure that the first MR assumption was satisfied. We calculated the proportion of the variance of the respective protein level explained by the cis-pQTL (R2) using the following formula: R2 ≈ 2β2ƒ(1 − ƒ), where β and ƒ denote the effect estimate and the effect allele frequency of the allele on a standardized phenotype, respectively (16). We also computed the F statistic of each cis-pQTL using the following formula: F = (R2/k)/([1 − R2]/[n − k − 1]), where R2 is the proportion of the variance of the respective protein level explained by the cis-pQTL, k is the number of instruments used in the model (in this case, k = 1, since there was a single cis-pQTL per protein), and n is the GWAS sample size (17).

Then, we tested the effects of the lead cis-pQTL in the adult type 2 diabetes DIAMANTE GWAS, unadjusted or adjusted for BMI (13), and in the youth-onset type 2 diabetes ProDiGY GWAS (2). In this study, we used single-variant MR, and, in order to calculate the Wald ratios, SNP-exposure effects were used against SNP-outcome effects to calculate a single MR estimate for each protein trait on type 2 diabetes risk. Next, we applied Bonferroni correction to control for the total number of proteins tested in our MR experiments. After prioritizing proteins based on Bonferroni correction, the findings from overlapping proteins from different proteomic GWAS were cross validated. Although we allowed for overlaps of tested proteins across proteomic GWAS, LD clumping was performed to ensure that there is a single cis-pQTLs per protein per proteomic GWAS. As a final stage, the results of all five independent MRs for each proteomic GWAS were combined and compared.

The findings of our single-variant MR studies are presented as MR odds ratios (ORs) and 95% CIs for risk of type 2 diabetes per genetically predicted 1 SD increase in circulating protein level.

MR Assumptions

In each MR analysis, three assumptions need to be satisfied. The first MR assumption requires that the genetic instrument must be strongly associated with the exposure; we thus used cis-pQTLs, which have been associated with their respective protein’s level at a genome-wide significant level (P ≤ 5 × 10−8). The cis-pQTLs were defined as the genome-wide significant SNPs with the lowest P value within 1 Mb of the transcription start site of the gene encoding the measured protein. For the cis-pQTLs that were not present in the type 2 diabetes GWAS, SNPs in high LD (defined by an LD R2 ≥ 0.8 in the 1000 Genomes phase 3 European panel) were selected as proxies in the LDlink website (https://ldlink.nci.nih.gov/?tab=ldproxy).

According to the second MR assumption, the genetic instrument should not be associated with confounders that link the exposure to outcome. We therefore used the PhenoScanner v2 (18) database to determine any reported associations of the cis-pQTLs of the MR-prioritized proteins with potential confounders at a genome-wide significant level (P ≤ 5 × 10−8).

The third MR assumption, known as exclusion restriction assumption, requires that the genetic instruments should be associated with the outcome only via the exposure. To satisfy this assumption, we elected to use only cis-acting SNPs (located within 1 Mb of the genes that encode the proteins) (19) as instruments in our MR studies. Since cis-pQTLs are considered to have a direct and definite influence on the protein compared with trans-pQTLs, they are less likely to impact the levels of this protein independently of the levels of the proteins encoded by their respective gene.

Sensitivity Analyses

Assessment for Confounding

In order to explore the second MR assumption, we queried reported genome-wide significant associations of the cis-pQTLs of the MR-prioritized proteins with potential confounders, such as body fat mass and waist-to-hip ratio, using PhenoScanner v2 (18).

Colocalization Analysis

We further assessed for potential confounding by LD via checking whether the cis-pQTL of the MR-prioritized proteins is itself associated with adult- or youth-onset type 2 diabetes or rather in LD with a separate causal variant for type 2 diabetes. To do so, we used colocalization, as implemented in the coloc R package (20). The colocalization analysis provides posterior probabilities for H0 (no association of the genomic locus with either trait), H1 (association with type 2 diabetes but not with the protein level), H2 (association with the protein level but not with type 2 diabetes), H3 (association with type 2 diabetes and the protein level through two different SNPs), and H4 (association with type 2 diabetes and the protein level through one shared SNP). To determine the posterior probability of each genomic locus containing a single variant affecting both the protein and the type 2 diabetes, we analyzed all SNPs within 1 Mb of the cis-pQTL. Colocalization analyses were performed only for proteins with the evidence of significant association with type 2 diabetes in our MR analysis and available summary-level results from the GWAS by Sun et al. (8). Visualization of colocalization results was performed using the LocusCompareR R package (21).

Multiple Locus Analysis

As a further sensitivity analysis, we assessed whether including multiple genetic instruments per protein exposure, explaining a larger portion of its variance, would affect the results of our main MR analysis. To do this, we performed a multiple locus analysis, including both cis- and trans-pQTLs (i.e., pQTLs that did not satisfy the aforementioned criteria of being cis-pQTL) whenever the latter were available for our protein exposures. We thus used the inverse variance–weighted MR approach to meta-analyze the MR effects of the SNPs used as instruments for a subset of the candidate proteins of our main MR analysis, which had available trans-pQTLs. To do so, we used “TwoSampleMR” R package (15).

Assessment for Protein-Altering Variants

For cis-pQTLs of our MR-prioritized proteins quantified on the SOMAlogic platform, we assessed the possibility of potential aptamer-binding effects, in which the presence of protein-altering variants (PAVs) may affect protein measurements. We verified whether the MR-prioritized cis-pQTL are PAVs or they are in LD (R2 > 0.8) with PAVs. Since such variants may impact direct measurements of the respective proteins using antibody-based or aptamer-based methods, this assessment was important for the interpretation of the results of our MR study and for future validation studies.

Expression QTL Assessment

To assess whether the cis-pQTL of the MR-prioritized proteins exert their effects on gene expression and demonstrate evidence of being expression quantitative trait loci (eQTL), we used the Genotype-Tissue Expression (GTEx) database (22) (https://www.gtexportal.org).

Data and Resource Availability

Data from proteomics studies are available from the referenced peer-reviewed studies or their corresponding authors, as applicable. Summary statistics for the type 2 diabetes GWAS are publicly available for download from the GWAS catalog. The statistical code needed to reproduce the results in the article is available upon request.

Combining all of the five proteomic GWAS (812), we obtained 1,690 cis-pQTLs, including cis-pQTLs in overlapping loci. These 1,690 cis-pQTLs correspond to 1,089 unique circulating proteins and have available GWAS effects on BMI-unadjusted adult type 2 diabetes for our MR studies. After Bonferroni correction for multiple testing (P value threshold for significance = 0.05/1,089 or 4.6 × 10−5), our MR analyses revealed associations for 20 circulating proteins, which all had cis-pQTLs with an F statistic >10, indicating that these SNPs were strong instruments (Table 1 and Supplementary Table 2). As shown in Table 1, these 20 proteins included cyclin-dependent kinase 2–associated protein 1 (CDK2AP1), cyclin H (CCNH), tyrosine-protein kinase receptor (TYRO3), mitogen-activated protein kinase 3 (MAPK3), tubulin folding cofactor E (TBCE), TNF receptor superfamily member 6B (TNFRSF6B), arginase 1(ARG1), drebrin-like (DBNL), C-type mannose receptor 2 (MRC2), sex hormone–binding globulin (SHBG), activating transcription factor 6β (ATF6B), spermatogenesis-associated protein 20 (SPATA20), sodium/potassium-transporting ATPase subunit β2 (ATP1B2), MANSC domain containing 4 (MANSC4), haptoglobin (HP), β-mannosidase (MANBA), α1–3-N-acetylgalactosaminyltransferase and α1–3-galactosyltransferase (ABO), ACE I (peptidyl-dipeptidase A) 1 (ACE), peptidyl-glycine α-amidating monooxygenase (PAM), and neural EGFLlike 1 (NELL1). Importantly, as demonstrated in Fig. 2A, we observed MR ORs ranging from 0.69 (for CDK2AP1) to up to 1.30 (for TYRO3) per SD increase in protein levels. This means that the risk of type 2 diabetes was increased to 1.3-fold per genetically predicted SD increase in TRYO3 level (MR OR 1.29 [95% CI 1.19–1.42]; P = 1.9 × 10−9), while a genetically predicted SD increase in CDK2AP1 was associated with decreased risk of type 2 diabetes by ∼30% (MR OR 0.69 [95% CI 0.61–0.78]; P = 2.3 × 10−9).

Figure 2

Forest plots displaying the results of the MR analyses. A: Forest plot displaying the MR OR and 95% CIs of BMI-unadjusted adult type 2 diabetes per genetically predicted 1 SD increase of each candidate protein level. B: Forest plot displaying the MR OR and 95% CIs of youth-onset type 2 diabetes per genetically predicted 1 SD increase of each candidate protein level. C: Forest plot displaying the MR OR and 95% CIs of BMI-adjusted adult type 2 diabetes per genetically predicted 1 SD increase of each candidate protein level.

Figure 2

Forest plots displaying the results of the MR analyses. A: Forest plot displaying the MR OR and 95% CIs of BMI-unadjusted adult type 2 diabetes per genetically predicted 1 SD increase of each candidate protein level. B: Forest plot displaying the MR OR and 95% CIs of youth-onset type 2 diabetes per genetically predicted 1 SD increase of each candidate protein level. C: Forest plot displaying the MR OR and 95% CIs of BMI-adjusted adult type 2 diabetes per genetically predicted 1 SD increase of each candidate protein level.

Close modal
Table 1

MR results for circulating proteins associated with BMI-unadjusted adult type 2 diabetes, after Bonferroni correction

ProteinChr.Positionrs number cis-pQTLEAFEAMR OR95% CIMR P valueR2F statisticSource (first author, reference)
MANSC4 12 27927881 rs36138811 0.23 0.90 0.88–0.92 3.81 × 10−18 0.14 557.01 Sun et al. (8
MANSC4 12 27923241 rs11049131 0.77 0.92 0.91–0.94 3.81 × 10−18 0.24 1018.26 Emilsson et al. (9
ABO 136149229 rs505922 0.31 1.04 1.03–1.05 6.62 × 10−12 0.72 8668.98 Sun et al. (8
ABO 136144960 rs492488 0.74 1.00 1.00–1.00 1.39 × 10−9 0.82 14796.78 Emilsson et al. (9
TYRO3 15 41860698 rs2289743 0.69 1.30 1.19–1.41 1.97 × 10−9 0.01 35.80 Emilsson et al. (9
CDK2AP1 12 123614813 rs2510885 0.75 0.69 0.61–0.78 2.31 × 10−9 0.01 18.45 Emilsson et al. (9
PAM 102418604 rs257309 0.35 0.92 0.89–0.94 2.37 × 10−9 0.10 363.29 Sun et al. (8
CCNH 86577352 rs7719891 0.76 0.72 0.64–0.81 2.77 × 10−8 0.01 17.59 Emilsson et al. (9
TBCE 235594951 rs10802708 0.65 0.79 0.73–0.86 4.91 × 10−8 0.09 314.04 Emilsson et al. (9
MANBA 103680984 rs223489 0.66 0.92 0.90–0.95 4.91 × 10−8 0.01 34.98 Emilsson et al. (9
MANBA 103612043 rs227370 0.67 0.94 0.92–0.96 3.88 × 10−7 0.14 525.84 Sun et al. (8
ACE 17 61566724 rs4344 0.50 0.95 0.93–0.97 5.73 × 10−7 0.17 654.75 Emilsson et al. (9
ATF6B 32113980 rs114887538 0.76 1.15 1.09–1.23 1.68 × 10−6 0.02 65.55 Emilsson et al. (9
DBNL 44156146 rs3087367 0.56 0.87 0.82–0.93 5.86 × 10−6 0.02 73.28 Emilsson et al. (9
HP 16 72105965 rs217184 0.20 0.96 0.94–0.98 1.13 × 10−5 0.24 1027.97 Sun et al. (8
SHBG 17 7531965 rs858519 0.47 0.88 0.83–0.93 1.21 × 10−5 0.04 138.35 Emilsson et al. (9
ATP1B2 17 7554772 rs1642762 0.59 1.10 1.06–1.15 1.21 × 10−5 0.02 74.74 Sun et al. (8
ARG1 131897278 rs2781668 0.80 0.80 0.72–0.88 1.34 × 10−5 0.01 27.59 Emilsson et al. (9
HP 16 72,114,002 rs217181 0.20 1.04 1.02–1.05 1.55 × 10−5 0.29 403.95 Suhre et al. (11
TNFRSF6B 20 62370349 rs1056441 0.61 1.23 1.12–1.35 2.00 × 10−5 0.01 30.57 Emilsson et al. (9
SPATA20 17 48624523 rs9890200 0.38 1.12 1.06–1.18 2.21 × 10−5 0.03 102.32 Sun et al. (8
MRC2 17 rs217184 rs146385050 0.20 0.85 0.79–0.92 2.84 × 10−5 0.02 50.35 Sun et al. (8
MAPK3 16 30134656 rs28529403 0.40 1.27 1.13–1.42 3.27 × 10−5 0.01 19.75 Emilsson et al. (9
HP 16 72079657 rs77303550 0.82 0.96 0.94–0.98 3.38 × 10−5 0.23 952.73 Emilsson et al. (9
NELL1 11 20952237 rs16907058 0.95 1.06 1.03–1.09 3.43 × 10−5 0.09 334.45 Emilsson et al. (9
ProteinChr.Positionrs number cis-pQTLEAFEAMR OR95% CIMR P valueR2F statisticSource (first author, reference)
MANSC4 12 27927881 rs36138811 0.23 0.90 0.88–0.92 3.81 × 10−18 0.14 557.01 Sun et al. (8
MANSC4 12 27923241 rs11049131 0.77 0.92 0.91–0.94 3.81 × 10−18 0.24 1018.26 Emilsson et al. (9
ABO 136149229 rs505922 0.31 1.04 1.03–1.05 6.62 × 10−12 0.72 8668.98 Sun et al. (8
ABO 136144960 rs492488 0.74 1.00 1.00–1.00 1.39 × 10−9 0.82 14796.78 Emilsson et al. (9
TYRO3 15 41860698 rs2289743 0.69 1.30 1.19–1.41 1.97 × 10−9 0.01 35.80 Emilsson et al. (9
CDK2AP1 12 123614813 rs2510885 0.75 0.69 0.61–0.78 2.31 × 10−9 0.01 18.45 Emilsson et al. (9
PAM 102418604 rs257309 0.35 0.92 0.89–0.94 2.37 × 10−9 0.10 363.29 Sun et al. (8
CCNH 86577352 rs7719891 0.76 0.72 0.64–0.81 2.77 × 10−8 0.01 17.59 Emilsson et al. (9
TBCE 235594951 rs10802708 0.65 0.79 0.73–0.86 4.91 × 10−8 0.09 314.04 Emilsson et al. (9
MANBA 103680984 rs223489 0.66 0.92 0.90–0.95 4.91 × 10−8 0.01 34.98 Emilsson et al. (9
MANBA 103612043 rs227370 0.67 0.94 0.92–0.96 3.88 × 10−7 0.14 525.84 Sun et al. (8
ACE 17 61566724 rs4344 0.50 0.95 0.93–0.97 5.73 × 10−7 0.17 654.75 Emilsson et al. (9
ATF6B 32113980 rs114887538 0.76 1.15 1.09–1.23 1.68 × 10−6 0.02 65.55 Emilsson et al. (9
DBNL 44156146 rs3087367 0.56 0.87 0.82–0.93 5.86 × 10−6 0.02 73.28 Emilsson et al. (9
HP 16 72105965 rs217184 0.20 0.96 0.94–0.98 1.13 × 10−5 0.24 1027.97 Sun et al. (8
SHBG 17 7531965 rs858519 0.47 0.88 0.83–0.93 1.21 × 10−5 0.04 138.35 Emilsson et al. (9
ATP1B2 17 7554772 rs1642762 0.59 1.10 1.06–1.15 1.21 × 10−5 0.02 74.74 Sun et al. (8
ARG1 131897278 rs2781668 0.80 0.80 0.72–0.88 1.34 × 10−5 0.01 27.59 Emilsson et al. (9
HP 16 72,114,002 rs217181 0.20 1.04 1.02–1.05 1.55 × 10−5 0.29 403.95 Suhre et al. (11
TNFRSF6B 20 62370349 rs1056441 0.61 1.23 1.12–1.35 2.00 × 10−5 0.01 30.57 Emilsson et al. (9
SPATA20 17 48624523 rs9890200 0.38 1.12 1.06–1.18 2.21 × 10−5 0.03 102.32 Sun et al. (8
MRC2 17 rs217184 rs146385050 0.20 0.85 0.79–0.92 2.84 × 10−5 0.02 50.35 Sun et al. (8
MAPK3 16 30134656 rs28529403 0.40 1.27 1.13–1.42 3.27 × 10−5 0.01 19.75 Emilsson et al. (9
HP 16 72079657 rs77303550 0.82 0.96 0.94–0.98 3.38 × 10−5 0.23 952.73 Emilsson et al. (9
NELL1 11 20952237 rs16907058 0.95 1.06 1.03–1.09 3.43 × 10−5 0.09 334.45 Emilsson et al. (9

MR OR represents the OR for type 2 diabetes per 1 SD increase in the protein level.

Chr., chromosome; EA, effect allele; EAF, effect allele frequency.

Next, we undertook an MR study using cis-pQTLs linked to 278 proteins that were nominally associated with adult type 2 diabetes in the previous MR study to evaluate the causal role of these prioritized proteins in youth-onset type 2 diabetes (2) (Fig. 1). By querying the 278 cis-pQTLs in the youth-onset type 2 diabetes GWAS, effects of cis-pQTLs from 174 unique circulating proteins were retrieved (Table 2 and Supplementary Table 3), which were used as genetic instruments for their respective proteins in our MR studies. As shown in Table 2, our MR analyses demonstrated 11 proteins nominally associated with risk of youth-onset type 2 diabetes, but after Bonferroni correction (P value threshold for significance = 0.05/174 or 2.8 × 10−4), no protein was significantly associated with risk of youth-onset type 2 diabetes. The 11 circulating proteins, all with F statistics >10, are, namely: growth differentiation factor 15 (GDF15), antiselectin-like osteoblast-derived protein 1 (SVEP1), surface glycoprotein, Ig superfamily member (CDON), kinase insert domain receptor (KDR), cytochrome B5 type A (CYB5A), complement component 4A/B (Rodgers blood group) (C4A/C4B) complex, fibroblast growth factor 2 (FGF2), and CCNH, TNFRSF6B, ABO, and ACE; the latter 4 proteins are in common with adult type 2 diabetes. As demonstrated in Fig. 2B, we observed MR ORs ranging from 0.77 (for CYB5A and CCNH) to up to 1.25 (for TNFRSF6B) per SD increase in protein levels. Specifically, a genetically predicted SD increase in CYB5A and CCNH levels was associated with ∼20% decreased risk of type 2 diabetes (MR OR 0.77 [95% CI 0.61–0.97], P = 0.03; and MR OR 0.77 [95% CI 0.61–0.98], P = 0.03, respectively), while a 25% increase in the risk of youth-onset type 2 diabetes was detected per genetically predicted SD increase in TNFRSF6B protein levels (MR OR 1.25 [95% CI 1.01–1.54]; P = 0.013).

Table 2

MR results for circulating proteins associated with youth-onset type 2 diabetes

ProteinChr.Positionrs number cis-pQTLEAFEAMR OR95% CIMR P valueR2F statisticSource (first author, reference)
GDF15 19 18503194 rs45543339 0.26 1.07 1.02–1.13 0.009 0.13 479.97 Sun et al. (8
SVEP1 113312231 rs61751937 0.03 0.89 0.81–0.98 0.015 0.08 294.12 Sun et al. (8
SVEP1 113260708 rs78742138 0.96 0.82 0.69–0.97 0.020 0.03 101.27 Emilsson et al. (9
CDON 11 125897840 rs60929339 0.93 0.91 0.84–0.99 0.021 0.06 214.61 Emilsson et al. (9
CDON 11 125889526 rs3740909 0.07 1.08 1.01–1.15 0.022 0.09 102.35 Suhre et al. (11
KDR 55979558 rs2305948 0.93 0.89 0.81–0.98 0.023 0.03 86.60 Emilsson et al. (9
CYB5A 18 71945370 rs7239618 0.68 0.77 0.61–0.97 0.030 0.01 22.21 Emilsson et al. (9
CCNH 86577352 rs7719891 0.76 0.77 0.61–0.98 0.032 0.01 17.59 Emilsson et al. (9
ABO 136149229 rs505922 0.31 1.02 1.00–1.05 0.036 0.72 8668.98 Sun et al. (8
TNFRSF6B 20 62370349 rs1056441 0.61 1.25 1.01–1.54 0.037 0.01 30.57 Emilsson et al. (9
ACE 17 61566724 rs4344 0.50 1.05 1.00–1.10 0.042 0.17 654.75 Emilsson et al. (9
C4A C4B 31928691 rs2280774 0.37 1.06 1.00–1.13 0.046 0.12 134.84 Suhre et al. (11
FGF2 123757748 rs308403 0.32 0.95 0.91–1.00 0.046 0.15 177.84 Suhre et al. (11
ProteinChr.Positionrs number cis-pQTLEAFEAMR OR95% CIMR P valueR2F statisticSource (first author, reference)
GDF15 19 18503194 rs45543339 0.26 1.07 1.02–1.13 0.009 0.13 479.97 Sun et al. (8
SVEP1 113312231 rs61751937 0.03 0.89 0.81–0.98 0.015 0.08 294.12 Sun et al. (8
SVEP1 113260708 rs78742138 0.96 0.82 0.69–0.97 0.020 0.03 101.27 Emilsson et al. (9
CDON 11 125897840 rs60929339 0.93 0.91 0.84–0.99 0.021 0.06 214.61 Emilsson et al. (9
CDON 11 125889526 rs3740909 0.07 1.08 1.01–1.15 0.022 0.09 102.35 Suhre et al. (11
KDR 55979558 rs2305948 0.93 0.89 0.81–0.98 0.023 0.03 86.60 Emilsson et al. (9
CYB5A 18 71945370 rs7239618 0.68 0.77 0.61–0.97 0.030 0.01 22.21 Emilsson et al. (9
CCNH 86577352 rs7719891 0.76 0.77 0.61–0.98 0.032 0.01 17.59 Emilsson et al. (9
ABO 136149229 rs505922 0.31 1.02 1.00–1.05 0.036 0.72 8668.98 Sun et al. (8
TNFRSF6B 20 62370349 rs1056441 0.61 1.25 1.01–1.54 0.037 0.01 30.57 Emilsson et al. (9
ACE 17 61566724 rs4344 0.50 1.05 1.00–1.10 0.042 0.17 654.75 Emilsson et al. (9
C4A C4B 31928691 rs2280774 0.37 1.06 1.00–1.13 0.046 0.12 134.84 Suhre et al. (11
FGF2 123757748 rs308403 0.32 0.95 0.91–1.00 0.046 0.15 177.84 Suhre et al. (11

MR OR represents the OR for youth-onset type 2 diabetes per 1 SD increase in the protein level.

Chr., chromosome; EA, effect allele; EAF, effect allele frequency.

Obesity, expressed as an increased BMI, increases the risk of type 2 diabetes (14,23). Therefore, we assessed whether BMI could have affected the findings of our main MR analysis on adult type 2 diabetes risk. Using cis-pQTLs for candidate proteins from the same five proteomic GWAS (812), we queried their effects on BMI-adjusted adult type 2 diabetes risk in the DIAMANTE consortium (13). We identified effects on BMI-adjusted type 2 diabetes for 915 unique circulating proteins with distinct cis-pQTLs, which were used as instruments in our MR studies (Table 3 and Supplementary Table 4). After Bonferroni correction for multiple testing (P value threshold for significance = 0.05/915 or 5.45 × 10−5), MR effects for nine of the candidate proteins from the main MR analysis were attenuated after adjusting the outcome (type 2 diabetes) for BMI (Table 3). However, 13 circulating proteins remained causally associated with type 2 diabetes, for which cis-pQTL have F statistics >10 (Table 3). These 13 proteins included TYRO3, TNFRSF6B, TBCE, SHBG, MRC2, DBNL, ATP1B2, ATF6B, PAM, ABO, and MANSC4, which have been also associated with the risk of BMI-unadjusted type 2 diabetes, and 2 novel proteins, endoplasmic reticulum oxidoreductase 1β (ERO1LB) and polypeptide-related sequence B (MICB) (Table 3). As demonstrated in Fig. 2C, after adjusting for BMI, we obtained comparable ORs ranging from 0.79 (0.72–0.86) (for MRC2) to 1.34 (1.21–1.48) (for TYRO3) per SD increase in protein levels.

Table 3

MR results for circulating proteins associated with BMI-adjusted adult type 2 diabetes, after Bonferroni correction

ProteinChr.Positionrs number cis-pQTLEAFEAMR OR95% CIMR P valueR2F statisticSource (first author, reference)
MANSC4 12 27927881 rs36138811 0.23 0.91 0.88–0.93 3.25 × 10−12 0.144 557.006 Sun et al. (8
MANSC4 12 27923241 rs11049131 0.77 0.93 0.91–0.95 3.25 × 10−12 0.242 1018.259 Emilsson et al. (9
ATP1B2 17 7554772 rs1642762 0.59 1.17 1.11–1.23 7.06 × 10−9 0.040 138.346 Sun et al. (8
TYRO3 15 41860698 rs2289743 0.69 1.34 1.21–1.48 9.94 × 10−9 0.011 35.796 Emilsson et al. (9
PAM 102418604 rs257309 0.35 0.91 0.88–0.94 1.69 × 10−8 0.099 363.292 Sun et al. (8
SHBG 17 7531965 rs858519 0.47 0.83 0.77–0.88 4.59 × 10−8 0.023 74.736 Emilsson et al. (9
MRC2 17 60637258 rs146385050 0.20 0.79 0.72–0.86 1.99 × 10−7 0.015 50.350 Sun et al. (8
ABO 136149229 rs505922 0.31 1.03 1.02–1.04 4.12 × 10−7 0.724 8668.978 Sun et al. (8
ABO 136144960 rs492488 0.74 1.03 1.02–1.04 7.95 × 10−7 0.811 13727.667 Emilsson et al. (9
ATF6B 32113980 rs114887538 0.76 1.19 1.11–1.28 1.06 × 10−6 0.020 65.546 Emilsson et al. (9
DBNL 44156146 rs3087367 0.56 0.85 0.79–0.91 3.06 × 10−6 0.022 73.276 Emilsson et al. (9
MICB 31472720 rs2855812 0.80 0.95 0.92–0.97 5.76 × 10−6 0.156 589.275 Emilsson et al. (9
ERO1LB 236399442 rs1254194 0.60 1.09 1.05–1.14 7.69 × 10−6 0.070 249.683 Sun et al. (8
TBCE 235594951 rs10802708 0.65 0.81 0.73–0.89 1.82 × 10−5 0.011 34.978 Emilsson et al. (9
TNFRSF6B 20 62370349 rs1056441 0.61 1.27 1.14–1.42 2.70 × 10−5 0.009 30.568 Emilsson et al. (9
ProteinChr.Positionrs number cis-pQTLEAFEAMR OR95% CIMR P valueR2F statisticSource (first author, reference)
MANSC4 12 27927881 rs36138811 0.23 0.91 0.88–0.93 3.25 × 10−12 0.144 557.006 Sun et al. (8
MANSC4 12 27923241 rs11049131 0.77 0.93 0.91–0.95 3.25 × 10−12 0.242 1018.259 Emilsson et al. (9
ATP1B2 17 7554772 rs1642762 0.59 1.17 1.11–1.23 7.06 × 10−9 0.040 138.346 Sun et al. (8
TYRO3 15 41860698 rs2289743 0.69 1.34 1.21–1.48 9.94 × 10−9 0.011 35.796 Emilsson et al. (9
PAM 102418604 rs257309 0.35 0.91 0.88–0.94 1.69 × 10−8 0.099 363.292 Sun et al. (8
SHBG 17 7531965 rs858519 0.47 0.83 0.77–0.88 4.59 × 10−8 0.023 74.736 Emilsson et al. (9
MRC2 17 60637258 rs146385050 0.20 0.79 0.72–0.86 1.99 × 10−7 0.015 50.350 Sun et al. (8
ABO 136149229 rs505922 0.31 1.03 1.02–1.04 4.12 × 10−7 0.724 8668.978 Sun et al. (8
ABO 136144960 rs492488 0.74 1.03 1.02–1.04 7.95 × 10−7 0.811 13727.667 Emilsson et al. (9
ATF6B 32113980 rs114887538 0.76 1.19 1.11–1.28 1.06 × 10−6 0.020 65.546 Emilsson et al. (9
DBNL 44156146 rs3087367 0.56 0.85 0.79–0.91 3.06 × 10−6 0.022 73.276 Emilsson et al. (9
MICB 31472720 rs2855812 0.80 0.95 0.92–0.97 5.76 × 10−6 0.156 589.275 Emilsson et al. (9
ERO1LB 236399442 rs1254194 0.60 1.09 1.05–1.14 7.69 × 10−6 0.070 249.683 Sun et al. (8
TBCE 235594951 rs10802708 0.65 0.81 0.73–0.89 1.82 × 10−5 0.011 34.978 Emilsson et al. (9
TNFRSF6B 20 62370349 rs1056441 0.61 1.27 1.14–1.42 2.70 × 10−5 0.009 30.568 Emilsson et al. (9

MR OR represents the OR for type 2 diabetes per 1 SD increase in the protein level.

Chr., chromosome; EA, effect allele; EAF, effect allele frequency.

Sensitivity Analyses

Colocalization Analyses

Our colocalization analyses demonstrated that the posterior probability that MRC2 levels and type 2 diabetes shared a single causal signal was H4 = 0.92, suggesting that the two traits shared a single causal variant within the 1-Mb locus around the rs146385050 cis-pQTL (Supplementary Fig. 1A). Similar colocalization results were observed for circulating ATP1B2 levels with H4 = 0.96 (Supplementary Fig. 1B), SPATA20 levels with H4 = 0.84 (Supplementary Fig. 1C), HP levels with H4 = 0.95 (Supplementary Fig. 1D), ABO levels with H4 = 0.54 (Supplementary Fig. 1E), MANSC4 levels with H4 = 0.52 (Supplementary Fig. 1F), and ERO1LB levels with H4 = 0.88 (Supplementary Fig. 1G), implying single shared causal signals between the above protein levels and risk of adult type 2 diabetes. However, for all of the above proteins, except for ABO, the single lead SNP for type 2 diabetes and the respective proteins differs from the cis-pQTL used as an instrument in our MR studies, and, only for ERO1LB, the lead SNP (rs2463185) was in LD (R2 = 0.95) with its cis-pQTL (rs1254194). Interestingly, for circulating PAM levels, our colocalization result showed posterior probability H3 = 1.0 (Supplementary Fig. 1H), similar to MANBA levels with H3 = 0.83 (Supplementary Fig. 1I), suggesting that the two traits are linked to type 2 diabetes through two independent SNPs in the same locus, which implies a possible bias due to LD.

For youth-onset type 2 diabetes, we found that the posterior probability that GDF15 levels and type 2 diabetes shared a single causal signal was low (H4 = 0.19) and as such did not colocalize with youth-onset type 2 diabetes (Supplementary Fig. 1J), and a similar result was found for SVEP1 (Supplementary Fig. 1K). Conversely, our analysis showed that, similar as in adult type 2 diabetes, ABO protein level colocalized with youth-onset type 2 diabetes (H4 = 0.55) (Supplementary Fig. 1L). The results of our colocalization analyses for both adult and youth-onset type 2 diabetes are illustrated in Fig. 3.

Figure 3

Venn diagram summarizing the candidate proteins prioritized by our MR analyses.

Figure 3

Venn diagram summarizing the candidate proteins prioritized by our MR analyses.

Close modal

Assessment for PAV

We then assessed the cis-pQTL of all of our MR-prioritized proteins for being PAVs or in LD (R2 > 0.8) with PAVs. Our results demonstrated that, except for the cis-pQTL of SVEP1 (rs61751937), CDON (rs3740909), and KDR (rs2305948), which are missense variants or PAVs, the remaining MR prioritized cis-pQTLs are not PAVs (Supplementary Table 5). We further showed that the cis-pQTL rs60929339 (CDON) is in LD (R2 = 0.91) with the missense variant rs3740909, and the cis-pQTL rs2855812 of MICB is in LD with two missense variants (rs1065075, R2 = 0.805; rs1051788, R2 = 0.805). Also, the rs8176786 for NELL1 is in perfect LD with the rs16907058 (missense variant, R2 = 1), and the rs1056441 cis-pQTL for TNFRSF6B is in LD with rs8957 (missense variant, R2 = 0.807). For SPATA20, its cis-pQTL rs9890200 is in LD with rs8076632 (missense variant, splice region variant, R2 = 1), while for ACE, its cis-pQTL(rs4344) is in LD with rs4316 (missense variant, R2 = 0.959) and with rs4362 (missense variant, R2 = 0.920) (Supplementary Tables 6 and 7). Taken together, these findings suggest that, except for SVEP1, CDON, KDR, MICB, NELL1, TNFRSF6B, ACE, and SPATA20, the affinity and binding sites of the rest of the MR-prioritized proteins are not affected by PAVs, and therefore, these proteins can be reliably quantified for further validation studies.

Confounder Assessment

Our confounder assessment using PhenoScanner (18) demonstrated that the cis-pQTL for TNFRSF6B, ATF6B, CDK2AP1, MAPK3, ARG1, GDF15, C4A/C4B complex, and FGF2 were genome-wide significantly associated with whole-body fat mass, body fat percentage, and whole-body fat-free mass (Supplementary Table 8 and 9 and Fig. 3). This indicates that our MR estimate of the effect of the above proteins on type 2 diabetes risk might have been driven by the above confounders. However, for the remaining 21 candidate proteins for adult and youth-onset type 2 diabetes, we observed either no association or nominal association with the confounding traits related to type 2 diabetes, such as whole-body fat-free mass, whole-body fat mass, obesity/overweight, waist-to-hip ratio, and body fat percentage (Supplementary Table 8 and 10). This reinforces the hypothesis that these proteins may have a direct causal effect on type 2 diabetes, which is not mediated by adiposity.

eQTL Assessment

Our eQTL assessment using GTEx (22) demonstrated that cis-pQTLs for CDK2AP1, CCNH, TYRO3, MAPK3, TBCE, TNFRSF6B, ARG1, MRC2, SPATA20, PAM, ERO1LB, ACE, HP, and CYB5A were eQTLs for their respective genes in tissues such as whole blood, skin, fibroblasts, pancreas, adipose-visceral or subcutaneous tissue, and skeletal muscle (Supplementary Table 11). Interestingly, SHBG’s and CDON’s cis-pQTLs are eQTLs in pituitary and thyroid; cis-pQTLs for MANSC4, C4A/C4B complex, and FGF2 are eQTLs for the respective genes in adrenal gland, testis, coronary artery, and esophagus; and for cis-pQTLs of ATP1B2, MICB, and MANBA, the same applied in tissue from the tibial artery (Supplementary Table 12). The cis-pQTLs associated with DBNL, ATF6B, ABO, NELL1, GDF15, SVEP1, and KDR proteins were not identified as eQTLs in the GTEx database.

Multi-Instrument MR Analyses

Our multiple SNP MR analysis including trans-pQTL demonstrated a very slight tightening of the 95% CI of our MR OR for type 2 diabetes only for PAM: specifically, the 95% CI changed from 0.89–0.94 to 0.9–0.94. This result suggests that adding trans-pQTL to our MR instruments did not significantly increase the power of our MR analyses to detect associations between circulating proteins and adult type 2 diabetes risk (Supplementary Table 13 and Supplementary Fig. 2).

Using MR, we provided evidence that genetically altered levels of 22 circulating proteins (CDK2AP1, CCNH, TYRO3, MAPK3, TBCE, TNFRSF6B, ARG1, DBNL, MRC2, SHBG, ATF6B, SPATA20, ATP1B2, MANSC4, HP, MANBA, ABO, ACE, PAM, NELL1, ERO1LB, and MICB) are likely to be causally linked to adult type 2 diabetes risk, while 11 proteins (GDF15, SVEP1, CDON, KDR, CYB5A, C4A/C4B complex, FGF2, CCNH, TNFRSF6B, ABO, and ACE) presented suggestive evidence of association with risk of youth-onset type 2 diabetes. Our sensitivity analysis indicated that after adjusting our adult type 2 diabetes outcome for BMI, 13 proteins showed significant MR associations. These findings are supported by evidence from colocalization, showing that 7 of the above-mentioned 22 proteins share a single causal SNP with adult type 2 diabetes, namely MRC2, ATP1B2, ERO1LB, HP, ABO, SPATA20, and MANSC4, or are linked to type 2 diabetes via 2 independent SNPs, in the case of PAM and MANBA. In addition, our follow-up analyses showed that the majority of our candidate proteins are not significantly associated with confounding traits linked to type 2 diabetes risk, which reinforces their causal role in type 2 diabetes (Fig. 3). We demonstrated that the cis-pQTLs associated with the candidate proteins affect expression of their genes in skeletal muscle, adipose tissue, and pancreas, which are all relevant tissues in the pathophysiology of type 2 diabetes (24). Finally, we demonstrated that the majority of the cis-pQTLs used as instruments for the above proteins are not sequence variants or in LD with such variants, indicating that these proteins can be reliably quantified in future validation studies. While the individual effects of genetically altered levels of these proteins on risk of type 2 diabetes are small, these molecules represent potential druggable targets and pinpoint to causal pathways that can be targeted for intervention.

Among the 20 proteins prioritized by our main MR analysis on BMI-unadjusted type 2 diabetes, 9 proteins, namely CDK2AP1, CCNH, ACE, MAPK3, SPATA20, MANBA, HP, ARG1, and NELL1, were attenuated after adjusting for BMI, which indicates that their effects on type 2 diabetes risk were probably mediated by BMI. Interestingly, these proteins are involved in insulin secretion (25,26), diabetic cardiomyopathy (27), arm adiposity (28), vascular dysfunction (29,30), and diabetic nephropathy (31), respectively.

After adjusting for BMI in adult type 2 diabetes, our MR study replicated three proteins with evidence for association with type 2 diabetes from a previous MR study (7), including SHBG, DBNL, and ATP1B2, showing effects of these proteins on type 2 diabetes in the same direction as the previous report. We also identified PAM and ABO, with prior evidence of involvement in β-cell function (32) and insulin secretion (33,34), respectively, and TYRO3 (35), which was shown to be associated with type 2 diabetes in individuals with cardiovascular diseases. In addition, our MR study prioritized seven novel candidate proteins, among which MANSC4, TNFRSF6B, and MRC2 have known roles in diabetic nephropathy (3638). Notably, it has been shown that MRC2 promotes proliferation and inhibits apoptosis in the diabetic kidney, while it also contributes to type 2 diabetes pathogenesis (38); however, this is in the opposite direction from our MR result, demonstrating that increased levels of MRC2 are associated with decreased type 2 diabetes risk. Thus, further investigation with direct measurement of this protein in case-control cohorts with type 2 diabetes is required to clarify the role of MRC2 in type 2 diabetes. Among the remaining four novel candidate proteins, ATF6B and ERO1LB are known to be involved in pancreatic β-cell function (39,40), and TBCE has been associated with glycemic traits and obesity in humans (41). While there is contradictory evidence regarding the role of ERO1LB as protective or pathogenic in type 2 diabetes (40,42), our study demonstrated that increased circulating levels of ERO1LB increase type 2 diabetes risk. Finally, MICB, encoded by the human MHC class I chain–related gene, has a known association with type 1 diabetes risk (43).

Our youth-onset type 2 diabetes MR analysis identified 11 proteins with suggestive evidence of association with type 2 diabetes risk in childhood, of which 4 proteins (CCNH, ABO, ACE, and TNFRSF6B) are in common with adult type 2 diabetes, and GDF15 was replicated from a previous MR study for adult type 2 diabetes (7) (Fig. 3). SVEP1, KDR, and FGF2 proteins have been shown to be involved in cardiovascular disease (44) and diabetic retinopathy (45,46), while CDON (47) and CYB5A (48) have been involved in adiposity associated with type 2 diabetes. In a whole-exome sequencing study in an American Indian population, CYB5A was positively associated with obesity and nominally associated with increased risk of type 2 diabetes (49). However, the direction of the effect of CYB5A on type 2 diabetes in this study (49) is not in agreement with our MR result, as we show that the genetically increased CYB5A levels decrease the risk of youth-onset type 2 diabetes. Thus, direct measurement of the protein in an independent case-control cohort is required to validate this MR result.

Although observational studies (35,50) have identified candidate plasma proteins as biomarkers of adult type 2 diabetes, the major strength of our study is that we used MR, an established approach in genetic epidemiology known to limit bias from confounding and reverse causation. Although recently a study using a similar MR design sought to identify causal proteins for type 2 diabetes in a large European cohort, the SNPs used as genetic instruments were not associated with the protein exposures at a genome-wide level (7). In our MR study, we leveraged data from the largest protein GWAS consortia available to date to maximize our yield in tested proteins with available genome-wide significant cis-pQTLs (812) and from the largest available type 2 diabetes GWAS (13) to ensure adequate statistical power for our MR analyses. Moreover, we sought to identify child-specific protein biomarkers using data from the only available youth-onset type 2 diabetes GWAS (2). By using solely genome-wide significant cis-acting pQTLs as instruments for our protein exposures, we prevented horizontal pleiotropy in our MR (51). We undertook multiple sensitivity analyses to account for possible mediating or confounding effects related to obesity and adiposity affecting the findings of our main MR analysis on both adult- and youth-onset type 2 diabetes. Finally, we performed colocalization analyses as an additional strategy to explore association between the candidate protein biomarkers and type 2 diabetes risk for a subset of candidate proteins with available summary-level GWAS results.

We are aware of a few considerable limitations of our study. First, the small sample size of the European subset in the only available youth-onset type 2 diabetes GWAS could not ensure adequate statistical power for discovery of candidate proteins with small individual effects on disease risk. Thus, we elected to report proteins with suggestive MR evidence for association with risk of youth-onset type 2 diabetes, while no proteins survived after multiple-testing correction. These findings should be validated in future MR studies using data from emerging larger pediatric type 2 diabetes GWAS consortia. Second, we did not perform validation studies to confirm our MR findings by directly measuring the candidate protein levels in independent case-control studies for type 2 diabetes, as these validation studies were out of the scope of this work. Also, these experiments are often strongly influenced by confounding and reverse causation. However, our PAV analysis ensured the feasibility of such studies, showing that most of these proteins can be measurable in a clinical setting using aptamer- or antibody-based bioassays. Third, although our colocalization analysis showed that type 2 diabetes and proteins such as MRC2, ATP1B2, SPATA20, ERO1LB, HP, ABO, and MANSC4 are linked via a single causal variant in the same locus, the lead SNPs were not the same as the corresponding cis-pQTLs, which implies possible bias due to LD. Nevertheless, a limitation of colocalization is the assumption of a single shared common causal SNP; however, in reality, genetic loci may contain several causal SNPs. Moreover, we observed that in our MR-prioritized proteins, 10 proteins (CDK2AP1, CCNH, TNFRSF6B, DBNL, MRC2, ATF6B, PAM, HP, GDF15, and C4A/C4B) have instruments (SNPs) that do not map directly in the protein gene itself, but rather next to the gene. Nevertheless, these SNPs still satisfied the definition of being a cis-pQTL for their respective protein in the proteomic GWAS (812), as these SNPs were located within a maximum of 1 Mb of the transcription start site of the gene encoding the measured protein. Finally, our study has been performed using only GWAS data from cohorts of European ancestry, and, as such, our results cannot be generalized to other ancestries. Upon availability of large proteomic and type 2 diabetic GWAS in diverse ancestries, future ancestry-specific MR studies are needed to cross validate our findings in non-Europeans.

In conclusion, our two-sample MR approach provides evidence for a causal role in adult type 2 diabetes and suggestive evidence for a role in youth-onset type 2 diabetes for the above-mentioned circulating proteins. While for a set of these circulating proteins, previous evidence for association with type 2 diabetes exists from observational and MR studies, we also identified novel candidate proteins with previously known involvement in type 2 diabetes complications, pancreatic β-cell functions, and adiposity. Our findings highlight a possible role of these circulating proteins in type 2 diabetes pathophysiology and support a potential utility of these molecules in drug development for type 2 diabetes in adults and children.

This article contains supplementary material online at https://doi.org/10.2337/figshare.19233129.

Funding. D.M. received a Pediatric Endocrine Society Clinical Scholar Award and is a Fonds de recherche Québec–Santé and Canadian Child Health Clinician Scientist Program scholar.

The funding body had no involvement in the study design, data collection, analysis and interpretation of results, or writing of this manuscript.

Duality of Interest. J.B.R. has served as an advisor to GlaxoSmithKline and Deerfield Capital; his institution has received investigator-initiated grant funding from Eli Lilly and Company, GlaxoSmithKline, and Biogen for projects unrelated to this research; and he is the founder of 5 Prime Sciences. No other potential conflicts of interest relevant to this article were reported.

Author Contributions. F.G., N.Y., M.Y., and D.M. conducted the analyses and interpretation of data. F.G. and D.M. produced the first draft of the manuscript. D.M. designed the study. All authors reviewed and approved the final version. D.M. is the guarantor of this work and, as such, had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.

1.
Kohner
EM
,
Aldington
SJ
,
Stratton
IM
, et al
.
United Kingdom Prospective Diabetes Study, 30: diabetic retinopathy at diagnosis of non-insulin-dependent diabetes mellitus and associated risk factors
.
Arch Ophthalmol
1998
;
116
:
297
303
2.
Srinivasan
S
,
Chen
L
,
Todd
J
, et al.;
ProDiGY Consortium
.
The first genome-wide association study for type 2 diabetes in youth: The Progress in Diabetes Genetics in Youth (ProDiGY) Consortium
.
Diabetes
2021
;
70
:
996
1005
3.
Zanini
JC
,
Pietzner
M
,
Langenberg
C
.
Integrating genetics and the plasma proteome to predict the risk of type 2 diabetes
.
Curr Diab Rep
2020
;
20
:
60
4.
Beijer
K
,
Nowak
C
,
Sundström
J
,
Ärnlöv
J
,
Fall
T
,
Lind
L
.
In search of causal pathways in diabetes: a study using proteomics and genotyping data from a cross-sectional study
.
Diabetologia
2019
;
62
:
1998
2006
5.
Ferrannini
G
,
Manca
ML
,
Magnoni
M
, et al
.
Coronary artery disease and type 2 diabetes: a proteomic study
.
Diabetes Care
2020
;
43
:
843
851
6.
Burgess
S
,
Butterworth
A
,
Thompson
SG
.
Mendelian randomization analysis with multiple genetic variants using summarized data
.
Genet Epidemiol
2013
;
37
:
658
665
7.
Gudmundsdottir
V
,
Zaghlool
SB
,
Emilsson
V
, et al
.
Circulating protein signatures and causal candidates for type 2 diabetes
.
Diabetes
2020
;
69
:
1843
1853
8.
Sun
BB
,
Maranville
JC
,
Peters
JE
, et al
.
Genomic atlas of the human plasma proteome
.
Nature
2018
;
558
:
73
79
9.
Emilsson
V
,
Ilkov
M
,
Lamb
JR
, et al
.
Co-regulatory networks of human serum proteins link genetics to disease
.
Science
2018
;
361
:
769
773
10.
Folkersen
L
,
Gustafsson
S
,
Wang
Q
, et al
.
Genomic and drug target evaluation of 90 cardiovascular proteins in 30,931 individuals
.
Nat Metab
2020
;
2
:
1135
1148
11.
Suhre
K
,
Arnold
M
,
Bhagwat
AM
, et al
.
Connecting genetic risk to disease end points through the human blood plasma proteome
.
Nat Commun
2017
;
8
:
14357
12.
Yao
C
,
Chen
G
,
Song
C
, et al
.
Genome-wide mapping of plasma protein QTLs identifies putatively causal genes and pathways for cardiovascular disease
.
Nat Commun
2018
;
9
:
3268
13.
Mahajan
A
,
Taliun
D
,
Thurner
M
, et al
.
Fine-mapping type 2 diabetes loci to single-variant resolution using high-density imputation and islet-specific epigenome maps
.
Nat Genet
2018
;
50
:
1505
1513
14.
Cox
RD
,
Church
CD
.
Mouse models and the interpretation of human GWAS in type 2 diabetes and obesity
.
Dis Model Mech
2011
;
4
:
155
164
15.
Yavorska
OO
,
Burgess
S
.
MendelianRandomization: an R package for performing Mendelian randomization analyses using summarized data
.
Int J Epidemiol
2017
;
46
:
1734
1739
16.
Park
JH
,
Wacholder
S
,
Gail
MH
, et al
.
Estimation of effect size distribution from genome-wide association studies and implications for future discoveries
.
Nat Genet
2010
;
42
:
570
575
17.
Palmer
TM
,
Lawlor
DA
,
Harbord
RM
, et al
.
Using multiple genetic variants as instrumental variables for modifiable risk factors
.
Stat Methods Med Res
2012
;
21
:
223
242
18.
Kamat
MA
,
Blackshaw
JA
,
Young
R
, et al
.
PhenoScanner V2: an expanded tool for searching human genotype-phenotype associations
.
Bioinformatics
2019
;
35
:
4851
4853
19.
Abecasis
GR
,
Auton
A
,
Brooks
LD
, et al.;
1000 Genomes Project Consortium
.
An integrated map of genetic variation from 1,092 human genomes
.
Nature
2012
;
491
:
56
65
20.
Giambartolomei
C
,
Vukcevic
D
,
Schadt
EE
, et al
.
Bayesian test for colocalisation between pairs of genetic association studies using summary statistics
.
PLoS Genet
2014
;
10
:
e1004383
21.
Liu
B
,
Gloudemans
MJ
,
Rao
AS
,
Ingelsson
E
,
Montgomery
SB
.
Abundant associations with gene expression complicate GWAS follow-up
.
Nat Genet
2019
;
51
:
768
769
22.
Carithers
LJ
,
Moore
HM
.
The Genotype-Tissue Expression (GTEx) Project
.
Biopreserv Biobank
2015
;
13
:
307
308
23.
Corbin
LJ
,
Richmond
RC
,
Wade
KH
, et al
.
BMI as a modifiable risk factor for type 2 diabetes: refining and understanding causal estimates using Mendelian randomization
.
Diabetes
2016
;
65
:
3002
3007
24.
Taylor
R
.
Insulin resistance and type 2 diabetes
.
Diabetes
2012
;
61
:
778
779
25.
Kim
SY
,
Lee
J-H
,
Merrins
MJ
, et al
.
Loss of cyclin-dependent kinase 2 in the pancreas links primary β-cell dysfunction to progressive depletion of β-cell mass and diabetes
.
J Biol Chem
2017
;
292
:
3841
3853
26.
Bindom
SM
,
Hans
CP
,
Xia
H
,
Boulares
AH
,
Lazartigues
E
.
Angiotensin I-converting enzyme type 2 (ACE2) gene therapy improves glycemic control in diabetic mice
.
Diabetes
2010
;
59
:
2540
2548
27.
Xu
Z
,
Sun
J
,
Tong
Q
, et al
.
The role of ERK1/2 in the development of diabetic cardiomyopathy
.
Int J Mol Sci
2016
;
17
:
2001
28.
Neville
MJ
,
Wittemans
LBL
,
Pinnick
KE
, et al
.
Regional fat depot masses are influenced by protein-coding gene variants
.
PLoS One
2019
;
14
:
e0217644
29.
Pernow
J
,
Kiss
A
,
Tratsiakovich
Y
,
Climent
B
.
Tissue-specific up-regulation of arginase I and II induced by p38 MAPK mediates endothelial dysfunction in type 1 diabetes mellitus
.
Br J Pharmacol
2015
;
172
:
4684
4698
30.
Asleh
R
,
Levy
AP
.
In vivo and in vitro studies establishing haptoglobin as a major susceptibility gene for diabetic vascular disease
.
Vasc Health Risk Manag
2005
;
1
:
19
28
31.
Sethi
S
.
New ‘antigens’ in membranous nephropathy
.
J Am Soc Nephrol
2021
;
32
:
268
278
32.
Thomsen
SK
,
Raimondo
A
,
Hastoy
B
, et al
.
Type 2 diabetes risk alleles in PAM impact insulin release from human pancreatic β-cells
.
Nat Genet
2018
;
50
:
1122
1131
33.
Li-Gao
R
,
Carlotti
F
,
de Mutsert
R
, et al
.
Genome-wide association study on the early-phase insulin response to a liquid mixed meal: results from the NEO Study
.
Diabetes
2019
;
68
:
2327
2336
34.
Meo
SA
,
Rouq
FA
,
Suraya
F
,
Zaidi
SZ
.
Association of ABO and Rh blood groups with type 2 diabetes mellitus
.
Eur Rev Med Pharmacol Sci
2016
;
20
:
237
242
35.
Gilly
A
,
Park
Y-C
,
Png
G
, et al
.
Whole-genome sequencing analysis of the cardiometabolic proteome
.
Nat Commun
2020
;
11
:
6336
36.
Bai
Y
,
Zhu
R
,
Tian
Y
, et al
.
Catalpol in diabetes and its complications: a review of pharmacology, pharmacokinetics, and safety
.
Molecules
2019
;
24
:
3302
37.
Tseng
W-C
,
Yang
W-C
,
Yang
A-H
,
Hsieh
S-L
,
Tarng
D-C
.
Expression of TNFRSF6B in kidneys is a novel predictor for progression of chronic kidney disease
.
Mod Pathol
2013
;
26
:
984
994
38.
Li
L
,
Chen
X
,
Zhang
H
,
Wang
M
,
Lu
W
.
MRC2 promotes proliferation and inhibits apoptosis of diabetic nephropathy
.
Anal Cell Pathol (Amst)
2021
;
2021
:
6619870
39.
Seo
H-Y
,
Kim
YD
,
Lee
K-M
, et al
.
Endoplasmic reticulum stress-induced activation of activating transcription factor 6 decreases insulin gene expression via up-regulation of orphan nuclear receptor small heterodimer partner
.
Endocrinology
2008
;
149
:
3832
3841
40.
Awazawa
M
,
Futami
T
,
Sakada
M
, et al
.
Deregulation of pancreas-specific oxidoreductin ERO1β in the pathogenesis of diabetes mellitus
.
Mol Cell Biol
2014
;
34
:
1290
1299
41.
Skovsø
S
,
Panzhinskiy
E
,
Kolic
J
, et al
.
Beta-cell specific insulin resistance promotes glucose-stimulated insulin hypersecretion
.
Nat Commun
2022
;
13
:
735
42.
Zito
E
,
Chin
KT
,
Blais
J
,
Harding
HP
,
Ron
D
.
ERO1-beta, a pancreas-specific disulfide oxidase, promotes insulin biogenesis and glucose homeostasis
.
J Cell Biol
2010
;
188
:
821
832
43.
Field
SF
,
Nejentsev
S
,
Walker
NM
, et al
.
Sequencing-based genotyping and association analysis of the MICA and MICB genes in type 1 diabetes
.
Diabetes
2008
;
57
:
1753
1756
44.
Kessler
T
,
Vilne
B
,
Schunkert
H
.
The impact of genome-wide association studies on the pathophysiology and therapy of cardiovascular disease
.
EMBO Mol Med
2016
;
8
:
688
701
45.
Smith
G
,
McLeod
D
,
Foreman
D
,
Boulton
M
.
Immunolocalisation of the VEGF receptors FLT-1, KDR, and FLT-4 in diabetic retinopathy
.
Br J Ophthalmol
1999
;
83
:
486
494
46.
Hill
DJ
,
Flyvbjerg
A
,
Arany
E
,
Lauszus
FF
,
Klebe
JG
.
Increased levels of serum fibroblast growth factor-2 in diabetic pregnant women with retinopathy
.
J Clin Endocrinol Metab
1997
;
82
:
1452
1457
47.
van der Kolk
BW
,
Kalafati
M
,
Adriaens
M
, et al
.
Subcutaneous adipose tissue and systemic inflammation are associated with peripheral but not hepatic insulin resistance in humans
.
Diabetes
2019
;
68
:
2247
2258
48.
Paton
CM
,
Ntambi
JM
.
Biochemical and physiological function of stearoyl-CoA desaturase
.
Am J Physiol Endocrinol Metab
2009
;
297
:
E28
E37
49.
Huang
K
,
Nair
AK
,
Muller
YL
, et al
.
Whole exome sequencing identifies variation in CYB5A and RNF10 associated with adiposity and type 2 diabetes
.
Obesity (Silver Spring)
2014
;
22
:
984
988
50.
Elhadad
MA
,
Jonasson
C
,
Huth
C
, et al
.
Deciphering the plasma proteome of type 2 diabetes
.
Diabetes
2020
;
69
:
2766
2778
51.
Abecasis
GR
,
Auton
A
,
Brooks
LD
, et al.;
1000 Genomes Project Consortium
.
An integrated map of genetic variation from 1,092 human genomes
.
Nature
2012
;
491
:
56
65
Readers may use this article as long as the work is properly cited, the use is educational and not for profit, and the work is not altered. More information is available at https://diabetesjournals.org/journals/pages/license.