Progression of prediabetes to type 2 diabetes has been associated with β-cell dysfunction, whereas its remission to normoglycemia has been related to improvement of insulin sensitivity. To understand the mechanisms and identify potential biomarkers related to prediabetes trajectories, we compared the proteomics and metabolomics profile of people with prediabetes progressing to diabetes or reversing to normoglycemia within 1 year.
The fasting plasma concentrations of 1,389 proteins and the fasting, 30-min, and 120-min post–oral glucose tolerance test (OGTT) plasma concentrations of 152 metabolites were measured in up to 134 individuals with new-onset diabetes, prediabetes, or normal glucose tolerance. For 108 participants, the analysis was repeated with samples from 1 year before, when all had prediabetes.
The plasma concentrations of 14 proteins were higher in diabetes compared with normoglycemia in a population with prediabetes 1 year before, and they correlated with indices of insulin sensitivity. Higher levels of dicarbonyl/L-xylulose reductase and glutathione S-transferase A3 in the prediabetic state were associated with an increased risk of diabetes 1 year later. Pathway analysis pointed toward differences in immune response between diabetes and normoglycemia that were already recognizable in the prediabetic state 1 year prior at baseline. The area under the curve during OGTT of the concentrations of IDL particles, IDL apolipoprotein B, and IDL cholesterol was higher in new-onset diabetes compared with normoglycemia. The concentration of glutamate increased in prediabetes progressing to diabetes.
We identify new candidates associated with the progression of prediabetes to diabetes or its remission to normoglycemia. Pathways regulating the immune response are related to prediabetes trajectories.
Introduction
Type 2 diabetes (T2D) is a prevalent disease significantly affecting life expectancy and life quality. Identification and treatment of people at high risk of developing T2D can prevent or delay disease onset. Although extensive research has been conducted on diabetes pathophysiology, the exact mechanisms, particularly in relation to temporal dynamics, remain incompletely understood. Prediabetes is considered the strongest risk factor for the development of T2D. Progression from prediabetes to diabetes has been primarily attributed to pancreatic β-cell dysfunction (1). On the contrary, as recently shown by our consortium (2), reversal of prediabetes to normoglycemia is characterized by improvements in insulin sensitivity and reductions in liver fat content and visceral adipose tissue volume. Further understanding the underlying mechanisms characterizing prediabetes progression or remission is important to improve T2D prediction or prevention.
Advancements in proteomics and metabolomics technologies offer the opportunity for accurate measurement of a large number of proteins and metabolites, thus facilitating the discovery of new biomarkers or therapeutic targets. Proteomics studies so far have mostly focused on cross-sectional comparisons of prevalent T2D or prevalent prediabetes with healthy control participants (3–8). Known duration of diabetes, degree of hyperglycemia, medical treatment, comorbidities, and other confounding factors might affect the circulating proteome, thus perplexing the interpretation of these findings. Very few studies have followed a longitudinal design, and they have assessed the proteome and/or metabolome at baseline and in the long term (5–14 years follow-up) in population cohorts (7). Several of the reported proteins from these studies are established markers of adiposity (e.g., leptin, FABP4), thus reflecting the chronic detrimental impact of obesity on the development of T2D rather than any possible acute systemic changes that might contribute to prediabetes progression or remission.
Here, we designed an explorative case-control study with participants from the Prediabetes Lifestyle Intervention Study (PLIS) (9) to compare both cross-sectionally and longitudinally the proteomic and nuclear magnetic resonance (NMR)–based metabolomics signatures of people with prediabetes progressing to diabetes versus reversing to normoglycemia within 1 year. Finally, we regrouped the participants based on their BMI to evaluate the differences in the circulating proteome among obesity, overweight, and normal weight and compare them with the differences observed in various glycemic states.
Research Design and Methods
Study Design
A nested case-control study with a cross-sectional part and a prospective part was designed with samples from participants of the PLIS (ClinicalTrials.gov identifier NCT01947595) who were recruited and followed in the University Study Center for Metabolic Diseases in Dresden from 2016 to 2022. The PLIS design has been previously described (9) and is summarized in the Supplementary Material. Briefly, participants with prediabetes were randomized to receive conventional, intensive, or no lifestyle intervention for 1 year with 2 years’ follow-up initially, which has been further extended to 12 years. The study was conducted with the approval of the ethics committee of the Technische Universität Dresden and in accordance with the Declaration of Helsinki. All participants provided written informed consent.
Cross-Sectional Analysis
A proteomics and an NMR-based metabolomics analysis was performed in randomly selected participants of the PLIS recruited at the Dresden clinical center who had material collected at the cross-sectional time point (follow-up) and who 1) were diagnosed for the first time with T2D (n = 31 for metabolomics and n = 25 for proteomics) (diabetes group) or 2) maintained their prediabetic state (n = 72 for metabolomics and n = 50 for proteomics) (prediabetes group) or 3) had a normal glucose tolerance (NGT) test (n = 31 for metabolomics and n = 25 for proteomics) (NGT group). The cohort used for the proteomics study is a subset of the metabolomics cohort.
Prospective Analysis
For most of the participants (108 of 134 for NMR metabolomics and 80 of 100 for proteomics), the analysis was performed also in samples from 1 year before (baseline), when they all had prediabetes. Participants in the diabetes, prediabetes, and NGT groups were classified according to American Diabetes Association criteria (10).
Procedures and Tests
Details about the performance of the 75-g oral glucose tolerance test (OGTT), the measurements of glucose and insulin, as well as the calculation of the insulinogenic index (IGI), insulin sensitivity index (ISI), disposition index (DI), and liver lipid content are provided in the Supplementary Material. Briefly, both for the proteomics and metabolomics analysis, previously collected, unthawed aliquots stored at −80°C were used. The measurements in the proteomics analysis were performed in plasma samples after a 12-h overnight fast (0 min of OGTT) and in the metabolomics analysis in samples collected in sodium fluoride tubes at 0, 30, and 120 min of OGTT.
Plasma Proteomics
A proteomics analysis was performed by Olink in Uppsala, Sweden, by using the cardiometabolic, inflammation, neurology, and oncology predesigned panels as previously described (11). The analysis was performed by experienced personnel blinded to the different study groups. A total of 1,470 proteins were initially measured. Three proteins (tumor necrosis factor-α [TNF-α], interleukin-6 [IL-6], and IL-8) were measured in all four predesigned panels, and thus, only the measurements obtained from the inflammation panel were included in the analysis for these proteins. Of the data points, 94% (248,410 of 264,600) passed the three quality controls. Seventy-two proteins with >60% of their values not passing the quality controls were excluded completely from further analysis, so the final number of proteins was 1,389 analyzed. Values were provided as protein relative abundances, and the data were processed in the Olink standard normalized protein expression form on a log2 scale.
NMR-Based Metabolomics
Concentrations of 152 metabolites and lipid parameters were measured in sodium fluoride plasma by the Quantitative Metabolomics Platform, Institute of Clinical Chemistry and Laboratory Medicine, University Hospital of Dresden, using a Bruker 600-MHz Avance III Neo equipped with a BBI Probe and a Bruker SampleJet robot with a cooling system for sample storage at 4°C. Details about the procedures and measurements are described in the Supplementary Material.
Bioinformatics Analysis
For each data set, the analytical approach consisted of an overview analysis that included multidimensional scaling and sparse partial least squares-discriminant analysis (sPLS-DA). For differential abundance analysis, several linear models were created; the two most appropriate ones selected by Akaike information criterion were used, including a crude (unadjusted) model and a model adjusted for sex and type of intervention (conventional, intensive, or no lifestyle intervention) performed in the PLIS (Supplementary Material). In proteomics, a functional enrichment analysis examining Kyoto Encyclopedia of Genes and Genomes (KEGG) and Gene Ontology data sets and a weighted correlation network analysis were performed. All data processing steps were done in R 4.3 using an RStudio integrated development environment. A detailed description of the statistical methods are provided in the Supplementary Materials.
Data and Resource Availability
The data sets generated or analyzed during the current study are not available publicly because they are subject to national data protection laws and restrictions imposed by the ethics committee to ensure privacy of study participants. However, they can be requested after publication through an individual project agreement with the corresponding author. The request will be reviewed by the PLIS data steering committee. A data access agreement will have to be reached after request approval.
Results
Cohort Characteristics
Demographic, anthropometric, and relevant clinical data at follow-up of the three main groups (NGT, prediabetes, and diabetes) are presented in Table 1 for the population with metabolomics measurements and in Supplementary Table 1 for the population with proteomics measurements (part of the metabolomics population). For a subset of participants, samples were also available from 1 year before (baseline), when all were in the prediabetic state (Supplementary Tables 2 and 3). The proteomics and metabolomics profiles were analyzed both cross-sectionally (at follow-up and baseline) and prospectively (from baseline to follow-up) (see Supplementary Fig. 1 for schematic description of the analysis and Supplementary Tables 4 and 5 for paired analysis of selected clinical parameters). The cross-sectional analysis at follow-up was repeated after reclassifying the study participants according to their BMI (normal weight vs. overweight vs. obese).
Demographic, anthropometric, and relevant clinical data: the metabolomics cohort at follow-up
. | NGT group . | Prediabetes group . | Diabetes group . |
---|---|---|---|
Sex | |||
Female | 12 | 36 | 9 |
Male | 19 | 36 | 22 |
Intervention | |||
Control | 2 | 11 | 1 |
Conventional | 17 | 30 | 15 |
Intensive | 12 | 31 | 15 |
Age (years) | 69.9 (64.7–73.1) | 70.7 (65.1–74.9) | 68.8 (62.0–72.9) |
Glucose AUC (mmol · min/L) | 866 (803–960) | 1,050 (962–1,152)*** | 1,345 (1,224–1,481)*** |
Fasting glucose (mmol/L) | 5.41 (5.24–5.47) | 5.97 (5.67–6.36)*** | 7.12 (7.01–7.35)*** |
HOMA-IR | 2.45 (1.84–3.32) | 3.48 (2.31–5.40)** | 4.69 (3.58–7.43)*** |
HOMA-B | 122 (100.6–173.9) | 128 (74.36–182.7) | 108 (81.16–130.4) |
IGI | 108.6 (68.8–175.6) | 101.0 (68.0–198.8) | 63.2 (48.0–112.0)* |
ISI | 10.93 (8.66–13.88) | 6.92 (4.76–9.71)** | 5.20 (3.49–6.97)*** |
DI | 1,358 (916–1,523) | 847 (509–1,457)* | 367 (255–550)*** |
Insulin AUC (pmol · min/L) | 53,820 (40,406–67,331) | 70,890 (45,071–108,814)* | 73,290 (50,895–102,443)*** |
HbA1c (%) | 5.6 (5.5–5.80) | 5.6 (5.5–5.83) | 5.9 (5.8–6.0)*** |
BMI (kg/m2) | 26.3 (23.21–28.32) | 29.3 (26.08–32.30)** | 30.1 (26.40–32.47)** |
WHR | 0.930 (0.895–0.965) | 0.946 (0.877–0.990) | 0.980 (0.937–1.058)* |
Total adipose tissue volume (L) | 12.9 (10.24–15.96) | 18.3 (13.48–21.91)** | 20.2 (16.10–23.12)** |
Visceral adipose tissue (L) | 4.04 (3.17–5.92) | 5.28 (4.27–6.71) | 6.25 (4.75–7.85)** |
Subcutaneous adipose tissue (L) | 8.61 (7.24–10.08) | 13.01 (8.52–16.62)** | 12.71 (9.40–16.88)* |
Lean tissue (L) | 19.0 ± 3.36 | 20.0 ± 3.62 | 20.6 ± 3.36 |
Liver fat (%) | 9.09 (7.30–11.08) | 10.88 (8.63–15.40)* | 15.68 (12.76–22.99)*** |
. | NGT group . | Prediabetes group . | Diabetes group . |
---|---|---|---|
Sex | |||
Female | 12 | 36 | 9 |
Male | 19 | 36 | 22 |
Intervention | |||
Control | 2 | 11 | 1 |
Conventional | 17 | 30 | 15 |
Intensive | 12 | 31 | 15 |
Age (years) | 69.9 (64.7–73.1) | 70.7 (65.1–74.9) | 68.8 (62.0–72.9) |
Glucose AUC (mmol · min/L) | 866 (803–960) | 1,050 (962–1,152)*** | 1,345 (1,224–1,481)*** |
Fasting glucose (mmol/L) | 5.41 (5.24–5.47) | 5.97 (5.67–6.36)*** | 7.12 (7.01–7.35)*** |
HOMA-IR | 2.45 (1.84–3.32) | 3.48 (2.31–5.40)** | 4.69 (3.58–7.43)*** |
HOMA-B | 122 (100.6–173.9) | 128 (74.36–182.7) | 108 (81.16–130.4) |
IGI | 108.6 (68.8–175.6) | 101.0 (68.0–198.8) | 63.2 (48.0–112.0)* |
ISI | 10.93 (8.66–13.88) | 6.92 (4.76–9.71)** | 5.20 (3.49–6.97)*** |
DI | 1,358 (916–1,523) | 847 (509–1,457)* | 367 (255–550)*** |
Insulin AUC (pmol · min/L) | 53,820 (40,406–67,331) | 70,890 (45,071–108,814)* | 73,290 (50,895–102,443)*** |
HbA1c (%) | 5.6 (5.5–5.80) | 5.6 (5.5–5.83) | 5.9 (5.8–6.0)*** |
BMI (kg/m2) | 26.3 (23.21–28.32) | 29.3 (26.08–32.30)** | 30.1 (26.40–32.47)** |
WHR | 0.930 (0.895–0.965) | 0.946 (0.877–0.990) | 0.980 (0.937–1.058)* |
Total adipose tissue volume (L) | 12.9 (10.24–15.96) | 18.3 (13.48–21.91)** | 20.2 (16.10–23.12)** |
Visceral adipose tissue (L) | 4.04 (3.17–5.92) | 5.28 (4.27–6.71) | 6.25 (4.75–7.85)** |
Subcutaneous adipose tissue (L) | 8.61 (7.24–10.08) | 13.01 (8.52–16.62)** | 12.71 (9.40–16.88)* |
Lean tissue (L) | 19.0 ± 3.36 | 20.0 ± 3.62 | 20.6 ± 3.36 |
Liver fat (%) | 9.09 (7.30–11.08) | 10.88 (8.63–15.40)* | 15.68 (12.76–22.99)*** |
Data are mean ± SD or median (interquartile range). Significance determined using ANOVA or Kruskal-Wallis test and post hoc test with Bonferroni multiple testing correction, with NGT as the reference condition. WHR, waist-to-hip ratio.
*P < 0.05, **P < 0.01, ***P < 0.001.
Plasma Proteomics at Follow-up
Fourteen proteins were identified that were elevated in the diabetes compared with NGT group (false discovery rate [FDR] <0.05) (Fig. 1 and Supplementary Table 6). The 14 proteins correlated significantly both with fasting glucose and glucose area under the curve (AUC) and to a lesser extent with HbA1c (Fig. 1B). They were more strongly associated with indices of insulin sensitivity (ISI, HOMA of insulin resistance [HOMA-IR]) than with markers of β-cell function (HOMA-B, IGI). Most of the proteins also correlated with indices of adiposity (BMI, waist-to-hip ratio, subcutaneous and visceral adipose tissue volume, liver fat percentage). After adjusting for sex and type of intervention in the PLIS, 13 of the 14 proteins remained significantly different in diabetes compared with the NGT group (apart from tartrate-resistant acid phosphatase type 5 [ACP5], adjusted P = 0.069) (Supplementary Table 6D).
Differential abundance analysis of proteomics data at the follow-up time point according to glycemic states. A: Volcano plots, with logFC change of normalized abundances on the x-axis and log10-transformed nonadjusted P value on the y-axis. Proteins with nonadjusted P < 0.05 (FDR >0.05) are marked blue, and proteins with adjusted P (FDR) < 0.05 are marked red. Prediabetes includes all patients with impaired fasting glucose and/or impaired glucose tolerance. B: Correlation heat map of the 14 differentially abundant proteins in the NGT vs. diabetes comparison, with selected clinical parameters presented as Spearman correlation coefficients with corresponding P values. WHR, waist-to-hip ratio.
Differential abundance analysis of proteomics data at the follow-up time point according to glycemic states. A: Volcano plots, with logFC change of normalized abundances on the x-axis and log10-transformed nonadjusted P value on the y-axis. Proteins with nonadjusted P < 0.05 (FDR >0.05) are marked blue, and proteins with adjusted P (FDR) < 0.05 are marked red. Prediabetes includes all patients with impaired fasting glucose and/or impaired glucose tolerance. B: Correlation heat map of the 14 differentially abundant proteins in the NGT vs. diabetes comparison, with selected clinical parameters presented as Spearman correlation coefficients with corresponding P values. WHR, waist-to-hip ratio.
Among the 14 proteins, study participants in the highest tertiles of concentrations for liver carboxylesterase 1 (CES1), carboxylpeptidase M (CPM), keratin type I cytoskeletal 18 (KRT18), integrin α-5 (ITGA5), lysosomal Pro-X carboxypeptidase (PRCP), and ACP5 had a significantly higher odds of having diabetes and lower odds of being normoglycemic compared with participants in the lowest tertiles both before and after adjustment for sex, type of intervention, and BMI in the respective regression models (Supplementary Table 7). A subanalysis in the prediabetes group comparing study participants with impaired fasting glucose versus impaired fasting glucose and impaired glucose tolerance was performed, which did not identify proteins with significant differences in their concentrations between the two groups (Supplementary Table 8).
To evaluate the relative influence of adiposity versus glycemic state on the circulating proteome, we categorized the study participants during follow-up according to their BMI as obese (n = 38), overweight (n = 38), and normal weight (n = 24). The concentrations of 39 proteins were identified as increased and 6 proteins as decreased in obesity compared with normal weight (Supplementary Fig. 2A and Supplementary Table 9). Of those, only leptin was also significantly increased in obesity compared with overweight. Six proteins (IL-1 receptor antagonist [IL1RN], IL-6, glutathione S-transferase A3 [GSTA3], 6-pyruvoyl-tetrahydrobiopterin synthase [PTS], CES1, CPM) were elevated in both diabetes and NGT and in obesity and normal weight. Leptin, FABP4, furin, and proadrenomedullin (ADM) demonstrated a strong positive correlation (r ≥0.6) with total and subcutaneous adipose tissue volume (Supplementary Fig. 2B). Furthermore, we calculated the log-fold changes (logFCs) of the 14 significant proteins comparing NGT with diabetes, after adjusting for BMI (Supplementary Table 6D and E). In four of the proteins (KRT18, ACP5, arylsulfatase A [ARSA], L-xylulose reductase [DCXR]) the absolute logFCs were slightly higher after adjusting for BMI, suggesting that adiposity does not affect their concentrations. In the other 10 proteins, the absolute logFCs were slightly (sorbitol dehydrogenase [SORD], PRCP, ITGA5), moderately (GSTA3, CPM, β-glucuronidase [GUSB]), or profoundly (IL1RN, CES1, PTS, IL-6) lower after adjustment for BMI, suggesting that adiposity impacts their concentrations to different degrees. Altogether, the differences in the circulating proteome were more profound when participants were grouped based on their BMI than based on their glycemic state. This observation was further supported by sPLS-DA, which showed a clear progression pattern in proteome from normal weight to overweight and obesity (Supplementary Fig. 3B). In contrast, there was an overlap between prediabetes and NGT states. Nevertheless, at least a limited separation of the diabetes group, which demonstrated large heterogeneity, from the prediabetes and NGT groups could be observed (Supplementary Fig. 3A).
Plasma Proteomics at Baseline and Longitudinally (1 Year Later)
We compared the proteomic profiles of the diabetes, prediabetes, and NGT groups (follow-up classification) in plasma samples of the same study participants from 1 year before (baseline), when all of them were in a prediabetic state. The differential abundance analysis identified 184 (NGT vs. diabetes), 160 (prediabetes vs. diabetes), and 33 (NGT vs. prediabetes) proteins as differentially abundant according to the unadjusted P value, with none of them meeting the cutoff of statistical significance after multiple comparison correction (Supplementary Table 10).
We further assessed whether the concentrations of the 14 proteins identified to be elevated in the diabetes group compared with the NGT group at follow-up were associated at baseline (in the prediabetic state) with the risk for developing diabetes 1 year later. Among the 14 proteins, the study participants in the highest tertiles of concentrations at baseline for the DCXR and the GSTA3 had a significantly higher odds (DCXR: odds ratio 4.6 [P = 0.022]; GSTA3: odds ratio 6.4 [P = 0.010]) of having diabetes 1 year later compared with participants in the lowest tertiles (Supplementary Table 11 and Supplementary Fig. 4A). For GSTA3 and DCXR, the higher odds remained significant after adjusting for sex and type of intervention. Next, we evaluated the impact of the relative changes (Δ) of the concentrations of the 14 proteins (from baseline to follow-up) on the odds of developing diabetes or reversing to NGT. After adjusting for sex, type of intervention, and Δ of BMI, the study participants in the highest tertiles of Δ of four proteins (CPM, ITGA5, IL-6, PRCP) had a significantly higher odds of developing diabetes and of two proteins (CPM, IL-6), a significantly lower odds of reversing to normoglycemia compared with study participants in the lowest tertiles (Supplementary Table 11). A STRING network analysis indicated possible functional and physical associations among the 14 proteins. IL-6, SORD, and GUSB demonstrated the most and strongest associations with the other 11 proteins, whereas PRCP was related only with CPM and GSTA3 with none of the other 13 proteins (Fig. 2A).
Protein interaction, functional enrichment, and weighted protein network correlation analysis. A: Protein interaction map for the differentially abundant proteins in the NGT vs diabetes comparison based on STRING version 12 analysis. B: Dot plot with the top enriched Gene Ontology biological process terms for proteins differentially abundant in the comparison of normal weight vs. obesity. C: Enrichment map of the biological theme comparison analysis for KEGG terms in the baseline and follow-up differential abundance analyses of NGT vs. diabetes. NGT baseline and diabetes baseline designate participants who had prediabetes at baseline but reversed to NGT or progressed to diabetes 1 year later (at follow-up). D: Correlation heat map of the protein coexpression modules from the weighted protein correlation network analysis with selected clinical parameters. For each correlation, Spearman correlation coefficients and the corresponding P values are presented. E: Dot plot of the proteins in the brown module and their correlation with the module eigenprotein (y-axis) and glucose AUC (x-axis). Proteins with correlation coefficient ≥0.35 with trait (glucose AUC) and >0.60 with eigenprotein are marked in color and named. AGE-RAGE, advanced glycation end products-receptor of advance glycation end products; ECM, extracellular matrix; EGFR, epidermal growth factor receptor; JAK, Janus kinase; MAPK, mitogen-activated protein kinase; NF, nuclear factor; PI3K, phosphoinositide 3-kinase; Th, T helper cell; WHR, waist-to-hip ratio.
Protein interaction, functional enrichment, and weighted protein network correlation analysis. A: Protein interaction map for the differentially abundant proteins in the NGT vs diabetes comparison based on STRING version 12 analysis. B: Dot plot with the top enriched Gene Ontology biological process terms for proteins differentially abundant in the comparison of normal weight vs. obesity. C: Enrichment map of the biological theme comparison analysis for KEGG terms in the baseline and follow-up differential abundance analyses of NGT vs. diabetes. NGT baseline and diabetes baseline designate participants who had prediabetes at baseline but reversed to NGT or progressed to diabetes 1 year later (at follow-up). D: Correlation heat map of the protein coexpression modules from the weighted protein correlation network analysis with selected clinical parameters. For each correlation, Spearman correlation coefficients and the corresponding P values are presented. E: Dot plot of the proteins in the brown module and their correlation with the module eigenprotein (y-axis) and glucose AUC (x-axis). Proteins with correlation coefficient ≥0.35 with trait (glucose AUC) and >0.60 with eigenprotein are marked in color and named. AGE-RAGE, advanced glycation end products-receptor of advance glycation end products; ECM, extracellular matrix; EGFR, epidermal growth factor receptor; JAK, Janus kinase; MAPK, mitogen-activated protein kinase; NF, nuclear factor; PI3K, phosphoinositide 3-kinase; Th, T helper cell; WHR, waist-to-hip ratio.
When comparing follow-up with baseline, 136 (NGT vs. baseline), 186 (prediabetes vs. baseline), and 45 (diabetes vs. baseline) proteins were differentially abundant according to the unadjusted P value, with one of them (glucose-fructose oxidoreductase domain-containing protein 2 [GFOD2]) having an FDR-adjusted P < 0.05 and another 20 an FDR-adjusted P value < 0.1 (Supplementary Fig. 5 and Supplementary Table 12). Of note, participants with prediabetes at baseline who remained in the prediabetic state at follow-up had a lower HOMA-IR and higher ISI at follow-up, indicating improved insulin sensitivity compared with baseline, despite maintaining the same glycemic state (prediabetes) (Supplementary Table 5). The concentrations of many of the proteins correlated more strongly with markers of insulin sensitivity than with markers of glucose homeostasis or β-cell function (Supplementary Fig. 5B).
To identify pathways related to the distinct proteomic signatures between participants reversing to NGT or progressing to diabetes, a functional enrichment analysis was performed with the whole proteome both at baseline and follow-up. The enrichment signature in the KEGG pathway gene set database pointed to various primarily inflammatory pathways, such as leukocyte chemotaxis, cytokine-cytokine receptor interactions, and TNF signaling pathways, which are among others related to response to different pathogens (Fig. 2B and C). The enrichment signatures showed a remarkable functional consensus between the two time points in the biological theme comparison of the two differentially abundant protein lists, i.e., baseline and follow-up (Fig. 2C). Furthermore, the plasma proteome profiles at the follow-up time point were used to perform a weighted protein correlation network analysis. Four modules were identified, with one of them correlating with parameters of glucose homeostasis (Fig. 2D). Interestingly, this module showed no correlation with HbA1c levels, while the strongest correlation was with plasma glucose AUC during a 2-h OGTT and with BMI (0.37 and 0.35). The module consisted of 79 proteins, and 19 of them were designated as hub proteins (correlation with trait r >0.3 and with module eigengene r >0.6). Among the hub proteins, 26S proteasome non-ATPase regulatory subunit 9 (PSMD9), aldehyde dehydrogenase 1A1 (ALDH1A1), and oncostatin M (OSM) were previously described in association with T2D, with others not being previously related to glucose homeostasis (Fig. 2D and E and Supplementary Table 13). Furthermore, among the hub proteins, six were particularly associated with glucose AUC (Fig. 2E) and three with BMI (Supplementary Fig. 4B).
Plasma Metabolomics at Baseline and Follow-up
NMR plasma metabolomics were performed at 0-, 30-, and 120-min points of OGTT (Supplementary Table 14). The supervised analysis with sPLS-DA in the whole population demonstrated a partial separation between 0 or 30 with 120 min of OGTT (Fig. 3A), indicating differences in the metabolomic profile. The separation of 120 min from 0 and 30 min was mainly explained by lower concentrations in several amino acids at 120 min (separation from right to left, Fig. 3A). Both the 0-min (fasting) and AUC of the OGTTs were used for further analysis.
Metabolomics data analysis. A: Individual sPLS-DA plots of metabolomics data at follow-up, with the time point in OGTT as the response variable (left) and variable plot of the 10 variables with the highest contribution to the model components (right). B: Volcano plot of the differential abundance analysis of metabolites at fasting state (0 min) between prediabetes and diabetes at follow-up. C: Volcano plot of the differential abundance analysis of AUCs in OGTT of metabolites between NGT and diabetes at follow-up. D: Volcano plot of the differential abundance analysis of AUCs in OGTT of metabolites between prediabetes and diabetes at follow-up. LogFC of normalized abundances on the x-axis, log10-transformed nonadjusted P values on the y-axis. Metabolites with nonadjusted P < 0.05 (FDR >0.05) are marked blue, and metabolites with adjusted P value (FDR) < 0.05 are marked red. The prediabetes group included all participants with impaired fasting glucose and/or impaired glucose tolerance. E: Correlation heat map of the 15 differentially abundant metabolites (AUCs in OGTT) in participants with diabetes compared with those with prediabetes or NGT at follow-up. For each correlation, Spearman correlation coefficients with corresponding P values are presented. expl. var, explained variance; WHR, waist-to-hip ratio.
Metabolomics data analysis. A: Individual sPLS-DA plots of metabolomics data at follow-up, with the time point in OGTT as the response variable (left) and variable plot of the 10 variables with the highest contribution to the model components (right). B: Volcano plot of the differential abundance analysis of metabolites at fasting state (0 min) between prediabetes and diabetes at follow-up. C: Volcano plot of the differential abundance analysis of AUCs in OGTT of metabolites between NGT and diabetes at follow-up. D: Volcano plot of the differential abundance analysis of AUCs in OGTT of metabolites between prediabetes and diabetes at follow-up. LogFC of normalized abundances on the x-axis, log10-transformed nonadjusted P values on the y-axis. Metabolites with nonadjusted P < 0.05 (FDR >0.05) are marked blue, and metabolites with adjusted P value (FDR) < 0.05 are marked red. The prediabetes group included all participants with impaired fasting glucose and/or impaired glucose tolerance. E: Correlation heat map of the 15 differentially abundant metabolites (AUCs in OGTT) in participants with diabetes compared with those with prediabetes or NGT at follow-up. For each correlation, Spearman correlation coefficients with corresponding P values are presented. expl. var, explained variance; WHR, waist-to-hip ratio.
When comparing NMR analytes at 0 min of the OGTT (fasting state) among groups at follow-up, the concentrations of seven analytes (HDL-3 apolipoprotein A2 [H3A2], HDL-2 apolipoprotein A2 [H2A2], total HDL apolipoprotein A2 [HDA2], total apolipoprotein A2 [TPA2], lactate, methionine, calcium) were elevated in the diabetes group compared with the prediabetes group after adjusting for multiple comparisons (Fig. 3B). No differences were observed between the NGT group and diabetes group or prediabetes group (data not shown). Next, we compared the AUC derived from the 0-, 30-, and 120-min OGTT measurements of the metabolites among groups at follow-up. Ten analytes had higher AUCs in the diabetes compared with NGT group (Fig. 3C), and nine analytes had higher AUCs in the diabetes compared with prediabetes group (Fig. 3D). These were primarily IDL parameters (concentration of particles [IDPN], apoB [IDAB], and cholesterol [IDCH]), HDL-3 parameters (apoA2 [H3A2], triglycerides [H3TG]), amino acids (isoleucine, leucine, valine, alanine, methionine, tyrosine), and lactate. The upregulated analytes correlated more strongly with indices of body composition (visceral adipose tissue volume and liver fat percentage) and of insulin sensitivity (ISI, HOMA-IR) than with markers of β-cell function (HOMA-B, IGI) (Fig. 3E). After adjusting additionally for sex and type of intervention, the AUCs of IDCH, IDAB, and IDPN remained further significantly upregulated in the diabetes group compared with the NGT group and the AUC of lactate in the diabetes compared with prediabetes group (Supplementary Table 15). No significant differences after adjusting for multiple comparisons were observed among groups at baseline for fasting concentrations or AUCs of metabolites (Supplementary Table 16). Similarly, the concentrations of the analytes identified to be elevated in the diabetes group at follow-up were not associated at baseline (in the prediabetic state) with the risk of developing diabetes 1 year later (data not shown). When comparing baseline with follow-up among groups, the concentrations at 0-, 30-, and 120-min OGTT measurements, as well as the AUC of glutamate, increased in participants progressing from prediabetes to diabetes. On the other hand, fasting glutamate concentrations were reduced at follow-up in study participants maintaining their prediabetic state but who had improved insulin sensitivity (Supplementary Table 17).
Conclusions
We identified 14 proteins with significantly elevated concentrations in patients with prediabetes progressing to diabetes compared with reversing to normoglycemia within 1 year. Six of these 14 proteins are reported for the first time (DCXR, SORD, PTS, KRT18, GSTA3, ARSA), whereas the concentrations of the other eight (IL1RA, IL-6, CPM, CES1, GUSB, ITGA5, PRCP, ACP5) have been associated with incident and/or prevalent T2D in previous proteomics studies (3–8).
Among the six new proteins, enriched expression in the liver is observed in four of them (DCXR, SORD, PTS, KRT18), whereas GSTA3 is highly expressed in the adrenal gland and ARSA ubiquitously. Elevated concentrations of two of the proteins (DCXR and GSTA3) in the prediabetic state are associated with an increased risk of diabetes 1 year later. DCXR is an enzyme catalyzing the reduction of pentoses, tetroses, trioses, and L-xylulose. It is involved in the uronate cycle of glucose metabolism and in water absorption and prevention of osmolytic stress in the renal tubules (12,13). Our STRING analysis supports interactions of DCXR with members of the aldo/keto reductase superfamily, which contribute to the detoxification of dietary and lipid-derived unsaturated carbonyls (AKR1B10) or to the conversion of glucose to sorbitol (AKR1B1) and of sorbitol to fructose (SORD) (14). Thus, the first plasma proteomics signature in our study indicates an increased carbohydrate metabolism through possible activation of the polyol pathway or uronate cycle of glucose metabolism in prediabetes progressing to T2D versus reversing to normoglycemia.
The second plasma proteomic signature indicates increased inflammation in new-onset T2D and specifically possible involvement of pathways related to leukocyte chemotaxis, chemokine signaling, cytokine interactions, and immune response to infections. Importantly, as indicated by biological theme comparison, the inflammation-related differences in the circulating proteome between new-onset T2D and reversion to normoglycemia were also observed 1 year before, when all participants were in the prediabetic state. Regarding specific proteins related to inflammation, IL-6, IL1RN, and ACP5 were among the 14 significant ones identified in our study after FDR adjustment. The relationship of IL-6 and IL1RN with diabetes development has been investigated in several previous studies (15,16), whereas on the contrary, the role of ACP5 in glucose homeostasis remains largely unknown. ACP5 has two isoforms, with one being detected primarily in immune cells (macrophages and dendritic cells) and the other in osteoclasts. In osteoclasts, ACP5 stimulates their activity, and its overexpression has been found to lead to mild osteoporosis in transgenic mice (17). Furthermore, macrophage-expressed ACP5 has been shown to promote pulmonary fibrosis progression and mediate the recruitment of neutrophils and macrophages in bacterial lung infections (18,19). Thus, future studies are needed to investigate the role of ACP5 in immune cell function and organ fibrosis in obesity and diabetes.
GSTA3 was the second protein for which increased levels in prediabetes were associated with a high risk of diabetes 1 year later. GSTA3 catalyzes the double-bond isomerization of precursors of testosterone and progesterone, thus contributing in steroid hormone biosynthesis (20). Additionally, it acts as an antioxidant protecting against aflatoxin B1 injury (21). Finally, GSTA3 might protect against hepatic or renal fibrosis (21–23). The role of GSTA3 in the development of diabetes or diabetes-related complications still remains largely unexplored.
Assuming that a large part of the variations in plasma protein concentrations in the study population might be explained by their differences not in the glycemic state but in other metabolic parameters, we regrouped the study population based on BMI. Indeed, when comparing obesity with overweight or with normal weight, more profound differences in the circulating proteome were observed as when comparing different glycemic states. Several of the significant proteins are established markers of adipose tissue mass (leptin, FABP4), inflammation (ILRN, IL-6, OSM), hormones regulating growth function (IGBP1, IGBP2, HGF), and members of the inhibin-activin-follistatin system (FST, FSTL3, INHBC). Many of these proteins have been associated with glucose homeostasis and prevalent or incident T2D in previous studies (3,5,7,24,25). Finally, by performing a weighted protein correlation network analysis, we aimed to identify clusters of proteins whose concentrations demonstrate high collinearity, independently of disease state. Four modules were generally identified, with one of them being associated with markers of glucose homeostasis. Nineteen proteins correlated the strongest with this module, and some of them have been previously associated with T2D (PMSD9, OSM, ALDH1A1, TGM2), whereas others (CCS, DNPH1, PPME1) are new and their potential role in glucose homeostasis should be investigated in future studies.
Very few studies so far have assessed NMR metabolomics traits in different glycemic traits both at fasting and during an OGTT (26). In our study, the differences in metabolite signatures according to glycemic traits were more profound when AUCs of OGTT were compared between groups than fasting concentrations. Specifically, in the fasting state, participants with diabetes demonstrated mainly elevated apoA2 concentrations compared with those with prediabetes. ApoA2 is a major component of the HDL particles and is thought to be involved in HDL remodeling and cholesterol efflux (27). However, the role of apoA2 in metabolism is complex and not fully understood. On the one hand, glucose might induce apoA2 gene expression in the liver, which can increase plasma triglycerides and further aggravate hyperglycemia (28). On the other hand, apoA2 concentrations have been negatively associated with the risk of coronary artery disease (29) and of microvascular complications in T2D (30). When comparing the AUCs of OGTT, we observed mainly elevated concentrations of IDL particles and components and of several amino acids (branched-chain amino acids, methionine, alanine, tyrosine). These findings agree with previous NMR metabolomics studies in plasma samples that have reported positive associations of IDL, VLDL, small LDL, and small HDL particles with incident or prevalent diabetes (31,32). Similarly, branched-chain amino acids as well as tyrosine, alanine, and glutamate have also been positively associated with diabetes (32–34). Notably, in our study, the concentrations of metabolites and lipid parameters at baseline (in the prediabetic state) were not associated with the risk of diabetes development 1 year later and, thus, showed less predictive potentials for prediabetes trajectories compared with proteomic profiles. A possible explanation for this is that metabolites are often the end products of gene expression, protein activity, and environmental influences, offering a direct snapshot of the (patho)physiological state while also being more susceptible to within-individual variability. In contrast, proteins might be involved earlier in cellular processes and upstream of metabolite production, thus providing earlier indications of changes in a biological system. As prediabetes progressed to diabetes, glutamate levels were increased. Elevated glutamate levels have been previously associated with β-cell dysfunction, insulin resistance, and prevalent and incident T2D (35,36). Whether glutamate is causally related to insulin resistance and development of diabetes remains still under investigation.
A limitation of our study is the relatively small sample size, which requires large effect sizes to reach FDR-adjusted significance for this large number of investigated proteins and metabolites. For this reason, we are also reporting in our data the unadjusted P values of the group comparisons for all proteins. Moreover, we are emphasizing the 14 proteins reaching FDR-adjusted significance in our study. Several of these proteins have also been reported in other proteomics analyses of T2D populations, supporting the validity of our findings. Another limitation is that our population consisted primarily of participants of European descent; thus, the novel findings should be confirmed in other more diverse cohorts. Our study also shows that BMI is an important confounder affecting the associations between plasma proteome and glucose homeostasis or prediabetes trajectories. Thus, the observational character of the study does not allow us to claim causality, and our findings should be considered exploratory. Important strengths of our study are the combined longitudinal and cross-sectional approach supported by comprehensive phenotyping of study participants, as well as the evaluation of two disease trajectories (not only progression to T2D but also remission to normoglycemia) compared with steady state. The 1-year follow-up time enabled us to also assess relatively short-term changes in the proteome and metabolome associated with prediabetes progression or remission, in contrast to the few previous longitudinal omic studies that had reported omic changes after a longer period (5–14 years).
In conclusion, we identify new candidate proteins associated with the progression of prediabetes to diabetes or its remission to normoglycemia within 1 year. The candidate proteins are involved in pathways regulating immune response and carbohydrate metabolism. Finally, we report lipoprotein and amino acid profiles associated with hyperglycemia. The accuracy of the new candidate proteins for the prediction of the progression of prediabetes to diabetes or its remission to normoglycemia should be further assessed in large and ethnically diverse cohorts. Future research should also evaluate whether the observed proteomic and metabolomic signatures are secondary or causally related to the development of diabetes or diabetes-related complications.
This article contains supplementary material online at https://doi.org/10.2337/figshare.27961080.
Article Information
Acknowledgments. The authors thank Alexia Belavgeni, Institute of Clinical Chemistry and Laboratory Medicine, University Hospital and Faculty of Medicine, Technische Universität Dresden, Dresden, Germany, for creating Supplementary Fig. 1 and the graphical abstract, which provides an overview of the study design and analysis being performed.
Funding. N.P. has received transCampus Dresden-King’s College London Science to Business Initiative funding, as well as funding by Deutsche Zentrum für Diabetesforschung (DZD e.V.), German Ministry of Research and Education. J.J.H. was supported by a fellowship by the Deutsche Diabetes Gesellschaft. R.J.S. was funded by Helmholtz Young Investigator Group (VH-GN-1619) and EXC-2124 (03.007_0). A.F.H.P. and S.K. were funded by DZD e.V., German Ministry of Research and Education.
Duality of Interest. P.M. reports research support by grants from the DFG, Else-Kröner Fresenius Center for Digital Health Dresden, BMBF, and National Center for Tumor Diseases. S.K. reports grants outside the study from Wilhelm-Doerenkamp-Foundation, Almond Board of California, J. Rettenmaier & Söhne, German Diabetes Association, and German Association for Diabetes Patients; lecture honoraria from Sanofi, Berlin-Chemie, Boehringer Ingelheim, JuZo-Akademie, and Lilly Deutschland; support for meetings from Wilhelm-Doerenkamp-Foundation; and being a Nutrition Board, German Diabetes Association guidelines team member. A.F.H.P. reports consulting fees from Abbott, Berlin-Chemie, and Novo Nordisk; honoraria from Abbott, BDI, FOMF, AstraZeneca, Sanofi, Berlin-Chemie, Novo Nordisk, Medpoint GmbH, and Eli Lilly; grants through the European Foundation for the Study of Diabetes, and Bundesministerium für Wirtschaft und Energie; payment for expert testimony from Bundesministerium für Ernährung und Landwirtschaft; being a member of Deutsche Diabetes Stiftung; and other financial interests in Institut für Diabetes Technologie–Ulm. M.Bl. reports private consulting fees from Amgen, AstraZeneca, Bayer, Boehringer Ingelheim, Daiichi Sankyo, Eli Lilly, Novo Nordisk, MSD, Sanofi, and Pfizer; consulting fees to his institution from Novo Nordisk and Boehringer Ingelheim; honoraria for lectures from Amgen, AstraZeneca, Bayer, Boehringer Ingelheim, Daiichi Sankyo, Eli Lilly, Novo Nordisk, MSD, Sanofi, and Pfizer; and being on a board for Boehringer Ingelheim. N.S. reports grants through BMBF, DZD e.V., and European Union (EU)-Innovative Health Initiative Stratification of Obesity Phenotypes to Optimise Future Obesity Therapy; consulting fees from Pfizer, GlaxoSmithKline, and AstraZeneca; honoraria from Pfizer, GlaxoSmithKline, AstraZeneca, Sanofi, Novo Nordisk, and Eli Lilly; meeting or travel support from AstraZeneca, Sanofi, and Eli Lilly; and participation on data safety monitoring or advisory boards of GlaxoSmithKline and AstraZeneca. R.W. reports honoraria from Boehringer Ingelheim, Novo Nordisk, Eli Lilly, and Sanofi; meeting or financial support from Sanofi and Novo Nordisk; and participation on data safety monitoring or advisory boards of Eli Lilly. A.F. reports honoraria from Sanofi, Novo Nordisk, AstraZeneca, Eli Lilly, Boehringer Ingelheim, and Abbott. C.S.M. reports grants through his institution from Merck, Massachusetts Life Sciences Center, and Boehringer Ingelheim; personal consulting fees and support through his institution from Ansh Inc.; collaborative research support from LabCorp Inc., personal consulting fees from Nestlé, Olympus, Genfit, Lumos, Novo Nordisk, Amgen, Corcept, Intercept, 89Bio, Madrigal, Aligos, Esperion, and Regeneron; educational activity meals through his institution or national conferences from Esperion, Merck, and Boehringer Ingelheim; and travel support and fees from UpToDate, The Metabolyte Institute of America, Elsevier, and the Cardio Metabolic Health Conference. T.C. reports grants from European Research Council, DFG, U.S. National Institutes of Health, Else Kröner Fresenius Center for Digital Health Dresden, BMBF, DZD e.V., National Center for Tumor Diseases, Sächsisches Staatsministerium für Wissenschaft, Kultur und Tourismus, and Deutsche Krebshilfe. M.R. reports grants from Ministry of Culture and Science of the State of Northrhine Westphalia:German Diabetes Center (DDZ) Profilbildung 2020 (grant PROFILNRW-2020-107-A), BMBF:German Center for Diabetes Research GRK 2576 vivid 493659010 Future4CSPMM, German Federal Ministry of Health: DDZ, European Community (HORIZON-HLTH-2022-STAYHLTH-02-01: P.A.) INTERCEPT-T2D Consortium; consulting fees from Echosens, Novo Nordisk, and Target RWE; lecture honoraria from AstraZeneca, Boehringer Ingelheim, Novo Nordisk, Kenes Group, Madrigal, and MSD; research support from Boehringer Ingelheim and Novo Nordisk; and participation on data safety monitoring or advisory boards of Boehringer Ingelheim, Eli Lilly, and Novo Nordisk. S.R.B. reports grants from Boehringer Ingelheim and advisory board honoraria from Novo Nordisk and Boehringer Ingelheim. N.P. reports consulting fees from Bayer Vital GmbH; support for attending meetings and/or travel from Eli Lilly and Novo Nordisk; speaker honoraria from Novo Nordisk, APOGEPHA, GWT-TUD, Transmedac Innovations AG, Elbe-Gesundsheizszentrum GmbH, and DACH; and guest editor honorarium from Open Exploration outside the submitted work. No other potential conflicts of interest relevant to this article were reported.
Author Contributions. M.Ba., J.J.H., A.H., T.A., P.S., P.M., A.F., S.K., A.F.H.P., M.Bl., J.S., N.S., R.W., A.F., R.J.v.S., S.C., H.H., C.S.M., T.C., A.S., A.L.B., M.R., M.S., S.R.B., and N.P. contributed to data acquisition and data interpretation. M.B., J.J.H., and N.P. performed the data analysis. M.Ba. and N.P. wrote the manuscript with input from all other authors. All authors read and approved the final version of the manuscript. N.P. designed the experiment. M.Ba. and N.P. are the guarantors of this work and, as such, had full access to all the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis.
Prior Presentation. Parts of this study were presented in abstract form at the 58th Diabetes Congress of the German Diabetes Society, Berlin, Germany, 8–11 May 2023.
Handling Editors. The journal editors responsible for overseeing the review of the manuscript were Cheryl A.M. Anderson and Amalia Gastaldelli.