Gestational diabetes mellitus (GDM) affects 3–14% of pregnancies, with 20–50% of these women progressing to type 2 diabetes (T2D) within 5 years. This study sought to develop a metabolomics signature to predict the transition from GDM to T2D. A prospective cohort of 1,035 women with GDM pregnancy were enrolled at 6–9 weeks postpartum (baseline) and were screened for T2D annually for 2 years. Of 1,010 women without T2D at baseline, 113 progressed to T2D within 2 years. T2D developed in another 17 women between 2 and 4 years. A nested case-control design used 122 incident case patients matched to non–case patients by age, prepregnancy BMI, and race/ethnicity. We conducted metabolomics with baseline fasting plasma and identified 21 metabolites that significantly differed by incident T2D status. Machine learning optimization resulted in a decision tree modeling that predicted T2D incidence with a discriminative power of 83.0% in the training set and 76.9% in an independent testing set, which is far superior to measuring fasting plasma glucose levels alone. The American Diabetes Association recommends T2D screening in the early postpartum period via oral glucose tolerance testing after GDM, which is a time-consuming and inconvenient procedure. Our metabolomics signature predicted T2D incidence from a single fasting blood sample. This study represents the first metabolomics study of the transition from GDM to T2D validated in an independent testing set, facilitating early interventions.
Introduction
Currently, gestational diabetes mellitus (GDM) occurs in 3–14% of pregnancies, and type 2 diabetes (T2D) develops in 20–50% of women with GDM within 5 years of the index pregnancy (1,2). The American Diabetes Association (ADA) thus recommends T2D screening at 6–12 weeks postpartum and every 1–3 years thereafter via testing fasting plasma glucose (FPG) level using a 2-h 75-g oral glucose tolerance test (OGTT), or hemoglobin A1c level for women in this high-risk population (3). However, the screening of women after GDM pregnancy remains suboptimal, with very low compliance rates of 16–19% (4,5), although integrated health care systems report screening rates of 60% (2). The reasons for low rates include logistical difficulties in administering an OGTT, fear of receiving a diagnosis of diabetes (6), and failure to attend the postpartum follow-up examination (7). Furthermore, many women with a previous GDM pregnancy hold a faulty low-risk perception of T2D incidence (8,9). A metabolic risk score that can quantify risk, for prediction of the transition from GDM to T2D with a single nonfasting test, would thus be beneficial, but is currently unavailable. Although several risk scores have been developed for T2D (10,11), none of them consider a history of GDM diagnosis. Thus, the prediction of T2D in women with a previous GDM pregnancy is critical for individual risk stratification and early prevention after delivery.
Herein, we have used a metabolomics approach that implements advanced machine learning methods as an excellent tool to identify early diagnostic biomarkers that have the best predictive abilities for complex pathologies such as diabetes, which is a heterogeneous disorder of glucose metabolism that can have diverse root cause across various racial and ethnic subgroups (12). We measured numerous metabolites in stored frozen fasting plasma samples drawn at 6–9 weeks postpartum under standardized research protocols from women with recent GDM without diabetes via the 2-h 75-g OGTT and in whom annual follow-up screening was conducted with 2-h 75-g OGTTs to identify incident cases of T2D within 2 years.
Previous metabolomic investigations of T2D in the general population have revealed significant differences between patients with diabetes and normal glucose-tolerant (NGT) control subjects (13–22), although the majority of these were cross-sectional studies of T2D prevalence. Recently, a study (23) performed lipodomic analysis and evaluated the risk of T2D among women with previous GDM, who were of northern European ancestry. In this study, clinical variables combined with lipid species predicted 21 cases of T2D during 8.5 years of follow-up with ∼80% accuracy. However, this signature has not been independently validated or tested among other ethnicities. Thus, there is an unmet need to accurately predict T2D after GDM pregnancy with a more convenient and accurate method. This study represents the first metabolomics study of the transition from GDM to T2D and offers a quantitative measure of risk, as well as insight into the etiology of the transition.
Research Design and Methods
Study Design
The Study of Women, Infant Feeding, and Type 2 Diabetes Mellitus After GDM Pregnancy (SWIFT) is a prospective cohort study that enrolled 1,035 racially and ethnically diverse women (age 20–45 years) in whom GDM was diagnosed via a 3-h 100-g OGTT based on the Carpenter and Coustan criteria (24), who had no history of diabetes or other serious health conditions, received prenatal care, and delivered singleton pregnancies after ≥35 weeks of gestation at Kaiser Permanente Northern California (KPNC) hospitals during 2008–2011 (25). Details of the study recruitment, selection criteria, methodologies, and baseline characteristics of the cohort (75% minority women [Asian, Hispanic, and black] and 25% of low income), have been described previously (25,26). The SWIFT participants provided written consent to attend three in-person study visits at baseline (6–9 weeks postpartum), and 1 year and 2 years postpartum that included a 2-h 75-g OGTT, and assessments of lactation intensity and duration, sociodemographics, medical and reproductive history, lifestyle behaviors, and anthropometry (25). At each study visit, trained research staff collected and processed plasma samples at the fasting and 2-h time points during the 75-g OGTT and completed assessments. These plasma samples were analyzed within several weeks for levels of glucose and insulin, and subsequently for selected levels of lipids and lipoproteins, as previously described (26,27). The study design and all procedures were approved by the KPNC Institutional Review Board for the protection of human subjects. Of 1,010 women without T2D at baseline, 959 (95%) had follow-up assessments for T2D status within 2 years after baseline via annual study OGTTs and electronic medical records to capture diagnoses of diabetes from KPNC clinical laboratory tests within and beyond the 2 years after baseline (28). T2D diagnosis was based on ADA criteria (29).
Design of Experiment
Of the 130 incident cases of T2D, 113 cases developed within 2 years after baseline (28), and another 17 cases developed beyond 2 years as of December 2014. Using a nested case-control study design within the prospective cohort, 122 incident cases of T2D (105 within 2 years, and 17 beyond 2 years postbaseline) were matched to non-T2D control subjects in a 1:1 ratio based on age, prepregnancy BMI, and race/ethnicity. Age, prepregnancy BMI, and ethnicity/race distributions for the excluded incident T2D cases were not significantly different from incident T2D cases included in the analysis. The 122 incident T2D cases were split in a 2:1 ratio for the training and testing sets. Importantly, for the training set incident T2D cases were matched to control subjects on time of the annual screening tests within the 2 years of follow-up and used to develop a metabolic risk signature. Subsequently, the testing set, comprising 28 incident cases within 2 years as well as 14 incident cases beyond 2 years, was used to independently ensure the generalizability of the model. Fig. 1 displays the study design and work flow.
Metabolite Assay Development
To assay all metabolites of interest, a total of 182 metabolites were subpaneled into four major methods and evaluated in fasting plasma samples collected at 6–9 weeks postpartum. The subpanel of 13 free fatty acids and 4 amino acids were selected based on a literature review of over a dozen T2D metabolomics studies (13–22,30,31). These metabolites were chosen on the basis of consistency in trend direction and significance in a minimum of two studies. Both free fatty acid and amino acid subpanel assays were developed in-house, as described below in the following relevant sections. In addition, a total of 163 metabolites were assayed using the p150 AbsoluteIDQ plate technology according to the manufacturer instructions (Biocrates Life Sciences AG, Innsbruck, Austria). All assays were performed by the Analytical Facility for Bioactive Molecules (The Hospital for Sick Children, Toronto, ON, Canada). β-Hydroxybutyrate (catalog #700190; Cayman Chemicals, Ann Arbor, MI) was assayed by ELISA, whereas FPG level and 2-h OGTT postload glucose (2hPG) were assayed as previously described (26). Only metabolites with a coefficient of variation of <20% for each batch were accepted for the multiplex methods, although the majority had coefficients of variation of <15%. In addition, values were accepted only if the read concentration was within the dynamic range of the assay.
Amino Acid Analysis
For amino acid analyses, aliquots (10 µL) of plasma samples and standard mix samples (0.05–50 µg/mL leucine [Leu] and isoleucine [Ile], 0.005–5 µg/mL 2-aminoadipic acid [2-AAA] and phenyl acetyl glutamine [PAG]) were spiked with the internal standard mixture (5 µg/mL Leu-d10 and Glu-d3, 0.5 µg/mL PAG-d5 in H2O plus 0.1% fatty acid) and extracted by protein precipitation using 600 µL of methanol. Samples were then derivatized with 100 µL 3N HCL in n-butanol, evaporated, and reconstituted in 500 µL of the liquid chromatography–tandem mass spectrometry (LC-MS/MS) mobile phase. LC-MS/MS analysis was performed on a 1290 Infinity LC System (Agilent Technologies) with a Q-Trap 5500 Mass Spectrometer (AB Sciex). Chromatography was performed isocratically on a Kinetex HILIC Column (2.6 µm, 100 Å, 50 × 4.6 mm) (Phenomenex) at a flow rate of 500 µL/min using 5 mmol/L ammonium formate (pH 3.2) in 10/90 water/acetonitrile as the mobile phase. Data were acquired by scheduled multiple reaction monitoring.
Free Fatty Acid Analysis
For selected fatty acids, aliquots (20 µL) of plasma samples and standard mix samples (palmitic [C16:0], palmitoleic [C16:1 n-7], cis-7-hexadecenoic [C16:1 n-9], stearic [C18:0], oleic [C18:1 n-9], vaccenic [C18:1 n-7], linoleic [C18:2], α-linolenic [C18:3], arachidic [C20:0], eicosenoic [C20:1 n-7], arachidonic [C20:4], eicosapentaenoic [C20:5], docosapentaenoic [C22:5], and docosahexaenoic [C22:6] acids) were spiked with internal standards (myristic acid-d3 [C14:0-d3], palmitoleic acid-d14 [C16:1-d14], heptadecanoic acid [C17:0], and eicosanoic acid-d3 [C20:0-d3]). Samples were then acidified with 1 mol/L HCl, and extracted twice with 1 mL of hexane. The combined hexane phases were taken to dryness and derivatized with equal amounts of 1% pentafluorobenzyl bromide and 1% diisopropylamine, evaporated, and reconstituted in 200 µL of hexane. The samples were then injected on the gas chromatography–mass spectrometry system. Excellent separation on the chromatograph was observed for every fatty acid, except for oleate and vaccenate. These two were thus combined to give a total concentration for C18:1.
Statistical Analysis
Testing and training set characteristics at baseline were compared using χ2 statistics for categorical variables (race, education, perinatal characteristics, and medication use) and by comparison of means for continuous variables using ANOVA (levels of fasting plasma lipids and glucose, age, and BMI) and comparison of medians for the months of follow-up using the Wilcoxon rank sum test. A two-tailed independent t test was computed to determine significant differences between non-T2D and incident T2D in the baseline metabolite concentrations, with an α-value set at P < 0.05 using SPSS Statistics version 20 (IBM, Armonk, NY) and then with P values corrected for multiple comparisons with the Benjamini-Hochberg method using RStudio software version 0.99.486 (Boston, MA). Predictive modeling was performed using WEKA software (University of Waikato, Hamilton, New Zealand). The best model was selected as the one with the highest score in the summation of the discriminative power from the receiver operating characteristic (ROC) curves and the F score (32), which is a measure that places greater weight on detecting future cases. The J48 machine learner was optimized to develop a broad classifier by setting the confidence threshold to 0.5 and the minimum object in the leaf node to 14. The naive Bayes classifier was used as the default parameter setting in the WEKA software. Sensitivity (Se), specificity (Sp), and precision (P) were further calculated from the classification plot for both the training and testing sets.
Pearson correlation coefficients were calculated to analyze the relationship between significant metabolites and baseline clinically relevant parameters (6–9 weeks postpartum BMI, FPG, 2hPG, fasting insulin, and HOMA-insulin resistance [IR]) using SAS for Windows (version 9.1.3; SAS Institute Inc., Cary, NC).
Results
Baseline sociodemographic and clinical characteristics of training and testing sets are summarized in Table 1. Although the mean age of women in the training set was significantly younger (P < 0.05) compared with the testing set, no statistically significant differences in any other baseline or prenatal clinical characteristics were found. The race/ethnicity distributions in both the training and testing sets were similar. There was no statistically significant difference in either prepregnancy or baseline (6–9 weeks postpartum) BMI, total caloric intake, or physical activity. A greater proportion of T2D incident case patients had a family history of T2D in the testing set compared with the training set. At baseline, there were significantly higher mean levels of FPG, 2hPG, and fasting insulin, and a higher proportion of case patients treated with insulin or oral diabetes medication during pregnancy among incident T2D case patients compared with non-T2D case patients (P < 0.05) in both sets. The mean HOMA-IR was higher for T2D versus non-T2D case patients (P < 0.05) only in the training set.
Characteristics . | Training set . | Testing set . | ||
---|---|---|---|---|
Non-T2D
(n = 80) . | Incident T2D (n = 80) . | Non-T2D (n = 42) . | Incident T2D (n = 42) . | |
Sociodemographic/clinical | ||||
Age, years | 33.1 (4.5) | 33.3 (5.2) | 35.1 (5.5)† | 35.4 (5.5)† |
Race/ethnicity, n (%) | ||||
Non-Hispanic white | 13 (16) | 12 (15) | 8 (19) | 9 (21) |
Asian (East, South, Southeast) | 26 (33) | 26 (33) | 13 (31) | 10 (24) |
Non-Hispanic black | 10 (12) | 10 (12) | 2 (5) | 5 (12) |
Hispanic | 31 (39) | 31 (39) | 17 (41) | 17 (41) |
Other | 0 (0) | 1 (1) | 2 (5) | 1 (2) |
Parity, n (%) | ||||
Primiparous (1 birth) | 31 (39) | 26 (33) | 13 (31) | 16 (38) |
Biparous (2 births) | 27 (34) | 29 (36) | 14 (33) | 16 (38) |
Multiparous (>2 births) | 22 (27) | 25 (31) | 15 (36) | 10 (24) |
GDM prenatal treatment, n (%) | ||||
Diet only | 50 (63) | 33 (41)*‡ | 29 (69) | 19 (45)*‡ |
Oral medications | 28 (35) | 38 (48) | 13 (31) | 17 (40) |
Insulin | 2 (2) | 9 (11) | 0 (0) | 6 (14) |
Gestational age at GDM diagnosis (weeks) | 24.4 (7.5) | 22.0 (8.6) | 25.0 (7.1) | 23.3 (8.1) |
Prepregnancy BMI, kg/m2 | 33.3 (8.3) | 33.5 (8.4) | 32.6 (7.5) | 33.1 (7.6) |
Postpartum 6–9 weeks BMI, kg/m2 | 33.2 (7.8) | 33.5 (7.7) | 32.4 (6.6) | 33.3 (7.6) |
Hypertension history, n (%) | 16 (20) | 19 (24) | 8 (19) | 8 (19) |
Family history of diabetes, n (%) | 42 (53) | 45 (56) | 19 (33) | 27 (64)*‡ |
6–9 weeks postpartum, lifestyle | ||||
Smoker, n (%) | 2 (3) | 4 (5) | 1 (2) | 1 (2) |
Physical activity, met-h/week | 47.4 (21.0) | 54.2 (25.1) | 49.4 (21.6) | 48.8 (24.9) |
Total energy intake, kcal/day | 811 (319) | 805 (338) | 774 (340) | 900.4 (297) |
Lactation intensity groups, n (%) | ||||
Exclusive lactation | 20 (25) | 10 (12) | 8 (19) | 8 (19) |
Mostly lactation | 30 (38) | 28 (35) | 15 (36) | 17 (41) |
Mostly formula/mixed | 18 (22) | 19 (24) | 10 (24) | 12 (29) |
Exclusive formula | 12 (15) | 23 (29) | 9 (21) | 5 (12) |
6–9 weeks postpartum, plasma | ||||
FPG, mg/dL | 95 (8.4) | 103 (10.5)* | 93.5 (7.8) | 101.4 (11.3)* |
2hPG, mg/dL | 109 (25.9) | 132 (29.5)* | 116 (28.5) | 132 (30.2)* |
Fasting insulin, µU/mL | 26 (14.8) | 33 (17.7)* | 25.6 (12.1) | 29.1 (20) |
Fasting triglycerides, mg/dL | 128 (90.7) | 150 (105.2) | 134 (79.6) | 151.3 (106) |
Fasting HDL-C, mg/dL | 49 (13.2) | 49 (13.0) | 51.5 (13.0) | 49.4 (10.9) |
HOMA-IR | 6.1 (3.7) | 8.6 (5.0)* | 5.97 (3.0) | 7.47 (5.9) |
HOMA-B | 299 (183) | 305 (156) | 313 (153) | 284 (193) |
Postbaseline, 2-year follow-up | ||||
Subsequent birth, n (%) | 5 (6) | 5 (6) | 9 (21) | 2 (5)*‡ |
Follow-up in months, median (IQR) | 22.4 (1.9) | 16.4 (11.6)*‡§ | 21.8 (2.8) | 18.3 (12.5) |
Characteristics . | Training set . | Testing set . | ||
---|---|---|---|---|
Non-T2D
(n = 80) . | Incident T2D (n = 80) . | Non-T2D (n = 42) . | Incident T2D (n = 42) . | |
Sociodemographic/clinical | ||||
Age, years | 33.1 (4.5) | 33.3 (5.2) | 35.1 (5.5)† | 35.4 (5.5)† |
Race/ethnicity, n (%) | ||||
Non-Hispanic white | 13 (16) | 12 (15) | 8 (19) | 9 (21) |
Asian (East, South, Southeast) | 26 (33) | 26 (33) | 13 (31) | 10 (24) |
Non-Hispanic black | 10 (12) | 10 (12) | 2 (5) | 5 (12) |
Hispanic | 31 (39) | 31 (39) | 17 (41) | 17 (41) |
Other | 0 (0) | 1 (1) | 2 (5) | 1 (2) |
Parity, n (%) | ||||
Primiparous (1 birth) | 31 (39) | 26 (33) | 13 (31) | 16 (38) |
Biparous (2 births) | 27 (34) | 29 (36) | 14 (33) | 16 (38) |
Multiparous (>2 births) | 22 (27) | 25 (31) | 15 (36) | 10 (24) |
GDM prenatal treatment, n (%) | ||||
Diet only | 50 (63) | 33 (41)*‡ | 29 (69) | 19 (45)*‡ |
Oral medications | 28 (35) | 38 (48) | 13 (31) | 17 (40) |
Insulin | 2 (2) | 9 (11) | 0 (0) | 6 (14) |
Gestational age at GDM diagnosis (weeks) | 24.4 (7.5) | 22.0 (8.6) | 25.0 (7.1) | 23.3 (8.1) |
Prepregnancy BMI, kg/m2 | 33.3 (8.3) | 33.5 (8.4) | 32.6 (7.5) | 33.1 (7.6) |
Postpartum 6–9 weeks BMI, kg/m2 | 33.2 (7.8) | 33.5 (7.7) | 32.4 (6.6) | 33.3 (7.6) |
Hypertension history, n (%) | 16 (20) | 19 (24) | 8 (19) | 8 (19) |
Family history of diabetes, n (%) | 42 (53) | 45 (56) | 19 (33) | 27 (64)*‡ |
6–9 weeks postpartum, lifestyle | ||||
Smoker, n (%) | 2 (3) | 4 (5) | 1 (2) | 1 (2) |
Physical activity, met-h/week | 47.4 (21.0) | 54.2 (25.1) | 49.4 (21.6) | 48.8 (24.9) |
Total energy intake, kcal/day | 811 (319) | 805 (338) | 774 (340) | 900.4 (297) |
Lactation intensity groups, n (%) | ||||
Exclusive lactation | 20 (25) | 10 (12) | 8 (19) | 8 (19) |
Mostly lactation | 30 (38) | 28 (35) | 15 (36) | 17 (41) |
Mostly formula/mixed | 18 (22) | 19 (24) | 10 (24) | 12 (29) |
Exclusive formula | 12 (15) | 23 (29) | 9 (21) | 5 (12) |
6–9 weeks postpartum, plasma | ||||
FPG, mg/dL | 95 (8.4) | 103 (10.5)* | 93.5 (7.8) | 101.4 (11.3)* |
2hPG, mg/dL | 109 (25.9) | 132 (29.5)* | 116 (28.5) | 132 (30.2)* |
Fasting insulin, µU/mL | 26 (14.8) | 33 (17.7)* | 25.6 (12.1) | 29.1 (20) |
Fasting triglycerides, mg/dL | 128 (90.7) | 150 (105.2) | 134 (79.6) | 151.3 (106) |
Fasting HDL-C, mg/dL | 49 (13.2) | 49 (13.0) | 51.5 (13.0) | 49.4 (10.9) |
HOMA-IR | 6.1 (3.7) | 8.6 (5.0)* | 5.97 (3.0) | 7.47 (5.9) |
HOMA-B | 299 (183) | 305 (156) | 313 (153) | 284 (193) |
Postbaseline, 2-year follow-up | ||||
Subsequent birth, n (%) | 5 (6) | 5 (6) | 9 (21) | 2 (5)*‡ |
Follow-up in months, median (IQR) | 22.4 (1.9) | 16.4 (11.6)*‡§ | 21.8 (2.8) | 18.3 (12.5) |
Data are presented as the mean (SD) unless otherwise noted. Plasma values are from the SWIFT database (26).
*P < 0.05 between incident T2D and non-T2D groups;
‡Determined by χ2 test;
†P < 0.05 between training and testing sets. Specific differences between specific characteristics are shown in boldface type.
§Determined by Wilcoxon sum rank tests for medians.
A total of 110 metabolites passed all quality control criteria, as described above. In the training set, a two-tailed independent t test was carried out, with 21 metabolites found to significantly differ between T2D and non-T2D case patients (Table 2). The levels of metabolites 2-AAA (P < 0.009), Ile (P < 0.009), Leu (P < 0.007), threonine (Thr) (P < 0.02), tryptophan (Trp) (P < 0.02), tyrosine (Tyr) (P < 0.0008), valine (Val) (P < 0.002), xleucine (xLeu) (P < 0.0009), Hexose (P < 0.000002), and the acylcarnitine (AC)3 (P < 0.05) were significantly elevated in incident T2D compared with non-T2D case patients. In contrast, levels of the metabolite glycine (Gly) (P < 0.04), sphingomyelin (SM) metabolites SM (OH) C16:1 (P < 0.04), SM (OH) C22:2 (P < 0.04), SM C18:0 (P < 0.03), SM C18:1 (P < 0.005), SM C20:2 (P < 0.0002), SM C24:1 (P < 0.02), phosphatidylcholines (PC) metabolites PC ae C40:5 (P < 0.05), PC ae C42:5 (P < 0.03), PC ae C44:5 (P < 0.05), AC10 (P < 0.05), and free fatty acid palmitoleic acid (C16:1 n9) (P < 0.04) were decreased in incident T2D compared with non-T2D case patients. Furthermore, levels of Tyr, Val, xLeu, hexoses, and SM C20:2 remained statistically significant after Benjamini-Hochberg correction for multiple comparisons (Table 2).
No. . | Metabolites . | Non-T2D . | Incident T2D . | Uncorrected P value . | Corrected P value* . |
---|---|---|---|---|---|
1 | 2-AAA | 1.06 ± 0.44 | 1.27 ± 0.54 | 8.02E-03 | 1.01E-01 |
2 | Gly | 311.1 ± 112.63 | 279.14 ± 71.7 | 3.38E-02 | 2.31E-01 |
3 | Ile | 46.94 ± 9.09 | 51.39 ± 11.8 | 8.30E-03 | 1.01E-01 |
4 | Leu | 115.05 ± 21.79 | 126.34 ± 29.01 | 6.05E-03 | 9.50E-02 |
5 | Thr | 141.13 ± 27.78 | 154.77 ± 43.81 | 1.99E-02 | 1.83E-01 |
6 | Trp | 66.76 ± 8.31 | 70.52 ± 10.99 | 1.57E-02 | 1.57E-01 |
7 | Tyr | 94.82 ± 17.48 | 106.33 ± 24.51 | 7.95E-04 | 2.23E-02 |
8 | Val | 230.79 ± 35.52 | 252.44 ± 45.63 | 1.01E-03 | 2.23E-02 |
9 | xLeu+ | 200.69 ± 29.18 | 220.64 ± 43.67 | 8.63E-04 | 2.23E-02 |
10 | Hexoses | 4.7 ± 0.51 | 5.16 ± 0.63 | 1.13E-06 | 1.24E-04 |
11 | SM (OH) C16:1 | 2.87 ± 0.69 | 2.62 ± 0.8 | 3.87E-02 | 2.31E-01 |
12 | SM (OH) C22:2 | 7.13 ± 1.45 | 6.59 ± 1.83 | 3.90E-02 | 2.31E-01 |
13 | SM C18:0 | 17.21 ± 3.83 | 15.82 ± 4.19 | 2.98E-02 | 2.31E-01 |
14 | SM C18:1 | 8.91 ± 2.01 | 7.94 ± 2.21 | 4.11E-03 | 7.54E-02 |
15 | SM C20:2 | 0.42 ± 0.12 | 0.34 ± 0.12 | 1.33E-04 | 7.33E-03 |
16 | SM C24:1 | 26.86 ± 5.52 | 24.52 ± 6.44 | 1.47E-02 | 1.57E-01 |
17 | PC ae C40:5 | 4.81 ± 1.21 | 4.36 ± 1.59 | 4.32E-02 | 2.31E-01 |
18 | PC ae C42:5 | 2.27 ± 0.46 | 2.08 ± 0.59 | 2.42E-02 | 2.05E-01 |
19 | PC ae C44:5 | 1.18 ± 0.25 | 1.09 ± 0.32 | 4.47E-02 | 2.31E-01 |
20 | AC10 | 0.25 ± 0.08 | 0.22 ± 0.06 | 4.63E-02 | 2.31E-01 |
21 | AC3 | 0.28 ± 0.08 | 0.31 ± 0.1 | 4.55E-02 | 2.31E-01 |
22 | Palmitoleic acid (C16:1 n9) | 2.76 ± 0.96 | 2.45 ± 0.86 | 3.86E-02 | 2.31E-01 |
No. . | Metabolites . | Non-T2D . | Incident T2D . | Uncorrected P value . | Corrected P value* . |
---|---|---|---|---|---|
1 | 2-AAA | 1.06 ± 0.44 | 1.27 ± 0.54 | 8.02E-03 | 1.01E-01 |
2 | Gly | 311.1 ± 112.63 | 279.14 ± 71.7 | 3.38E-02 | 2.31E-01 |
3 | Ile | 46.94 ± 9.09 | 51.39 ± 11.8 | 8.30E-03 | 1.01E-01 |
4 | Leu | 115.05 ± 21.79 | 126.34 ± 29.01 | 6.05E-03 | 9.50E-02 |
5 | Thr | 141.13 ± 27.78 | 154.77 ± 43.81 | 1.99E-02 | 1.83E-01 |
6 | Trp | 66.76 ± 8.31 | 70.52 ± 10.99 | 1.57E-02 | 1.57E-01 |
7 | Tyr | 94.82 ± 17.48 | 106.33 ± 24.51 | 7.95E-04 | 2.23E-02 |
8 | Val | 230.79 ± 35.52 | 252.44 ± 45.63 | 1.01E-03 | 2.23E-02 |
9 | xLeu+ | 200.69 ± 29.18 | 220.64 ± 43.67 | 8.63E-04 | 2.23E-02 |
10 | Hexoses | 4.7 ± 0.51 | 5.16 ± 0.63 | 1.13E-06 | 1.24E-04 |
11 | SM (OH) C16:1 | 2.87 ± 0.69 | 2.62 ± 0.8 | 3.87E-02 | 2.31E-01 |
12 | SM (OH) C22:2 | 7.13 ± 1.45 | 6.59 ± 1.83 | 3.90E-02 | 2.31E-01 |
13 | SM C18:0 | 17.21 ± 3.83 | 15.82 ± 4.19 | 2.98E-02 | 2.31E-01 |
14 | SM C18:1 | 8.91 ± 2.01 | 7.94 ± 2.21 | 4.11E-03 | 7.54E-02 |
15 | SM C20:2 | 0.42 ± 0.12 | 0.34 ± 0.12 | 1.33E-04 | 7.33E-03 |
16 | SM C24:1 | 26.86 ± 5.52 | 24.52 ± 6.44 | 1.47E-02 | 1.57E-01 |
17 | PC ae C40:5 | 4.81 ± 1.21 | 4.36 ± 1.59 | 4.32E-02 | 2.31E-01 |
18 | PC ae C42:5 | 2.27 ± 0.46 | 2.08 ± 0.59 | 2.42E-02 | 2.05E-01 |
19 | PC ae C44:5 | 1.18 ± 0.25 | 1.09 ± 0.32 | 4.47E-02 | 2.31E-01 |
20 | AC10 | 0.25 ± 0.08 | 0.22 ± 0.06 | 4.63E-02 | 2.31E-01 |
21 | AC3 | 0.28 ± 0.08 | 0.31 ± 0.1 | 4.55E-02 | 2.31E-01 |
22 | Palmitoleic acid (C16:1 n9) | 2.76 ± 0.96 | 2.45 ± 0.86 | 3.86E-02 | 2.31E-01 |
Data are presented as the mean ± SD, unless otherwise noted. Concentrations of metabolites are in μmol/L except for hexoses (mmol/L).
*P values are corrected for multiple comparisons with the Benjamini-Hochberg method and significant metabolites are shown in boldface type.
+Metabolites assayed using both Biocrates plate technology and in-house method, but xLeu was excluded for prediction analysis. Significant differences after Benjamini-Hochberg correction for multiple comparisons are shown in boldface type.
To identify a set of metabolites with an accurate prediction of future occurrence of T2D, we selected a rigorous method of splitting data into training (model building) and testing (model verification) over methods such as cross-validation and holdout. Several methods of attribute selection were explored. First, attributes were ranked by predictive capacity and then trained and tested in a naive Bayes model. Although this initial model worked well in a 10-fold cross-validation, it performed poorly in the testing set, indicating that this method of attribute selection contained data set–specific biases (data not shown). Next, the J48 decision tree method using random sampling of attributes to build trees and then selecting and pruning the trees to identify the best performing attributes (the metabolite model) was used to create the model. We optimized the J48 model by increasing the confidence threshold to 0.5 and the minimum number of subjects to 14. These settings ensured a broad classifier model that was not prone to overfitting.
The resulting metabolite model had a high summation of the area under the curve (AUC) and F score in the training set (Fig. 2A), relying only on a few metabolites, as follows: PC ae C40:5, hexoses, branched-chain amino acids (BCAAs) (Val, Leu, Ile), and SM (OH) C14:1. Baseline (6–9 weeks postpartum) FPG alone predicted T2D incidence in the training set, with an AUC of 0.724 (95% CI 0.645–0.803, P < 0.0001), an Se of 60.0%, an Sp of 75.0%, an F score 0.649, and a total score of 1.373. In contrast, the metabolite model resulted in an AUC of 0.830 (95% CI 0.765–0.894, P < 0.000001), with an Se of 86.3%, a Sp of 69%, an F score of 0.793, and a total score 1.623. We next applied the metabolite model and the FPG model against the testing data set and assessed relative performance using ROC curves (Fig. 2B). The FPG model was worse at predicting the occurrence of T2D, with an AUC of 0.706 (95% CI 0.569–0.816, P < 0.01), an Se of 57.0%, an Sp of 66.7%, an F score of 0.6, and a total score of 1.306. In contrast, the metabolite model performed well with an AUC of 0.769 (95% CI 0.667–0.871, P < 0.001), an Se of 73.8%, an Sp of 69%, an F score of 0.721, and a total score of 1.49 (Table 3). The metabolite model also outperformed the use of 2hPG in both the training set (AUC 0.726, F score 0.6309, total score 1.357) and the testing set (AUC 0.661, F score 0.615, total score 1.276).
Sets . | Parameters . | Optimized machine learner algorithm . | AUC* . | Se . | Sp . | Accuracy . | P . | F score . | Best model score (F score plus AUC) . |
---|---|---|---|---|---|---|---|---|---|
Training | FPG | LR | 0.724 (0.645–0.803) | 60.00% | 75.00% | 67.50% | 70.60% | 64.90% | 1.373 |
2hPG | LR | 0.726 (0.648–0.804) | 58.75% | 72.50% | 65.63% | 68.12% | 63.09% | 1.3569 | |
Metabolite model | DT | 0.830 (0.765–0.894) | 86.30% | 68.80% | 77.50% | 73.40% | 79.30% | 1.623 | |
Testing | FPG | LR | 0.706 (0.596–0.816) | 57.10% | 66.70% | 61.90% | 63.20% | 60.00% | 1.306 |
2hPG Model | LR | 0.661 (0.543–0.779) | 57.10% | 71.40% | 64.30% | 66.70% | 61.50% | 1.276 | |
Metabolite model | DT | 0.769 (0.667–0.871) | 73.80% | 69.10% | 71.40% | 70.50% | 72.10% | 1.490 | |
Glucose model (FPG and 2hPG) | DT | 0.732 | 88.10% | 47.60% | 67.90% | 62.70% | 73.30% | 1.465 | |
Combined model | NB | 0.754 | 54.80% | 76.20% | 65.50% | 69.70% | 61.30% | 1.367 |
Sets . | Parameters . | Optimized machine learner algorithm . | AUC* . | Se . | Sp . | Accuracy . | P . | F score . | Best model score (F score plus AUC) . |
---|---|---|---|---|---|---|---|---|---|
Training | FPG | LR | 0.724 (0.645–0.803) | 60.00% | 75.00% | 67.50% | 70.60% | 64.90% | 1.373 |
2hPG | LR | 0.726 (0.648–0.804) | 58.75% | 72.50% | 65.63% | 68.12% | 63.09% | 1.3569 | |
Metabolite model | DT | 0.830 (0.765–0.894) | 86.30% | 68.80% | 77.50% | 73.40% | 79.30% | 1.623 | |
Testing | FPG | LR | 0.706 (0.596–0.816) | 57.10% | 66.70% | 61.90% | 63.20% | 60.00% | 1.306 |
2hPG Model | LR | 0.661 (0.543–0.779) | 57.10% | 71.40% | 64.30% | 66.70% | 61.50% | 1.276 | |
Metabolite model | DT | 0.769 (0.667–0.871) | 73.80% | 69.10% | 71.40% | 70.50% | 72.10% | 1.490 | |
Glucose model (FPG and 2hPG) | DT | 0.732 | 88.10% | 47.60% | 67.90% | 62.70% | 73.30% | 1.465 | |
Combined model | NB | 0.754 | 54.80% | 76.20% | 65.50% | 69.70% | 61.30% | 1.367 |
DT, J48 decision tree; LR, logistic regression, NB, naive Bayes.
*Data are presented as the mean and 95% CI.
Using FPG and the 2hPG, we could build a model using the J48 decision tree method (the glucose model). The glucose model had greater Se but worse P and Sp compared with the metabolite model (glucose model: P 0.627, Se 0.881, Sp 0.476; metabolite model: P 0.705, Se 0.738, Sp 0.690). To determine whether combining the glucose model and metabolite model (the combined model) could improve prediction, we built an optimized naive Bayes classifier model combining the four metabolite species and glucose data (FPG and 2hPG). The combined model showed worse prediction compared with metabolites alone (P 0.697, Se 0.548, Sp 0.762). Of the three models, the metabolite-only model outperformed the latter two models with the highest AUC and F score (Table 3). The predictions from the three models (metabolite, glucose, and combined metabolite-glucose) were directly compared in a Venn diagram to determine the similarities and differences among the models (Fig. 3).
From the comparisons of the three models (Fig. 3), the combined model showed improvement in capturing all six future T2D case patients solely predicted by the glucose model and missed by the metabolite model. The glucose model could capture only 11 of 16 future T2D case patients predicted by the metabolite model. The combined model fared worse in the prediction of control subjects with eight unique false-negative findings (predicted as patients with diabetes; Fig. 3).
Pearson correlation coefficients were calculated among the 22 metabolites that significantly differ between incident T2D cases and non-T2D cases in the training set, with metabolites selected by machine learning, and five baseline clinical parameters that significantly differed between incident T2D and non-T2D case patients in both the training and testing sets (BMI, FPG, 2hPG, fasting insulin level, and HOMA-IR). SM C24:1 most significantly and negatively correlated with BMI (P < 0.0005, r = −0.277). The correlations of 2-AAA, Ile, AC3, hexoses, and SM C20:2 were most significant using FPG (P < 0.0005, and r = 0.283, 0.278, 0.306, 0.826, and −0.284, respectively). At 2 h PG, total hexoses were most significantly correlated with glucose levels (P < 0.005, r = 0.211), as expected. All other metabolites, with the exception of palmitoleic acid, significantly correlated with both fasting insulin level and HOMA-IR (Table 4). Interestingly, among all 22 significant metabolites, Gly and hexoses were the only metabolites to correlate significantly with all five of the following clinical parameters: BMI (r = −0.151, 0.160), FPG (r = −0.192, 0.826), 2hPG (r = −0.173, 0.211), fasting insulin (r = −0.279, 0.311), and HOMA-IR (r = −0.281, 0.429). SM (OH) C14:1 correlated negatively with BMI, FPG, 2hPG, fasting insulin level, and HOMA-IR, like the other SMs investigated in this study.
Parameter and metabolite . | BMI (kg/m2) . | Fasting glucose (mg/dL) . | 2hPG (glucose mg/dL) . | Fasting insulin (μU/mL) . | HOMA-IR . |
---|---|---|---|---|---|
2-AAA | 0.210** | 0.283*** | 0.115 | 0.335*** | 0.353*** |
Gly | −0.151+ | −0.192* | −0.173* | −0.279*** | −0.281*** |
Ile | 0.230** | 0.278*** | 0.144 | 0.415*** | 0.437*** |
Leu | 0.055 | 0.242** | 0.15* | 0.343*** | 0.367*** |
Thr | 0.218** | 0.156* | 0.025 | 0.150+ | 0.153+ |
Trp | −0.161* | 0.22** | 0.061 | 0.171* | 0.187* |
Tyr | 0.205** | 0.252** | 0.028 | 0.335*** | 0.353*** |
Val | 0.073 | 0.235** | 0.161* | 0.409*** | 0.418*** |
AC10 | −0.022 | −0.165* | 0.139 | −0.201* | −0.202* |
AC3 | 0.104 | 0.306*** | 0.184* | 0.362*** | 0.387*** |
xLeu+ | 0.118 | 0.311*** | 0.197* | 0.481*** | 0.508*** |
Hexoses | 0.16* | 0.826*** | 0.211** | 0.311*** | 0.429*** |
Palmitoleic acid (C16:1n9) | 0.246** | −0.1 | −0.009 | 0.098 | 0.068 |
PC ae C40:5 | −0.252** | −0.054 | 0.081 | −0.329*** | −0.311*** |
PC ae C42:5 | −0.115 | −0.033 | 0.018 | −0.266*** | −0.252** |
PC ae C44:5 | −0.006 | −0.177* | −0.182* | −0.204** | −0.217** |
SM C18:0 | −0.181* | −0.150* | 0.028 | −0.266*** | −0.272*** |
SM C18:1 | −0.049 | −0.157* | −0.039 | −0.254** | −0.263*** |
SM C20:2 | −0.092 | −0.284*** | −0.122 | −0.358*** | −0.376*** |
SM C24:1 | −0.277*** | −0.246** | −0.025 | −0.475*** | −0.475*** |
SM (OH) C14:1 | −0.136 | −0.207* | −0.175* | −0.257** | −0.279*** |
SM (OH) C16:1 | −0.161* | −0.199* | −0.087 | −0.315*** | −0.329*** |
SM (OH) C22:2 | −0.201* | −0.226** | −0.034 | −0.378*** | −0.385*** |
Parameter and metabolite . | BMI (kg/m2) . | Fasting glucose (mg/dL) . | 2hPG (glucose mg/dL) . | Fasting insulin (μU/mL) . | HOMA-IR . |
---|---|---|---|---|---|
2-AAA | 0.210** | 0.283*** | 0.115 | 0.335*** | 0.353*** |
Gly | −0.151+ | −0.192* | −0.173* | −0.279*** | −0.281*** |
Ile | 0.230** | 0.278*** | 0.144 | 0.415*** | 0.437*** |
Leu | 0.055 | 0.242** | 0.15* | 0.343*** | 0.367*** |
Thr | 0.218** | 0.156* | 0.025 | 0.150+ | 0.153+ |
Trp | −0.161* | 0.22** | 0.061 | 0.171* | 0.187* |
Tyr | 0.205** | 0.252** | 0.028 | 0.335*** | 0.353*** |
Val | 0.073 | 0.235** | 0.161* | 0.409*** | 0.418*** |
AC10 | −0.022 | −0.165* | 0.139 | −0.201* | −0.202* |
AC3 | 0.104 | 0.306*** | 0.184* | 0.362*** | 0.387*** |
xLeu+ | 0.118 | 0.311*** | 0.197* | 0.481*** | 0.508*** |
Hexoses | 0.16* | 0.826*** | 0.211** | 0.311*** | 0.429*** |
Palmitoleic acid (C16:1n9) | 0.246** | −0.1 | −0.009 | 0.098 | 0.068 |
PC ae C40:5 | −0.252** | −0.054 | 0.081 | −0.329*** | −0.311*** |
PC ae C42:5 | −0.115 | −0.033 | 0.018 | −0.266*** | −0.252** |
PC ae C44:5 | −0.006 | −0.177* | −0.182* | −0.204** | −0.217** |
SM C18:0 | −0.181* | −0.150* | 0.028 | −0.266*** | −0.272*** |
SM C18:1 | −0.049 | −0.157* | −0.039 | −0.254** | −0.263*** |
SM C20:2 | −0.092 | −0.284*** | −0.122 | −0.358*** | −0.376*** |
SM C24:1 | −0.277*** | −0.246** | −0.025 | −0.475*** | −0.475*** |
SM (OH) C14:1 | −0.136 | −0.207* | −0.175* | −0.257** | −0.279*** |
SM (OH) C16:1 | −0.161* | −0.199* | −0.087 | −0.315*** | −0.329*** |
SM (OH) C22:2 | −0.201* | −0.226** | −0.034 | −0.378*** | −0.385*** |
Discussion
GDM represents one of the strongest risk factors for the development of T2D among young women, of whom 20–50% may develop T2D within 5 years after delivery (1). Metzger et al. (33) reported greater severity of hyperglycemia during pregnancy-predicted T2D conversion within 6 months postpartum as opposed to 5 years, and that higher prepregnancy BMI increased the risk of T2D within 5 years postpartum. The Diabetes Prevention Program Research Group (34) reported a greatly reduced risk of T2D progression among women with a history of GDM by either a lifestyle modification or metformin treatment, with a T2D incidence of 10–15% within 10 years compared with 50% in the standard care group. Nevertheless, many women with GDM hold a false perception of low-risk status for future diabetes (8,9). Thus, diabetes screening is suboptimal during the postpartum period because of the time-consuming glucose tolerance testing and required fasting period.
Herein, we explored a combination of several significantly altered metabolites for the prediction of incident T2D compared to clinical parameters FPG and 2hPG among women matched on age, race/ethnicity, and BMI. Our metabolite model predicts T2D above and beyond the risk contributed by obesity. Several metabolites were statistically significant predictors of incident T2D, and they were previously associated with T2D in cross-sectional metabolomics studies, suggesting that GDM women who are at risk for progressing to T2D present a more T2D-like metabolite profile within the very short time frame of 6–9 weeks postpartum compared with women who will remain without diabetes. Women in whom T2D developed were also more likely to have been treated with insulin or oral medication during pregnancy, underscoring the predictive value of the severity of glucose intolerance during pregnancy.
Comparison of the three T2D predictive models identified the metabolite model as the most balanced for type I (false positive) and type II (false negative) errors over the glucose model. A combined model of metabolites and glucose could improve the capture of future T2D over glucose alone, but with higher false-positive prediction rates. This increased type I error suggests a conflict between the predictions arising from the metabolite or glucose models. Alternatively, these false-positive predictions of future diabetes may represent the detection of individuals in whom diabetes will develop beyond the 2-year window of our current study.
The levels of several amino acids (2-AAA, Ile, Leu, Thr, Trp, Tyr, Val) were increased in subjects with incident T2D, except for Gly, which was significantly decreased. These amino acids are known predictors of T2D (19). The metabolite 2-AAA has been reported to be increased up to 12 years before T2D onset (30). In our study, 2-AAA levels were elevated in women with incident T2D after a previous pregnancy with GDM and were positively correlated with IR. Prevalent, however, in a study by Fiehn et al. (15), where levels of 2-AAA were assessed in a cross-sectional study of African American women with T2D, no statistical significance was observed. Mechanistically, in murine models treated with 2-AAA, decreased levels of FPG and enhanced glucose-stimulated insulin secretion in β-cell models were observed (30). It is still to be determined whether a similar response exists in humans.
BCAA levels correlate with IR in obese subjects (35). Catabolism of BCAAs plays an important role in T2D and impaired fasting glucose levels (36). Clinical trials (18) have also demonstrated that levels of BCAAs, such as Leu, Ile, and Val, are increased up to 7 years before T2D onset. In this study, BCAAs were elevated at 6–9 weeks postpartum among women who were at the highest risk of subsequent progression to T2D, indicating that this metabolic profile precedes the onset of disease rather than being a consequence of T2D.
In our cohort, we observed higher levels of the hexoses (all six carbon sugars, such as glucose, fructose, and mannose) for incident T2D, which is consistent with the findings of others (18). Interestingly, in a T2D metabolomics study, Fiehn et al. (15) characterized carbohydrates and found fructose levels to be significantly elevated in obese women with T2D. Unlike glucose, fructose stimulates hepatic lipogenesis, which may result in hepatic IR, a key feature of T2D (37).
We also observed an overall reduction of sphingomyelin species in individuals with incident T2D compared with non-T2D. Wang-Sattler et al. (19) confirmed a decrease in SM C20:2, SM C16:0, and SM C16:1, among other SM species, and Floegel et al. (21) observed a decrease in SM C16:1 and an inverse association with insulin secretion. In these nested case-control studies, the decreases were found up to 7 years before T2D incidence. The metabolic breakdown of SM results in ceramides, which are known to induce β-cell apoptosis (38,39). Further research is required to determine whether altered concentrations of ceramides mechanistically contribute to T2D, and specifically to levels of SM C20:2, the sphingomyelin species most significant in this cohort.
Anderson et al. (40) investigated the lipidome of postpartum women who were normal, had hyperglycemia (non-GDM), or had GDM. They observed that phosphatidylcholine, lysophosphatidylcholine, ACs, and free fatty acids had the strongest correlations. Lappas et al. (23) applied lipidomics analysis of plasma collected at 12 weeks postpartum in 104 women with a GDM pregnancy who were NGT postpartum and later evaluated T2D again at 8–10 years after delivery in a model including age, BMI, pregnancy FPG, postnatal FPG, triacylglycerol, and total cholesterol, and three metabolites (CE 20:4, PE(P-36:2), and PS 38:4). In our study, palmitoleic acid, AC3, and AC10 were significantly altered with incident T2D. Palmitoleic acid levels were positively related to T2D among older adults (41), and AC3 is known to be integral in the pathway of BCAA catabolism (35). In previous studies, AC10 level has been associated with a graded increase among individuals who were NGT, had impaired glucose tolerance, and had T2D, but others found no significant difference in the association of AC10 levels with T2D compared with female control subjects (14,42). In contrast, our study revealed a decrease in AC10 levels.
Prediction revealed two novel metabolites, PC ae C40:5 and SM (OH) C14:1, as being predictive of incident T2D. Interestingly, PC ae C40:5 was not only significantly decreased in women with incident T2D, but also negatively correlated with BMI, fasting insulin levels, and HOMA-IR. Importantly, machine learning selected metabolite SM (OH) C14:1, a metabolite not associated with T2D incidence. This is because in predictive modeling, as opposed to traditional exploratory research, association is not a requirement for variable inclusion (43). Interestingly, similar to other SMs, SM (OH) C14:1 correlated negatively with BMI, FPG level, and 2hPG, which may partially explain why the combined model did not outperform the metabolite-only model.
Presently, the ADA recommends T2D screening via measuring fasting glucose levels or conducting a 2-h 75-g OGTT at 6–12 weeks postpartum and thereafter every 1–3 years for women with a prior GDM diagnosis, and more frequent testing if screening results fall within the prediabetes ranges. Our metabolomics signature holds the potential to replace the requirement for frequent OGTTs, surpassing both the issue of lost follow-up and low screening rates with a single fasting measurement. In addition, this signature was comparable and outperformed using the 2-h postload plasma glucose level after the OGTT in predicting future T2D incidence within 2 years. Furthermore, this signature presents valuable insight into the etiology of the transition to T2D in women with previous GDM.
Clinical trial reg. no. NCT01967030, clinicaltrials.gov.
Article Information
Acknowledgments. The authors thank Michael Leadley, Ashley St. Pierre, Hayley Craig-Barnes, and Denis Reynaud of the Analytical Facility for Bioactive Molecules of The Centre for the Study of Complex Childhood Diseases, The Hospital for Sick Children, Toronto, Ontario, Canada, for services in the development of the selected reaction monitoring–mass spectrometry protocol for the free fatty acids and special amino acids, as well as for assaying the p150 AbsoluteIDQ plate technology (Biocrates Life Sciences AG, Innsbruck, Austria).
Funding. The SWIFT study (E.P.G., Principal Investigator) was funded by the National Institute of Child Health and Human Development grants R01-HD-050625, R01-HD-050625-03S1, and R01-HD-050625-05S (to E.P.G.). This project was also supported in part by National Institutes of Health National Center for Research Resources grant UCSF-CTSI UL1-RR-024131 and by grants from the Kaiser Permanente Community Benefit Program (Northern California) and the W.K. Kellogg Foundation (to E.P.G.). The metabolomics study was funded by Canadian Institutes of Health Research (CIHR) grants FDN-143219 (to M.B.W.) and MOP-136810 and by Canadian Diabetes Association grant CG-3-12-37 (to M.B.W.). A.A. was supported by an Ontario Graduate Scholarship and the Banting & Best Diabetes Centre (BBDC), University of Toronto. A.N. was supported by a postdoctoral fellowship from the Danish Diabetes Academy supported by Novo Nordisk Foundation. K.J.P. was supported by a CIHR doctoral research award. Y.L. and M.Z. were supported by postdoctoral fellowships from BBDC. B.J.C. is supported by a Tier II Canada Research Chair.
Duality of Interest. No potential conflicts of interest relevant to this article were reported.
Author Contributions. A.A. and A.N. designed this study, analyzed the data, and wrote the manuscript. K.J.P., Y.L., M.Z., and F.F.D. helped to design the metabolite assay panel and provided valuable discussion concerning data interpretation. X.N. and B.J.C. performed data analysis and conducted the sample selection and matching. L.R.O. provided valuable discussion concerning data interpretation. E.P.G. designed this study, is the principal investigator of the SWIFT study, designed the study that collected all data and biospecimens used for this analysis, and contributed to the analytic approach and writing of the manuscript. M.B.W. designed this study and was the primary investigator of the metabolomics study. A.A., E.P.G., and M.B.W. are the guarantors of this work and, as such, had full access to all the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis.