Metabolomic signatures of incident diabetes remain largely unclear for the U.S. Hispanic/Latino population, a group with high diabetes burden. We evaluated the associations of 624 known serum metabolites (measured by a global, untargeted approach) with incident diabetes in a subsample (n = 2,010) of the Hispanic Community Health Study/Study of Latinos without diabetes and cardiovascular disease at baseline (2008–2011). Based on the significant metabolites associated with incident diabetes, metabolite modules were detected using topological network analysis, and their associations with incident diabetes and longitudinal changes in cardiometabolic traits were further examined. There were 224 incident cases of diabetes after an average 6 years of follow-up. After adjustment for sociodemographic, behavioral, and clinical factors, 134 metabolites were associated with incident diabetes (false discovery rate–adjusted P < 0.05). We identified 10 metabolite modules, including modules comprising previously reported diabetes-related metabolites (e.g., sphingolipids, phospholipids, branched-chain and aromatic amino acids, glycine), and 2 reflecting potentially novel metabolite groups (e.g., threonate, N-methylproline, oxalate, and tartarate in a plant food metabolite module and androstenediol sulfates in an androgenic steroid metabolite module). The plant food metabolite module and its components were associated with higher diet quality (especially higher intakes of healthy plant-based foods), lower risk of diabetes, and favorable longitudinal changes in HOMA for insulin resistance. The androgenic steroid module and its component metabolites decreased with increasing age and were associated with a higher risk of diabetes and greater increases in 2-h glucose over time. We replicated the associations of both modules with incident diabetes in a U.S. cohort of non-Hispanic Black and White adults (n = 1,754). Among U.S. Hispanic/Latino adults, we identified metabolites across various biological pathways, including those reflecting androgenic steroids and plant-derived foods, associated with incident diabetes and changes in glycemic traits, highlighting the importance of hormones and dietary intake in the pathogenesis of diabetes.
Introduction
Hispanic/Latino individuals currently comprise 18.4% of the U.S. population, and the proportion is projected to be 30% by 2050 (1). Compared with other U.S. racial/ethnic groups, the Hispanic/Latino population has distinct socioeconomic, lifestyle, and genetic characteristics that may contribute to their disproportionately high burden of metabolic disorders, including diabetes (2–4).
Technical advances offer new opportunities for deciphering metabolomic signatures of diabetes, potentially improving our understanding of pathogenesis and discovery of novel targets to prevent or treat diabetes (5). Over the past decade, many epidemiological studies have identified a wide range of circulating metabolite biomarkers associated with incident diabetes (5–7), including various amino acids (e.g., branched-chain amino acids [BCAAs], aromatic amino acids [AAAs]) (8–10), multiple lipid species (e.g., phospholipids, triacylglycerols) (11–13), carbohydrates related to glucose/energy metabolism (e.g., mannose, trehalose) (9,14), and microbiota-related metabolites (e.g., bile acids, indolepropionate) (15,16). The Human Metabolome Database (17) reported >3,000 metabolites that have been detected and quantified in blood, but the latest studies on metabolomics and incident diabetes examined <300 serum or plasma metabolites (8–13). The majority of prior studies were conducted in European, U.S. non-Hispanic White, or a mixture of ethnically diverse individuals (5–7).
The Hispanic Community Health Study/Study of Latinos (HCHS/SOL) is a population-based cohort study representing individuals with diverse Hispanic/Latino backgrounds in the U.S. (18,19). With its unique ancestral background, the study also includes repeated archived blood samples, well-characterized phenotyping, longitudinal follow-up, and use of an untargeted and unbiased approach for serum metabolomic profiling of 1,136 metabolites that could maximize the potential for discovering novel metabolite signatures associated with incident diabetes (20). These study features represent an ideal opportunity to explore metabolic dysfunctions that underlie the development of diabetes in this U.S. population group with high diabetes burden. Herein, we performed a metabolome-wide analysis of incident diabetes in U.S. Hispanic/Latino adults from HCHS/SOL. We also examined associations of diabetes-related metabolite signatures with longitudinal changes in cardiometabolic traits over ∼6 years. Furthermore, we validated potentially novel metabolite signatures associated with incident diabetes in a separate cohort of U.S. non-Hispanic Black and White adults from the Atherosclerosis Risk in Communities Study (ARIC) study.
Research Design and Methods
Study Design and Population
The HCHS/SOL is a prospective, population-based study of 16,415 Hispanic/Latino adults aged 18–74 years at recruitment who were living in four U.S. metropolitan areas (Bronx, NY; Chicago, IL; Miami, FL; and San Diego, CA). Participants were recruited by using a multistage probability sample design, as described previously (18,19). A comprehensive battery of surveys and a clinical assessment with fasting blood draw were conducted by trained, certified, and bilingual staff at in-person clinic visits from March 2008 to June 2011. The second visit, which included repeated phenotypic assessments and collection of blood samples, started in October 2014 and concluded in December 2017, capturing an ∼6-year follow-up period. The study was approved by the institutional review boards at all participating institutions, and all participants gave written informed consent.
Sample Collection and Metabolomic Profiling
At both visits 1 and 2, participants were asked to fast for at least 8 h before the examination, consume only water and necessary medications, and refrain from smoking or physical activity before undergoing the fasting examination procedures. Venous blood samples were collected, processed, and frozen (at −70°C) onsite toward the beginning of the visit. In total, 3,972 participants who were randomly selected from the visit 1 study population comprised the subsample for metabolomic profiling in HCHS/SOL. Based on the DiscoveryHD4 platform at Metabolon, Inc. (Durham, NC), identification of serum metabolites was achieved by using an untargeted liquid chromatography-mass spectrometry–based metabolomic quantification protocol. More detailed experimental information on mass spectrometry analysis, identification and classification of metabolites, and quality control processes is reported in the Supplementary Material. The platform captured information for a total of 1,136 metabolites, including 782 metabolites with known structural identities and 354 unknown metabolites. For quality control and more interpretable findings, 624 known metabolites with an undetectable rate (<20%) were included in the present analysis (Supplementary Fig. 1).
Measurements of Cardiometabolic Traits and Covariates
Measurements of cardiometabolic traits were performed at both the baseline and follow-up visits. Systolic blood pressure (SBP), diastolic blood pressure (DBP), and anthropometric measurements were performed according to standardized protocols (21). BMI was calculated as weight in kilograms divided by height in meters squared and waist-to-hip ratio (WHR) as the ratio of the circumference of the waist to that of the hips. Centralized laboratory tests were performed to determine plasma glucose, insulin, hemoglobin A1c (HbA1c), and serum lipids, including triglycerides and total, LDL, and HDL cholesterol. For participants without self-reported diabetes, 2-h plasma glucose levels were also measured following a standard 75 g oral glucose tolerance test (OGTT). HOMA of insulin resistance (HOMA-IR) was derived using a common equation based on fasting glucose and insulin (22).
Structured questionnaires were administered by trained and bilingual interviewers to collect information on demographic, socioeconomic, and behavioral characteristics; medication use; and disease histories at baseline. Habitual dietary intake was estimated using the National Cancer Institute methodology based on two 24 h dietary recalls and a food propensity questionnaire (23). Overall dietary quality for each participant was estimated by the Alternative Healthy Eating Index-2010 (AHEI-2010) score (24). Physical activity was measured using the Global Physical Activity Questionnaire (25).
Diabetes Ascertainment
Participants were classified as having diabetes if they reported use of antidiabetic medications or met the following American Diabetes Association criteria: 1) fasting plasma glucose ≥126 mg/dL, 2) 2-h OGTT plasma glucose ≥200 mg/dL, or 3) HbA1c ≥6.5%. On the basis of these criteria, participants free of diabetes at baseline (visit 1) who were identified as having diabetes at visit 2 were deemed to have incident diabetes (26).
Statistical Analysis
After excluding participants who had prevalent diabetes (n = 804) or cardiovascular disease or cancer (n = 261) at baseline, those who died before the follow-up visit or refused or were unable to attend visit 2 (n = 895), and those missing information on Hispanic/Latino background (n = 2), 2,010 participants remained for the present analysis. For the 624 assessed metabolites, the median percentage of missing/undetectable values was 0.15%, and 521 metabolites had <5% missing/undetectable values. These missing/undetectable values of metabolites were imputed using one-half of the lowest detected value, and the values of individual metabolites were transformed using a rank-based inverse normal transformation to approximate a normal distribution (11). To account for oversampling of specific population subgroups and nonresponse to the follow-up visit, all analyses incorporated HCHS/SOL complex study design and sampling weights (18,19).
Age-adjusted descriptive characteristics of the study population were presented according to incident diabetes status as means or percentages with 95% CIs. Associations between individual metabolites and risk of diabetes were examined using multivariable survey Poisson regression models after multivariable adjustment for demographic and socioeconomic characteristics, lifestyle behaviors, diet quality, medication use, family history of diabetes, and fasting hours for the blood samples (details listed in the footnotes/legends to relevant tables and figures). P values were corrected for the false discovery rate (FDR) using the Benjamini-Hochberg procedure (27). For identified metabolites associated with incident diabetes (FDR-adjusted P < 0.05), metabolite modules were detected by performing a network analysis using a topological overlap measure, which is a data reduction technique that allows for dependency between components and robustly measures interconnectedness based on shared network neighbors (28,29). Associations between metabolite modules and risk of diabetes were examined using multivariable survey Poisson regression models after the aforementioned multivariable adjustment, with or without further adjustment for other conventional cardiometabolic traits (i.e., BMI, WHR, HDL cholesterol, triglycerides, SBP, and DBP). Using multivariable survey linear regression models, we also examined the associations of metabolite modules with 6-year changes in glycemic traits (fasting glucose, HbA1c, 2-h glucose, fasting insulin, and HOMA-IR) and other cardiometabolic traits, further adjusting for baseline levels of the assessed traits and excluding users of antihypertensive, diabetes, or lipid-lowering medications that could influence the traits at either visit.
Metabolite modules comprising potentially novel metabolites associated with incident diabetes were validated using data from the ARIC study, a cohort of U.S. non-Hispanic Black and White participants aged 45–64 years (9). Participants with diabetes cases in the ARIC study were identified by an elevated blood glucose at a study visit (fasting glucose ≥7.0 mmol/L or nonfasting glucose ≥11.1 mmol/L) or self-reported physician-diagnosed diabetes or use of antidiabetic medications at a study visit or through an annual follow-up telephone interview (30). Incident diabetes was defined as newly onset diabetes from baseline through 31 December 2015. A total of 1,754 participants with fasting serum metabolomic data and free of diabetes, major cardiovascular disease, and cancer at baseline were included, and 629 participants with incident diabetes were identified during ∼25 years of follow-up. Multivariable Cox proportional hazard regressions were used to estimate hazard ratios (HRs) and 95% CIs of incident diabetes associated with per-SD increment in metabolites. Results from the HCHS/SOL and ARIC cohorts were combined using fixed-effects meta-analysis. All statistical analyses were performed using R version 3.3.2 (R Foundation for Statistical Computing) or Stata version 15.1 (StataCorp) software.
Data and Resource Availability
The data sets analyzed during this study are available from the corresponding author upon reasonable request in addition to a data and materials distribution agreement to protect the confidentiality and privacy of the participants and their families. Alternatively, deidentified data are publicly available at Biologic Specimen and Data Repository Information Coordinating Center and database of Genotypes and Phenotypes for the subset participants of the study cohort who authorized general use of their data at the time of informed consent.
Results
Participant Characteristics
During an average 6 years of follow-up, 224 incident cases of diabetes were identified among 2,010 participants in HCHS/SOL. Compared with individuals who did not develop diabetes, those with incident diabetes were older, had lower family income, and had higher BMI, WHR, DBP, and triglyceride levels but lower HDL cholesterol levels (Supplementary Table 1). Use of antihypertensive and lipid-lowering medications and family history of diabetes also were more frequent among participants with incident diabetes than those without. Basic baseline characteristics and diabetes incidence over ∼6 years of participants included in the current analysis were generally similar compared with those who were not included (data not shown).
Individual Metabolites and Risk of Diabetes
After multivariable adjustment for sociodemographic, behavioral, and clinical factors, 134 of the 624 known metabolites were significantly associated with incident diabetes (FDR-adjusted P < 0.05) (Fig. 1). Of these metabolites, 110 were positively associated and 24 metabolites inversely associated with risk of diabetes (Supplementary Table 2). These 134 significant metabolites largely included lipids (41.8%) and amino acids (AAs) (35.8%), and 1.5–6.0% were within six other metabolite pathways (Supplementary Fig. 1).
Metabolite Modules and Risk of Diabetes
Based on the 134 metabolites significantly associated with risk of diabetes, 10 metabolite modules were detected by the network analyses (Fig. 2). These included four lipid modules (8–14 metabolites), two AA modules (4 and 38 metabolites), one module of 9 metabolites involved in glucose/energy metabolism, one module of 6 plant food metabolites, one module of 6 androgenic steroid metabolites, and one module of 9 metabolites within a variety of metabolic pathways (Fig. 2, Table 1, and Supplementary Table 2). A visual examination of the module networks showed that the four lipid modules had evident overlaps with each other, while the other modules, except for the module of plant food metabolites, were close neighbors (Fig. 2).
Metabolite modules . | Metabolites (n) . | Model 1 . | Model 2 . | ||
---|---|---|---|---|---|
RR (95% CI) . | P . | RR (95% CI) . | P . | ||
Lipid module 1 (sphingolipids) | 8 | 1.69 (1.42–2.02) | <0.001 | 1.40 (1.18–1.67) | <0.001 |
Lipid module 2 (PC, PI) | 14 | 1.62 (1.36–1.93) | <0.001 | 1.39 (1.12–1.73) | 0.003 |
Lipid module 3 (MAG, DAG) | 9 | 1.56 (1.33–1.82) | <0.001 | 1.31 (0.87–1.96) | 0.195 |
Lipid module 4 (PE, LPL) | 10 | 1.52 (1.27–1.81) | <0.001 | 1.22 (0.96–1.55) | 0.106 |
AA module 1 (BCAA, AAA, GGAA) | 38 | 2.02 (1.65–2.48) | <0.001 | 1.57 (1.26–1.95) | <0.001 |
AA module 2 (glycine) | 4 | 0.61 (0.51–0.74) | <0.001 | 0.75 (0.61–0.93) | 0.009 |
Glucose and energy metabolism module | 9 | 1.88 (1.61–2.19) | <0.001 | 1.59 (1.34–1.88) | <0.001 |
Plant food metabolite module | 6 | 0.66 (0.57–0.78) | <0.001 | 0.70 (0.59–0.83) | <0.001 |
Androgenic steroid module | 6 | 1.46 (1.18–1.80) | <0.001 | 1.29 (1.04–1.61) | 0.023 |
Mixed metabolite module | 9 | 1.61 (1.36–1.92) | <0.001 | 1.33 (1.11–1.60) | 0.002 |
Metabolite modules . | Metabolites (n) . | Model 1 . | Model 2 . | ||
---|---|---|---|---|---|
RR (95% CI) . | P . | RR (95% CI) . | P . | ||
Lipid module 1 (sphingolipids) | 8 | 1.69 (1.42–2.02) | <0.001 | 1.40 (1.18–1.67) | <0.001 |
Lipid module 2 (PC, PI) | 14 | 1.62 (1.36–1.93) | <0.001 | 1.39 (1.12–1.73) | 0.003 |
Lipid module 3 (MAG, DAG) | 9 | 1.56 (1.33–1.82) | <0.001 | 1.31 (0.87–1.96) | 0.195 |
Lipid module 4 (PE, LPL) | 10 | 1.52 (1.27–1.81) | <0.001 | 1.22 (0.96–1.55) | 0.106 |
AA module 1 (BCAA, AAA, GGAA) | 38 | 2.02 (1.65–2.48) | <0.001 | 1.57 (1.26–1.95) | <0.001 |
AA module 2 (glycine) | 4 | 0.61 (0.51–0.74) | <0.001 | 0.75 (0.61–0.93) | 0.009 |
Glucose and energy metabolism module | 9 | 1.88 (1.61–2.19) | <0.001 | 1.59 (1.34–1.88) | <0.001 |
Plant food metabolite module | 6 | 0.66 (0.57–0.78) | <0.001 | 0.70 (0.59–0.83) | <0.001 |
Androgenic steroid module | 6 | 1.46 (1.18–1.80) | <0.001 | 1.29 (1.04–1.61) | 0.023 |
Mixed metabolite module | 9 | 1.61 (1.36–1.92) | <0.001 | 1.33 (1.11–1.60) | 0.002 |
Results are per SD increment of the module scores in multivariable survey Poisson regression models. Model 1 was adjusted for age, sex, study field center, Hispanic/Latino background, education, annual household income, smoking status, drinking status, total physical activity, AHEI-2010 score, lipid-lowering medication use, antihypertensive medication use, family history of diabetes, and fasting hours for the blood samples. Model 2 was further adjusted for BMI, WHR, HDL cholesterol, triglycerides, and SBP and DBP. PC, phosphatidylcholine; PI, phosphatidylinositol.
Correlations among the 10 metabolite modules were generally modest, except for a few moderate to substantial correlations among the 4 lipid modules (Supplementary Fig. 2). The four lipid modules showed the most evident positive correlations with triglycerides, total cholesterol, and/or LDL cholesterol (Supplementary Fig. 3). An AA module (including glycine and related metabolites) and the plant food metabolite module both had inverse correlations with glycemic traits, while other metabolite modules were positively correlated with glycemic traits, especially fasting insulin and HOMA-IR. As expected, the module of androgenic steroid metabolites was inversely associated with age (r = −0.34).
After multivariable adjustment, all 10 metabolite modules were significantly associated with risk of diabetes (P-trend < 0.001) (Table 1). Eight modules were positively associated with risk of diabetes, with relative risks (RRs) of diabetes per SD increment in the module score ranging from 1.52 (95% CI 1.27–1.81) for a lipid module, including phosphatidylethanolamine/lysophospholipid (PE/LPL) to 2.02 (95% CI 1.65–2.48) for an AA module consisting of BCAA/AAA/γ-glutamyl amino acid (GGAA). The glycine AA module (RR 0.61, 95% CI 0.51–0.74) and the plant food metabolite module (RR 0.66, 95% CI 0.57–0.78) were both inversely associated with risk of diabetes. These associations between metabolite modules and risk of diabetes were broadly consistent across subgroups of the population with different Hispanic/Latino heritages (Supplementary Fig. 4). After further adjustment for conventional cardiometabolic traits, including adiposity measures, blood lipids, and BP, most of the examined associations between metabolite modules and risk of diabetes were slightly attenuated (Table 1). Eight metabolite modules remained significantly associated with risk of diabetes, while the associations for the PE/LPL lipid module and another lipid module including nine monoacylglycerol/diacylglycerol (MAG/DAG) metabolites became nonsignificant. The attenuations of the associations appeared to be largely due to the adjustment for HDL cholesterol and triglycerides (for the PE/LPL lipid module) or SBP and DBP (for the other modules) (Supplementary Fig. 5).
Metabolite Modules and Changes in Glycemic Traits
We then examined relationships between baseline metabolite modules and longitudinal changes in cardiometabolic traits over 6 years. There were only a few significant associations between the metabolite modules and changes in serum lipids (e.g., positive association of sphingolipid module with changes in triglycerides) and no association between any metabolite modules and changes in BP (Supplementary Table 3). However, the majority of the metabolite modules were associated with changes in various glycemic traits in directions consistent with those of metabolite module-diabetes associations (Fig. 3 and Supplementary Table 3). For example, after multivariable adjustment, including adjustment for BMI and WHR, the BCAA/AAA/GGAA and androgenic steroid modules were both associated with greater increases in fasting glucose (P < 0.007) and 2-h glucose (P < 0.004), and the former was further associated with more increased HOMA-IR (P = 0.037), while the plant food metabolite module was associated with attenuated increases in HOMA-IR (P = 0.0008).
Further Analyses for the Plant Food and Androgenic Steroid Metabolite Modules
Compared with metabolites within other modules (especially lipid and AA modules), those in the androgenic steroid and plant food metabolite modules have been less well described in prior diabetes studies (5–7); thus, additional analyses of these metabolites and modules were performed. Among the 10 metabolite modules, the most evident correlations with dietary factors were the positive correlations of the plant food metabolite module with AHEI-2010 score and with intakes of fruits and vegetables (Supplementary Fig. 6). The six plant food metabolites, which were moderately or substantially correlated with each other (Supplementary Fig. 7), were also positively associated with AHEI-2010 score and with intakes of whole fruit and/or other healthy plant foods (Fig. 4A). The plant food metabolite module and its components were further examined for associations with incident diabetes in the ARIC study. Four of the six component metabolites were measured in that study, and a metabolite module comprising these four metabolites showed a significant inverse association with risk of diabetes in both the HCHS/SOL and the ARIC studies as well as in the combined meta-analysis (Fig. 4B).
As expected, the androgenic steroid module score was higher in men than in women, and decreased with increasing age in both sexes (Fig. 5A). The six androgenic steroid metabolites had moderate to substantial correlations with each other (r = 0.47–0.89) (Supplementary Fig. 8). Levels of these metabolites were higher in men than in women, and they were inversely associated with age (Supplementary Fig. 9). We also examined sex-specific associations of the androgenic steroid module score and its component metabolites with risk of diabetes and found that all significant associations were observed in men but not in women (Supplementary Fig. 10). RRs comparing the highest with the lowest quartile of the module score were 4.55 (95% CI 2.06–10.06) in men and 1.01 (95% CI 0.52–1.95) in women (P-interaction = 0.024). Three of the six component metabolites were measured in the ARIC study, and a metabolite module composed of these three metabolites was significantly and positively associated with risk of diabetes in both the HCHS/SOL and the ARIC studies as well as in the combined meta-analysis (Fig. 5B).
Discussion
In a population-based study of U.S. Hispanic/Latino adults using a global, nontargeted metabolomics platform, our analyses systematically assessed a wide range of serum metabolites for their associations with incident diabetes. A total of 134 metabolites covered by distinct biological pathways were significantly associated with risk of diabetes. In network analyses, these metabolomic signatures formed 10 metabolite modules, with some similar to previously reported metabolites indicating diabetes risk (e.g., various lipid and AA metabolite modules) and others suggesting potentially novel metabolites associated with incident diabetes (e.g., androgenic steroid metabolite and plant food metabolite modules). The potentially novel metabolites and modules were replicated in ARIC, a study of U.S. non-Hispanic Black and White individuals. In addition, the majority of the metabolite modules were associated with 6-year changes in various glycemic traits in directions consistent with those of associations for incident diabetes.
Many metabolites have been identified to be associated with risk of diabetes in previous studies (5–7), including multiple lipid species, AA metabolites, and metabolites related to glucose and energy metabolism. These were confirmed in our study of U.S. Hispanic/Latino adults. For example, AA metabolites have been widely assessed for their associations with risk of diabetes, with consistent, positive associations of BCAA and AAA metabolites with risk of diabetes in the current and previous prospective studies (8–10), as well as for a potential causal relationship between BCAAs and diabetes suggested by a Mendelian randomization analysis (31). We also found that the module comprising BCAA/AAA/GGAA metabolites was associated with greater increases in various glycemic traits over time, especially 2-h glucose and HOMA-IR, supporting the role of BCAAs in impaired insulin signaling and insulin resistance (32). An inverse association was observed between another AA module of glycine and related metabolites and risk of diabetes. This is in agreement with findings from observational studies (5) and a Mendelian randomization analysis (8) as well as with evidence from experimental studies supporting an important role of glycine in glucose homeostasis (33). The consistency of this evidence suggests that interventions that target alterations of these AA metabolites in blood may inform efforts toward diabetes prevention.
In addition to these confirmed findings among many known diabetes-associated metabolites, our study identified potentially novel associations of several androgenic steroid metabolites with risk of diabetes, and the results were replicated in the ARIC study with both (non-Hispanic) White and Black participants. These metabolites are sulfated androstenediol and androstanediol, which are inactive forms of androgenic steroids for transportation and reservation in the circulation system with much higher concentrations compared with unsulfated steroids (34). Androstenediol is a precursor of testosterone, which can be converted to dihydrotestosterone, while androstanediol is a downstream metabolite of dihydrotestosterone and can also convert back to dihydrotestosterone through an alternative pathway (35). Similar to testosterone, levels of these intermediate androgenic steroids were higher in men than in women and decreased with age, but they were positively associated with risk of diabetes in our study, especially in men, which is in opposition with the previously reported inverse association between testosterone and risk of diabetes in men (36,37). Our further analyses indicated a significantly higher risk of diabetes only in men with very high levels of these androgenic steroids. The explanation for this unexpected positive association is unknown, as these intermediate androgenic steroids have been rarely studied for their association with diabetes and previous studies have largely focused on testosterone (36). Future studies systematically investigating these intermediate androgenic steroids along with testosterone and dihydrotestosterone may help to clarify the relationships between androgens and risk of diabetes. Nevertheless, our findings suggest a potentially harmful effect of excess levels of these intermediate androgenic steroids on metabolic health. If these results can be further validated, they might have important clinical implications, as evidence from randomized clinical trials for the beneficial role of testosterone treatment in diabetes prevention and glycemic control is still unclear (38–40).
We also found that several metabolites, including β-cryptoxanthin, isocitrate, threonate, N-methylproline, oxalate, and tartarate, as well as a module of these metabolites were positively associated with dietary intakes of healthy plant-based foods (e.g., fruits and vegetables) and inversely associated with risk of diabetes. These metabolites occur naturally in fruits and vegetables (41–45), and some of these metabolites (i.e., β-cryptoxanthin, threonate, N-methylproline) are considered to be biomarkers of several healthy dietary patterns (e.g., the Mediterranean diet) and their plant food components that are potentially protective against diabetes (46,47). Circulating levels of β-cryptoxanthin and isocitrate have been associated with a lower risk of diabetes in previous studies (8,48), while other metabolites have not been previously reported to be associated with risk of diabetes. We also found that a higher score of the plant food metabolite module was associated with attenuated increases in insulin resistance measures over time, which is in alignment with evidence from animal studies that some of these plant food metabolites (i.e., β-cryptoxanthin, isocitrate) might contribute to improved insulin sensitivity (49–51). However, it is still uncertain whether these metabolites mechanistically underlie the associations between healthy plant-based diets and a lower risk of diabetes (52) or whether they are simply serum biomarkers reflecting general health benefits of a healthy diet.
To our knowledge, this metabolome-wide study of incident diabetes is the first in a U.S. Hispanic/Latino population. Our study has several notable strengths, including an unbiased metabolomic approach covering many metabolites that have not been well studied (e.g., androgenic steroid metabolites), the investigation of dietary factors that are major determinants of circulating metabolites, the network analysis to interpret correlated metabolites and related biological pathways, and a diverse population sample of Hispanic/Latino groups with replication among U.S. non-Hispanic Black and White adults. Using repeated measures of glycemic traits over 6 years, we were able to investigate baseline diabetes-related metabolite profiles in relation to longitudinal changes in glucose and insulin resistance measures, which were rarely studied in previous metabolomic studies and may help with better understanding the potential mechanisms underlying the metabolite-diabetes associations. Several limitations to our study also merit consideration. Because of the nature of the observational design, we were unable to make causal inferences. To capture a large number of metabolites, we used an untargeted metabolomic approach that was unable to measure absolute values of metabolites, although this did not impede our ability to estimate the associations between metabolites and risk of diabetes. Future studies of targeted metabolomics, especially studies that take into account changes in metabolites over time, are still needed to confirm our findings. Self-reported dietary recalls with inevitable measurement errors was another limitation, but the identified diet-metabolite associations were supported by strong biological plausibility, as these metabolites are food compounds and/or derivatives (41–45). Although the identified associations between metabolomic signatures and risk of diabetes were generally consistent across various groups of Hispanic/Latino background and validated in a cohort of U.S. non-Hispanic Black and White individuals, generalizability to other diverse populations is unknown.
In summary, this study systematically profiled a wide range of serum metabolites that were associated with an altered risk of diabetes in U.S. Hispanic/Latino adults. These metabolomic signatures spanned multiple biological pathways upon which they may be involved in the pathogenesis of diabetes. In particular, our findings contribute to the increasing body of metabolomic studies of incident diabetes by extending the metabolite-diabetes associations to a significantly understudied U.S. ethnic group comprising a rapidly growing population with a disproportionately high diabetes burden. Our findings also suggest the potential role of a panel of androgenic steroids or plant food–derived metabolites in the development of diabetes.
J.C.C. and G.-C.C. contributed equally to this work.
This article contains supplementary material online at https://doi.org/10.2337/figshare.19324199.
Article Information
Acknowledgments. The authors thank the staff and participants of HCHS/SOL and ARIC for important contributions. A complete list of HCHS/SOL staff and investigators can be found in Lavange et al. (18) or at https://sites.cscc.unc.edu/hchs/.
Funding. The HCHS/SOL is a collaborative study supported by contracts from the National Heart, Lung, and Blood Institute (NHLBI) to the University of North Carolina (HHSN268201300001I/N01-HC-65233), University of Miami (HHSN268201300004I/N01-HC-65234), Albert Einstein College of Medicine (HHSN268201300002I/N01-HC-65235), University of Illinois at Chicago (HHSN268201300003I/N01-HC-65236 Northwestern University), and San Diego State University (HHSN268201300005I/N01-HC-65237). The following institutes, centers, and offices have contributed to HCHS/SOL through a transfer of funds to NHLBI: National Institute on Minority Health and Health Disparities, National Institute on Deafness and Other Communication Disorders, National Institute of Dental and Craniofacial Research, National Institute of Diabetes and Digestive and Kidney Diseases, National Institute of Neurological Disorders and Stroke, and National Institutes of Health (NIH) Office of Dietary Supplements. The ARIC study has been funded in whole or in part with federal funds from NHLBI, NIH, Department of Health and Human Services (contracts HHSN268201700001I, HHSN268201700002I, HHSN268201700003I, HHSN268201700004I, and HHSN268201700005I and R01HL087641, R01HL059367, and R01HL086694), National Human Genome Research Institute contract U01HG004402, and NIH contract HHSN268200625226C. Infrastructure was partly supported by grant UL1RR025005, a component of NIH and the NIH Roadmap for Medical Research. Metabolomics measurements were sponsored by the National Human Genome Research Institute (3U01HG004402-02S1). This work is supported by the National Institute of Diabetes and Digestive and Kidney Diseases grant R01DK119268, and other funding sources for this study include the National Human Genome Research Institute (UM1HG008898), NHLBI (K01HL129892, R01HL060712, R01HL140976, and R01HL136266), New York Regional Center for Diabetes Translation Research (R01DK120870), and National Institute of Diabetes and Digestive and Kidney Diseases (P30DK111022). E.S. was supported by NHLBI grant K24HL152440. C.M.R. was supported by the National Institute of Diabetes and Digestive and Kidney Diseases (K01DK107782 and R03DK128386) and NHLBI (R56HL153178).
Duality of Interest. No potential conflicts of interest relevant to this article were reported.
Author Contributions. J.C.C. and G.-C.C. prepared the tables and figures. J.C.C., G.-C.C., B.Y., and J.X. performed the statistical analyses. J.C.C., G.-C.C., B.Y., J.L., T.K., K.M.P., M.J.P., D.C.V., S.F.C., E.S., C.M.R., M.L.D., J.C., L.V.H., C.R.I., Q.S., M.H., X.X., E.B., R.C.K, and Q.Q. contributed to the interpretation of data, critically reviewed and revised the manuscript, and all authors approved the final manuscript. J.C.C. and Q.Q. designed the research and developed the analytical plan. G.-C.C. and Q.Q. had primary responsibility for writing the manuscript. Q.Q. directed the study. Q.Q. is the guarantor of this work and, as such, had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
Prior Presentation. Parts of this study were presented in abstract and oral form at the 79th Scientific Sessions of the American Diabetes Association, San Francisco, CA, 7–11 June 2019.