Metabolomics, with its wealth of data, offers a valuable avenue for enhancing predictions and decision-making in diabetes. This observational study, aimed to leverage machine learning (ML) algorithms to predict the 4-year risk of developing T2DM using targeted quantitative metabolomic data. A cohort of 279 cardiovascular risk patients, who underwent coronary angiography and who were initially free of T2DM according to ADA criteria, were followed for up to 4 years. During this time, 11.5% newly developed T2DM. Targeted metabolomics (Biocrates, AUSTRIA) was performed at baseline, using liquid chromatography (LC), - mass spectroscopy (MS), and flow injection analysis (FIA) - MS respectively. After preprocessing the metabolomics data set, 362 variables were used for ML, employing the caret package (CRAN, R). The dataset was divided into training and test sets (75:25 ratio), and we used an oversampling approach to address the classifier (T2DM incidence) imbalance. The Multilayer Perceptron (MLP) after size-tuning demonstrated the most promising predictive capabilities, exhibiting a sensitivity of 63%, a specificity of 79%, and an accuracy of 77%. The most important variables (top20, figure) were ceramides, bile acids, and hexoses.

In conclusion, ML analysis of large metabolomic data is a promising tool for identifying individuals at risk of developing T2DM and opens avenues for personalized and early intervention strategies.

Disclosure

A. Leiherer: None. A. Muendlein: None. C.H. Saely: None. T. Plattner: None. B. Larcher: None. A. Mader: None. A. Vonbank: None. R. Laaksonen: None. P. Fraunberger: None. H. Drexel: None.

Readers may use this article as long as the work is properly cited, the use is educational and not for profit, and the work is not altered. More information is available at http://www.diabetesjournals.org/content/license.