Aim: Current clinical classification paradigms for obesity are unrefined and demand more precise classification.

Methods: Unsupervised ML is used to cluster patients with obesity on 2 independent cohorts at 1 institution: an outpatient (n=507) and an inpatient (n=229) cohort. The clustering is performed separately on the 2 cohorts, based on 4 physician expert-selected clinical variables (age, AUC of insulin, FSH, urine acid). Statistics of a lean cohort (n=702) are measured as control.

Results: The classification reveals 4 metabolic different obese clusters on each cohort (Fig 1). Jaccard similarity is 0.865 between the 2 cohorts’ clusters. MHO shows a moderate basal metabolic rate (BMR) with relative healthy hormone levels and glucometabolism. LMO shows the lowest BMR and hormone, oldest age, and most severe glucometabolism. Both HMO-T1 and HMO-T2 show the highest BMR and hormone, lowest glucose, highest incidents of hyperuricemia and female hyperandrogenemia. Moreover, higher urine acid and testosterone in female are observed in HMO-T1, while extremely high insulin secretion and low incidents of diabetes are seen in HMO-T2.

Conclusion: Clinical characteristics-separable subgroups of obesity can be autonomously identified by ML, where the sub-grouping is generalizable across 2 independent cohorts. This may provide interpretation of pathogenesis and enable more precise therapy decision-making for patients with obesity.


Z. Lin: None. Y. Hui: None. S. Wu: Other Relationship; Spouse/Partner; LifeVantage. S. Qu: None.


National Key R&D Program of China (2018YFC1314100); National Natural Science Foundation of China for Youth (81500687); Fundamental Research Funds for the Central Universities of Tongji University (22120190210)

Readers may use this article as long as the work is properly cited, the use is educational and not for profit, and the work is not altered. More information is available at