T2D is a heterogeneous disease with variations in presentation, progression, and response to treatments across individuals. We developed a novel GNN-based framework to identify distinct T2D progression pathways using electronic health records (EHR) data.We identified T2D patients using 2015-2020 EHR data from the University of Florida Health System. We extracted (1) demographics, labs, medications, and comorbidities; and (2) social determinants of health via natural language processing from clinical notes. Our novel GNN-based framework, including: (1) modeling T2D progression graphs based on the encounters in longitudinal EHRs, (2) learning outcome-oriented (i.e., glycemic control via HbA1c over time) latent T2D progression representations, and (3) learning T2D progression subphenotypes using Hierarchical Clustering algorithm.Of the 2,415 T2D patients, mean age was 59 (sd: 13) years, and 60 % were females. We identified 5 distincT2D progression subphenotypes (Figure, C1-C5). and they exhibited distinct sociodemographic and clinical characteristics. C1 had the slowest progression of T2D, with the lowest burden of hypertension. Compared to C1, C5 had a faster progression and suffered more from renal declines.C2 had the fastest progression, with the highest burden of obesity and neuropathy.Our GNN-based T2D progression subphenotypes may support precision treatment and disease management.
Y. Huang: None. Z. Fan: None. W.T. Donahoo: None. J. Guo: None. J. Bian: None.