Introduction: Deep learning methods, commonly called artificial intelligence (AI) , have been widely adopted in medical research. Reinforcement learning is a particularly promising AI tool that can generate optimal strategies based on non-optimized training data. The aim of the currentstudy was to develop and validate a reinforcement learning algorithm for determining optimal glycemic targets for sepsis patients in the ICU.

Methods: We developed and validated a clinical decision support model that provided an individualized daily glycemic target for each critically ill patient based on the retrospective analysis of two independent ICU databases, Medical Information for Mart for Intensive Care III (MIMIC-III) and the eICU Collaborative Research Database (eICU-CRD) . We used a Markov decision process (MDP) to formulate sequential decision-making. The discrete state was generated by quantizing all patients’ longitudinal health records using a k-means clustering algorithm. To evaluate clinician policy, we used the temporal-difference (TD) learning method. To learn AI policy, we generated a random action policy, updated the value function, and used the value function to update the policy.

Results: Average return per patient was significantly higher in survivors than non-survivors, and a significant inverse relationship was observed between AI performance return and estimated 90-day mortality. Contrary to the current guidelines that recommend maintaining a glycemic range of 140-180 mg during ICU admission, AI most often recommended a glycemic range of 120-139 mg/dL for ICU patients. Analysis of mortality rate according to mean glucose level during ICU admission revealed that the greater the difference between AI recommendation and real glycemic control range, the higher the mortality rate.

Conclusion: These results suggest that tighter glycemic control, guided by a similar AI policy for daily individualized optimal glycemic target, may help improve ICU patient survival.


J.Yun: None. G.Lee: None. J.Kim: None. S.Ko: None. Y.Ahn: None. D.Kim: n/a. K.Song: n/a.

Readers may use this article as long as the work is properly cited, the use is educational and not for profit, and the work is not altered. More information is available at