Despite technological advances, results from various clinical trials have repeatedly shown that many individuals with type 1 diabetes (T1D) do not achieve their glycemic goals. One of the major challenges in disease management is the administration of an accurate amount of insulin for each meal that will match the expected postprandial glycemic response (PPGR). The objective of this study was to develop a prediction model for PPGR in individuals with T1D.
We recruited individuals with T1D who were using continuous glucose monitoring and continuous subcutaneous insulin infusion devices simultaneously to a prospective cohort and profiled them for 2 weeks. Participants were asked to report real-time dietary intake using a designated mobile app. We measured their PPGRs and devised machine learning algorithms for PPGR prediction, which integrate glucose measurements, insulin dosages, dietary habits, blood parameters, anthropometrics, exercise, and gut microbiota. Data of the PPGR of 900 healthy individuals to 41,371 meals were also integrated into the model. The performance of the models was evaluated with 10-fold cross validation.
A total of 121 individuals with T1D, 75 adults and 46 children, were included in the study. PPGR to 6,377 meals was measured. Our PPGR prediction model substantially outperforms a baseline model with emulation of standard of care (correlation of R = 0.59 compared with R = 0.40 for predicted and observed PPGR respectively; P < 10−10). The model was robust across different subpopulations. Feature attribution analysis revealed that glucose levels at meal initiation, glucose trend 30 min prior to meal, meal carbohydrate content, and meal’s carbohydrate-to-fat ratio were the most influential features for the model.
Our model enables a more accurate prediction of PPGR and therefore may allow a better adjustment of the required insulin dosage for meals. It can be further implemented in closed loop systems and may lead to rationally designed nutritional interventions personally tailored for individuals with T1D on the basis of meals with expected low glycemic response.
Introduction
Type 1 diabetes (T1D) is one of the most common chronic diseases in children. Despite technological advances introduced in the past decades, such as continuous glucose monitoring (CGM) and continuous subcutaneous insulin infusion (CSII) devices (1), results from various large-scale clinical trials repeatedly show that the clinical management of T1D is challenging, especially in children and adolescents, with many patients not achieving the glycemic goals recommended by clinical guidelines (2–4). One of the greatest challenges in disease management is the administration of an accurate amount of insulin for each meal. While high levels of insulin might result in a life-threatening hypoglycemic event, a low level would result in high postprandial blood glucose levels and its associated short-term and long-term comorbidities (5). The Standards of Medical Care in Diabetes guidelines, published by the American Diabetes Association, state that patients should match prandial insulin to carbohydrate intake, premeal blood glucose, and anticipated physical activity (6). However, it was previously shown that conventional therapy resulted in suboptimal insulin counteraction of postprandial glycemic responses (PPGRs) (7). More sophisticated models exist, such as those that also include adjustment for the meal’s fat and protein content, but have so far failed to provide a significant improvement in glycemic control, and their implementation in real practice is limited (8,9).
A previous study by Zeevi et al. (10) in healthy individuals revealed that carbohydrate content alone is not a good predictor of glycemic responses to meals and that a significant interindividual variability in PPGR exists. Moreover, environmental factors, including the gut microbiota, were associated with the glycemic response of healthy individuals to meals (10). We therefore hypothesized that by collecting data on the administered insulin dosages, along with additional clinical and microbial data that were previously shown to contribute to PPGR prediction in healthy individuals, we will be able to construct a prediction model for glycemic responses to meals specifically tailored for individuals with T1D. An accurate model may have an immense clinical value, as it will allow a better estimation of the required insulin dosage for meals and can guide personal nutritional interventions, which will be based on meals with expected low glycemic response.
Research Design and Methods
Study Design
To characterize the PPGRs of individuals with T1D, we conducted a prospective observational study. At study initiation, a physician authorized participation and acquired informed consent. Anthropometric measurements and vital signs were taken by the medical staff, and blood tests, including metabolic and lipid profile, HbA1c, and thyroid function, were drawn and analyzed in the laboratories of each of the three medical institutes. Health and lifestyle questionnaires were filled out by the participants or their legal guardians (Supplementary Fig. 1). During the 2 weeks of study participation, participants continued using their own personal CGM and CSII devices. Participants who did not have a CGM device in their possession were connected to a FreeStyle Libre Flash CGM (FSL-CGM) system for the duration of the study. Continuous glycemic and insulin profiles were then acquired in high resolution from the insulin pump and CGM records, with 5- to 15-min intervals between glucose measurements, depending on the CGM type.
Study participants, or their parents in the case of young children, used a proprietary smartphone app (www.personal nutrition.org) (Supplementary Fig. 2), already used by >900 people (10), to log, in real time, food intake, sleep times, physical activity, and medication intake with the exception of insulin, which was recorded in the CSII devices. Each food item within every meal was logged along with its weight by selecting it from a database of 6,401 foods with full nutritional values based on the Israeli Ministry of Health database with the expansion of additional items by certified dietitians. For increased compliance, participants were informed that accurate logging is crucial for them to receive an accurate analysis of their PPGRs to food. Participants were asked to follow their normal daily routine and dietary habits, with the exception of seven standardized meals.
Participant Recruitment
Enrollment and recruitment of participants were conducted by the medical teams in three medical centers in Israel: Sheba Medical Center, the largest hospital in Israel; Rambam Health Care Campus, Northern Israel’s primary hospital; and Shamir Medical Center, the fourth-largest government hospital in Israel. The locations of these clinics allowed us a nearly national coverage of the Israeli population. The inclusion criteria were age between 3 and 70 years, >1 year since T1D diagnosis, use of CGM and CSII devices simultaneously, and a capability to work with a mobile phone app on a daily basis for the recording of the dietary intake by participants or their parents in the case of young children. A minimal age of 3 years was chosen, since it was previously shown that in this age the development of the microbiome from infancy reaches a stable phase (11). A maximal age of 70 years was chosen due to the higher prevalence of additional comorbidities in this age-group. Participants were recruited at least 1 year following diagnosis for minimization of the possible effects of the “honeymoon period” of T1D in which a good glycemic control is achieved with reduced insulin requirements (9). Exclusion criteria included an active inflammatory or neoplastic disease, pregnancy, and antibiotic usage 3 months prior to participation in the study.
Meals Preprocessing
We applied the following consecutive filters on the 9,139 meals that were logged by the participants: First, we merged meals logged <30 min apart if the earliest meal contained >50 cal. The merged meal was assigned the summed values of all of its components and the time of the earliest component. Second, we removed 1) meals logged within 90 min of other meals to avoid the potential influence of adjacent meals and their preceding insulin administration, 2) meals with components weighing >1 kg, 3) meals with incomplete logging, 4) meals with <70 kcal and <15 g total weight, and 5) meals with carbohydrate content >200 g, baseline insulin level of 0, or baseline glucose level <50 mg/dL (2.8 mmol/L). Glucose level at baseline was considered as the lowest glucose value within ±15 min from self-reporting of the meal in the app. This process resulted in 6,377 meals that were logged during the study and were included in the analysis.
Standardized Meals
Each subject received seven standardized breakfasts, provided by the study team, to consume throughout the 14 days of the study: 2 × 30 g glucose, 2 × 50 g glucose, 2 × 65 g bread, and 65 g bread + 20 g butter. Participants were instructed to consume these meals immediately after their night fast, not to modify the meal, and to refrain from eating or performing strenuous physical activity before and for 2 h following meal consumption. In addition, they were instructed to administer insulin calculated by their personal carbohydrate-to-insulin ratio, insulin sensitivity factor, and glycemic target. Participant adherence for consuming the standardized meals was partial, with 116 people consuming at least one test meal, 48 consuming 2 × 30 g glucose, 51 consuming 2 × 50 g glucose, 35 consuming 2 × 65 g bread, and only 3 consuming 65 g bread + 20 g butter twice. Overall, 688 standardized meals were consumed during the study. Following meal preprocessing described above, 118 meals were included in the analysis. Since insulin boluses are calculated by the carbohydrate content of the meal and by the baseline glucose level (12), we compared the within-individual glycemic response of two identical standardized meals only when the difference between glucose levels at meal initiation was <40 mg/dL (2.2 mmol/L). In addition, we only included meals consumed in the morning (5:00–10:00 a.m.).
Prediction of Prandial Glucose Response
To measure the glucose response to meals we used two matrices. First, the PPGR of each meal was calculated by combination of the reported meal time with CGM data and computation of the incremental area under the glucose curve in the 2 h after the meal (13). Of note, no significant differences between PPGRs extracted from CGMs and those obtained from blood tests were shown in healthy individuals in a previous study (14). Second, we calculated the difference between glucose level at meal initiation and maximal glucose level during the 2 h after the meal (Glumax) for the multiple real-life meals. This measure was chosen because it is less sensitive to inaccurate logging time by the participants. For predicting both indices, we constructed prediction models based on XGBoost 0.90; 3,000 estimators and a learning rate of 0.005 were used.
A total of 296 features were included in the model as input, including features representing meal content and blood tests results, CGM and insulin pump-derived features, questionnaires, and microbiome features (Supplementary Table 1). To test the performance of our models, we evaluated our model using several 10-fold cross validation schemes. The results presented in the manuscript are from a per-meal validation scheme, in which the model was trained on 90% of the meals and the remaining 10% of the meals were used as validation. Additional cross-validation schemes were analyzed (Supplementary Fig. 4). Pearson R between predicted and observed PPGR and Glumax was calculated for each model. Explained variance was computed as R2 (coefficient of determination) regression score function in the Python scikit-learn library.
Integration of Data From Healthy Individuals
To analyze whether an integration of data of postprandial responses to meals will improve the prediction results, we integrated data of the PPGR values of 41,371 meals from 900 healthy individuals. The cohort characteristics are described in detail in the study by Zeevi et al. (10) from 2015. In brief, the healthy cohort included individuals age 18–70 years, not previously diagnosed with diabetes, who logged meals in real time in the same smartphone app used in the current study and were connected to a CGM device for 1 week. The clinical data collected from participants were similar with the exception of insulin dosage, which is not relevant for healthy individuals. We used the data from the healthy cohort in two schemes. First, we tested the performance of a model trained solely on data from the healthy individuals on the cohort of individuals with T1D. Second, we tried using the output of a prediction model trained on the data that originates from healthy individuals as an additional feature to our model, based on data of individuals with T1D. We evaluated the value of Pearson correlation obtained between predicted and measured PPGR following these two approaches.
Feature Attributions
We used the recently introduced SHapley Additive exPlanation (SHAP) methods (15–17) for model interpretability. SHAP values are calculated individually for every feature and represent the average change in the model’s output upon conditioning on that feature, when introducing each feature separately, as it is introduced one at a time. The additive property of SHAP values was used to analyze the impact of different groups of features on the model.
Stool Sample Collection and Genomic DNA Extraction
Participants were instructed to sample their stool once during the study period, following detailed printed instructions. Sampling was done using both a swab and an OMNIgene GUT (OMR-200; DNA Genotek) stool collection kit. Collected samples were immediately stored in a home freezer (−20°C), and transferred in a provided cooler to our facilities where they were stored at −80°C (−20°C for OMNIgene GUT kits) until DNA extraction. Gut microbiota profile was obtained only from the DNA Genotek samples by metagenomics sequencing.
Metagenomic DNA was purified with the DNeasy PowerMag Soil DNA extraction kit (QIAGEN), optimized for Tecan automated platform. Next-generation sequencing libraries were prepared with use of Nextera DNA Library Prep (Illumina) and sequenced on a NovaSeq sequencing platform (Illumina). Sequencing was performed with 100–base pairs single-end reads with depth of 10 million reads per sample. We filtered metagenomic reads containing Illumina adapters, filtered low-quality reads, and trimmed low-quality read edges. We detected host DNA by mapping with Bowtie 2 (18) to the human genome with inclusive parameters and removed those reads. Bacterial relative abundance (RA) estimation was performed through mapping bacterial reads to species-level genome bins of representative genomes (19). We selected all species-level genome bins representatives with at least five genomes in a group and for these representatives genomes kept only unique regions as a reference data set. Mapping was performed with Bowtie 2 (18), and we estimated abundance by calculating the mean coverage of unique genomic regions across the 50% most densely covered areas as previously described (20). Feature names include the lowest taxonomy level identified.
Ethics Approval
The study was approved by the Rambam Health Care Campus institutional review board (IRB), Tel Hashomer Hospital IRB, Shamir Medical Center IRB, and Weizmann Institute of Science IRB. All participants signed written informed consent forms. All identifying details of the participants were removed prior to the computational analysis. The trial was registered at ClinicalTrials.gov, clinical trial reg. no. NCT02919839.
Data and Resource Availability
Metagenomic sequencing data that support the findings of this study will be made available. Clinical data cannot be shared due to restrictions by virtue of the informed consent. The data set is available from https://data.mendeley.com/data sets/bcz47mhvc3/1. Analysis code is available from https://github.com/nastyagod neva/t1d_microbiome.
Results
Study Population
A total of 142 individuals with T1D were recruited into the study between March 2017 and April 2019. Five individuals (3.5%) dropped out, resulting in 137 participants, of whom 131 provided data from CGM and CSII devices. Of these, 121 individuals, 46 children (<18 years of age) and 75 adults, logged meals during the study period and were included in the analysis (Fig. 1A). The mean ± SD age was 24.7 ± 15 years (median 21 years, interquartile range [IQR] 14–32) (Fig. 2A), and average disease duration was 11.2 ± 10.1 years (median 8 years, IQR 4–15). Mean HbA1c level was 7.5% ± 1.1% (58.5 ± 12.1 mmol/mol) (Fig. 2B; see Supplementary Table 2 for the results of all blood tests obtained at baseline). Mean BMI of adults was 25.1 ± 3.9 kg/m2 (median 22.4 kg/m2, IQR 20.4–22.6), and mean BMI percentile of children when converted according to the Centers for Disease Control and Prevention reference percentiles was 58 ± 23 (median 59.4, IQR 38.3–83.5) (Supplementary Fig. 3). Of the 121 participants, 42 (34.7%) had at least one additional comorbidity. The most common comorbidities were hypothyroidism (17 participants [14%]), hyperlipidemia (13 participants [10.7%]), and celiac disease (7 participants [5.8%]). Thirty-three participants (27%) consumed additional medications apart from insulin during the study. The most common medications were levothyroxine, oral contraceptives, and antilipidemic drugs. (See Supplementary Table 3 for a full list of medical conditions and medications.) None of the participants reported using long-acting insulin during the study. Altogether, 1,775 days of a concurrent usage of both CGM (including 302,703 glucose measurements) and CSII devices simultaneously were included in the analysis. Of them, 1,597 days included at least one meal logging. A total of 110 individuals provided a stool sample including 7 participants who had celiac disease and were excluded from all the microbiome analyses. Cohort characteristics are presented in Table 1.
. | All (n = 121) . | Adults (n = 75) . | Children (n = 46) . |
---|---|---|---|
Age (years) | 24.7 ± 15 | 32.3 ± 14.1 | 12.1 ± 3.7 |
Diabetes duration (years) | 11.2 ± 10.1 | 15.1 ± 10.7 | 4.5 ± 2.7 |
Male sex | 40 | 40 | 40 |
Weight (kg) | 61.9 ± 19.5 | 71.4 ± 12.9 | 46.1 ± 18.3 |
BMI (kg/m2) | 23.1 ± 4.8 | 25.1 ± 3.9 | 19.8 ± 4.2 |
BMI percentile, for children | 58 ± 23 | ||
HbA1c (%, mmol/mol) | 7.5 ± 1.1, 58.5 ± 12.1 | 7.3 ± 1.0, 56.3 ± 10.9 | 7.8 ± 1.3, 61.7 ± 14.25 |
Microbiome samples (N) | 110 | 71 | 39 |
Individuals who logged physical activity (N) | 83 | 58 | 25 |
Real-time meal logging (per participant) | |||
Days of meals logging + CGM + CSII data | 13.2 ± 3.3 | 13.2 ± 3.1 | 13.4 ± 3.4 |
Meals logged | 73.9 ± 32.2 | 76.9 ± 27.3 | 69 ± 26.2 |
Meals included in the predictor | 52.7 ± 24.0 | 55.3 ± 25.6 | 48.5 ± 21 |
Total energy intake per day (kcal) | 1,516.7 ± 550.2 | 1,571.6 ± 523.4 | 1,428.4 ± 585 |
Nutrition components | |||
% total energy intake from lipids | 39.1 ± 8 | 40.3 ± 7.6 | 37.1 ± 8.5 |
% total energy intake from protein | 17.1 ± 4.3 | 17.5 ± 4.8 | 16.2 ± 3.3 |
% total energy intake from carbohydrates | 41.6 ± 10.2 | 39.6 ± 9.8 | 45.2 ± 10 |
. | All (n = 121) . | Adults (n = 75) . | Children (n = 46) . |
---|---|---|---|
Age (years) | 24.7 ± 15 | 32.3 ± 14.1 | 12.1 ± 3.7 |
Diabetes duration (years) | 11.2 ± 10.1 | 15.1 ± 10.7 | 4.5 ± 2.7 |
Male sex | 40 | 40 | 40 |
Weight (kg) | 61.9 ± 19.5 | 71.4 ± 12.9 | 46.1 ± 18.3 |
BMI (kg/m2) | 23.1 ± 4.8 | 25.1 ± 3.9 | 19.8 ± 4.2 |
BMI percentile, for children | 58 ± 23 | ||
HbA1c (%, mmol/mol) | 7.5 ± 1.1, 58.5 ± 12.1 | 7.3 ± 1.0, 56.3 ± 10.9 | 7.8 ± 1.3, 61.7 ± 14.25 |
Microbiome samples (N) | 110 | 71 | 39 |
Individuals who logged physical activity (N) | 83 | 58 | 25 |
Real-time meal logging (per participant) | |||
Days of meals logging + CGM + CSII data | 13.2 ± 3.3 | 13.2 ± 3.1 | 13.4 ± 3.4 |
Meals logged | 73.9 ± 32.2 | 76.9 ± 27.3 | 69 ± 26.2 |
Meals included in the predictor | 52.7 ± 24.0 | 55.3 ± 25.6 | 48.5 ± 21 |
Total energy intake per day (kcal) | 1,516.7 ± 550.2 | 1,571.6 ± 523.4 | 1,428.4 ± 585 |
Nutrition components | |||
% total energy intake from lipids | 39.1 ± 8 | 40.3 ± 7.6 | 37.1 ± 8.5 |
% total energy intake from protein | 17.1 ± 4.3 | 17.5 ± 4.8 | 16.2 ± 3.3 |
% total energy intake from carbohydrates | 41.6 ± 10.2 | 39.6 ± 9.8 | 45.2 ± 10 |
Data are means ± SD or % unless otherwise indicated. Characteristics of individuals with T1D who were included in the prediction model are presented. BMI percentiles for children were converted to reference percentiles provided by the Centers for Disease Control and Prevention.
Nutritional Profiling
Overall 9,139 meals (mean of 73.9 meals per participant, median 68 meals, IQR 51.75–93), a total of 2,383,181 kcal (mean of 19,695 kcal per participant, median 18,631 kcal, IQR 14,517–24,903), were logged by participants during the study. The distributions of the number of meals logged for children and adults are presented in Supplementary Fig. 3. To obtain a global view of the dietary habits of the participants, we first analyzed the fraction that each food group contributed to the cohort’s overall energy intake (Fig. 2D and E) and the distribution of macronutrients intake from the overall energy intake (Fig. 2C). The main food groups consumed by the participants were similar in children and adults. Average carbohydrate, fat, and protein consumption was 39.1 ± 8%, 41.6 ± 10.2%, and 17.1 ± 4.3% of total energy, respectively (Table 1 and Fig. 2). Overall, macronutrients distribution changed with age, with older individuals consuming fewer carbohydrates, as demonstrated by a positive association between age and the percentage of lipids (r = 0.27, P < 0.05, false discovery rate [FDR] corrected) and negative association of age with the percentage of carbohydrates from total energy intake (r = −0.30, P < 0.05, FDR corrected). After adjustment for age, the percentage of carbohydrates from the total energy intake consumed per day by the participants was positively correlated with blood levels of HbA1c at study initiation (r = 0.27, P < 0.05, FDR corrected), mean glucose value (r = 0.21, P < 0.05, FDR corrected), and glucose variability (r = 0.27, P < 0.05, FDR corrected) (Supplementary Fig. 4).
Glycemic Responses of the Same Person to the Same Meal Are Reproducible
We next analyzed intrapersonal variability in the PPGR of the same person to the same meal. To that end, we assessed whether the PPGRs of standardized meals that were given twice to each participant are reproducible. Overall, 688 standardized meals were consumed during the study by 116 participants. However, to limit cases in which an independent exposure leads to the variability, we performed this analysis only for standardized meals consumed in the morning, in which case the baseline glucose levels at meal initiation were similar (see research design and methods). Altogether, 118 test meals were included for 59 individuals. In these cases, there was a relatively high agreement for all duplicated test meals (R = 0.63), demonstrating that the PPGR to identical meals is correlated in the same participant when the baseline glucose level is similar (Supplementary Fig. 5).
Prediction of Glycemic Responses to Real-life Meals
We next examined our ability to predict 1) PPGR and 2) Glumax, which we defined as the maximal glucose rise during the 2 h after meal initiation (Fig. 1B). After meal preprocessing, from 9,139 logged meals, 6,377 meals (mean ± SD 47.4 ± 24.1 meals per participant [Supplementary Fig. 3]) were included in the analysis. (See research design and methods.) For both tasks, we constructed prediction models based on extreme gradient boosting (XGBoost) taking as input the following: 1) Only the meal’s carbohydrate content. 2) A baseline model representing standard of care for insulin administration. Since insulin boluses are calculated on the basis of the carbohydrate content of the meal and the baseline glucose level, the meal’s carbohydrate content and glucose at meal initiation were included in the model (12). Insulin bolus given 90 min prior to the meal was also included in the baseline model, for cases in which it was already administered for the meal, thereby influencing the glycemic response. And 3) all the information that was collected from the participant during the study period, with regard to features representing meal content (e.g., macronutrients), meal timing, daily activity (e.g., time from logged exercise), blood parameters (e.g., HbA1c), CGM-derived features, data on insulin dosage obtained from CSII devices, questionnaires, and microbiome features (e.g., metagenomic RA). (A full list of features can be found in Supplementary Table 1.) Several cross-validation schemes were used for validation of the model (see research design and methods and Supplementary Fig. 6).
For the prediction of PPGR, a model that is based solely on the meal’s carbohydrate content achieves a relatively low correlation with PPGRs (R = 0.16) and explains only ∼3% of the variance in glycemic response (Fig. 3A). A baseline model using carbohydrate content, glucose level at meal initiation, and insulin bolus performs better and explains ∼16% of the variance in glycemic response (R = 0.4, P < 10−10) (Fig. 3B). The full model, with integration of glucose measurements, insulin dosages, dietary habits, blood parameters, anthropometrics, exercise, and gut microbiota (Supplementary Table 1), achieves a substantially higher correlation for the held out PPGRs of meals and increases the explained variance to ∼35% (R = 0.59, P < 10−10) (Fig. 3C). Similarly, for Glumax prediction, a meal’s carbohydrate-based model achieves a relatively low correlation (R = 17) (Fig. 3D), a baseline model performs better (R = 0.43, P < 10−10) (Fig. 3E), and the full model predicts the held out values of individuals with a significantly higher correlation (R = 0.61, P < 10−10) (Fig. 3F).
Although insulin doses substantially affect glucose levels in individuals with T1D, we next asked whether integration of data on glycemic responses to meals from healthy individuals may also contribute to our ability to predict the glycemic response to meals in T1D. To this end, we used an extensive data set of detailed clinical profiling and PPGR measurements of 41,371 meals from 900 healthy individuals (10) (see research design and methods). First, we tested a model trained solely on data from the healthy individuals cohort on the T1D cohort. This model achieved a correlation between predicted and measured PPGR of R = 0.39, similar to that achieved by our baseline model but lower than achieved by the full model trained specifically on individuals with T1D. Second, we tried using the output of a prediction model trained on the data that originate from healthy individuals as an additional feature to our model, which is based on data of individuals with T1D. This resulted in a correlation similar to that obtained by the original model (Supplementary Fig. 7).
Variability in the Prediction of Glycemic Responses
To further investigate the performance of the full model, we analyzed the variability in the correlation between predicted and observed PPGR between individuals in the cohort. While in most of the individuals, a correlation of >0.52 was obtained, a high variability was observed (Supplementary Fig. 3). To further analyze the factors underlying this variability and the overall robustness of the model, we divided the cohort into several subgroups by clinical parameters and calculated the correlation for each subgroup. Subgroup analysis revealed similar results for divisions by age, HbA1c, time spent in a state of hypoglycemia, and different types of CGM and CSII devices (Supplementary Table 4). We next compared the characteristics of the participants with a high correlation with the characteristics of those with a low correlation and found that the latter had a significantly higher glucose variability and logged significantly fewer meals per day (3.58 vs. 4.26, P < 0.05, FDR corrected) and throughout the study period (54.68 vs. 42.81, P < 0.05, FDR corrected) (Supplementary Table 5).
Factors Underlying the Prediction of Postprandial Glycemic Responses
To gain insight into factors affecting prediction, we performed feature attribution analysis using SHAP (15–17). SHAP values represent the average change in the model’s output upon conditioning on a specific feature (see research design and methods). The most influential features with the highest mean absolute SHAP value included glucose levels at meal initiation, glucose trend in the 30 min prior to meal, meal carbohydrate content, the ratio between carbohydrate and fat in the meal, and time of day, for which the impact of earlier hours in the day was toward a higher glycemic response (Fig. 3G). By adding up the SHAP values of related features, we could analyze the overall impact of groups of features to the model. These analyses revealed that glucose measurements were the most impactful in the prediction, followed by the meal composition, insulin dosage, and microbial composition (Fig. 3H). A companion article by Shilo et al. (21) focused exclusively on the novel associations between bacterial taxa and the glycemic control of the host.
Conclusions
In this study we used comprehensive clinical data from a 2-week profiling period to devise a prediction model for glycemic responses to meals in individuals with T1D. PPGR is an important contributor to the overall glycemic control (22), and decreasing it may improve the time spent in the target glycemic range, which is strongly associated with the risk of future microvascular complications in individuals with T1D (23). In recent years, the increased availability of sequential glucose measurements and insulin data, as a result of an increased usage of CGM and CSII devices, respectively, has paved the way for machine learning approaches targeted at the prediction of the dynamics of blood glucose in T1D (24). However, modeling the effect of different meals remains a challenge (24,25). To the best of our knowledge, our study is the first to investigate the contribution of adding data on PPGRs originating from healthy individuals and to incorporate microbial features to a prediction model for PPGR of individuals with T1D.
Here, we have shown a relatively high agreement (R = 0.63) for duplicated test meals of the same individual with T1D, demonstrating that the PPGR to identical meals is correlated within the same person when baseline glucose level is similar (Supplementary Fig. 5). However, it is worthy of note that the same analysis previously performed on a larger cohort of healthy individuals achieved substantially better results (R = 0.77 for glucose and bread with butter test meals, R = 0.71 for bread) (10). This emphasizes the fact that prediction of glycemic responses to meals is more challenging in individuals with T1D than in healthy individuals, as those with T1D have a higher glucose variability; are treated with insulin, which can greatly impact glucose levels; and they may engage in physical activity in proximity to the meal, which may also greatly impact their glucose levels.
Our prediction model for the glycemic response to real-life meals takes into account a comprehensive clinical and microbiome profile as input and aims at predicting two glycemic measures: PPGR and Glumax. In both glycemic measures, our model substantially outperforms a baseline model that emulates the current Standards of Care for insulin administration for meals in individuals with T1D and takes into account only the carbohydrate content of the meal and glucose level at meal initiation (6), as well as the dosage of bolus insulin already given prior to the meal (Fig. 4). Notably, a model based solely on the meal’s carbohydrate content achieved a relatively low correlation with PPGRs (R = 0.16) compared with the correlation previously reported for healthy individuals (R = 0.38) (10), possibly due to the fact that the insulin dosage given for the meal has already been calculated and adjusted for the carbohydrate content. Integration of data from 900 healthy individuals (10) into the model did not improve its performance.
While a high variability in the correlation of predicted and observed glycemic response to meals was observed between participants, subgroup analysis has shown that the model is overall robust and performs similarly across individuals in different age-groups and with different levels of glycemic control. Glucose variability was significantly lower in individuals who had a high correlation, reflecting the challenge of glucose prediction in patients with high variability in glucose values. Notably, the number of meals logged per day and throughout the study was significantly higher in individuals who had a high correlation. This may be due to better adherence to the study protocol, which resulted in a more accurate and timely meal logging that enabled a better prediction. In addition, the fact that the model was trained on a larger number of meals logged by these participants may have also improved its performance. Nonetheless, these findings emphasize the potential to achieve a better prediction with additional accurate and timely meal logging in individuals with insufficient data.
Analysis of the impact of different factors on the prediction model revealed insights on the drivers of glucose rise following meals in individuals with T1D. The most impactful features included glucose at meal baseline and the carbohydrate content of the meal, which are currently the main factors taken into consideration in the calculation of insulin bolus (6), thus further providing validation to our analyses. Other impactful factors, which are not taken into account in the Standards of Care for insulin administration today, are the glucose trend 30 min prior to the meal start and the ratio of carbohydrate to fat in the meal. The impact of earlier hours in the day was toward a higher glycemic response, which is in line with previous studies on diurnal changes in insulin requirements that showed that more insulin is needed per carbohydrate count (a lower carbohydrate-to-insulin ratio) in breakfast versus lunch and dinner (26,27). Overall, glucose values prior to the meal, followed by meal composition and insulin dosages, were the most impactful groups of features in the model. Interestingly, analyses of the impact of groups of features on the model revealed that glucose measurements were the most impactful in the prediction, followed by the meal composition, insulin dosage, and microbial composition (Fig. 3H).
In this cohort, individuals who consumed a diet with fewer carbohydrates had better glycemic control, as demonstrated by a higher percentage of time in range and a lower glucose average. The percentage of carbohydrates from the total energy intake consumed by the participants was positively correlated with blood levels of HbA1c at study initiation and with a higher glucose variability. Although several studies reported a statistically significant reductions in HbA1c in individuals with T1D consuming low-carbohydrate diets, others did not, and the overall effect of these diets is still debatable, partially due to heterogeneity of studies in this field (28).
This study has several limitations. First, a low adherence for standardized meals consumption and inaccurate self-reports of meals by participants could have affected our ability to predict the glycemic response to meals, and we believe that better predictions can likely be achieved by training the model on a larger volume of high-quality clinical data. This is empirically shown by the superiority of the model performance in participants who logged more meals and were more adherent to the study protocol (Supplementary Table 2). This limitation also presents a challenge in real-life utility of the model, as an accurate input of meals is required. Second, we did not have information on several additional factors that may influence glucose levels in individuals with T1D such as menstrual cycle. Third, although the sample size is large compared with that of other studies with analysis of microbiome composition in individuals with T1D, it may still be insufficient for robust characterization of the bacterial composition and for associating it with PPGR. Finally, we could not evaluate the impact of hybrid closed loop systems on our models, as these systems were not part of the health basket provided by the Israeli Ministry of Health and were not used routinely by individuals with T1D in Israel during the study period.
In summary, here we developed a model that can predict glycemic responses to meals in individuals with T1D, substantially outperforming standard of care. Our work has several potential clinical applications. Our model enables a more accurate prediction of the glycemic response to meals and therefore may allow a better adjustment of the required insulin dosage for meals. It can be further implemented in closed loop systems, personalized decision systems, and alarm systems for the expected high and low blood glucose events for individuals with T1D. These applications have the potential to improve glycemic control, thereby delaying the onset of microvascular complications, while decreasing disease burden. In addition, our model may lead to rationally designed nutritional interventions personally tailored for individuals with T1D based on meals with expected low glycemic response. In addition, our study unravels potential microbial effectors of glycemic control in T1D and underscores the need for additional mechanistic studies that may identify the role of these bacteria and pave the way to novel therapeutic strategies. All of the above have the potential to improve glycemic control and disease management in individuals with T1D.
Clinical trial reg. no. NCT02919839, clinicaltrials.gov
See accompanying article, p. 555.
This article contains supplementary material online at https://doi.org/10.2337/figshare.16649266.
Article Information
Acknowledgments. The authors thank the Segal group members for fruitful discussions.
Funding. This work is supported by The Israel Science Foundation (ISF) (grant 3-14762). E.S., from the Weizmann Institute of Science, Israel, is supported by the Crown Human Genome Center, Larson Charitable Foundation New Scientist Fund, Else Kroener Fresenius Foundation, White Rose International Foundation, Ben B. and Joyce E. Eisenberg Foundation, Nissenbaum Family, Marcos Pinheiro de Andrade and Vanessa Buchheim, Lady Michelle Michels, Aliza Moussaieff, and grants funded by the Minerva Foundation with funding from the Federal German Ministry for Education and Research and by the European Research Council and the Israel Science Foundation.
These funding sources had no role in the design of this study or during its execution, analyses, interpretation of the data, or decision to submit results.
Duality of Interest. E.S. is a paid consultant for DayTwo. No other potential conflicts of interest relevant to this article were reported.
No pharmaceutical manufacturers or companies from the industry contributed to the planning, design, or conduct of the trial.
Author Contributions. S.S. conceived the project, designed and conducted the analyses, interpreted the results and wrote the manuscript. A.G. designed and conducted the analyses, interpreted the results, and wrote the manuscript. M.R. provided data and interpreted the results. T.Ko. conceived the project, designed the analysis, and interpreted the results. D.K. and T.Ka. designed and conducted the analyses. M.C., N.Z.L., N.S., N.G., N.L., and S.K. provided data and interpreted the results. N.B. designed the analysis and interpreted the results. B.C.W. and Y.G.-G. coordinated and designed data collection. A.W. conceived the project and directed sample sequencing. O.P.-H. and E.S. conceived the project, designed and conducted the analyses, interpreted the results, and supervised the project and analyses. All authors reviewed and approved the manuscript and vouch for the accuracy and completeness of the data. E.S. is the guarantor of this work and, as such, had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.