Time in range is a key glycemic metric, and comparisons of management technologies for this outcome are critical to guide device selection.
We conducted a systematic review and network meta-analysis to compare and rank technologies for time in glycemic ranges.
We searched Evidenced-Based Medicine Reviews, CINAHL, Embase, MEDLINE, MEDLINE In-Process & Other Non-Indexed Citations, PROSPERO, PsycInfo, PubMed, and Web of Science until 24 April 2019.
We included randomized controlled trials ≥2 weeks’ duration comparing technologies for management of type 1 diabetes in adults (≥18 years of age), excluding pregnant women.
Data were extracted using a predefined template. Outcomes were percent time with sensor glucose levels 3.9–10.0 mmol/L (70–180 mg/dL), >10.0 mmol/L (180 mg/dL), and <3.9 mmol/L (70 mg/dL).
We identified 16,772 publications, of which 14 eligible studies compared eight technologies comprising 1,043 participants. Closed-loop systems led to greater percent time in range than any other management strategy, and mean percent time in range was 17.85 (95% predictive interval 7.56–28.14) longer than with usual care of multiple daily injections with capillary glucose testing. Closed-loop systems ranked best for percent time in range or above range with use of Surface Under the Cumulative RAnking curve (SUCRA) (98.5% and 93.5%, respectively). Closed-loop systems also ranked highly for time below range (SUCRA 62.2%).
Overall risk of bias ratings were moderate for all outcomes. Certainty of evidence was very low.
In the first integrated comparison of multiple management strategies considering time in range, we found that the efficacy of closed-loop systems appeared better than all other approaches.
Introduction
Type 1 diabetes is challenging for people with the condition and those caring for them. Optimal management should be individualized and multidisciplinary and is dependent on effective insulin administration, glucose monitoring, and decision support (1,2). However, despite advances in management, there is still significant morbidity and mortality associated with this condition (1–3).
Recent years have witnessed increasing emphasis on glycemic metrics beyond glycosylated hemoglobin (HbA1c) (4–8). The value of HbA1c in managing people with type 1 diabetes stems from robust studies that demonstrate correlation between improved HbA1c levels and fewer diabetes complications (9–11). However, because HbA1c provides an indication of average glycemia over a period of months, additional glycemic metrics are required to provide real-time insights into glycemic variability and burden of hypoglycemia and hyperglycemia (4,7,8). In combination with increasing use of interstitial glucose monitoring (12–14), randomized controlled trials (RCTs) and clinical practice recommendations are now incorporating the outcome of percent time spent in glycemic ranges (6,15). Furthermore, it appears that increasing time in range also correlates with lower risk of developing complications of diabetes and with lower HbA1c values (5,16).
The categories of technology commonly used for the management of diabetes comprise insulin delivery, glucose measurement, and decision support for insulin dosing. Technologies for decision support include insulin dose advisors within pumps, freestanding glucometers, and applications for smart devices. Following syringes and pens for insulin delivery, pumps have evolved to provide preprogrammable basal insulin rates, in-built bolus calculators, and the capability to administer insulin bolus doses immediately, over hours, or with a manually determined combination of both. Glucose measurement has also evolved from urine testing to capillary blood glucose and, more recently, interstitial fluid with continuous glucose monitoring (CGM) and intermittently scanned devices. Integrating technologies with algorithms has also led to closed-loop systems, hybrid closed-loop systems, and systems that suspend basal insulin delivery to reduce the burden of hypoglycemia, all of which aim to improve outcomes and quality of life by automating aspects of glycemic control. However, the efficacy of newer technologies has not been compared with that of the full range of alternatives.
Network meta-analyses make it possible to estimate comparative efficacy within and between categories of technology and compare interventions where direct trial evidence is sparse (17–19). Therefore, the aim of the current study was to systematically review the literature regarding time in range and perform network meta-analysis to provide an integrated comparison of relative efficacy for the majority of clinically available technologies in the management of adults with type 1 diabetes.
Methods
Data Sources and Searches
Evidenced-Based Medicine Reviews, CINAHL, Embase, MEDLINE, MEDLINE In-Process & Other Non-Indexed Citations, PROSPERO, PsycInfo, PubMed, and Web of Science were searched from database inception to 24 April 2019, limited to the English language. The electronic database searches were also supplemented by manual searches for ongoing RCTs using the International Clinical Trials Registry Platform Search Portal (http://apps.who.int/trialsearch/) as well as published studies from the reference lists of review articles. Study authors and technology companies were contacted for missing data from published studies as appropriate. Only indexed full text articles that were available at the time of article screening and data extraction were eligible for inclusion in the systematic review and network meta-analysis. The published protocol (20) for a broader review by the same authors was used for the search strategy as well as the approach to data extraction, quality appraisal, and statistical analyses as relates to different intervention groups, outcomes, and study durations.
Study Selection
Inclusion criteria required RCTs of parallel and crossover study design that were ≥2 weeks’ duration overall (or each phase of a crossover study), presented ≥2 weeks of CGM results, and comprised men and women (≥18 years of age) with type 1 diabetes in the outpatient context. Pregnant women were excluded. Studies that compared technologies for insulin delivery, glucose monitoring, insulin dosing advice, or multiple daily injections (MDI) and self-monitoring of blood glucose via capillary testing (SMBG) were included. MDI was defined as three or more insulin bolus injections per day and at least one basal insulin injection per day. The combination of CGM and continuous subcutaneous insulin infusion (CSII) that facilitated automated adjustment of insulin delivery was considered to be a closed-loop system. This was distinct from low glucose suspend systems, which formed their own category. Unless otherwise reported by authors, it was assumed that the bolus calculator function on insulin pumps was activated for devices with that capability. Implanted devices were excluded due to limited availability and RCT data at the time of review. Systems that required telemedicine were also excluded due to potential systematic differences from other technologies implemented within the network. Studies that comprised adult and pediatric participants or multiple diabetes types were also excluded unless authors provided stratified results in the manuscript or additional correspondence. Because every individual with type 1 diabetes must have at least one method for insulin delivery and glucose monitoring, we considered eight intervention pairs based on the results of our searches. The outcomes comprised percent time in range (3.9–10.0 mmol/L [70–180 mg/dL]), percent time above range (>10.0 mmol/L [180 mg/dL]), and percent time below range (<3.9 mmol/L [70 mg/dL]) measured over a period of at least 14 days.
Data Extraction and Quality Assessment
The lead investigator (A.P.) selected articles, reviewed main reports and supplementary materials, assessed the risk of bias, and extracted relevant summary estimates from the studies for clinical outcomes using a predefined extraction template in Microsoft Excel. Independent duplication of article screening (C.L. [94% of total]), risk-of-bias assessment (V.K. [100%]), and data extraction (V.K. [100%]) were performed. Any discrepancies or uncertainties could be resolved by consensus or deferral to a third reviewer (S.Z.). Criteria for inclusion in network meta-analysis required published data or adequate information to calculate missing data for mean (SD) time in range, time above range, and time below range (21). Risk of bias for studies was assessed following the Cochrane Handbook for Systematic Reviews of Interventions, and the quality of evidence for the network meta-analysis was evaluated with the Grading of Recommendations Assessment, Development and Evaluation (GRADE) framework (22–24).
Data Synthesis and Analysis
The mean (SD) difference was estimated for the percent time in glycemic ranges at the completion of relevant studies. In the network meta-analysis, group-level data and a frequentist approach for statistical inference were used (23,25–30). The effect sizes were synthesized using a random-effects model with 95% CIs as well as 95% predictive intervals (PrI) for the expected treatment effects of future studies. Potential effect modifiers including age, diabetes duration, HbA1c, and sex were considered in the assessment of the statistical transitivity assumption (30). For quantification of inconsistency and heterogeneity, the network τ (2), I2 (95% CI), loop-specific inconsistency, and side splitting were considered (25–29). If authors were not able to provide missing results in individual studies, the Cochrane Handbook for Systematic Reviews of Interventions recommendations for calculation were followed (18). Due to the reporting of results for crossover studies, data were extracted as though they were parallel if carry over effect was reported as absent. Where insufficient information was provided on period effect, carry over, or intraperson variation, we imputed the covariance with a range of correlation coefficients (0.1, 0.5, and 0.9) (18). Our results were reported with the conservative correlation coefficient of 0.5, and other results are presented in Supplementary Material as sensitivity analysis. If authors only reported median and interquartile range, a pragmatic approach was adopted to convert these results to means and SDs (21). Interventions were compared with PrI plots for percent time in glycemic ranges and were ranked for all outcomes with the Surface Under the Cumulative RAnking curve (SUCRA). The rankings for outcome pairs were evaluated with clustered ranking plots from SUCRA values for percent time in range and above range, as well as for time in range and below range. If at least three studies for an outcome did not specify that all participants used the same modality of insulin delivery for studies investigating glucose measurement technology, a second network would be generated to analyze these studies separately.
Subgroup analyses were not performed due to inconsistent reporting across studies. Comparison-adjusted funnel plots were used to determine how results of studies differed by their precision. Statistical analyses were done using Stata (version 14.0) and R (version 3.4.4).
Results
We identified 16,772 publications, of which 120 potentially eligible publications were retrieved in full text (Fig. 1). Of these, 14 parallel or crossover RCTs met our inclusion criteria, involving 1,043 participants eligible for analysis. Respectively, there were 12, 13, and 14 studies included in network meta-analysis for the outcomes of percent time in range, above range, and below range. Characteristics of included studies are summarized in Supplementary Material (pp. 15–53). Incomplete trials were not included, and correspondence with authors did not yield additional data for analysis. There was 100% agreement between authors regarding article inclusion, risk-of-bias assessment, and data extraction.
Overall, the mean (SD) sample size of included studies was 55 (33), the mean duration of intervention was 5 (3) months, and 100% received industry funding or material support. Europe was the study location for six (43%) studies, and four (29%) were completed in the U.S., two (14%) were based solely in the U.K., and two (14%) had sites in the U.K. as well as Europe. The mean age of participants was 43.3 (7.0) years; HbA1c 7.7% (0.7%), 61 (7.7) mmol/mol, at baseline; and duration of diabetes 21.4 (5.7) years.
Figure 2 shows the network of eligible technology comparisons for percent time in range (3.9–10.0 mmol/L [70–180 mg/dL]), time above range (>10.0 mmol/L [180 mg/dL]), and time below range (<3.9 mmol/L [70 mg/dL]). Supplementary Material provide detailed results of both indirect and pairwise comparisons of diabetes management interventions in the network meta-analyses.
Network map of direct diabetes management comparisons for the outcome of percent time in range (A), percent time above range (B), and percent time below range (C). The size of each circle and the width of each line are proportional to the number of participants randomized to each intervention and the number of trials comparing each pair of treatments, respectively.
Network map of direct diabetes management comparisons for the outcome of percent time in range (A), percent time above range (B), and percent time below range (C). The size of each circle and the width of each line are proportional to the number of participants randomized to each intervention and the number of trials comparing each pair of treatments, respectively.
Figure 3 shows the network meta-analysis as interval plots. In terms of percent time in range (10 RCTs comprising 710 randomized participants), closed-loop systems led to greater mean percent time in range than all other technologies; mean percent time in range was 17.85 (95% PrI 7.56–28.14), 13.29 (95% PrI 2.01–24.57), 12.76 (95% PrI 3.25–22.26), 10.60 (95% PrI 5.30–15.90), and 8.77 (95% PrI 2.99–14.54) longer than that for MDI with SMBG, MDI with flash glucose monitoring (FGM), MDI with CGM, CSII with CGM/FGM/SMBG, and CSII with CGM, respectively. Limiting closed-loop systems to nocturnal use led to mean percent time in range 13.98 (95% PrI 4.38–23.57), 8.88 (95% PrI 0.20–17.57), and 4.89 (95% PrI 0.63–9.15) longer than that for MDI with SMBG, MDI with CGM, and CSII with CGM, respectively. Mean time in range was 5.09 (95% PrI 0.62–9.57) longer for MDI therapy with CGM compared with SMBG.
Interval plots with 95% CIs and 95% PrI for the percent time in range (A), percent time above range (B), and percent time below range (C) networks.
Interval plots with 95% CIs and 95% PrI for the percent time in range (A), percent time above range (B), and percent time below range (C) networks.
Ranking technologies for percent time in range favored closed-loop systems (SUCRA 98.5%), nocturnal closed-loop systems (SUCRA 83.9%), and nonintegrated CSII with CGM therapy (SUCRA 57.7%). See Supplementary Material (pp. 65–66) for rankograms and cumulative ranking curve plots.
Regarding the mean difference of percent time above range (10 RCTs comprising 705 participants), no intervention was significantly better than any other. Closed-loop systems led to percent time above range that was 7.97 (95% CI 0.82–15.11) shorter than for CSII with CGM therapy. This was not significant for the 95% PrI. Ranking technologies by the mean difference of the percent time above range favored closed-loop systems (SUCRA 93.5%), nocturnal closed-loop systems (SUCRA 76.5%), and CSII therapy combined with CGM/FGM/SMBG (SUCRA 52.7%). See Supplementary Material (pp. 76–77) for rankograms and cumulative ranking curve plots. Continuous and nocturnal closed-loop systems appeared to provide the best composite ranking and MDI with SMBG the worst composite ranking in cluster analysis of SUCRA values with simultaneous consideration of percent time within and above target ranges (Supplementary Material, p. 93).
In terms of the mean difference of percent time below range (12 RCTs comprising 872 participants), no intervention was significantly better than any other. MDI with CGM led to percent time below range 4.04 (95% CI 1.63–6.44) shorter than for MDI with SMBG therapy. This was not significant for the 95% PrI. Ranking of technologies by mean time below range was similar for MDI with CGM (SUCRA 67.7%), closed-loop therapy (SUCRA 62.2%), MDI with FGM (SUCRA 62.1%), low glucose suspend systems (SUCRA 61.8%), and nocturnal closed-loop systems (SUCRA 57.8%). See Supplementary Material (pp. 87–88) for rankograms and cumulative ranking curve plots. For the outcome of time below range, correlation coefficients were imputed for one crossover study (31). Continuous and nocturnal closed-loop systems appeared to provide the best composite ranking and MDI with SMBG the worst composite ranking in cluster analysis of SUCRA values with simultaneous consideration of percent time within and below target ranges (Supplementary Material, p. 94).
Network heterogeneity was substantial for the time-in-range (τ2 = 0.81, I2 = 87% [95% CI 73–96]) and time-above-range (τ2 = 12.45, I2 = 73% [95% CI 41–92]) networks. Heterogeneity was considerable for the time-below-range network (τ2 = 4.43, I2 = 97% [95% CI 94–99]). Statistical consistency for direct and indirect evidence was present in all but one triangular loop for time below range comprising MDI with CGM, MDI with FGM, and MDI with SMBG (P = 0.005). The certainty of evidence for treatment effects was very low (Supplementary Material, pp. 98–103). The distribution of potential effect modifiers including age, diabetes duration, HbA1c, and sex is presented graphically in the Supplementary Material (pp. 95–96).
Discussion
This is the first network meta-analysis that compares technologies for insulin delivery and glucose monitoring for the key outcomes of time in glycemic ranges. Direct and indirect evidence for eight diabetes management technologies facilitated up to 28 separate treatment comparisons for each outcome and showed that closed-loop systems lead to longer time within target glycemic range (3.9–10.0 mmol/L [70–180 mg/dL]) than any other system. This means not only that adults with type 1 diabetes using closedloop systems may spend less time in symptomatic states of glycemic extremes but also that they may enjoy improved long-term outcomes.
Confirmatory prospective long-term studies will be needed to assess the impact of our findings on vascular complications, but an increase of 10% time in range has been correlated with an HbA1c reduction of ∼0.5%–0.8% (5.5–8.7 mmol/mol) (6,16,32). Applying these findings to our results for example, would correspond to closed-loop systems lowering HbA1c by ∼0.9%–1.4% (9.7–15.6 mmol/mol) or 0.4%–0.7% (4.4–7.7 mmol/mol) compared with current standard care of MDI and SMBG or sensor-augmented pump therapy, respectively. It also suggests that reduction in HbA1c would not be achieved at the expense of more time below range, which is a concern with other approaches to intensive glycemic control.
Extending analysis to percent time above range (10.0 mmol/L [180 mg/dL]) also favored continuous and nocturnal closed-loop systems. The percent time below target (3.9 mmol/L [70 mg/dL]) had limited precision at the network level, with the best five treatments having almost equivalent ranking values. Inadequate data were presented to perform network meta-analysis for time below range using 3.5 mmol/L (63 mg/dL) or 3.0 mmol/L (54 mg/dL) thresholds, but exploratory analysis for 2.8 mmol/L (50 mg/dL) generated similar results (data not shown). Cluster ranking provided further insight by simultaneously plotting SUCRA values for time in range as well as time below range. This may be interpreted as continuous or nocturnal closed-loop systems providing the best composite ranking with simultaneous consideration of both key outcomes. Furthermore, a consistent finding across all outcomes was that standard care with MDI and capillary blood glucose testing clearly ranked as the worst treatment strategy. This better informs clinical decisions to use diabetes management technology and has particular relevance to policy for countries in which technologies are not currently funded.
Because standard meta-analyses require direct evidence considering at most one intervention and one comparison, closed-loop systems have not previously been compared with each alternative treatment strategy. Others have compared closed-loop systems with one heterogeneous comparator arm comprising many different devices or have limited comparators to sensor-augmented pump therapy (33,34). While less familiar for many readers, and statistically more complex, the utilization of network meta-analysis has allowed us to better address the range of therapeutic choices by generating an integrated comparison of eight management technologies. Not only do we report that closed-loop systems performed best to achieve time in range, we also provide separate quantitative comparisons against any therapeutic alternative reported in the literature. Furthermore, previous reviews considered trials of any duration including those of 12 h and involved both adults and pediatric participants in the same analyses (33,34). The generalizability of reviews based on short trials and heterogeneous populations is unclear, and recently published recommendations subsequent to those reviews suggest that clinical decisions should be based on at least 14 days of CGM results (6). Limitation of our literature search to studies that presented this duration of CGM data also makes our review most firmly able to address technology comparisons within current recommendations for time in range.
In addition to presenting the first systematically integrated comparison of closed-loop systems against multiple separate devices, our network meta-analysis provides a unique framework for comparing 28 permutations of different technologies. The breadth of our integrated results has particular relevance to advocacy and guideline development, in addition to providing estimates of treatment effect to facilitate cost-effectiveness analyses. Although not a guideline itself, one clinical application of our results may also be illustrated through an approach to device selection. For example, among adults using standard care of MDI with SMBG, closed-loop systems appeared to offer the greatest potential for improved glycemia compared with alternative therapeutic options based on our novel analyses. However, our trial-based estimates of efficacy may overestimate the effects. Indeed, real-world observational data have recently suggested that 46% of people may discontinue hybrid closed-loop therapy over 1 year of follow-up due to human factors and practical concerns including alarms and sensor calibration (35). If only one device can be implemented, stand-alone CGM may be the preferred option, as it provided significantly more time in range than standard care and ranked higher than other standalone devices. If personal preference, cost, or other technology features favored FGM or CSII therapy, it is notable that either technology ranked better than standard care and that treatment effects comparing standalone devices with each other were not significantly different. Studies that compare the full range of devices and consider treatment effects in relation to user preference for particular devices, human factors, and engagement with technology and health services may further assist the choice between technologies.
Key strengths of the approach to our study were the variety of technologies considered as well as expansion of direct comparisons with indirect evidence. We followed current guidelines for the conduct and reporting of systematic reviews and network meta-analyses (36). Literature searches were extensive, though they were limited to the English language and gray literature was not systematically searched. We also acknowledge that most studies had small sample sizes contributing to network heterogeneity, and many were at unclear or high risk of bias due to information not being reported. All studies were considered at high risk of performance bias due to the infeasibility of blinding participants and clinicians to the treatments being compared. While the studies that we included, of at least 2 weeks’ duration with 2 weeks of CGM data, are longer than those of previous meta-analyses regarding time in range, generalizability to long-term implementation in clinical practice may be limited by factors such as time spent using the device or ceasing therapy. In the absence of participant-level data, network meta-regression to further evaluate transitivity was not possible. While the important potential effect modifiers of age, diabetes duration, HbA1c, and sex appeared similar for included studies, important aspects such as user preferences for technology types, standardization of implementation for technologies, and nutrition and physical activity could not be assessed. This also led to the conservative default downgrading of evidence quality for all comparisons. Many studies reported time in range as median values with interquartile ranges, requiring imputed results for SD values in network meta-analyses. Most crossover studies also reported results as parallel-group studies, but only one did not adequately report on carryover effect. Consistency of results at the extremes of our imputed correlation analysis, however, suggested that this did not significantly impact our results. Pooling treatment effects for all hybrid closed-loop systems prevented comparison of the various algorithms used. However, sensitivity analyses indicated that no algorithm was statistically superior to any other and all algorithms ranked higher than CSII and CGM followed by MDI combined with CGM, FGM, or SMBG. The issue of generalizability of our results to the general population is similar to that for all meta-analysis of RCTs. Participants in RCTs may be among the most motivated and actively engaged, and we note that the mean (SD) baseline HbA1c of 7.7% (0.7%), 61 (7.7) mmol/mol, from our network is not representative of all populations. The publication of individual participant-level data or subgroup data by baseline glycemia would greatly help future meta-analyses adjust efficacy results for baseline glycemia and improve generalizability. It is also important to highlight that the positive glycemic impact of technology likely occurs in the setting of multidisciplinary and individualized treatment of factors that were beyond the scope of this review.
Future trials should address clinically relevant technology comparisons for which there is limited or no direct trial evidence. Furthermore, as dual hormone systems and fully closed-loop systems without meal announcement are studied, these should provide at least 2 weeks of CGM results (6), preferably with participant-level data. Given the rapid pace of technology development, we also advocate for the adoption of living systematic reviews to assist with the rapid incorporation of evidence into clinical practice guidelines. Furthermore, cost-effectiveness analyses are also necessary to support the wider adoption and funding of advanced technologies such as closed-loop systems as part of holistic management for people with type 1 diabetes.
In conclusion, our systematic review and network meta-analysis has compared 28 different permutations of diabetes management technologies and suggests that closed-loop systems provide the best potential for maintaining glucose levels within target glycemic range. Ranking of the eight different technologies also provides a framework to assist clinicians, policy makers, and guidelines to make comparisons among a myriad of available technologies. Consensus and implementation of uniform reporting standards for CGM data and living systematic reviews are also required to assist the incorporation of evolving evidence into future clinical practice guidelines.
This article contains supplementary material online at https://doi.org/10.2337/figshare.12116691.
Article Information
Duality of Interest. Outside the submitted work, D.L. has received grants from AstraZeneca, Pfizer, AbbVie, and Bristol-Myers Squibb and has received personal fees from AstraZeneca, Astellas, and Bayer. S.Z. reports participation in advisory boards, expert committees, or educational meetings on behalf of Monash University for Boehringer Ingelheim, Sanofi, AstraZeneca, Novo Nordisk, Eli Lilly, and MSD Australia outside the submitted work. No other potential conflicts of interest relevant to this article were reported.
Author Contributions. A.P. and S.Z. conceived the study. A.P., A.E., D.L., and S.Z. designed the study. A.P. and C.L. selected the articles. A.P. and V.K. appraised articles and extracted data for the clinical review. A.P. analyzed the data under the supervision of A.E. and S.Z. A.P. wrote the first draft of the manuscript. A.P., A.E., C.L., D.L., S.Z., and V.K. interpreted the data and contributed to the writing of the final version of the manuscript. All authors agreed to the results and conclusions of the article.