This study analyzed the differences in continuous glucose monitoring (CGM)-derived metrics among three current-generation systems and evaluated their impact on therapeutic decision-making.
Twenty-three participants wore the FreeStyle Libre 3, Dexcom G7, and Medtronic Simplera CGM systems for 14 days in parallel. CGM metrics were calculated for each participant and CGM system separately.
The apparent glucose profile was influenced by the used CGM system, resulting in substantially different glycemic metrics among the three systems. Agreement between FreeStyle Libre 3 and Dexcom G7 was higher than with Medtronic Simplera, which showed lower glucose levels, on average. There were marked intraparticipant discrepancies that would have resulted in different therapeutic recommendations.
The CGM systems indicated discordant glycemic metrics, which should be considered in diabetes therapy. Different CGM systems should provide the same glucose readings and CGM-derived metrics when used by the same person.
Introduction
Metrics derived from continuous glucose monitoring (CGM) data are used to assess the glycemic impact of diabetes therapy. International consensus recommends specific targets for glycemic control metrics, such as time spent in certain glucose ranges, that should be achieved for optimal glucose control (1). In recent years, several studies have demonstrated that, when worn in parallel, different CGM systems can display discordant glucose profiles (2–5), likely caused by differences in CGM accuracy. Consequently, CGM-derived metrics can differ substantially depending on the CGM system used. Today, newer generations of these CGM systems are available, raising the question of whether this problem persists. Therefore, we analyzed data from a recent head-to-head CGM performance study in which participants wore three current-generation CGM systems of the principal manufacturers simultaneously. In that study, we found substantial differences in accuracy (6). The objective for the present article was to examine how these differences in accuracy affect the CGM-derived metrics and evaluate their possible impact on therapeutic decision-making.
Research Design and Methods
This was a prospective, interventional, monocentric, single-arm, open-label study performed between April and July 2024 at the Institut für Diabetes-Technologie, Forschungs- und Entwicklungsgesellschaft mbH an der Universität Ulm, in Ulm, Germany. The study was carried out under consideration of the Declaration of Helsinki and in compliance with the national regulations and provisions. It was approved by the responsible ethics committee and the competent authority. The study was registered in the German Clinical Trials Register (no. DRKS00033697).
Study Design and Investigational Devices
Adult participants with type 1 diabetes were included after we obtained their informed consent. All participants wore sensors of the FreeStyle Libre 3 (FL3; Abbott Diabetes Care, Alameda, CA), Dexcom G7 (DG7; Dexcom Inc., San Diego, CA), and Medtronic Simplera (MSP; Medtronic Minimed, Northridge, CA) CGM systems in parallel for 14 days. The FL3 and MSP sensors originated from one manufacturing lot; the DG7 sensors came from two different lots randomized across participants. Each participant received a sensor from each system on study day 1 and the sensors were removed on study day 15. However, due to the differing lifetimes (FL3: 14 days; DG7: 10 days; MSP: 7 days), one FL3 sensor per participant was used, whereas sensors of the DG7 and MSP were replaced on study days 5 and 8, respectively. No manual calibrations were performed, although DG7 and MSP allow optional calibration. The sensors were inserted on the upper arm and evenly distributed between left and right arm within CGM systems. The vast majority of the study duration was spent in a free-living setting, where participants followed their regular daily routine. However, three 7-h in-clinic sessions with deliberate glucose-level manipulation in the hypo- and hyperglycemic ranges were performed for the purpose of CGM accuracy assessment. Simultaneously, capillary glucose levels were measured every 15 min with the Contour Next blood glucose monitoring system (Ascensia Diabetes Care Holdings AG, Basel, Switzerland) (6).
Data Analysis and Statistical Analysis
To allow the direct comparison of CGM-derived metrics, only periods in which all three systems recorded data simultaneously were included in the analysis. Consequently, only participants for whom at least 70% of CGM readings (relative to the 14-day study period) were available were evaluated (1). CGM metrics were calculated for each participant and CGM system separately and included time below range level 2 (TBRL2; <54 mg/dL); time below range (TBR; <70 mg/dL); time in range (TIR; 70–180 mg/dL); time above range (TAR; >180 mg/dL); time above range level 2 (>250 mg/dL); mean glucose concentration; glucose management indicator (GMI); glycemic variability (1); time in tight range (70–140 mg/dL) (7); and glycemia risk index (8). Additionally, the number of hypoglycemic episodes <54 mg/dL and <70 mg/dL was calculated, as described previously (9). Differences in CGM-derived metrics between CGM systems were assessed within individuals and on a population level using nonparametric statistical tests.
Data and Resource Availability
The data sets generated during and/or analyzed in this study are available from the corresponding author upon reasonable request.
Results
After exclusion of one participant due to insufficient CGM data availability, mainly caused by the loss of a FL3 sensor, 23 participants, 17 of whom were male, were included in the analysis. The mean age of the participants was 52.7 years, mean BMI was 26.1 kg/m2, and average diabetes duration was 26.7 years with mean HbA1c of 6.6% (49 mmol/mol). Five participants used multiple daily injections; 18 were undergoing continuous subcutaneous insulin infusion, of whom 15 were using an automated insulin delivery (AID) system. Median data availability was 97.8% (range 73.0–98.6%).
Summary statistics of CGM-derived metrics from each CGM system are shown in Table 1. Although there were few significant differences between DG7 and FL3, most metrics derived from MSP data differed from those derived from FL3 or DG7, resulting, on average, in a lower GMI, a higher TIR and TBR, and a lower TAR with MSP (Table 1, Fig. 1). Furthermore, CGM readings of FL3 and DG7 were on average 14.2% and 11.2%, respectively, above MSP (relative to MSP), whereas CGM readings of FL3 were on average 2.7% higher than DG7 (relative to DG7). These findings concur with the accuracy results, where FL3, DG7, and MSP showed relative biases compared with capillary glucose measurements of −1.1%, −2.5%, and −14.5%, respectively (6). Population-level TIR and GMI results for individual days are shown in Supplementary Fig. 1.
CGM-derived metrics from the three CGM systems
Metric . | FL3 . | DG7 . | MSP . | P < 0.05* . |
---|---|---|---|---|
Time | ||||
<54 mg/dL (%) | 0.1 (0.0–4.3) | 0.7 (0.0–5.5) | 0.7 (0.1–5.2) | a,b |
Minutes | 1.8 (0.0–61.7) | 10.2 (0.0–78.8) | 9.5 (0.7–75.2) | |
<70 mg/dL (%) | 2.8 (0.1–18.0) | 4.2 (0.4–17.5) | 5.1 (0.7–15.6) | b |
Minutes | 40.3 (1.1–259.2) | 60.1 (5.5–251.6) | 73.9 (9.8–224.8) | |
70–140 mg/dL (%) | 54.7 (30.2–73.6) | 54.8 (27.0–72.6) | 67.5 (38.6–82.7) | b,c |
Hours | 13.1 (7.2–17.7) | 13.1 (6.5–17.4) | 16.2 (9.3–19.8) | |
70–180 mg/dL (%) | 76.3 (47.7–92.0) | 76.2 (56.7–92.0) | 84.0 (64.1–92.6) | b,c |
Hours | 18.3 (11.5–22.1) | 18.3 (13.6–22.1) | 20.2 (15.4–22.2) | |
>180 mg/dL (%) | 21.8 (6.4–49.5) | 19.6 (4.4–42.3) | 10.5 (2.2–31.6) | b,c |
Hours | 5.2 (1.5–11.9) | 4.7 (1.1–10.2) | 2.5 (0.5–7.6) | |
>250 mg/dL (%) | 4.3 (0.1–16.3) | 3.1 (0.0–13.6) | 1.3 (0.0–6.8) | b,c |
Minutes | 62.6 (1.1–235.0) | 44.4 (0.0–196.2) | 19.3 (0.0–98.5) | |
Glucose, mean (mg/dL) | 143 (109–179) | 140 (111–177) | 123 (105–154) | b,c |
GMI (%) | 6.7 (5.9–7.6) | 6.7 (6.0–7.5) | 6.3 (5.8–7.0) | b,c |
%CV (%) | 37.3 (24.2–43.1) | 37.0 (25.4–45.5) | 34.7 (21.8–43.2) | b,c |
Glycemia risk index | 28.9 (9.8–58.5) | 34.1 (9.4–57.8) | 23.7 (7.2–53.9) | c |
Events | ||||
<70 mg/dL | 9 (0–32) | 10 (3–33) | 15 (2–33) | a,b,c |
<54 mg/dL | 0 (0–9) | 3 (0–14) | 3 (0–15) | a,b |
Metric . | FL3 . | DG7 . | MSP . | P < 0.05* . |
---|---|---|---|---|
Time | ||||
<54 mg/dL (%) | 0.1 (0.0–4.3) | 0.7 (0.0–5.5) | 0.7 (0.1–5.2) | a,b |
Minutes | 1.8 (0.0–61.7) | 10.2 (0.0–78.8) | 9.5 (0.7–75.2) | |
<70 mg/dL (%) | 2.8 (0.1–18.0) | 4.2 (0.4–17.5) | 5.1 (0.7–15.6) | b |
Minutes | 40.3 (1.1–259.2) | 60.1 (5.5–251.6) | 73.9 (9.8–224.8) | |
70–140 mg/dL (%) | 54.7 (30.2–73.6) | 54.8 (27.0–72.6) | 67.5 (38.6–82.7) | b,c |
Hours | 13.1 (7.2–17.7) | 13.1 (6.5–17.4) | 16.2 (9.3–19.8) | |
70–180 mg/dL (%) | 76.3 (47.7–92.0) | 76.2 (56.7–92.0) | 84.0 (64.1–92.6) | b,c |
Hours | 18.3 (11.5–22.1) | 18.3 (13.6–22.1) | 20.2 (15.4–22.2) | |
>180 mg/dL (%) | 21.8 (6.4–49.5) | 19.6 (4.4–42.3) | 10.5 (2.2–31.6) | b,c |
Hours | 5.2 (1.5–11.9) | 4.7 (1.1–10.2) | 2.5 (0.5–7.6) | |
>250 mg/dL (%) | 4.3 (0.1–16.3) | 3.1 (0.0–13.6) | 1.3 (0.0–6.8) | b,c |
Minutes | 62.6 (1.1–235.0) | 44.4 (0.0–196.2) | 19.3 (0.0–98.5) | |
Glucose, mean (mg/dL) | 143 (109–179) | 140 (111–177) | 123 (105–154) | b,c |
GMI (%) | 6.7 (5.9–7.6) | 6.7 (6.0–7.5) | 6.3 (5.8–7.0) | b,c |
%CV (%) | 37.3 (24.2–43.1) | 37.0 (25.4–45.5) | 34.7 (21.8–43.2) | b,c |
Glycemia risk index | 28.9 (9.8–58.5) | 34.1 (9.4–57.8) | 23.7 (7.2–53.9) | c |
Events | ||||
<70 mg/dL | 9 (0–32) | 10 (3–33) | 15 (2–33) | a,b,c |
<54 mg/dL | 0 (0–9) | 3 (0–14) | 3 (0–15) | a,b |
Results are given as median (minimum − maximum); %CV, coefficient of variation.
*P values between CGM system pairs were calculated using the paired Wilcoxon test, as follows:
aFL3 vs. DG7
bFL3 vs. MSP
cDG7 vs. MSP
Median percentage of time in different glucose ranges across all study participants (n = 23) according to the different CGM systems.
Median percentage of time in different glucose ranges across all study participants (n = 23) according to the different CGM systems.
Figure 2 gives a more detailed, participant-specific analysis, including differences in glycemic metrics between pairs of CGM systems. Despite FL3 and DG7 showing, on average, comparable metrics, there were individual differences of up to 9.8% (142 min) in TBR, 11.9% (171 min) in TIR, and 13.4% (193 min) in TAR. TIR differed >5%, which is considered clinically significant (1,10,11), in five participants (22%) when comparing FL3 and DG7, in 17 participants (74%) when comparing FL3 with MSP, and in 12 participants (52%) when comparing DG7 and MSP. Similarly, GMI differences of >0.3%, considered clinically significant (12), were observed in five participants (22%) when comparing FL3 and DG7, in 18 participants (78%) when comparing FL3 with MSP, and in 14 participants (61%) when comparing DG7 and MSP.
A–D: CGM-derived metrics of each participant (dots; n = 23) according to different CGM systems. Identical participants are connected by lines. Red dashes show the medians. Dashed lines indicate therapy targets. E–H: Differences in CGM-derived metrics between pairs of CGM systems (indicated by the x-axis labels) within the same participant (dots). Dashed lines for TIR and GMI indicate clinically significant differences (1,10–12).
A–D: CGM-derived metrics of each participant (dots; n = 23) according to different CGM systems. Identical participants are connected by lines. Red dashes show the medians. Dashed lines indicate therapy targets. E–H: Differences in CGM-derived metrics between pairs of CGM systems (indicated by the x-axis labels) within the same participant (dots). Dashed lines for TIR and GMI indicate clinically significant differences (1,10–12).
The maximum observed difference in TBR within the same participant was 12.9% (185 min) between FL3 and MSP. Consequently, the CGM system used influenced whether specific therapy targets were met or not. For example, the therapy target of TBR <4% (1) (Fig. 2A) was met by 16 participants (70%) based on FL3 data, 11 (48%) participants based on DG7, and 6 (26%) based on MSP data.
Apart from the differences in metrics derived over 14 days, the study procedures also allowed the examination of differences over shorter time spans by comparing glucose profiles between CGM systems during experimentally induced hyper- and hypoglycemia (Fig. 3). This figure also shows the average profile of capillary comparator measurements to provide a qualitative impression of CGM accuracy. For more detailed results of CGM performance, the reader is referred our previously published article (6).
Mean time course of capillary comparator measurements and CGM glucose data from the three systems during experimentally induced hyperglycemia (A) and hypoglycemia (B). The individual profiles were synchronized according to the time of the first capillary measurement >250 mg/dL or <70 mg/dL, respectively, and averaged.
Mean time course of capillary comparator measurements and CGM glucose data from the three systems during experimentally induced hyperglycemia (A) and hypoglycemia (B). The individual profiles were synchronized according to the time of the first capillary measurement >250 mg/dL or <70 mg/dL, respectively, and averaged.
Conclusions
Despite advances in CGM measurement accuracy, this study has shown that considerable differences remain between glucose values reported by different, current-generation CGM systems in the same person. This might lead to different therapy decisions and interventions.
This study had a short observation period, which was additionally decreased by only analyzing periods during which all systems were delivering data simultaneously, and a small population size, leading to a low overall number of used sensors. These limitations affect the interpretation of the differences in metrics between CGM systems on an individual level, because they are caused by two overlapping effects: the variations of individual sensors within the same CGM system and the systematic differences between CGM systems. To clearly distinguish between these two effects, multiple sensors from the same CGM system would have to be used. However, we assert that the study protocol (14-day collection of CGM data with one to two sensors per CGM system) complies with the consensus recommendation (1); therefore, we consider the results to reflect the experience in clinical practice. In contrast to the differences in metrics between CGM systems on an individual level, the averaged population-level results should be less affected by variability of individual sensors. Another limitation of this study was that the participant population had above-average glycemic control, meaning that the observed CGM-derived metrics likely covered a narrower range compared with the overall population of people with diabetes. Nevertheless, we argue that the findings of this study have two clinically relevant consequences for practical diabetes therapy.
First, CGM-derived data are used by people with diabetes and health care professionals to assess glucose control. Here, the study showed that the observed differences between CGM systems would have resulted in different individual therapy adjustments depending on which system was used. According to the international consensus statement (1), highest priority in therapy adjustment should be given to decreasing the TBR to <4%, followed by adjusting TAR and TIR. However, in almost half of participants (n = 11; 48%), the CGM systems did not agree on whether the TBR target was met, meaning that necessary therapy adjustments might have been missed or unnecessary adjustments might have been made depending on the used CGM system. Regarding the targets for TIR (>70%) and TAR (<25%), the corresponding number of participants for whom the therapeutic recommendations would have diverged was lower but still considerable with five (22%) and eight (35%), respectively. These results emphasize that glycemic metrics are influenced by the used CGM system and/or individual sensor, which should be considered by people with diabetes and health care professionals when comparing data from different systems or switching from one system to another. Furthermore, the population-level differences between CGM systems indicate that results from trials comparing AID systems based on glycemic metrics measured with different CGM systems should be interpretated with caution (13,14).
The second practical and relevant consequence of our findings is that different CGM systems might affect everyday clinical decision-making of people with diabetes and the insulin delivery of AID systems. This is illustrated in Fig. 3, where it is demonstrated that CGM profiles of FL3 and DG7 were similar during hyperglycemic and hypoglycemic episodes. In contrast, the CGM profiles of MSP were lower, resulting in earlier detection of hypoglycemia but later detection of hyperglycemia. This would most likely affect the self-management behavior of people with diabetes, including both the timing and choice of treatment decisions, like insulin delivery or hypoglycemia interventions.
In our opinion, the main reason for the observed discrepancies of CGM systems is the heterogenous procedures used in CGM performance studies, in particular during development. Therefore, standardization of study procedures, especially the collection of comparator data, will ensure that accuracy of CGM systems is judged against harmonized comparator data, ultimately resulting in better alignment of CGM systems (14). This is the goal of the working group on CGM of the International Federation of Clinical Chemistry and Laboratory Medicine.
Clinical trial reg. no. DRKS00033697, https://www.drks.de/
See accompanying article, p. 1161.
This article contains supplementary material online at https://doi.org/10.2337/figshare.28611818.
Article Information
Acknowledgments. The authors thank everybody who participated in the study, the staff at the Institut für Diabetes-Technologie, Forschungs- und Entwicklungsgesellschaft mbH an der Universität Ulm (IfDT), and the sponsors of this study.
Funding. Financial support to partially cover the costs of this study was provided by BIONIME Corporation, Diabetes Center Berne, i-SENS, Inc., and Roche Diabetes Care GmbH. Additionally, Ascensia Diabetes Care Holdings AG provided blood glucose monitoring systems and associated consumables free of charge. The remaining costs were carried by the IfDT. No funding was provided by any of the manufacturers of the examined CGM systems. None of the commercial entities had any influence on the study design, data analysis, or presentation or publication of results.
Duality of Interest. G.F. is the general manager and medical director of the IfDT, which carries out clinical studies on its own initiative and on behalf of various companies. G.F. and IfDT have received research support, speakers’ honoraria, or consulting fees in the last 3 years from Abbott, Ascensia, Berlin Chemie, Boydsense, Dexcom, Lilly Deutschland, Novo Nordisk, Perfood, Pharmasens, Roche, Sinocare, Terumo, and Ypsomed. D.W., S.W., M.E., S.P., M.L., N.J., S.Ö., and C.H. are employees of IfDT. No other potential conflicts of interest relevant to this article were reported.
Author Contributions. G.F., S.W., M.E., S.P., M.L., D.B., C.H., and D.W. were involved in the conception and design of the study. G.F., S.W., M.E., M.L., N.J., S.Ö., and D.W. were involved in the conduct of the study. M.E. and S.P. analyzed data and reviewed and edited the manuscript. D.W. and S.W. wrote the first draft of the manuscript. All authors were involved in the interpretation of data, reviewed and edited the manuscript, and approved the final version of the manuscript. D.W. is the guarantor of this work and, as such, has full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
Prior Presentation. Part of the results reported in here were presented at the 2024 Fall Meeting of the German Diabetes Society, Hannover, Germany, 23 November 2024.
Handling Editors. The journal editors responsible for overseeing the review of the manuscript were John B. Buse and Jeremy Pettus.