OBJECTIVE

To develop a noninvasive hypoglycemia detection approach using smartwatch data.

RESEARCH DESIGN AND METHODS

We prospectively collected data from two wrist-worn wearables (Garmin vivoactive 4S, Empatica E4) and continuous glucose monitoring values in adults with diabetes on insulin treatment. Using these data, we developed a machine learning (ML) approach to detect hypoglycemia (<3.9 mmol/L) noninvasively in unseen individuals and solely based on wearable data.

RESULTS

Twenty-two individuals were included in the final analysis (age 54.5 ± 15.2 years, HbA1c 6.9 ± 0.6%, 16 males). Hypoglycemia was detected with an area under the receiver operating characteristic curve of 0.76 ± 0.07 solely based on wearable data. Feature analysis revealed that the ML model associated increased heart rate, decreased heart rate variability, and increased tonic electrodermal activity with hypoglycemia.

CONCLUSIONS

Our approach may allow for noninvasive hypoglycemia detection using wearables in people with diabetes and thus complement existing methods for hypoglycemia detection and warning.

Hypoglycemia is a frequent and dangerous complication of diabetes (1), causing changes in physiological parameters (2,3). While wearable technology can measure these changes, machine learning (ML) may be capable of associating them with hypoglycemia. Our objective was to develop an ML approach to detect hypoglycemia noninvasively using data from consumer-grade wearable technology. Thus, we prospectively collected smartwatch and continuous glucose monitoring (CGM) data in adults with diabetes on insulin treatment.

Study Design and Population

This prospective, single-center study was conducted at the University Hospital of Bern from February 2021 to March 2022. To reflect the increased risk of hypoglycemia related to insulin treatment independently of the underlying diabetes entity, we included adults with diabetes on multiple daily injection, pump, or hybrid closed-loop insulin therapy but without cardiac arrhythmia, antiarrhythmic drugs, pacemaker, or implantable cardioverter defibrillator. The study was conducted in accordance with Good Clinical Practice principles and the Declaration of Helsinki after local ethics committee approval (2020-02721). Participants provided informed consent.

Study Procedures and Data Collection

Eligible participants received a smartwatch (Garmin vivoactive 4S) to collect heart rate variability, heart rate, motion, and time data, and a second wrist-worn wearable (Empatica E4) to collect electrodermal activity. Participants were fitted with a CGM system (Dexcom G6) and performed daily fasting calibrations to enhance the accuracy of ground truth. After 30 days of data collection, participants were scheduled for the second visit and were free to extend participation up to 90 days.

Outcome and Sample Size

The main outcome was the diagnostic accuracy of our ML approach to detect hypoglycemia from wearable data quantified as the area under the receiver operating characteristic curve (AUROC). The sample size consideration is described in the Supplementary Methods 1.

Analysis and ML

Figure 1A displays our ML approach. We preprocessed heart rate, heart rate variability, and motion signals. The electrodermal activity signal was separated into its main components: fast reacting (phasic) and slow reacting (tonic). For feature engineering, we followed the conventions for time-series classification (e.g., as done by Bent et al. [4]). We cut all signals into nonoverlapping sequences of 60 s, then applied statistical aggregation functions (e.g., mean) on each sequence, generating a set of 36 interpretable features per sequence. For these steps, we developed a software package and made it publicly available to ensure full reproducibility (5). Time encoded as full hours from 0100 to 2400 h was added, resulting in a total of 37 input features (Supplementary Table 1). CGM values were linearly interpolated between two consecutive measurements. The binary output variable for the ML models was set based on CGM data (ground truth) for each sequence (to 1 for glucose <3.9 mmol/L for ≥15 min according to Battelino et al. [6] or to 0 otherwise).

Figure 1

Study overview and procedure for building and evaluating the ML model. A: Displayed is an overview of the study population and the procedure for building and evaluating the ML model. B: Shown are the included (blue boxes) and excluded (white boxes) individuals with reasons. CF, cardiac features; EDA, electrodermal activity; feature generation toolkit for wearable data (FLIRT), Fast Library Identification and Recognition Technology; HR, heart rate; HRV, heart rate variability; MO, motion.

Figure 1

Study overview and procedure for building and evaluating the ML model. A: Displayed is an overview of the study population and the procedure for building and evaluating the ML model. B: Shown are the included (blue boxes) and excluded (white boxes) individuals with reasons. CF, cardiac features; EDA, electrodermal activity; feature generation toolkit for wearable data (FLIRT), Fast Library Identification and Recognition Technology; HR, heart rate; HRV, heart rate variability; MO, motion.

Close modal

A decision tree–based model was used as the ML model, which is a way of making decisions by following a flowchart-like structure. For model building and evaluation, we followed the idea of De Caigny et al. (7) and refer to it as the pairing approach (see Supplementary Methods 2 for more details). This approach comprised three stages. In the training stage, we selected a test subject and trained an individual model for each remaining subject. In the validation stage, we paired the test subject with a training subject (i.e., the corresponding training model) that yielded the highest validation score. In the testing stage, we finally applied the selected training subject’s model to unseen data of the test subject. Given sufficient detection performance in the validation stage (mean validation score >0.7), the testing stage measured the generalizability of the ML model on the unseen subject.

An event-based stratified cross-validation approach was used to compute the mean scores for the validation and testing stage. It involved dividing a data set into smaller parts and testing the model multiple times using different combinations of these parts. Since our data set was made up of hypoglycemic events, we refer to it as event-based.

Shapley Additive Explanation (SHAP) (8) values were analyzed to explain the model’s decisions and to validate the associations between wearable data and hypoglycemia learned by the ML model. SHAP values are a way of determining the importance of different inputs to a model’s predictions. Features were grouped as follows by sensor modality to compute their relative importance: time, cardiac features (including heart rate and heart rate variability features), electrodermal activity, and motion.

Data and Resource Availability

The source code for preprocessing the smartwatch data has already been published (5). The code used for the pairing approach is available on GitHub (https://github.com/im-ethz/radar). The data sets generated and analyzed during the current study are available from the corresponding author upon reasonable request.

Of 31 included participants wearing both wearables, 22 had two or more hypoglycemic events covered by both wearables and were included in the final analysis (age 54.5 ± 15.2 years, HbA1c 6.9 ± 0.6%, 16 male) (Fig. 1B and Table 1). We gathered 95,632 CGM data points: 71.4% in euglycemia (3.9–10.0 mmol/L), 2.3% in hypoglycemia (<3.9 mmol/L), and 26.3% in hyperglycemia (>10.0 mmol/L). Twenty-one of the 22 included individuals participated for 30 days, and 1 participated for 90 days.

Table 1

Baseline characteristics of the participants included in the final analysis

VariableValue
Participants, n 22 
Age (years), mean ± SD 54.5 ± 15.2 
Sex, n  
 Male 16 
 Female 
Diabetes type, n  
 1 14 
 2 
 Pancreatogenic 
Insulin treatment, n  
 MDI 14 
 CSII 
 PLGS 
 HCL 
Weight (kg), mean ± SD 81.4 ± 15.1 
BMI (kg/m2), mean ± SD 27.4 ± 4.9 
TDD (IU/day/kg), mean ± SD 0.57 ± 0.21 
HbA1c, mean ± SD  
 % 6.9 ± 0.6 
 mmol/mol 51 ± 6 
Diabetes duration (years), mean ± SD 20.4 ± 14.3 
Clarke score >3*, n 
Gold score >3*, n 
Diabetes complications, n  
 Peripheral neuropathy 
 Autonomic neuropathy 
 Nephropathy 
 Retinopathy 
Comorbidities, n  
 Cardiovascular disease 
 Cerebrovascular disease 
 Dyslipidemia 14 
 Hypertension 
 Liver cirrhosis 
 Malignancy 
Number of hypoglycemic events during the study period covered with data from both wearables, mean ± SD 9.0 ± 7.7 
VariableValue
Participants, n 22 
Age (years), mean ± SD 54.5 ± 15.2 
Sex, n  
 Male 16 
 Female 
Diabetes type, n  
 1 14 
 2 
 Pancreatogenic 
Insulin treatment, n  
 MDI 14 
 CSII 
 PLGS 
 HCL 
Weight (kg), mean ± SD 81.4 ± 15.1 
BMI (kg/m2), mean ± SD 27.4 ± 4.9 
TDD (IU/day/kg), mean ± SD 0.57 ± 0.21 
HbA1c, mean ± SD  
 % 6.9 ± 0.6 
 mmol/mol 51 ± 6 
Diabetes duration (years), mean ± SD 20.4 ± 14.3 
Clarke score >3*, n 
Gold score >3*, n 
Diabetes complications, n  
 Peripheral neuropathy 
 Autonomic neuropathy 
 Nephropathy 
 Retinopathy 
Comorbidities, n  
 Cardiovascular disease 
 Cerebrovascular disease 
 Dyslipidemia 14 
 Hypertension 
 Liver cirrhosis 
 Malignancy 
Number of hypoglycemic events during the study period covered with data from both wearables, mean ± SD 9.0 ± 7.7 

CSII, continuous subcutaneous insulin infusion; HCL, hybrid closed-loop; MDI, multiple daily injection; PLGS, predictive low glucose suspend; TDD, total daily insulin dose.

*

A Clarke and/or Gold score of greater than three points indicates impaired awareness of hypoglycemia.

Of 22 participants, 19 were paired during the validation process. Therefore, we report the ML performance, relative importance, and SHAP values for 19 individuals. For the detection of hypoglycemia (<3.9 mmol/L), the ML model achieved an AUROC of 0.76 ± 0.07 (Fig. 2A). Important feature categories for the ML model’s decisions were time, cardiac features (including heart rate and heart rate variability) and electrodermal activity with a relative importance of 33.2%, 32.8%, and 27.3%, respectively (Fig. 2B). Motion features were less decisive, with 6.7% relative importance on the model output.

Figure 2

Wearable-based detection of hypoglycemia using ML. A: Reported is the AUROC to detect hypoglycemia (<3.9 mmol/L) solely based on wearable data (n = 19 of 22). B: Displayed are the feature categories with their relative importance on the model output. Time includes the time of the day coded as full hours from 0100 to 2400 h. The cardiac features (CF) include heart rate (HR) and heart rate variability (HRV) features. Electrodermal activity (EDA) features include phasic and tonic components. CE: SHAP values of the five most impactful features in the categories CF (C), EDA (D), and motion (MO) (E) are displayed. High to low feature values are coded using colors from yellow to violet, respectively. Feature values on the right side of the zero line drive the model toward hypoglycemia, whereas feature values on the left side drive the model toward nonhypoglycemia prediction. For example, the model tends to predict hypoglycemia when the 95th percentile (P95%) value for the tonic component of EDA is high (yellow values on the right side of the zero line). Conversely, a low tonic P95% value (violet values on the left side of the zero line) drives the model to predict nonhypoglycemia. AUC, area under the curve; IQR, interquartile range; P5%, 5th percentile; RMSSD, root mean square of differences between adjacent interbeat intervals; PNNI50, number of pairs of adjacent interbeat intervals differing by more than 50 ms in the entire recording divided by the total number of all interbeat intervals; ROC, receiver operating characteristic; STD, standard deviation.

Figure 2

Wearable-based detection of hypoglycemia using ML. A: Reported is the AUROC to detect hypoglycemia (<3.9 mmol/L) solely based on wearable data (n = 19 of 22). B: Displayed are the feature categories with their relative importance on the model output. Time includes the time of the day coded as full hours from 0100 to 2400 h. The cardiac features (CF) include heart rate (HR) and heart rate variability (HRV) features. Electrodermal activity (EDA) features include phasic and tonic components. CE: SHAP values of the five most impactful features in the categories CF (C), EDA (D), and motion (MO) (E) are displayed. High to low feature values are coded using colors from yellow to violet, respectively. Feature values on the right side of the zero line drive the model toward hypoglycemia, whereas feature values on the left side drive the model toward nonhypoglycemia prediction. For example, the model tends to predict hypoglycemia when the 95th percentile (P95%) value for the tonic component of EDA is high (yellow values on the right side of the zero line). Conversely, a low tonic P95% value (violet values on the left side of the zero line) drives the model to predict nonhypoglycemia. AUC, area under the curve; IQR, interquartile range; P5%, 5th percentile; RMSSD, root mean square of differences between adjacent interbeat intervals; PNNI50, number of pairs of adjacent interbeat intervals differing by more than 50 ms in the entire recording divided by the total number of all interbeat intervals; ROC, receiver operating characteristic; STD, standard deviation.

Close modal

Figure 2C–E shows the influence on the model output of the five most important features within the categories cardiac features, electrodermal activity, and motion. Increased heart rate and decreased heart rate variability were associated with hypoglycemia. For electrodermal activity, the model associated higher feature values, particularly of the tonic component, with hypoglycemia. Finally, the motion features were less impactful as indicated by SHAP values closer to 0, but motion decreased in hypoglycemia nonetheless.

The main findings of this pilot study in adults with diabetes on insulin treatment are twofold. First, our ML approach showed adequate performance in detecting hypoglycemia noninvasively from wearable data. Second, the ML model’s associations were consistent with known physiological changes in hypoglycemia.

Current methods to detect hypoglycemia include self-measurement of blood glucose and CGM. Wearable-based hypoglycemia detection could trigger self-measurement of blood glucose proactively in individuals not using CGM. In contrast to previously proposed approaches detecting hypoglycemia from wearables (911), our approach uses consumer-grade smartwatches without additional inputs (e.g., insulin, meals), increasing usability. In CGM and closed-loop users, ML-based approaches using wearable data could improve hypoglycemia prediction, as shown previously (12).

The interpretation of the SHAP values allowed for examining the model’s decisions. The model-related increased heart rate, decreased heart rate variability, and increased tonic component of electrodermal activity with hypoglycemia reflected the physiological stress response to hypoglycemia, such as tachycardia and sweating (2,13,14) (see Supplementary Table 1 for further explanation of the features). Time of the day constituted another relevant feature, potentially reflecting circadian patterns in physiological parameters and daily habits influencing glycemia. In our study, hypoglycemic events occurred most frequently between 2200 and 0200 h, corroborating the increased risk of hypoglycemia during this period as suggested by earlier reports (1517) because of impaired awareness and, consequently, delayed or missing treatment of hypoglycemia overnight.

The strength of this study is its prospective design using wearable data collected in real-life conditions in individuals with different diabetes types. Our ML model is based on data retrieved from unobtrusive consumer-grade devices, rendering our approach scalable. The model’s associations are consistent with known physiological changes in hypoglycemia, corroborating its robustness. In this proof-of-concept study, we focused on the detection of hypoglycemia at the well-accepted glucose threshold of <3.9 mmol/L (6), since early detection is clinically most relevant in enabling prompt treatment and preventing progression into more severe hypoglycemia. On the other hand, additional research is required to determine the model’s performance and decision making at various levels of hypoglycemia. While we acknowledge the relatively low frequency of hypoglycemic events in our data set compared with other studies (2), the absolute number of hypoglycemic events exceeded the number prespecified in the sample size calculations (Supplementary Methods 1). Of note, the number of hypoglycemic events recorded in the current study despite the use of open-label CGM emphasizes the need for complementary hypoglycemia detection methods. Still, the restricted sample size, the single-center design, and the male predominance limit generalizability of our results to other populations. Finally, further studies are needed to assess the impact of different diabetes types, comorbidities (e.g., severe neuropathy), and hypoglycemia unawareness on detection performance.

To conclude, we provide pilot results showing that hypoglycemia can be detected noninvasively based on consumer-grade wearable data using ML. Our approach may complement existing methods to detect hypoglycemia, thereby improving self-management and care of people with diabetes.

Clinical trials reg. no. NCT04689685; clinicaltrials.gov

This article contains supplementary material online at https://doi.org/10.2337/figshare.21995150.

This article is featured in podcasts available at diabetesjournals.org/journals/pages/diabetes-core-update-podcasts.

V.L., S.F., and M.M. share first authorship.

T.Z., F.W., and C.S. share last authorship.

Acknowledgments. The authors thank all the study participants for their time and enthusiasm. The authors also thank Laura Goetschi (Department of Diabetes, Endocrinology, Nutritional Medicine and Metabolism, Inselspital, Bern, University Hospital, University of Bern) for providing administrative support.

Funding. This study was funded by a grant provided by the Swiss Innovation Agency - Innosuisse (46917.1 IP-LS).

Duality of Interest. No potential conflicts of interest relevant to this article were reported.

Author Contributions. V.L., S.F., M.K., T.Z., F.W., and C.S. interpreted the data. V.L., S.F., S.L., K.O., and C.A. collected the data. V.L., S.F., T.Z., F.W., and C.S. contributed to the conception and design of the study. V.L., S.F., and C.S., drafted the manuscript. V.L., S.L., and T.Z. recruited the participants. S.F., M.M., E.v.W., and M.K. analyzed the data. M.M. developed the software for wearable data collection. M.M., E.v.W., M.K., S.L., K.O., C.A., E.F., T.Z., and F.W. critically reviewed the manuscript. All authors approved the final version of the manuscript. C.S. is the guarantor of this work and, as such, had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.

Prior Presentation. Parts of this study were presented in abstract form at the Annual Meeting of the Swiss Society of Endocrinology and Diabetes, Bern, Switzerland, 17–18 November 2022.

1.
Seaquist
ER
,
Anderson
J
,
Childs
B
, et al
.
Hypoglycemia and diabetes: a report of a workgroup of the American Diabetes Association and the Endocrine Society
.
Diabetes Care
2013
;
36
:
1384
1395
2.
Olde Bekkink
M
,
Koeneman
M
,
de Galan
BE
,
Bredie
SJ
.
Early detection of hypoglycemia in type 1 diabetes using heart rate variability measured by a wearable device
.
Diabetes Care
2019
;
42
:
689
692
3.
Koeneman
M
,
Olde Bekkink
M
,
Meijel
LV
,
Bredie
S
,
de Galan
B
.
Effect of hypoglycemia on heart rate variability in people with type 1 diabetes and impaired awareness of hypoglycemia
.
J Diabetes Sci Technol
2022
;
16
:
1144
1149
4.
Bent
B
,
Cho
PJ
,
Henriquez
M
, et al
.
Engineering digital biomarkers of interstitial glucose from noninvasive smartwatches
.
NPJ Digit Med
2021
;
4
:
89
5.
Föll
S
,
Maritsch
M
,
Spinola
F
, et al
.
FLIRT: A feature generation toolkit for wearable data
.
Comput Methods Programs Biomed
2021
;
212
:
106461
6.
Battelino
T
,
Danne
T
,
Bergenstal
RM
, et al
.
Clinical targets for continuous glucose monitoring data interpretation: recommendations from the International Consensus on Time in Range
.
Diabetes Care
2019
;
42
:
1593
1603
7.
De Caigny
A
,
Coussement
K
,
De Bock
KW
.
A new hybrid classification algorithm for customer churn prediction based on logistic regression and decision trees
.
Eur J Oper Res
2018
;
269
:
760
772
8.
Lundberg
SM
,
Lee
S-I
.
A unified approach to interpreting model predictions
. In
Proceedings of the 31st International Conference of Neural Information Processing
.
New York
,
Association for Computing Machinery
,
2017
, pp.
4768
4777
9.
Ranvier
J-E
,
Dubosson
F
,
Calbimonte
J-P
,
Aberer
K
.
Detection of hypoglycemic events through wearable sensors
. In
Proceedings of the International Workshop on Semantic Web Technologies for Mobile and Pervasive Environments
.
Aachen, Germany
,
CEUR-WS
,
2016
10.
Dave
D
,
Vyas
K
,
Branan
K
, et al
.
Detection of hypoglycemia and hyperglycemia using noninvasive wearable sensors: ECG and accelerometry
.
J Diabetes Sci Technol
.
4 August 2022 [Epub ahead of print]. DOI: 10.1177/19322968221116393
11.
San
PP
,
Ling
SH
,
Nguyen
HT
.
Deep learning framework for detection of hypoglycemic episodes in children with type 1 diabetes
.
Annu Int Conf IEEE Eng Med Biol Soc
2016
:
3503
3506
12.
Cichosz
SL
,
Frystyk
J
,
Hejlesen
OK
,
Tarnow
L
,
Fleischer
J
.
A novel algorithm for prediction and detection of hypoglycemia based on continuous glucose monitoring and heart rate variability in patients with type 1 diabetes
.
J Diabetes Sci Technol
2014
;
8
:
731
737
13.
Hepburn
DA
,
Deary
IJ
,
Frier
BM
,
Patrick
AW
,
Quinn
JD
,
Fisher
BM
.
Symptoms of acute insulin-induced hypoglycemia in humans with and without IDDM. Factor-analysis approach
.
Diabetes Care
1991
;
14
:
949
957
14.
Schwartz
NS
,
Clutter
WE
,
Shah
SD
,
Cryer
PE
.
Glycemic thresholds for activation of glucose counterregulatory systems are higher than the threshold for symptoms
.
J Clin Invest
1987
;
79
:
777
781
15.
Bode
BW
,
Schwartz
S
,
Stubbs
HA
,
Block
JE
.
Glycemic characteristics in continuously monitored patients with type 1 and type 2 diabetes: normative values
.
Diabetes Care
2005
;
28
:
2361
2366
16.
Yang
C
,
Ma
Y-l
,
Kang
J
, et al
.
Time and department distribution of hypoglycemia occurrences in hospitalized diabetic patients
.
Int J Nurs Sci
2015
;
2
:
263
267
17.
Chico
A
,
Vidal-Ríos
P
,
Subirà
M
,
Novials
A
.
The continuous glucose monitoring system is useful for detecting unrecognized hypoglycemias in patients with type 1 and type 2 diabetes but is not better than frequent capillary glucose measurements for improving metabolic control
.
Diabetes Care
2003
;
26
:
1153
1157
Readers may use this article as long as the work is properly cited, the use is educational and not for profit, and the work is not altered. More information is available at https://www.diabetesjournals.org/journals/pages/license.