We derive and validate D-RISK, an electronic health record (EHR)-driven risk score to optimize and facilitate screening for undiagnosed dysglycemia (prediabetes plus diabetes) in clinical practice.
We used retrospective EHR data (derivation sample) and a prospective diabetes screening study (validation sample) to develop D-RISK. Logistic regression with backward selection was used to predict dysglycemia (HbA1c ≥5.7%) using diabetes risk factors consistently captured in structured EHR data. Model coefficients were converted to a points-based risk score. We report discrimination, sensitivity, and specificity and compare D-RISK to the American Diabetes Association (ADA) risk test and the ADA and United States Preventive Services Task Force (USPSTF) screening guidelines.
The derivation cohort included 11,387 patients (mean age 48 years; 65% female; 42% Hispanic; 32% non-Hispanic Black; mean BMI 32; 29% with hypertension). D-RISK included age, race, BMI, hypertension, and random glucose. The area under curve (AUC) for the risk score was 0.75 (95% CI 0.74–0.76). In the validation screening study (n = 519), the AUC was 0.71 (95% CI 0.66–0.75) which was better than the ADA and USPSTF diabetes screening guidelines (AUC = 0.52 and AUC = 0.58, respectively; P < 0.001 for both). Discrimination was similar to the ADA risk test (AUC = 0.67) using patient-reported data to supplement EHR data, although D-RISK was more sensitive (75% vs. 61%) at the recommended screening thresholds.
Designed for use in EHR, D-RISK performs better than commonly used screening guidelines and risk scores and may help detect undiagnosed cases of dysglycemia in clinical practice.
Introduction
Despite well-established diabetes screening guidelines (1,2) and opportunistic screening in clinical practice, an estimated 8.7 million adults in the U.S. have undiagnosed type 2 diabetes, and an additional 80.6 million U.S. adults have undiagnosed prediabetes (3). Although individuals with limited access to and engagement with health care have higher rates of undiagnosed diabetes (4,5), targeted screening for high-risk individuals and overall screening rates remain suboptimal in clinical practice (6–9). The risk of microvascular and macrovascular complications increases as glucose rises in prediabetes, and nearly one-third of patients have microvascular or macrovascular complications at the time of type 2 diabetes diagnosis (10). Early diagnosis is critical for initiating evidence-based lifestyle interventions (11) to prevent or delay the development of diabetes and its complications (12). The Anglo-Danish-Dutch Study of Intensive Treatment in People with Screen Detected Diabetes in Primary Care (ADDITION)-Denmark study demonstrated cost savings per person with incident diabetes over a 5-year period such that screening program costs were offset by health system savings within 2 years (13). Systematic screening strategies are needed for clinical practice to move beyond visit-based opportunistic screening and help close screening gaps in high-risk patients to improve early detection of undiagnosed dysglycemia (prediabetes + diabetes) (14).
Diabetes screening guidelines use risk factors to identify individuals for testing, with a goal of identifying asymptomatic individuals likely to have diabetes. The American Diabetes Association (ADA) screening guideline (1) recommends testing for all adults over age 35 and earlier testing for those who have overweight or obesity with additional risk factors. The 2021 United States Preventive Services Task Force (USPSTF) guideline recommends screening for all adults ages 35–70 years who are overweight or obese (2). Both the USPSTF and ADA guidelines rely heavily on age and weight status to identify high-risk individuals and use a single screen/no-screen threshold. These population-based screening recommendations identify a large number of individuals for screening and prioritize sensitivity, which lowers overall performance of the guideline (15). Within health systems where most screening occurs, more efficient approaches that afford health systems flexibility to select screening thresholds based on underlying population risk, desired balance of sensitivity and specificity, and availability of resources are needed.
Diabetes risk scores provide an opportunity for health systems to select screening thresholds based upon the desired sensitivity and specificity for their target population. However, the usefulness of most risk scores in clinical practice is limited by requiring collection of prospective data by clinical staff, providers, or patient report, which poses a significant barrier to use (16). Patient-facing risk scores such as the ADA risk test can be effective at identifying high-risk patients for screening when administered via patient portals or during clinical visits (17,18); however, usability, scalability, and potential automation within the electronic health record (EHR) data are complicated by missing data such as physical activity and may require the use of alternate cut points or score modification (19).
Automated novel risk scores that use data routinely available within EHR, without requiring the collection of additional data, are needed to facilitate implementation in clinical practice and allow risk assessment at clinic and health system levels. Random blood glucose (RBG), which is routinely measured on laboratory panels in clinical practice, is commonly available, is associated with increased diabetes risk when elevated, and can identify cases of undiagnosed diabetes (20–22). In this study, we derive and validate a novel EHR-driven Dysglycemia Risk Score (D-RISK) to detect undiagnosed dysglycemia using RBG and other common, structured risk factors from EHR data collected in routine clinical practice.
Research Design and Methods
Study Overview and Population
We derived D-RISK using a retrospective EHR cohort and validated it using a prospective diabetes screening study (Supplementary Fig. 1). Both derivation and validation cohorts consisted of established primary care patients without a prior diagnosis of diabetes or prediabetes within Parkland Health (Parkland). Parkland is an integrated safety-net health system in Dallas County, TX, that provides comprehensive subsidized health care for the uninsured and underinsured residents of Dallas County. Parkland operates 12 free-standing outpatient primary care clinics across Dallas County that use a common EHR, Epic (Epic Systems Corporation, Verona, WI), with comprehensive encounter and laboratory data from inpatient, outpatient, and emergency room visits.
Derivation Cohort
D-RISK was derived from a retrospective cohort of established primary care patients ages 18–64 years without diagnosed dysglycemia (diabetes or prediabetes). Patients were eligible if they 1) had an index outpatient clinic visit between 1 June 2011 and 31 December 2014, defined by the first outpatient primary care clinic visit during this period; 2) had a resulted diabetes screening test (hemoglobin A1C [HbA1c] in EHR after the index visit; and 3) had one or more RBG values in EHR in the 12 months before the HbA1c test. Only outpatient RBG values were included. Patients were excluded if they 1) had diagnosed dysglycemia defined as HbA1c ≥5.7% or fasting glucose ≥100 mg/dL at any time prior to the index visit or documentation of a diabetes ICD 9/10 code in the problem list, medical history, or encounter diagnoses and 2) had a resulted gold standard diabetes screening test (HbA1c, fasting glucose, or oral glucose tolerance test) in the 18 months prior to the index visit. Fasting glucose results were identified using the “fasting glucose” order in EHR. Although laboratory staff confirmed fasting status prior to the laboratory draw, this confirmation was not available in EHR. Women with pregnancy-related ICD 9/10 and Current Procedural Terminology codes between 2011 and 2014 were also excluded. Patients over age 65 years were excluded because Medicare eligibility expands health care access beyond Parkland and decreases the capture of comprehensive health services within the safety-net health system.
Validation Cohort
The validation cohort included participants in a prospective diabetes screening study (December 2015 through June 2017) at Parkland. The prospective validation cohort was recruited from the same clinical setting as the derivation sample. However, while patients in the retrospective derivation sample were screened in clinical practice, participants in the screening study were seen in clinical practice but not screened by their health care team. Eligible participants were primary care patients ages 18–64 years with a completed primary care encounter in the past 18 months and one or more RBG values resulted in EHR in the past 12 months. We excluded those with diagnosed dysglycemia using problem lists, medical history, ICD 9/10 encounter codes, and laboratory results (HbA1c ≥5.7% or fasting glucose ≥100 mg/dL). To create a cohort eligible for screening, we also excluded those with a resulted gold-standard diabetes screening test (HbA1c, fasting glucose, or oral glucose tolerance test) regardless of test results in the past 2 years. Eligible participants were identified from clinical practice using a quarterly EHR data query and invited to participate in the screening study via a mailed opt-out invitation followed by a phone call from research staff within 2 weeks. All communication was in English or Spanish per patient preference. We confirmed eligibility criteria by phone and applied additional exclusion criteria including diagnosis or treatment of cancer in the past 2 years, current use of steroid medication, and pregnancy in the past 3 years or current breastfeeding using patient-reported data. These additional patient-reported exclusion criteria were included to address limitations of EHR data and create a cohort representative of patients who would likely be screened in clinical practice. After completing verbal consent via telephone, participants completed an HbA1c at their primary care clinic. Participants received a $25 gift card after completing their laboratory visit. This study was approved by the University of Texas Southwestern Medical Center Institutional Review Board, Dallas, TX.
Outcomes
The primary outcome was dysglycemia (prediabetes + type 2 diabetes) defined by HbA1c ≥5.7%. We used HbA1c to define our primary outcome because it is the most ordered diabetes screening test in clinical practice (6) and has the lowest coefficient of variation of recommended screening tests (23), and we were unable to confirm fasting status in the retrospective EHR data. In the derivation cohort, glycemic status was classified using the first HbA1c after the index visit. In the validation cohort, HbA1c was collected within 2 weeks of enrollment in the screening study.
Candidate Predictors
Candidate predictors were selected based on clinical relevance and consistent, reliable data capture in structured EHR fields. RBG, defined as a nonfasting serum or plasma glucose value, was included as a candidate predictor because it is strongly associated with undiagnosed diabetes (22). Demographics (age, sex, race), BMI, hypertension, and family history of diabetes were extracted from EHR data collected in routine clinical care. Hypertension was defined using encounter codes from the problem list, medical history, and billing codes in EHR. Family history was extracted from the structured family history table in EHR, and the absence of documentation was considered a negative family history.
Model Derivation and D-RISK Development
We constructed bivariable and multivariable logistic regression models to predict dysglycemia and diabetes. We assessed bivariate relationships between each candidate predictor and dysglycemia using a prespecified significance threshold of P < 0.20. Significant bivariate predictors were then entered into multivariable logistic regression models using stepwise backward selection with a prespecified significance threshold of P < 0.10. Model performance was examined using area under the curve (AUC), which denotes the probability that a randomly selected individual with the outcome will have a higher predicted probability than a randomly selected individual without the outcome (24).
To facilitate use in clinical practice and promote automation of the model in EHR, we derived D-RISK from the final multivariable model predicting dysglycemia. Using the Framingham approach (25), we created clinically meaningful categories by assigning points to each predictor and converting the regression coefficients to the nearest integer, setting the lowest category as the representative value for each risk factor. Specifically, for each variable, the reference category is assigned a score of zero. Each risk point represents a 0.3 increase in log odds (or a 1.35-fold change in odds) compared with the reference category, calculated at the representative values and subjected to rounding. Additional details are presented later in Table 2.
Model Validation
We temporally validated (26) D-RISK using data from the prospective diabetes screening study. We report overall model discrimination using AUC and examine sensitivity, specificity, and positive and negative predictive values across a range of clinically relevant point values. We examined calibration by comparing observed to predicted probabilities of dysglycemia by quintiles of predicted risk and using the Hosmer-Lemeshow goodness-of-fit test.
Comparison With Other Risk Assessment Strategies Used in Screening
To understand the performance of D-RISK in the context of national screening guidelines recommended for clinical practice and established risk scores, we compared D-RISK performance with the 2022 ADA screening guideline (1), the 2021 USPSTF screening guideline (2), and the ADA risk rest (27,28) in the validation sample. Structured EHR data were used to assess USPSTF screening criteria. For the ADA guideline and the ADA risk test, structured EHR data were combined with patient-reported information on risk factors collected in the screening study that were not routinely captured in structured EHR data (physical activity, first-degree family history, and gestational diabetes). Analyses were conducted with SAS Version 9.4 and R Version 4.4.0 statistical software.
Results
The retrospective EHR derivation sample contained 11,387 patients without diagnosed dysglycemia. The mean age was 48.0 years, 65% were female, 42% Hispanic, 36% non-Hispanic Black, 16% non-Hispanic White, 5% Asian, and 1% other race. The mean BMI was 32 kg/m2, with 83% being overweight or obese. Nearly 80% were uninsured and covered by Dallas County’s indigent health care program. During a mean follow-up period of 26.4 months, 42.5% of patients had a test result in the dysglycemia range (34.3% prediabetes; 8.2% diabetes) according to HbA1c tests collected in routine clinical practice. Those with newly diagnosed dysglycemia were older, had higher BMIs, and had a greater burden of hypertension, hyperlipidemia, and family history of first-degree relative with diabetes. Blacks were more likely to have newly diagnosed dysglycemia, while non-Hispanic Whites were less likely (Table 1).
Patient demographics and diabetes risk factors in the derivation and validation cohorts
. | Retrospective derivation cohort (n = 11,387) . | Prospective validation cohort (n = 519) . |
---|---|---|
Mean (SD) age, years | 48.0 (10.6) | 47.6 (9.9) |
Female, % | 65.4 | 69.7 |
Race and ethnicity, % | ||
Non-Hispanic White | 15.7 | 8.3 |
Non-Hispanic Black | 36.1 | 22.5 |
Hispanic | 42.1 | 68.4 |
Asian | 4.7 | 0.8 |
Other | 1.3 | 0 |
Mean (SD) BMI, kg/m2 | 31.8 (7.7) | 30.4 (6.6) |
BMI categorical, % | ||
<25 kg/m2 | 16.8 | 16.8 |
25–29.9 kg/m2 | 30.5 | 37.6 |
≥30 kg/m2 | 52.7 | 45.7 |
Insurance, % | ||
Uninsured/charity care | 70.3 | 83.0 |
Commercial | 11.8 | 5.2 |
Medicaid/Medicare | 8.3 | 11.8 |
First-degree family history of diabetes, % | 32.9 | 37.4 |
Hypertension, % | 28.8 | 35.6 |
Hyperlipidemia, % | 21.8 | 24.9 |
Cardiovascular disease, % | 6.7 | 6.2 |
Mean (SD) most recent RBG, mg/dL | 101 (29.1) | 101 (18.0) |
. | Retrospective derivation cohort (n = 11,387) . | Prospective validation cohort (n = 519) . |
---|---|---|
Mean (SD) age, years | 48.0 (10.6) | 47.6 (9.9) |
Female, % | 65.4 | 69.7 |
Race and ethnicity, % | ||
Non-Hispanic White | 15.7 | 8.3 |
Non-Hispanic Black | 36.1 | 22.5 |
Hispanic | 42.1 | 68.4 |
Asian | 4.7 | 0.8 |
Other | 1.3 | 0 |
Mean (SD) BMI, kg/m2 | 31.8 (7.7) | 30.4 (6.6) |
BMI categorical, % | ||
<25 kg/m2 | 16.8 | 16.8 |
25–29.9 kg/m2 | 30.5 | 37.6 |
≥30 kg/m2 | 52.7 | 45.7 |
Insurance, % | ||
Uninsured/charity care | 70.3 | 83.0 |
Commercial | 11.8 | 5.2 |
Medicaid/Medicare | 8.3 | 11.8 |
First-degree family history of diabetes, % | 32.9 | 37.4 |
Hypertension, % | 28.8 | 35.6 |
Hyperlipidemia, % | 21.8 | 24.9 |
Cardiovascular disease, % | 6.7 | 6.2 |
Mean (SD) most recent RBG, mg/dL | 101 (29.1) | 101 (18.0) |
Model Derivation and D-RISK
Candidate predictors with P < 0.20 (age, race and ethnicity, BMI, diagnosed hypertension, and RBG) in bivariate models predicting dysglycemia were retained in the multivariable regression model (Table 2). Sex did not meet prespecified criteria (P = 0.88) and was not included in the multivariable model. First-degree relative with family history of diabetes (P < 0.001) met criteria for inclusion in the multivariable model. However, it was excluded from the final version of D-RISK because its contribution to the risk score was negligible (0.37 points). AUC for detection of dysglycemia in the final regression model was 0.75 (0.74–0.76). Age, BMI, and RBG required multiple categories to capture the gradient of risk and were assigned point scores as shown in Table 2.
Multivariate risk model and corresponding D-RISK points in the derivation cohort (n = 11,387)
Risk factor . | Representative value . | β-Coefficient . | Points assigned . |
---|---|---|---|
Age, years | |||
18–24 | 21.5 | 0.043 | 0 |
25–34 | 30 | 1 | |
35–44 | 40 | 3 | |
45–54 | 50 | 4 | |
55–65 | 60 | 6 | |
Race and ethnicity | |||
Non-Hispanic White | Reference | 0 | |
Hispanic | 0.673 | 2 | |
Non-Hispanic Black | 0.986 | 3 | |
Asian | 0.969 | 3 | |
Other/unknown | 0.806 | 3 | |
BMI, kg/m2 | |||
<24.9 | 21.8 | 0.059 | 0 |
25–29.9 | 27.5 | 1 | |
30–34.9 | 32.5 | 2 | |
≥35 | 45.6 | 5 | |
Hypertension | |||
No | Reference | 0 | |
Yes | 0.285 | 1 | |
Most recent random glucose value, mg/dL | |||
<100 | 85 | 0.030 | 0 |
100–109 | 105 | 2 | |
110–119 | 115 | 3 | |
120–129 | 125 | 4 | |
130–139 | 135 | 5 | |
140–149 | 145 | 6 | |
150–159 | 155 | 7 | |
160–169 | 165 | 8 | |
170–179 | 175 | 9 | |
180–189 | 185 | 10 | |
190–199 | 195 | 11 | |
≥200 | 214 | 13 | |
First-degree family history of diabetes | |||
No | Reference | 0 | |
Yes | 0.112 | 0 |
Risk factor . | Representative value . | β-Coefficient . | Points assigned . |
---|---|---|---|
Age, years | |||
18–24 | 21.5 | 0.043 | 0 |
25–34 | 30 | 1 | |
35–44 | 40 | 3 | |
45–54 | 50 | 4 | |
55–65 | 60 | 6 | |
Race and ethnicity | |||
Non-Hispanic White | Reference | 0 | |
Hispanic | 0.673 | 2 | |
Non-Hispanic Black | 0.986 | 3 | |
Asian | 0.969 | 3 | |
Other/unknown | 0.806 | 3 | |
BMI, kg/m2 | |||
<24.9 | 21.8 | 0.059 | 0 |
25–29.9 | 27.5 | 1 | |
30–34.9 | 32.5 | 2 | |
≥35 | 45.6 | 5 | |
Hypertension | |||
No | Reference | 0 | |
Yes | 0.285 | 1 | |
Most recent random glucose value, mg/dL | |||
<100 | 85 | 0.030 | 0 |
100–109 | 105 | 2 | |
110–119 | 115 | 3 | |
120–129 | 125 | 4 | |
130–139 | 135 | 5 | |
140–149 | 145 | 6 | |
150–159 | 155 | 7 | |
160–169 | 165 | 8 | |
170–179 | 175 | 9 | |
180–189 | 185 | 10 | |
190–199 | 195 | 11 | |
≥200 | 214 | 13 | |
First-degree family history of diabetes | |||
No | Reference | 0 | |
Yes | 0.112 | 0 |
Representative values: we use the median value within each category as its representative value when assessing the difference in risk between categories. β-Coefficient: regression coefficients (log odds ratio) in the logistic model where age, BMI, and glucose were included as continuous variables.
Points assigned: for each variable, the reference category is assigned a score of zero. Each point represents a 0.3 increase in log odds (or a 1.35-fold change in odds) compared with the reference category, calculated at the representative values and subjected to rounding. For example, the “one” point assigned to age category “25–34” is obtained by rounding [(30−21.5)*0.043/0.3].
D-RISK Validation and Performance
D-RISK was validated in a prospective screening study of patients who were recruited from the same clinics as the derivation cohort but had not screened for diabetes by their health care team in the past 2 years. Of 895 patients enrolled in the screening study, 58% (n = 519) completed the diabetes screening test, and there were no significant differences between those who did and did not complete the screening test (Supplementary Table 1). The prevalence of undiagnosed dysglycemia in the validation screening cohort was 33% (30% prediabetes; 3% diabetes), which was lower than the dysglycemia prevalence (42.5%) in the derivation sample of patients screened in clinical practice over a 26.4-month follow-up period. By design, the validation sample was a “screening eligible” cohort of patients seen in clinical practice but not previously screened for diabetes. Overall, the derivation and validation samples were similar except that the validation cohort had a higher proportion of Hispanics (Table 1). In the validation sample, performance of RBG alone to detect cases of undiagnosed dysglycemia was 0.60 (95% CI 0.55–0.65). A RBG testing threshold ≥100 mg/dL would recommend screening for 46% of the sample. Combining nonglycemic risk factors available in EHR with RBG into D-RISK improved model discrimination to detect unrecognized dysglycemia (AUC 0.71; 95% CI 0.66–0.75). Discrimination of D-RISK for detection of unrecognized dysglycemia was similar in the prospective validation cohort (0.71; 95% CI 0.66–0.75) and the retrospective derivation cohort (0.74; 95% CI 0.74–0.76) (Table 3). For detection of undiagnosed dysglycemia, the D-RISK calibration plot demonstrated good discrimination between the lowest and highest quintiles of risk and was well calibrated at the lower three quintiles of risk but overestimated risk in the top two quintiles (Fig. 1) in the validation cohort. To detect undiagnosed type 2 diabetes, the AUC for D-RISK was 0.87 (95% CI 0.85–0.88).
Performance of D-RISK, ADA diabetes screening guideline, USPSTF diabetes screening guideline, and ADA risk test
Model . | Screening recommended, % . | Sensitivity . | Specificity . | PPV . | NPV . | AUC . |
---|---|---|---|---|---|---|
Derivation data set: retrospective EHR data (n = 11,387) | ||||||
D-RISK points | ||||||
7 | 82 | 0.94 | 0.26 | 0.48 | 0.85 | 0.74 (0.74–0.76) |
8 | 72 | 0.87 | 0.40 | 0.52 | 0.81 | |
9 | 61 | 0.80 | 0.53 | 0.56 | 0.78 | |
10 | 50 | 0.70 | 0.64 | 0.59 | 0.74 | |
11 | 40 | 0.58 | 0.74 | 0.62 | 0.71 | |
Validation data set: prospective diabetes screening study (n = 519) | ||||||
D-RISK points | ||||||
7 | 80 | 0.91 | 0.26 | 0.38 | 0.86 | 0.71 (0.66–0.75) |
8 | 68 | 0.85 | 0.40 | 0.41 | 0.84 | |
9 | 56 | 0.75 | 0.53 | 0.44 | 0.81 | |
10 | 49 | 0.65 | 0.59 | 0.44 | 0.77 | |
11 | 40 | 0.57 | 0.69 | 0.48 | 0.76 | |
RBG ≥100 mg/dL | 46 | 0.54 | 0.59 | 0.40 | 0.72 | 0.60 (0.55–0.65) |
2022 ADA guideline | 97 | 0.99 | 0.04 | 0.34 | 0.88 | 0.52 (0.50–0.53) |
2021 USPSTF guideline | 76 | 0.87 | 0.30 | 0.38 | 0.58 | 0.58 (0.55–0.62) |
ADA risk test | ||||||
4 | 64 | 0.79 | 0.43 | 0.41 | 0.81 | 0.67 (0.62–0.72) |
5 | 44 | 0.61 | 0.65 | 0.47 | 0.77 | |
6 | 26 | 0.40 | 0.82 | 0.53 | 0.73 |
Model . | Screening recommended, % . | Sensitivity . | Specificity . | PPV . | NPV . | AUC . |
---|---|---|---|---|---|---|
Derivation data set: retrospective EHR data (n = 11,387) | ||||||
D-RISK points | ||||||
7 | 82 | 0.94 | 0.26 | 0.48 | 0.85 | 0.74 (0.74–0.76) |
8 | 72 | 0.87 | 0.40 | 0.52 | 0.81 | |
9 | 61 | 0.80 | 0.53 | 0.56 | 0.78 | |
10 | 50 | 0.70 | 0.64 | 0.59 | 0.74 | |
11 | 40 | 0.58 | 0.74 | 0.62 | 0.71 | |
Validation data set: prospective diabetes screening study (n = 519) | ||||||
D-RISK points | ||||||
7 | 80 | 0.91 | 0.26 | 0.38 | 0.86 | 0.71 (0.66–0.75) |
8 | 68 | 0.85 | 0.40 | 0.41 | 0.84 | |
9 | 56 | 0.75 | 0.53 | 0.44 | 0.81 | |
10 | 49 | 0.65 | 0.59 | 0.44 | 0.77 | |
11 | 40 | 0.57 | 0.69 | 0.48 | 0.76 | |
RBG ≥100 mg/dL | 46 | 0.54 | 0.59 | 0.40 | 0.72 | 0.60 (0.55–0.65) |
2022 ADA guideline | 97 | 0.99 | 0.04 | 0.34 | 0.88 | 0.52 (0.50–0.53) |
2021 USPSTF guideline | 76 | 0.87 | 0.30 | 0.38 | 0.58 | 0.58 (0.55–0.62) |
ADA risk test | ||||||
4 | 64 | 0.79 | 0.43 | 0.41 | 0.81 | 0.67 (0.62–0.72) |
5 | 44 | 0.61 | 0.65 | 0.47 | 0.77 | |
6 | 26 | 0.40 | 0.82 | 0.53 | 0.73 |
Diagnostic performance of D-RISK across a range of clinically relevant cut points (7–11 points) in the derivation and validation data sets is shown in Table 3. The proportion of patients identified as high risk and the sensitivity and specificity of D-RISK at different screening thresholds were similar in the derivation and validation cohorts (Table 3). We selected a cut point of nine or more points to identify individuals as high risk for having undiagnosed dysglycemia based on trade-offs in the sensitivity, specificity, and proportion of patients identified for screening. At a cut point of nine or more points, D-RISK recommended screening for 56% of the validation sample and had a sensitivity of 75%, specificity of 53%, positive predictive value (PPV) of 44%, and negative predictive value (NPV) of 81% (Table 3).
Comparison of D-RISK With ADA and USPSTF Screening Guidelines
D-RISK (AUC = 0.71) performed better at detecting undiagnosed dysglycemia than the 2021 UPSTF guideline (AUC = 0.58) and the 2022 ADA screening guideline (AUC = 0.52) (P < 0.001 for all). At the selected cut point of nine points, D-RISK identified 56% of the sample for screening, which was less than the 2022 ADA (97%) and 2021 USPSTF (76%) guidelines, while maintaining adequate sensitivity (75%) and much higher specificity (53%) (Table 3). D-RISK and the ADA risk rest had similar discrimination to detect undiagnosed dysglycemia (0.71 [0.66–0.75] vs. 0.67 [0.62–0.72], respectively). However, compared with a D-RISK cut point of nine points, the ADA risk test, at its recommended cut point of five points, recommended screening fewer individuals (44% vs. 56%) and had a lower sensitivity for detecting dysglycemia (61% vs. 75%) (Table 3).
Conclusions
We derived and validated D-RISK, a pragmatic EHR-driven risk score for use by clinicians and health systems to detect patients with prevalent, undiagnosed dysglycemia using EHR data collected in routine clinical practice. D-RISK has better discrimination than the ADA and USPSTF diabetes screening guidelines and allows users to select screening thresholds based on population risk and resource availability. D-RISK performed similarly to the ADA risk test, but by using commonly available structured EHR data, D-RISK is easily automated within EHR without requiring additional data collection from patients, clinicians, or clinic staff. D-RISK may be useful for facilitating system-level risk assessment and help close screening gaps by supplementing opportunistic screening in clinical practice to improve early identification and treatment of patients with unrecognized prediabetes and type 2 diabetes.
D-RISK harnesses nondiagnostic RBG values, which are opportunistically obtained in clinical practice and routinely available in EHR (29). Although RBG is commonly included in laboratory panels, it is often overlooked because guidance for the interpretation of random glucose values is lacking except for diagnosing type 2 diabetes when an RBG of ≥200 mg/dL is associated with hyperglycemic symptoms (30). RBG is strongly associated with diabetes risk (22,31); provides a robust case-finding strategy to detect prevalent, undiagnosed type 2 diabetes (32); and is a stronger predictor of future type 2 diabetes than demographics and cardiovascular risk factors (33). While using only RBG to detect dysglycemia underperforms in both the general U.S. population (32) and our validation cohort, we found that combining RBG with nonglycemic risk factors routinely available in EHR significantly improves detection of unrecognized dysglycemia.
Overall, D-RISK performed similarly to the ADA risk test (28) in detecting prevalent, undiagnosed dysglycemia in our validation cohort. However, at the recommended cut point, the ADA risk test was less sensitive at detecting undiagnosed dysglycemia than D-RISK. We used gold standard, patient-reported family history and physical activity data collected in our screening study to calculate the ADA risk test, because of missingness in the EHR data. The ADA risk test, which is designed as a patient-facing instrument, requires information such as physical activity and first-degree family history that is often missing, lacks sufficient detail, or is not captured in structured data fields in EHR. Thus, the ability to automate the ADA risk test in EHR and operationalize it in clinical practice without additional data collection by clinic staff, clinicians, or patient self-report is limited. Our D-RISK tool addresses these barriers by using routinely available, structured EHR data, and may perform better in real-world settings where data needed for the ADA risk test are not consistently available.
D-RISK performed better than the ADA and USPSTF diabetes screening guidelines to identify undiagnosed dysglycemia in our health care system. Although the ADA and USPSTF screening guidelines are commonly used to identify patients at high risk for dysglycemia in clinical practice, their application in clinical practice to efficiently target screening to high-risk patients is suboptimal (6). A key strength of the risk score compared with screening guidelines is that it allows users to select a screening threshold based on the underlying dysglycemia risk in the target population and acceptable trade-offs in sensitivity and specificity within the clinical setting. For example, the ADA and USPSTF diabetes screening guidelines are highly sensitive and recommend screening 97% and 76% of the validation cohort, respectively. In contrast, D-RISK at a testing threshold of 9 points is still highly sensitive (75%) while only screening 56% of the validation cohort. In resource limited settings, the ability to select testing thresholds based upon available resources and the underlying prevalence in the target population may be particularly important and help optimize screening practices.
D-RISK was designed for automation and scale such that it can be embedded in EHR to facilitate both clinic-based practice and population health risk assessment within health systems. To support practicing clinicians, the D-RISK tool can be automated using dot phrases in EHR, included in note templates, or programmed into clinical decision support to drive visit-based best practice alerts to increase risk awareness (34). Additionally, integration of D-RISK into patient portals may prompt patients to request diabetes screening from their clinical team. At a population health level, D-RISK can be automated in clinical decision support to populate EHR registries with high-risk patients to facilitate outreach and screening to patients outside of clinical encounters (34). By designing D-RISK to be automated in EHR, our approach helps address the challenge of operationalizing existing risk scores in clinical practice (16). The reach of D-RISK is enhanced by relying on RBG rather than fasting glucose or HbA1c, which are frequently included in similar models (25,35,36). Additionally, by focusing on detecting dysglycemia instead of diabetes alone, our risk score is particularly salient given the growing epidemic of prediabetes (3), the opportunity to deliver evidence-based lifestyle change programs (12,37), and the emergence of GLP1 receptor agonists to promote weight loss and reduce cardiometabolic risk (38). These features position D-RISK for future evaluation, impact, and implementation studies to surveil diabetes risk and promote early detection of dysglycemia.
Our study has several notable strengths. First, we designed D-RISK using commonly available, structured EHR data that maximize its usability in clinical practice and for population health. Second, we temporally validated D-RISK in a prospective diabetes screening study of active primary care patients unscreened for diabetes to demonstrate its validity and impact in clinical practice. D-RISK demonstrated robust performance in the prospective validation cohort, which reflects the true target population for D-RISK in clinical practice. Third, we used HbA1c as the diagnostic standard because it is the most frequently used diabetes screening and diagnostic test in clinical practice. Our study is not without limitations. First, we were unable to identify all fasting glucose values because fasting status is not reliably captured in EHR data. However, random glucose reflects early signs of glycemic dysregulation, with postprandial glucose rising before fasting glucose in those with declining β-cell function (39). Second, we used different look-back periods in the retrospective derivation cohort and the prospective screening study to determine screening eligibility. Because our goal was to capture an “active” patient population, we used a shorter look-back period in the retrospective data than in the prospective data because of our inability to confirm patient engagement with the health system in the retrospective data. Importantly, the model performed similarly in both the derivation and validation cohorts, which is an indication of its robustness. Third, the external validity and generalizability of D-RISK to non–safety-net populations representative of the broader population of patients engaged in health care and populations with race and ethnicity compositions different from the validation sample are unknown. Further studies examining external validation and generalizability are needed.
D-RISK provides a novel scalable, actionable approach to diabetes risk assessment using commonly available EHR data and is designed for use in clinical practice and by health systems to detect individuals at risk for undiagnosed dysglycemia. Although improvements in screening have led to decreasing rates of undiagnosed diabetes in the U.S. (40), screening gaps remain in clinical practice. With systematic screening programs demonstrating cost savings to health systems, novel scalable approaches to diabetes screening are needed (13). As a strategy to close diabetes screening gaps, D-RISK may provide a useful adjunct to routine screening in clinical practice by supporting population health screening strategies to improve detection of undiagnosed dysglycemia within health systems.
Article Information
Acknowledgments. The authors acknowledge and thank additional contributors to this study, including Emily Marks, Bryan Elwood, Adam Loewen, Joanne Sanders, Claudia Chavez, Molly F. McGuire, and Michael Dang, all from the University of Texas Southwestern Medical Center, Dallas, TX. The authors thank the Global Diabetes Program, Ambulatory and Population Medicine at Parkland Health, and Medical Informatics programs at Parkland Health for their partnership in this work.
Funding. Research reported in this publication was supported by the National Institute of Diabetes Digestive and Kidney Diseases at the National Institutes of Health under award number K23DK104065.
The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.
Duality of Interest. No potential conflicts of interest relevant to this article were reported.
Author Contributions. I.L., L.M., B.M., N.O.S., S.Z., and E.A.H. were involved in the acquisition of funding, conception, study design, and interpretation of data. All authors have reviewed, contributed to, and issued final approval of the version of the work submitted. M.E.B. is the guarantor of this work and, as such, had full access to all the data in the study and takes responsibility for the integrity of the data and accuracy of the data analysis.
Prior Presentation. Preliminary versions of this work were presented at the 79th Scientific Sessions of the American Diabetes Association, San Francisco, CA, 7–11 June 2019.
Handling Editors. The journal editors responsible for overseeing the review of the manuscript were Cheryl A.M. Anderson and Meghana D. Gadgil.
This article contains supplementary material online at https://doi.org/10.2337/figshare.28082933.
This article is featured in a podcast available at diabetesjournals.org/care/pages/diabetes_care_on_air.
References
See accompanying article, p. 682.