Reportedly, two-thirds of the patients who were positive for diabetes during screening failed to attend a follow-up visit for diabetes care in Japan. We aimed to develop a machine-learning model for predicting people’s failure to attend a follow-up visit.
We conducted a retrospective cohort study of adults with newly screened diabetes at a national screening program using a large Japanese insurance claims database (JMDC, Tokyo, Japan). We defined failure to attend a follow-up visit for diabetes care as no physician consultation during the 6 months after the screening. The candidate predictors were patient demographics, comorbidities, and medication history. In the training set (randomly selected 80% of the sample), we developed two models (previously reported logistic regression model and Lasso regression model). In the test set (remaining 20%), prediction performance was examined.
We identified 10,645 patients, including 5,450 patients who failed to attend follow-up visits for diabetes care. The Lasso regression model using four predictors had a better discrimination ability than the previously reported logistic regression model using 13 predictors (C-statistic: 0.71 [95% CI 0.69–0.73] vs. 0.67 [0.65–0.69]; P < 0.001). The four selected predictors in the Lasso regression model were lower frequency of physician visits in the previous year, lower HbA1c levels, and negative history of antidyslipidemic or antihypertensive treatment.
The developed machine-learning model using four predictors had a good predictive ability to identify patients who failed to attend a follow-up visit for diabetes care after a screening program.
Introduction
Diabetes poses a burden on health care and the economic, social, and communal lives of patients, and patients need community-based support to live normally (1). Medical intervention immediately after the diagnosis of diabetes is important to achieve a good patient prognosis (2). However, individuals who are suspected of having diabetes at medical checkups frequently fail to attend follow-up visits for diabetes care. Indeed, in Japan, 65% of patients with a positive fasting blood glucose test or glycated hemoglobin (HbA1c) ≥6.5% (HbA1c ≥48 mmol/mol) failed to attend a follow-up visit for diabetes care (3). In addition, as reported in the U.K., 30% of patients invited for screening failed to undergo the first screening test, and 10% failed to undergo the second screening test (4).
Although factors for discontinuation after the initiation of diabetes care have been studied (5), little is known about the factors related to failure to attend a follow-up medical visit for newly screened diabetes after a checkup. A previous study using data from 2005 to 2010 with a moderate sample size (n = 3,878) showed that younger age and lower blood glucose levels were associated with the failure to attend a follow-up visit in patients with newly screened hyperglycemia (3). Because the diagnostic criteria for diabetes changed in 2010 (6), investigating the factors related to failure to attend a follow-up visit based on the recommendations at checkups is important to develop an optimal strategy for patients with newly screened diabetes. Accumulated evidence has emphasized the importance of strict interventions in patients with diabetes (7). Therefore, the development of a predictive model for failure to attend a follow-up visit for newly screened diabetes at checkups should attract the attention of policymakers and payers because of the expectation that interventions targeting this population may be important in reducing the diabetes-related health care burden.
In this context, we aimed to explore the factors and develop a machine-learning model for predicting the failure to attend a follow-up visit among patients who were positive for diabetes during screening at annual checkups using a claims database, containing data for laboratory values, and a lifestyle questionnaire, after checkups piloted by the Japanese government for prevention of lifestyle-related diseases.
Research Design and Methods
Study Design and Data Source
This was a retrospective cohort study using a commercially available database, the JMDC claims database (Tokyo, Japan), of patients whose details have been previously described (8). Briefly, the JMDC database contains claims data of ∼10 million insured individuals as of 2020, most of whom are company employees and their families. Diagnoses were recorded using the ICD-10, and drug specifications were based on World Health Organization Anatomical Therapeutic Chemical Classification System (WHO-ATC) codes.
The database additionally contains information on annual checkups for lifestyle-related diseases (9), which covers 4.7 million insured individuals. Employers have been legally required to provide full-time employees with annual checkups, and the Japanese government initiated universal lifestyle-related disease screening throughout Japan in 2008, which focuses on metabolic syndromes to prevent lifestyle-related diseases. With this scheme, insurers are obliged to provide universal lifestyle-related disease screening. When individuals are found to have abnormalities in their checkups at the time of screening, they are encouraged to receive advice from a public health nurse or consult a physician for disease management. Individuals in a hyperglycemic state during a checkup, where the threshold is set as fasting blood glucose ≥126 mg/dL or HbA1c ≥6.5% (≥48 mmol/mol), are advised to consult a doctor (9). The JMDC database contains the following results of the checkups: blood pressure recordings, clinical laboratory tests, such as complete blood count, blood glucose, biochemistry, and urinary dipstick test, as well as questionnaires on lifestyle factors, such as smoking and alcohol habits.
Inclusion and Exclusion Criteria
Based on previous studies (3,4), we included individuals who met all of the following criteria: 1) age ≥40 years, 2) fasting blood glucose ≥126 mg/dL and HbA1c ≥6.5% (HbA1c ≥48 mmol/mol) at the baseline checkup from April 2007 to October 2019 without any missing values among all of these variables, and the combination of the two glycemic marker levels satisfied the diagnostic criteria for diabetes (10). For each patient, we set the baseline period as the time between 1 year prior to the checkup and the day before the checkup. During the year prior to the checkup, it was imperative that there were no diabetes-related claims, as described previously (3). Therefore, all included individuals were considered to be “patients with newly diagnosed diabetes.” Follow-up visits for diabetes were defined as having any claims related to diabetes, including HbA1c or glycated albumin tests and prescriptions for diabetes. We excluded individuals who were censored; in other words, those who did not receive insurance coverage throughout the observation period.
The study protocol was approved by the Institutional Review Board of the Graduate School of Medicine of The University of Tokyo. The requirement for informed consent was waived due to the anonymous nature of the data.
Variables and Candidate Predictors
The following information was obtained from the database: sex, age, BMI, waist circumference, blood pressure, serum γ-glutamyl transferase, creatinine, uric acid levels, HbA1c levels, and urinary protein/glucosuria dipstick results. We also obtained results from the checkup of the previous year for glycemic markers, such as fasting blood glucose or HbA1c levels. We classified BMI into four categories as follows: underweight (<18.5 kg/m2), normal weight (18.5–24.9 kg/m2), overweight (25.0–29.9 kg/m2), and obesity (≥30.0 kg/m2). Binary variables were used to determine the medication prescriptions for dyslipidemia (WHO-ATC codes starting with C10), hypertension (WHO-ATC codes starting with C02, C03, C04, C07, C08, or C09), hyperuricemia (WHO-ATC codes starting with M04), and depression (WHO-ATC codes starting with N06A or N06BA), from the claims data during the previous 12 months. We also set a binary variable indicating whether the individual was an insured employee. Blood pressure was classified into four categories based on the definition of the Japanese Society of Hypertension: normal or grade 1–3 in hypertension (11). We also obtained responses of questionnaires on lifestyle factors (smoking and alcohol habits), medical history of stroke or ischemic heart disease, health behavior changes based on the transtheoretical model, and willingness to receive health instruction from a public health nurse.
Candidate risk factors were selected based on previous studies (3,4) and clinically relevant variables. Specifically, the candidate risk factors included patient age; sex; BMI; smoking habits or alcohol intake; prescription histories of antihypertensives, antidyslipidemic, antihyperuricemic, or antidepressive agents in the previous 12 months; glycosuria (negative or positive) or urinary protein (negative or trace/+1 or worse) results from a urinary dipstick test; hemoglobin values; and whether the patient was insured or a dependent (3,4). We also included the following variables that were clinically important in a previous study that examined the discontinuation of follow-up visits in patients with diabetes: the number of months when an individual used a medical facility in the past 12 months prior to the checkup, health behavior changes based on the transtheoretical model, and willingness to receive health instruction from a public health nurse (12). In addition, we added the following items related to severity or comorbidity: estimated glomerular filtration rate, fatty liver index (13), and whether a patient was recommended to consult a doctor for hyperglycemia (fasting blood glucose ≥126 mg/dL or HbA1c ≥6.5% [≥48 mmol/mol]) in the previous year.
Study Outcomes
The primary outcome was the failure to attend a follow-up visit based on recommendations at the checkups, which was defined as a person failing to attend a follow-up visit within 6 months after undergoing a checkup in which the laboratory data satisfied the diagnosis of diabetes, despite being recommended to consult a physician. We defined a visit to a physician during the 6 months by recorded consultation with a physician with a registered name of diabetes (ICD-10 codes: E10, E11, E12, E13, or E14), existence of measurement of glycemic markers, such as HbA1c or glycated albumin, or prescription of antihyperglycemics.
Statistical Analysis
We summarized the background characteristics of the eligible patients according to the failure to attend a follow-up visit. Categorical variables were compared using the χ2 test, and continuous variables were compared using the Wilcoxon rank-sum test.
Identifying Risk Factors and Developing a Prediction Model
We classified the study patients into a training set consisting of randomly selected 80% of the total participants and a test set comprising the remaining participants (20%).
In the training set, we developed a reference model and a machine-learning model to predict the probability of failing to attend a follow-up visit. For the reference model, we fitted a previously reported logistic regression model (without penalization), including all 13 candidate predictors used in the previous study (3). We calculated odds ratios (ORs) and 95% CIs for failure to attend a follow-up visit using all of these variables, in which the BMI was treated as a continuous variable. For the machine-learning model, we fitted a logistic regression with Lasso regularization (Lasso regression) among all of the candidate variables (14). Lasso regression is an efficient model that both shrinks and selects regression coefficients (14,15) and allows for better interpretability of the model. The optimal value for the hyperparameter λ was selected to maximize the percentage of correctly identified case/control subjects with 10-fold cross-validation on the training set before using the remaining 20% of the data to test the predictability of the model. We applied the maximal λ within 1 SE of the difference. Lasso regression with the 1-SE rule can yield a more parsimonious model than the model using the minimal λ (16,17). Based on the Lasso regression model, we identified factors related to failure to attend a follow-up visit. In addition, we conducted a logistic regression model using the identified factors for postestimation (i.e., to calculate the OR and 95% CI of each factor for the failure to attend a follow-up visit).
In the test set (the remaining 20% of the sample), we measured the predictive performance using C-statistics (area under the receiver operating characteristic [ROC] curve). The C-statistics between the models were compared using the Delong test. We also obtained the values of the Bayesian information criteria for each model.
We performed an analysis stratified by the severity of diabetes because the severity may affect the intensity of failure to attend a follow-up visit, as in a previous study (3). We set a cutoff value of HbA1c of 8.0% (≥64 mmol/mol) because this value is a less stringent target for most patients with diabetes, as defined in the guidelines (10,18). We also performed another analysis stratified by whether the patient received antihypertensive, antidyslipidemic, or antihyperuricemic agents in the previous 12 months.
As sensitivity analyses, we also explored factors related to the failure to attend a follow-up visit based on recommendations in checkups within 3 and 9 months after the checkup.
Statistical significance was set at P < 0.05. All analyses were performed using Stata version 17 software (StataCorp, College Station, TX).
Results
Study Samples
Among the patients whose HbA1c and fasting blood glucose levels were available from checkups from April 2007 to October 2019, we identified 11,023 patients who tested positive for diabetes at the checkups but had no history of treatment or diagnosis in the previous year. We subsequently excluded 378 patients who were censored during the first 6 months after the checkups, and the remaining 10,645 patients were eligible for the analysis.
Overall, 5,450 (51.1%) patients failed to attend a follow-up visit based on recommendations at the checkups within 6 months (Table 1). Briefly, those who failed to attend a follow-up visit for newly screened diabetes were likely to be male, younger, smokers, frequent drinkers, and insured people. The results of their checkup showed a higher proportion of people with negative results for glucosuria or urinary protein and lower HbA1c values. In addition, they were less likely to have a prescription history of antihypertensive, antidyslipidemic, antihyperuricemic, or antidepressive agents and to be frequent visitors of medical facilities in the previous 12 months.
Characteristics of eligible patients who attended and failed to attend a follow-up visit for diabetes care
Variable and category . | Those attending a follow-up visit (N = 5,195) . | Those not attending a follow-up visit (N = 5,450) . | P value . |
---|---|---|---|
Sex, male | 4,125 (79.4) | 4,452 (81.7) | <0.001 |
Age (years), median (IQR) | 53.0 (48.0, 58.0) | 51.0 (46.0, 57.0) | <0.001 |
BMI (kg/m2) | 0.092 | ||
<18.5 | 51 (1.0) | 35 (0.6) | |
18.5–24.9 | 1,763 (33.9) | 1,851 (34.0) | |
25.0–29.9 | 2,282 (43.9) | 2,310 (42.4) | |
≥30.0 | 1,099 (21.2) | 1,254 (23.0) | |
Waist circumference (cm) | 0.94 | ||
M: <85, F: <90 | 1,460 (28.1) | 1,535 (28.2) | |
M: ≥85, F: ≥90 | 3,735 (71.9) | 3,915 (71.8) | |
Blood pressure | 0.12 | ||
Normal | 3,258 (62.7) | 3,338 (61.2) | |
Grade 1 hypertension | 1,354 (26.1) | 1,420 (26.1) | |
Grade 2 hypertension | 435 (8.4) | 512 (9.4) | |
Grade 3 hypertension | 148 (2.8) | 180 (3.3) | |
Smoking history | <0.001 | ||
Nonsmoker | 3,433 (66.1) | 3,218 (59.0) | |
Smoker | 1,762 (33.9) | 2,232 (41.0) | |
Alcohol intake frequency | 0.002 | ||
Rarely | 2,179 (41.9) | 2,103 (38.6) | |
Occasionally | 1,639 (31.5) | 1,798 (33.0) | |
Regularly | 1,377 (26.5) | 1,549 (28.4) | |
Positive history of cardiovascular disease | 224 (4.3) | 112 (2.1) | <0.001 |
Willingness to receive health instruction from public health nurses | 0.005 | ||
Yes | 1,767 (34.0) | 1,716 (31.5) | |
Stage in change model for lifestyle modifications | <0.001 | ||
Precontemplation | 812 (15.6) | 959 (17.6) | |
Contemplation | 1,986 (38.2) | 2,166 (39.7) | |
Preparation | 941 (18.1) | 910 (16.7) | |
Action | 545 (10.5) | 582 (10.7) | |
Maintenance | 911 (17.5) | 833 (15.3) | |
Restfulness from sleep | 0.84 | ||
No | 3,023 (58.2) | 3,161 (58.0) | |
Yes | 2,172 (41.8) | 2,289 (42.0) | |
Positive proteinuria | 672 (12.9) | 598 (11.0) | 0.002 |
Positive glucosuria | 1,630 (31.4) | 1,439 (26.4) | <0.001 |
Fasting blood glucose (mg/dL), median (IQR) | 145.0 (134.0, 176.0) | 141.0 (132.0, 164.0) | <0.001 |
HbA1c (%), median (IQR) | 7.3 (6.8, 8.5) | 7.1 (6.7, 8.0) | <0.001 |
HbA1c (mmol/mol), median (IQR) | 56 (50, 69) | 54 (49, 63) | <0.001 |
Triglycerides (mg/dL) | 0.51 | ||
<150 | 2,675 (51.5) | 2,745 (50.4) | |
150–299 | 1,862 (35.8) | 1,998 (36.7) | |
≥300 | 658 (12.7) | 707 (13.0) | |
LDL-cholesterol (mg/dL) | <0.001 | ||
<120 | 1,620 (31.2) | 1,449 (26.6) | |
120–139 | 1,224 (23.6) | 1,207 (22.1) | |
≥140 | 2,351 (45.3) | 2,794 (51.3) | |
HDL-cholesterol (mg/dL) | 0.27 | ||
<40 | 741 (14.3) | 819 (15.0) | |
≥40 | 4,454 (85.7) | 4,631 (85.0) | |
Hemoglobin (g/dL), median (IQR) | 15.4 (14.6, 16.2) | 15.5 (14.7, 16.3) | <0.001 |
Serum uric acid (mg/dL), median (IQR) | 5.6 (4.7, 6.6) | 5.8 (4.9, 6.7) | <0.001 |
Estimated glomerular filtration rate (mL/min/ 1.73 m2), median (IQR) | 78.7 (68.9, 90.0) | 79.4 (70.1, 90.7) | <0.001 |
Metabolic syndrome | 3,697 (71.2) | 3,868 (71.0) | 0.83 |
Fatty liver index, median (IQR) | 91.7 (10.3, 99.9) | 93.3 (13.3, 99.9) | 0.15 |
Insured person | <0.001 | ||
Identical person | 4,060 (78.2) | 4,460 (81.8) | |
Dependent | 1,135 (21.8) | 990 (18.2) | |
Antihypertension prescription | 1,578 (30.4) | 667 (12.2) | <0.001 |
Antidyslipidemic prescription | 1,018 (19.6) | 335 (6.1) | <0.001 |
Antihyperuricemic prescription | 417 (8.0) | 178 (3.3) | <0.001 |
Antidepressive prescription | 188 (3.6) | 107 (2.0) | <0.001 |
Frequency of physician visits in the previous year (months/year), median (IQR) | 5.0 (1.0, 10.0) | 2.0 (0.0, 5.0) | <0.001 |
Recommended to consult a doctor for hyperglycemia at the last checkup | 4,526 (87.1) | 4,737 (86.9) | 0.75 |
Variable and category . | Those attending a follow-up visit (N = 5,195) . | Those not attending a follow-up visit (N = 5,450) . | P value . |
---|---|---|---|
Sex, male | 4,125 (79.4) | 4,452 (81.7) | <0.001 |
Age (years), median (IQR) | 53.0 (48.0, 58.0) | 51.0 (46.0, 57.0) | <0.001 |
BMI (kg/m2) | 0.092 | ||
<18.5 | 51 (1.0) | 35 (0.6) | |
18.5–24.9 | 1,763 (33.9) | 1,851 (34.0) | |
25.0–29.9 | 2,282 (43.9) | 2,310 (42.4) | |
≥30.0 | 1,099 (21.2) | 1,254 (23.0) | |
Waist circumference (cm) | 0.94 | ||
M: <85, F: <90 | 1,460 (28.1) | 1,535 (28.2) | |
M: ≥85, F: ≥90 | 3,735 (71.9) | 3,915 (71.8) | |
Blood pressure | 0.12 | ||
Normal | 3,258 (62.7) | 3,338 (61.2) | |
Grade 1 hypertension | 1,354 (26.1) | 1,420 (26.1) | |
Grade 2 hypertension | 435 (8.4) | 512 (9.4) | |
Grade 3 hypertension | 148 (2.8) | 180 (3.3) | |
Smoking history | <0.001 | ||
Nonsmoker | 3,433 (66.1) | 3,218 (59.0) | |
Smoker | 1,762 (33.9) | 2,232 (41.0) | |
Alcohol intake frequency | 0.002 | ||
Rarely | 2,179 (41.9) | 2,103 (38.6) | |
Occasionally | 1,639 (31.5) | 1,798 (33.0) | |
Regularly | 1,377 (26.5) | 1,549 (28.4) | |
Positive history of cardiovascular disease | 224 (4.3) | 112 (2.1) | <0.001 |
Willingness to receive health instruction from public health nurses | 0.005 | ||
Yes | 1,767 (34.0) | 1,716 (31.5) | |
Stage in change model for lifestyle modifications | <0.001 | ||
Precontemplation | 812 (15.6) | 959 (17.6) | |
Contemplation | 1,986 (38.2) | 2,166 (39.7) | |
Preparation | 941 (18.1) | 910 (16.7) | |
Action | 545 (10.5) | 582 (10.7) | |
Maintenance | 911 (17.5) | 833 (15.3) | |
Restfulness from sleep | 0.84 | ||
No | 3,023 (58.2) | 3,161 (58.0) | |
Yes | 2,172 (41.8) | 2,289 (42.0) | |
Positive proteinuria | 672 (12.9) | 598 (11.0) | 0.002 |
Positive glucosuria | 1,630 (31.4) | 1,439 (26.4) | <0.001 |
Fasting blood glucose (mg/dL), median (IQR) | 145.0 (134.0, 176.0) | 141.0 (132.0, 164.0) | <0.001 |
HbA1c (%), median (IQR) | 7.3 (6.8, 8.5) | 7.1 (6.7, 8.0) | <0.001 |
HbA1c (mmol/mol), median (IQR) | 56 (50, 69) | 54 (49, 63) | <0.001 |
Triglycerides (mg/dL) | 0.51 | ||
<150 | 2,675 (51.5) | 2,745 (50.4) | |
150–299 | 1,862 (35.8) | 1,998 (36.7) | |
≥300 | 658 (12.7) | 707 (13.0) | |
LDL-cholesterol (mg/dL) | <0.001 | ||
<120 | 1,620 (31.2) | 1,449 (26.6) | |
120–139 | 1,224 (23.6) | 1,207 (22.1) | |
≥140 | 2,351 (45.3) | 2,794 (51.3) | |
HDL-cholesterol (mg/dL) | 0.27 | ||
<40 | 741 (14.3) | 819 (15.0) | |
≥40 | 4,454 (85.7) | 4,631 (85.0) | |
Hemoglobin (g/dL), median (IQR) | 15.4 (14.6, 16.2) | 15.5 (14.7, 16.3) | <0.001 |
Serum uric acid (mg/dL), median (IQR) | 5.6 (4.7, 6.6) | 5.8 (4.9, 6.7) | <0.001 |
Estimated glomerular filtration rate (mL/min/ 1.73 m2), median (IQR) | 78.7 (68.9, 90.0) | 79.4 (70.1, 90.7) | <0.001 |
Metabolic syndrome | 3,697 (71.2) | 3,868 (71.0) | 0.83 |
Fatty liver index, median (IQR) | 91.7 (10.3, 99.9) | 93.3 (13.3, 99.9) | 0.15 |
Insured person | <0.001 | ||
Identical person | 4,060 (78.2) | 4,460 (81.8) | |
Dependent | 1,135 (21.8) | 990 (18.2) | |
Antihypertension prescription | 1,578 (30.4) | 667 (12.2) | <0.001 |
Antidyslipidemic prescription | 1,018 (19.6) | 335 (6.1) | <0.001 |
Antihyperuricemic prescription | 417 (8.0) | 178 (3.3) | <0.001 |
Antidepressive prescription | 188 (3.6) | 107 (2.0) | <0.001 |
Frequency of physician visits in the previous year (months/year), median (IQR) | 5.0 (1.0, 10.0) | 2.0 (0.0, 5.0) | <0.001 |
Recommended to consult a doctor for hyperglycemia at the last checkup | 4,526 (87.1) | 4,737 (86.9) | 0.75 |
Data are N (%) unless otherwise stated.
F, female; IQR, interquartile range; M, male.
Factors Associated With the Failure to Attend a Follow-up Visit for Diabetes Care
The process of variable selection using the hyperparameter λ and the corresponding values of the mean squared error with each SE bar is shown in Fig. 1A. We first obtained λCV, with which the cross-validation function was minimized. Subsequently, we obtained λSE, the largest λ within 1 SE of the lowest mean squared error. Using λSE, we constructed the Lasso regression model for the failure to attend a follow-up visit, and four variables were selected (Fig. 1B). The most important factor for the failure to attend a follow-up visit for diabetes care was a lower frequency of physician visits in the previous year (OR 1.15/1 time decrease/12 months [95% CI 1.13–1.16]), followed by lower HbA1c levels (OR 1.24/1% [11 mmol/mol] decrease [95% CI 1.20–1.27]), negative history of receiving antidyslipidemic prescription (OR 1.86 [95% CI 1.58–2.19]), and negative history of receiving antihypertensive prescription (OR 1.37 [95% CI 1.19–1.57]) (Fig. 2). Based on the previously reported logistic regression model, we used 13 variables, and the coefficients of the multivariable regression are shown in Supplementary Table 1.
Cross-validation plot and variable selection process in the Lasso regression model. A: Cross-validation plot of mean squared error corresponding to smoothing parameter λ with SEs. B: The paths of the coefficients along with smoothing parameter λ and standardized coefficients of variables. λCV, λ for which the cross-validation function is minimum (λ= 0.0025); λSE, the largest λ for which the cross-validation function is within 1 SE of the minimum of the cross-validation function (λ= 0.028).
Cross-validation plot and variable selection process in the Lasso regression model. A: Cross-validation plot of mean squared error corresponding to smoothing parameter λ with SEs. B: The paths of the coefficients along with smoothing parameter λ and standardized coefficients of variables. λCV, λ for which the cross-validation function is minimum (λ= 0.0025); λSE, the largest λ for which the cross-validation function is within 1 SE of the minimum of the cross-validation function (λ= 0.028).
Variable importance derived from machine-learning prediction models for failure to attend a follow-up visit for diabetes care and ORs of failure to attend a follow-up visit for diabetes care. A: The variable importance is a measure scaled to have a maximum value of 100. B: We calculated ORs of failure to attend a follow-up visit for diabetes care. Points and error bars indicate the OR and 95% CI of variables, respectively.
Variable importance derived from machine-learning prediction models for failure to attend a follow-up visit for diabetes care and ORs of failure to attend a follow-up visit for diabetes care. A: The variable importance is a measure scaled to have a maximum value of 100. B: We calculated ORs of failure to attend a follow-up visit for diabetes care. Points and error bars indicate the OR and 95% CI of variables, respectively.
Predictive Ability of Developed Models
The Lasso regression model using four predictors had a better discrimination ability than the previously reported logistic regression model using 13 predictors (C-statistic: 0.71 [95% CI 0.69–0.73] vs. 0.67 [95% CI 0.65–0.69], respectively; P < 0.001) (Fig. 3).
Predictive abilities in patients who failed to attend a follow-up visit for diabetes care in the Lasso regression and previously reported logistic regression models. Predictive performances of the Lasso regression and previously reported logistic regression models for failure to attend a follow-up visit for diabetes care within 6 months after being recommended to consult a physician. The ROC curves for predicting failure to attend a follow-up visit are shown. The Lasso regression model had a better discrimination ability (C-statistic: 0.71, Lasso regression model, vs. 0.67, previously reported logistic regression model).
Predictive abilities in patients who failed to attend a follow-up visit for diabetes care in the Lasso regression and previously reported logistic regression models. Predictive performances of the Lasso regression and previously reported logistic regression models for failure to attend a follow-up visit for diabetes care within 6 months after being recommended to consult a physician. The ROC curves for predicting failure to attend a follow-up visit are shown. The Lasso regression model had a better discrimination ability (C-statistic: 0.71, Lasso regression model, vs. 0.67, previously reported logistic regression model).
Stratified Analyses
When stratified by HbA1c values (HbA1c <8.0% and ≥8.0% [<64 mmol/mol and ≥64 mmol/mol]), the results were similar to those of the primary analysis across the groups. Among those with HbA1c <8.0% [<64 mmol/mol] (N = 7,485), 4,035 (53.9%) failed to attend a follow-up visit. Among those with HbA1c ≥8.0% (≥64 mmol/mol) (N = 3,160), 1,415 patients (44.8%) failed to attend a follow-up visit. Additionally, the stratified analyses yielded a similar pattern of variable importance in the Lasso regression model (Supplementary Fig. 1). Although the Lasso regression model identified three factors in patients with HbA1c of 6.5–7.9% (48–63 mmol/mol), four factors were identified in patients with HbA1c ≥8.0% (≥64 mmol/mol). While HbA1c levels were selected as an important variable in patients with HbA1c ≥8.0% (≥64 mmol/mol), they were not in patients with HbA1c <8.0% (<64 mmol/mol). The three common variables in both stratified groups were a lower frequency of physician visits in the previous year, negative history of receiving an antihypertensive prescription, and negative history of receiving an antidyslipidemic prescription. Supplementary Fig. 2 shows the ROC curves for each group, and the results were similar to those of the main analysis.
Among those receiving antihypertensive, antidyslipidemic, or antihyperuricemic agents (N = 2,896), 871 patients (30.1%) failed to attend a follow-up visit. Among those who did not receive any of these drugs (N = 7,749), 4,579 (59.1%) failed to attend a follow-up visit. These stratified analyses yielded a lower ability to predict failure (Supplementary Fig. 3). Among those receiving any of these drugs, the Lasso regression model using one variable (frequency of physician visits in the previous year) had a comparable discrimination ability with the previously reported logistic regression model using the 13 predictors (C-statistic: 0.60 [95% CI 0.55–0.65] vs. 0.61 [95% CI 0.57–0.66], respectively; P = 0.71). Among those who did not receive these drugs, the Lasso regression model using two variables (frequency of physician visits in the previous year and HbA1c values) had a better discrimination ability than the previously reported logistic regression model using the 13 predictors (C-statistic: 0.66 [95% CI 0.63–0.68] vs. 0.61 [95% CI 0.58–0.64]; P < 0.001).
Sensitivity Analyses
Supplementary Figure 4 shows the importance of variables in the failure to attend a follow-up visit during observation periods of 3 or 9 months after the checkup. In the model with observation periods of 3 months and 9 months, 64.3% and 45.0% failed to attend a follow-up visit, respectively. In both settings, the Lasso regression model involved the following four variables: lower frequency of physician visits in the previous year, followed by lower HbA1c levels, negative history of receiving antidyslipidemic prescription, and negative history of receiving an antihypertensive prescription. The results of the ROC curves comparing the predictive ability of the developed models in both settings were also comparable to those of the main analysis (Supplementary Fig. 5).
Conclusions
In this analysis of 10,645 patients with newly diagnosed diabetes in an administrative database, the Lasso regression model using only four variables had a better discrimination ability to predict failure to attend a follow-up visit based on recommendations from a screening program, compared with the previously reported logistic regression model that used 13 variables. The important factors associated with the failure to attend a follow-up visit were lower frequency of physician visits in the previous year, followed by lower HbA1c levels, negative history of receiving an antidyslipidemic prescription, and negative history of receiving an antihypertensive prescription.
A lower frequency of physician visits in the previous year and negative history of receiving antihypertensive or antidyslipidemic agents were associated with the failure to attend a follow-up visit. For one reason, patients treated for some diseases, especially diseases necessitating frequent follow-ups, are likely to have more opportunities to undergo an intervention based on the results of the checkups. Furthermore, patients treated for hypertension or dyslipidemia may be aware that it is necessary to treat lifestyle-related diseases. Otherwise, their primary physicians may help them understand the necessity. Another reason is that physicians who prescribe antidyslipidemic medication may be primary care physicians, diabetologists, or endocrinologists, who are specialized in diabetes care.
An implication of this study is that people who infrequently or rarely use medical facilities are at a high risk of not initiating diabetes care. This was confirmed by the stratified analyses, because all of the models failed to have a good predictive ability in those receiving antihypertensive, antidyslipidemic, or antihyperuricemic agents. To prevent failure to have a follow-up visit, whether the screened people used recent medical services should be considered. Patients with newly screened diabetes and no history of recent medical facility visits should be nudged to undergo a follow-up visit. Although nudging has been recognized as important (19), no studies have examined effective methods of nudging such patients; however, current evidence suggests that a multicomponent approach is promising (19). We should also consider supporting or giving incentives to employers who encourage or nudge their employees with newly screened diabetes to consult a physician to improve public health and social welfare.
We found a high proportion of patients who failed to attend a follow-up visit for diabetes care after checkup (53.2%), which was smaller than that in a previous study (65.2%) (3). The differences may be attributable to the different inclusion criteria between studies or the difference in recognition of the annual checkup system. While the previous study included patients with either elevated fasting blood glucose or elevated HbA1c values, our study included patients who had both elevated fasting blood glucose and elevated HbA1c values, which are diagnostic criteria for diabetes. Given that the severity of diabetes affects the initiation of diabetes care (20), it is understandable that patients satisfying both were likely to consult a doctor.
Our results should be interpreted in the context of the public health and health care system of Japan. The American Diabetes Association recommends community screening only after establishment of an appropriate referral system before testing (21), and the Japanese legally obliged checkup system complies with this requirement; the system provides employees with consultation to public health nurses or physicians based on checkup results (3). However, considering that nearly half of those recommended to see a physician failed to attend a follow-up visit in this study, some measures are warranted to improve the current situation. For example, in addition to nudging people with newly screened diabetes who rarely use a medical facility, it is necessary to consider implementing advocacy-oriented policy against diabetes in workplaces or among employers.
Our study has certain strengths and implications. First, we provided clinical information on a nationwide level predicting failure to attend a follow-up visit immediately after the diagnosis of diabetes, which is a clinically and socially challenging problem to be resolved. Second, our results could facilitate the implementation of community-based policymaking since the policymakers can focus on a smaller number of factors. Third, focusing on the limited number of variables was made possible with the use of Lasso regression with the 1-SE rule. At the same time, we succeeded in maintaining the predictive ability to detect the patients failing to attend a follow-up visit for diabetes care using a small number of variables while maintaining the discrimination capabilities using conventional logistic regression using 13 variables. Thus, we found a useful application of this parsimonious method in clinical decision-making.
We acknowledge four potential limitations of this study. First, while we selected patients with newly diagnosed diabetes by excluding patients who had received any diabetes-related care in the previous 12 months, some of the included patients might have been diagnosed with diabetes before. Second, although we identified 39 candidate factors to be included in the models, there may also be other factors affecting the failure to attend a follow-up visit. For instance, we did not obtain information on education level, occupation, and socioeconomic status, which was not recorded in the database. The third limitation may be that the effectiveness of screening for diabetes in terms of short-term health outcomes is not obvious (22). Lastly, our findings may not be generalizable to different settings. With a different medical insurance system or different cultural backgrounds, the predictive model in our study may not be completely applicable to patients urged to consult a physician for diabetes care. However, the variables used in this study are common in other countries, and the predictive model is, therefore, somewhat functional. A previous study in the U.K. showed that lower blood glucose levels and negative prescription histories were predictors of failure to attend a follow-up visit as in our study (4).
In conclusion, this retrospective cohort study using a large-scale administrative claims database revealed that a lower frequency of physician visits in the previous year, negative history of receiving antihypertensive or antidyslipidemic agents, and lower HbA1c levels are important factors for failure to attend a follow-up visit for diabetes care. In particular, the information on medical facility utilization is important in predicting failure. The predictive model using only four factors may be beneficial for predicting failure to attend a follow-up visit for diabetes care. This finding may be useful in policymaking against lifestyle-related diseases.
Y.H. and T.G. contributed equally to this article as co-second authors.
This article contains supplementary material online at https://doi.org/10.2337/figshare.19333490.
Article Information
Funding. This work was supported by grants from the Ministry of Health, Labour and Welfare, Japan (21AA2007) and the Ministry of Education, Culture, Sports, Science and Technology, Japan (20K18957, 20H03907, and 21H03159). This work was also supported by a junior scientist development grant from the Japan Diabetes Society to A.O.
Duality of Interest. A.O., S.Y., K.I.K., and T.K. are members of the Department of Prevention of Diabetes and Lifestyle-Related Diseases, which is a cooperative program between the University of Tokyo and Asahi Mutual Life Insurance Company. K.I.K. was employed by the Asahi Mutual Life Insurance Company. T.G. is employed by TXP Medical Co., Ltd. S.O. is a member of the Department of Eat-loss Medicine, which is a cooperative program between the University of Tokyo and ITO EN Ltd. No other potential conflicts of interest relevant to this article were reported.
Author Contributions. A.O., Y.H., T.G., S.Y., H.Y., and T.K. conceived and designed the study, performed the statistical analysis, and wrote, edited, and reviewed the manuscript. S.O., K.I.K., M.N., and T.Y. contributed to the discussion and interpretation of the data and edited and reviewed the manuscript. All authors have approved the final manuscript for publication. T.K. is the guarantor of this work and, as such, had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.