OBJECTIVE

Several diabetes clinical practice guidelines suggest that treatment goals may be modified in older adults on the basis of comorbidities, complications, and life expectancy. The long-term benefits of treatment intensification may not outweigh short-term risks for patients with limited life expectancy. Because of the uncertainty of determining life expectancy for individual patients, we sought to develop and validate prognostic indices for mortality in older adults with diabetes.

RESEARCH DESIGN AND METHODS

We used a prevalence sample of veterans with diabetes who were aged ≥65 years on 1 January 2006 (N = 275,190). Administrative data were queried for potential predictors that included patient demographics, comorbidities, procedure codes, laboratory values and anthropomorphic measurements, medication history, and previous health service utilization. Logistic least absolute shrinkage and selection operator regressions were used to identify variables independently associated with mortality. The resulting odds ratios were then weighted to create prognostic indices of mortality over 5 and 10 years.

RESULTS

Thirty-seven predictors of mortality were identified: 4 demographic variables, prescriptions for insulin or sulfonylureas or blood pressure medications, 6 biomarkers, previous outpatient and inpatient utilization, and 22 comorbidities/procedures. The prognostic indices showed good discrimination, with C-statistics of 0.74 and 0.76 for 5- and 10-year mortality, respectively. The indices also demonstrated excellent agreement between observed outcome and predictions, with calibration slopes of 1.01 for both 5- and 10-year mortality.

CONCLUSIONS

Prognostic indices obtained from administrative data can predict 5- and 10-year mortality in older adults with diabetes. Such a tool may enable clinicians and patients to develop individualized treatment goals that balance risks and benefits of treatment intensification.

Diabetes increases the risk of macrovascular and microvascular complications and mortality. Lowering hemoglobin A1c (A1C) reduces microvascular disease risks over time but has unclear effects on cardiovascular disease and mortality, particularly in older adults (1). There is uncertainty about when it is appropriate to pursue more intensive glucose control, particularly in patients with comorbidities and/or complications that shorten life expectancy (2,3). Indeed, treatment intensification increases near-term risks such as hypoglycemia (4,5), which may independently lead to increased mortality risk (6,7). The potential benefits of targeting a lower A1C may take several years to accrue (8,9) but also may carry burdens without benefit for those with limited life expectancy.

Several clinical practice guidelines suggest that A1C targets should account for a patient’s comorbid illnesses, diabetes complications, and life expectancy and that clinicians can propose lower A1C goals for patients with longer life expectancy and higher targets for those with shorter life expectancy. The Veterans Health Administration (VA)/Department of Defense Clinical Practice Guideline (10) proposes several A1C target ranges on the basis of comorbidities, complications, and life expectancy (e.g., <5 years, 5–10 years, and >10 years). However, these and other guidelines provide limited direction on how to classify patients by life expectancy groups. This issue is compounded by a scarcity of easily accessible tools to help clinicians to assess life expectancy in patients with diabetes and incorporate this information into shared decision making and goal setting with patients.

A systematic review by Yourman et al. (11) identified six prognostic indices for mortality in community-dwelling older adults that included variables for age, sex, and major comorbidities. Some indices incorporate additional variables for clinical measures (e.g., BMI, blood pressure, A1C), medication use, and survey measures for self-reported health and functional status. However, the utility of this prior work is limited by follow-up times as short as 1 year (12), low predictive validity (13), or requirements for extensive surveys that are impractical for point-of-care use (14,15). The objective of the current study was to develop and validate prognostic indices for 5- and 10-year mortality in older adults with diabetes using only administrative data that 1) use only data that are routinely collected in most electronic health records, 2) leverage machine learning techniques that produce outputs that are interpretable to the lay reader, and 3) provide an adaptable and transparent methodology that can be deployed in a wide variety of clinical settings.

Sample Selection

The study was reviewed and approved by the institutional review board of the VA Boston Healthcare System. Our data sources included VA and Medicare administrative data from 2004 to 2015. We used a nationwide prevalence sample of patients with diabetes who were aged ≥65 years and enrolled in VA care on 1 January 2006. This date was selected to provide 2 years preceding the observation period to develop risk prediction models and 10 years of follow-up to assess mortality. Patients were classified with diabetes if they met at least one of the following criteria during 2004–2005: two ICD-9 diagnosis codes for diabetes from outpatient visits within a 12-month period, one ICD-9 diagnosis code for diabetes during an inpatient stay, or a prescription for a medication to treat diabetes (excluding metformin only) (16). Patients were also required to have at least one outpatient primary care visit and at least one record each for three routine clinical parameters (blood pressure, BMI, A1C) within the previous 24 months. Our objective was to identify the VA-reliant population to ensure fairly complete predictors for each patient. Medicare provides health insurance coverage for older adults (>65 years) in the U.S., and veterans often use services provided by both VA and Medicare. The sample was further limited to patients enrolled in traditional fee-for-service Medicare versus Medicare Advantage to ensure more complete utilization and diagnosis data. Medicare Advantage is a private plan alternative to traditional Medicare; however, only limited claims data are available for its enrollees. Consequently, complete predictors cannot be obtained for Medicare Advantage enrollees. The final sample comprised 275,190 unique patients. A sample selection flowchart is presented in Supplementary Fig. 1.

Initial Variable Identification

We included a wide variety of predictors that have been linked to mortality in patients with diabetes or older adults. These included demographics (age, sex, marital status), VA priority groups (which reflect disability related to military service or economic hardship) (17), major comorbidities (18), smoking (19), prescriptions for different classes of diabetes medications (20), frailty (14,21), previous inpatient and outpatient health services utilization (15), and biomarkers such as A1C, BMI, systolic and diastolic blood pressure, HDL and LDL cholesterol, triglycerides, serum creatinine, urine microalbumin/creatinine ratio, and serum albumin (20,2224).

Age was categorized into six groups: 65–69, 70–74, 75–79, 80–84, 85–89, and 90+ years. Marital status included single (never married or divorced), married, or widowed. The VA classifies veterans into eight priority groups. Groups 1 and 4 constitute those with serious service-related disabilities (≥50% disability or housebound); groups 2, 3, and 6 are those with low or moderate disabilities; group 5 comprises those with economic hardships; and groups 7 and 8 have no service-related disabilities and household incomes above certain thresholds (25). Sex was considered binary (male/female), while race was categorized into white, black, or other.

Binary variables for the following diabetes medication classes were included: sulfonylureas, meglitinides, metformin, thiazolidinediones, α-glucosidase inhibitors, and insulin. Models also included a binary variable indicating blood pressure treatment (e.g., β-blockers, calcium channel blockers, antihypertensive combinations). Comorbidities included the 31 Quan-Elixhauser comorbidities (26) as well as end-stage liver disease, major depression, coronary artery disease, acute myocardial infarction, percutaneous coronary interventions (27), retinopathy, hyperglycemia, lower-limb amputation, and diabetic foot infections (28). We used Quan-Elixhauser comorbidities versus other comorbidity indices because of their stronger association with mortality (26). We also identified screenings for retinopathy and ankle-brachial indices since these may signify additional diabetes complications (29). A frailty index ranging from 0 to 1 was also created using 30 variables identified from ICD-9 or Common Procedural Terminology (CPT) codes related to morbidity (e.g., arthritis), functional status (e.g., need for durable medical equipment), cognition and mood (e.g., dementia), sensory impairment (e.g., hearing impairment), or other conditions (e.g., incontinence) (21). This was binary coded into five categories of frailty. We had complete data for demographics, utilization, and diagnosis codes.

Our models also included a BMI slope variable capturing the change in BMI for patients who had two or more measurements during the baseline period that were at least 30 days apart. These were rescaled to BMI change per 30 days and then split into two positive and two negative categories. An additional category was added indicating that the patient lacked two BMI measurements. Other specific biomarkers beyond BMI, blood pressure, and A1C may be unavailable for a given patient during the baseline period. Biomarkers were thus categorized into groups (e.g., low, normal, high) following established clinical criteria, with an additional grouping if the biomarker was missing. Groups were aggregated separately for 5- and 10-year models if the resulting odds ratios and point scores were equivalent. The complete list of potential predictor variables and their coding (including the ICD-9 and CPT codes used to create comorbidities) are contained in Supplementary Table 1.

We used a 2-year lookback period (2004–2005) to collect the study predictors. All comorbidities and screenings were modeled as binary variables indicating whether the patient had a relevant ICD-9 or CPT code within the previous 24 months. For demographics and VA priority status, we used the most recent information recorded during the lookback period. When patients had multiple measurements for the same biomarker, we used mean values. Outcomes included whether the patient was alive at the end of 5 years (31 December 2011) or 10 years (31 December 2015), using data provided by the National Vital Statistics System.

Statistical Analyses

The first step was to randomly allocate our data into training (75%, n = 206,392) and test sets (25%, n = 68,798). The training set was used for model development, while the test set was used to evaluate the models’ predictive validity. We used logistic least absolute shrinkage and selection operator (LASSO) regressions, using the training set to select variables independently associated with mortality risk. While a full explanation of LASSO is outside the scope of this article, the technique has been described in detail elsewhere (30,31).

LASSO is a common machine learning algorithm and a form of penalized regression that constrains the sum of the absolute value of model parameters to some value λ. LASSO can be used to objectively select a subset of variables that minimizes prediction error and removes any potential collinearity. LASSO shrinks the β-coefficients that are least associated with the outcome toward 0 before the more strongly associated β-coefficients. If two variables are highly correlated, the LASSO algorithm shrinks the least associated one until it drops out of the model because it does not contribute to predictive accuracy. Alternative methodologies for variable screening, such as the use of correlation coefficients, usually treat every effect as additive, while automatic techniques, such as stepwise regression, often lead to unstable results (32). The LASSO technique also trades off unbiasedness for lower variance, which is advantageous in predictive research applications (30,33). Lastly, LASSO produces familiar-looking odds ratios, albeit without P values, which are untrustworthy for methods that select or shrink predictor variables adaptively (31). The resulting odds ratios may then be used to develop prognostic indices for mortality risk. We estimated separate regressions using the training set with 5- and 10-year mortality as outcomes, using 10-fold cross-validation to automatically select the optimal value of λ that maximizes predictive accuracy in the test set. Cross-validation is a resampling procedure commonly used to reduce overfitting and evaluate the predictive accuracy of machine learning models on unseen data and to reduce mean squared error (34,35).

We then developed a point-based risk scoring system on the basis of the absolute size of the resulting odds ratios (14,15). Points were assigned by subtracting 1 from the odds ratio, dividing by 0.2 increments, and rounding (e.g., an odds ratio of 1.2 would be assigned 1 point, 1.4 would be 2 points, and 1.65 would be 3 points). Since the choice of increment is by nature arbitrary, we also tested increments of 0.1 and 0.3 in sensitivity analyses, which gave qualitatively similar results. Lastly, we calculated mortality rates by point score and several measures of predictive validity in both the training and the test sets.

Measures of Predictive Validity

Since each individual measure has strengths and drawbacks, we calculated a variety of measures following Steyerberg et al. (36). Balanced accuracy is a measure of overall predictive performance and is calculated as the average of two proportions: the proportion correctly predicted for those who experienced the mortality outcome and the proportion correctly predicted for those who did not. A score of 1 indicates a perfect model, and 0.5 indicates that the model is no better than chance. Balanced accuracy addresses the well-known phenomenon that binary classifiers tend to be biased toward the more frequent class, yielding an overly optimistic estimate of accuracy (37).

We also calculated additional measures to assess the models’ discrimination and calibration. These measures may range from 0 to 1, with higher values indicating better model performance. Discrimination refers to how well a prediction model differentiates between patients with and without the outcome. The concordance or C-statistic gives the probability that a randomly selected patient who died had a higher predicted mortality risk than a patient who was alive at the end of the outcome period. The discrimination slope is calculated as the average difference in predicted probability between the mortality and no-mortality groups. Calibration refers to the agreement between predicted and observed outcomes. The calibration slope is the regression slope of the linear predictor for observed and expected mortality, with values closer to 1 indicating better model agreement. Calibration belts were calculated to evaluate the goodness of fit for model predictions of binary outcomes (38), plotting the 95% CIs for expected versus observed mortality risk.

Sensitivity Analyses

We also conducted several sensitivity analyses to check the robustness of our models’ results. First, since age is the primary risk factor for mortality, we assessed how well our prognostic indices predicted mortality after excluding age. Second, because of the relatively small proportion of women in the sample, we also conducted a sensitivity analysis limiting the sample to men. Third, the base LASSO regression models included only the main effects of potential predictors. We also conducted a sensitivity analysis using a “saturated” model, including all possible two-way interactions between predictors. Lastly, we tested the robustness of the results to changes in the point-based risk scoring system by changing the incremental divisor to 0.1 or 0.3. Analyses were conducted using R version 3.5.3 statistical software.

Predictors of Mortality

The study sample included 275,190 patients (Supplementary Table 2). There were 65,171 deaths during the 5-year follow-up period (24% of the sample) and 157,620 deaths during the 10-year follow-up period (57% of the sample). Of 67 possible predictors of mortality, the logistic LASSO results for the training set indicated that 30 were associated with an increased 5-year mortality risk and that 36 were associated with a 10-year mortality risk (Table 1). For the biomarkers, we selected whichever group had the lowest risk of mortality as the baseline. This decision is by definition arbitrary, although the choice of reference group does not affect predictive accuracy. The strongest predictors of 5-year mortality were age, serum creatinine, BMI, and comorbidities for congestive heart failure, metastatic cancer, and end-stage liver disease. These variables were also the strongest predictors in the 10-year model, although elevated urine albumin-to-creatinine ratio also had a strong association.

Table 1

Demographic and clinical characteristics associated with mortality in patients with diabetes

5-year mortality10-year mortality
Risk factorAdjusted ORPointsAdjusted ORPoints
Demographics     
 Male sex 1.15 1.25 
 Marital status, single — — 1.16 
 Age category (years)     
  70–74 1.12 1.40 
  75–79 1.62 2.45 
  80–84 2.44 4.94 20 
  85–89 4.06 15 10.87 49 
  >90 6.61 28 22.29 106 
 VA priority     
  Group 4 1.63 1.97 
  Group 5 1.17 1.21 
Biomeasuresa     
 Serum creatinine (mg/dL)     
  1.5–3.0 1.37 1.52 
  >3.0 2.42 3.54 13 
 Systolic blood pressure >180 mmHg 1.49 1.42 
 Serum albumin (g/dL) <3.5 1.49 1.57 
 Urine albumin-to-creatinine ratio     
  20 to <60 — — 1.23 
  60 to <90 — — 1.39 
  90 to <30 — — 1.74 
  30 to <300 1.22 — — 
  300+ 1.52 2.53 
  Missing — — 1.18 
 A1C (%)     
  8–9 1.13 1.18 
  >9 1.30 1.39 
 BMI (kg/m2    
  <18.5 1.90 1.64 
  18.5–24.9 1.32 1.26 
  40–49.9 — — 1.27 
  ≥50 1.57 1.82 
Utilizationb     
 Outpatient visits ≥30 1.21 1.11 
 Inpatient days >5 1.13 — — 
Medicationsc     
 Insulin 1.49 1.51 
 Sulfonylureas 1.24 1.27 
 Blood pressure 1.13 1.19 
Comorbiditiesc     
 Congestive heart failure 1.73 1.88 
 Cardiac arrhythmia 1.16 1.19 
 Valvular disease — — 1.10 
 Peripheral vascular disorders 1.26 1.38 
 Paralysis 1.16 1.32 
 Other neurological disorders 1.41 1.66 
 Chronic pulmonary disease 1.36 1.46 
 Renal failure 1.14 1.19 
 Lymphoma 1.38 1.52 
 Metastatic cancer 1.77 1.77 
 Solid tumor without metastasis 1.15 1.13 
 Coagulopathy — — 1.19 
 Fluid and electrolyte disorders — — 1.12 
 Alcohol abuse — — 1.19 
 Psychoses 1.50 1.67 
 Depression — — 1.14 
 Coronary artery disease — — 1.13 
 End-stage liver disease 1.72 1.71 
 Lower-limb amputation 1.31 1.47 
 Diabetic foot infection 1.12 1.18 
 Weight loss 1.13 1.13 
 Smoking 1.19 1.32 
5-year mortality10-year mortality
Risk factorAdjusted ORPointsAdjusted ORPoints
Demographics     
 Male sex 1.15 1.25 
 Marital status, single — — 1.16 
 Age category (years)     
  70–74 1.12 1.40 
  75–79 1.62 2.45 
  80–84 2.44 4.94 20 
  85–89 4.06 15 10.87 49 
  >90 6.61 28 22.29 106 
 VA priority     
  Group 4 1.63 1.97 
  Group 5 1.17 1.21 
Biomeasuresa     
 Serum creatinine (mg/dL)     
  1.5–3.0 1.37 1.52 
  >3.0 2.42 3.54 13 
 Systolic blood pressure >180 mmHg 1.49 1.42 
 Serum albumin (g/dL) <3.5 1.49 1.57 
 Urine albumin-to-creatinine ratio     
  20 to <60 — — 1.23 
  60 to <90 — — 1.39 
  90 to <30 — — 1.74 
  30 to <300 1.22 — — 
  300+ 1.52 2.53 
  Missing — — 1.18 
 A1C (%)     
  8–9 1.13 1.18 
  >9 1.30 1.39 
 BMI (kg/m2    
  <18.5 1.90 1.64 
  18.5–24.9 1.32 1.26 
  40–49.9 — — 1.27 
  ≥50 1.57 1.82 
Utilizationb     
 Outpatient visits ≥30 1.21 1.11 
 Inpatient days >5 1.13 — — 
Medicationsc     
 Insulin 1.49 1.51 
 Sulfonylureas 1.24 1.27 
 Blood pressure 1.13 1.19 
Comorbiditiesc     
 Congestive heart failure 1.73 1.88 
 Cardiac arrhythmia 1.16 1.19 
 Valvular disease — — 1.10 
 Peripheral vascular disorders 1.26 1.38 
 Paralysis 1.16 1.32 
 Other neurological disorders 1.41 1.66 
 Chronic pulmonary disease 1.36 1.46 
 Renal failure 1.14 1.19 
 Lymphoma 1.38 1.52 
 Metastatic cancer 1.77 1.77 
 Solid tumor without metastasis 1.15 1.13 
 Coagulopathy — — 1.19 
 Fluid and electrolyte disorders — — 1.12 
 Alcohol abuse — — 1.19 
 Psychoses 1.50 1.67 
 Depression — — 1.14 
 Coronary artery disease — — 1.13 
 End-stage liver disease 1.72 1.71 
 Lower-limb amputation 1.31 1.47 
 Diabetic foot infection 1.12 1.18 
 Weight loss 1.13 1.13 
 Smoking 1.19 1.32 

OR, odds ratio.

a

Average of measurements taken in previous 2 years.

b

Total utilization in previous 2 years.

c

Indicates a CPT code, ICD-9 code, or prescription (as applicable) during previous 2 years. For a full list of drug classes, ICD-9 codes, and CPT codes used, see Supplementary Table 1.

Mortality was associated with other demographic characteristics, such as male sex, single marital status, and economic hardships. Diabetes treatment (e.g., insulin, sulfonylurea), blood pressure treatment, several additional biomarkers (e.g., low serum albumin, elevated A1C), and greater number of comorbidities increased mortality risk. Sixteen comorbidities were associated with mortality in the 5-year model, and 22 were associated with the 10-year model. BMI showed a U-shaped relationship with mortality, with BMI 25–40 kg/m2 having the lowest risk. The number of inpatient days and outpatient visits was associated with mortality but only for patients with high health care utilization (>5 inpatient days or ≥30 outpatient visits). Patient frailty was not associated with mortality after controlling for other risk factors.

Possible patient risk scores (in points) ranged from 0 to 87 in the 5-year model and from 0 to 192 in the 10-year model. Observed patient risk scores ranged from 0 to 59 in the 5-year model and from 0 to 143 in the 10-year model. Overall, the models predicted a >50% probability of 5-year mortality for patients with risk scores >22 and 10-year mortality for those with risk scores >14.

Model Performance

Table 2 contains several statistics for model performance, using either the raw predicted probabilities or the prognostic indices. Balanced accuracy ranged from 0.78 to 0.77 in the 5-year model and was consistently 0.77 in the 10-year model, suggesting that the models are approximately substantially more accurate at predicting mortality than chance alone. The predictive models also showed good discrimination, with C-statistics ranging from 0.74 to 0.77 for 5- and 10-year mortality in the test cohort. Discrimination slopes ranged from 0.36 to 0.40, indicating that average predicted mortality was 36–40% higher for patients who experienced the mortality outcomes, on average.

Table 2

Measures of model performance

Model for predicting 5-year risk of all-cause mortalityModel for predicting 10-year risk of all-cause mortality
MetricProbabilityScoreProbabilityScore
Overall performance     
 Balanced accuracya 0.78 0.77 0.70 0.70 
Discriminationb     
 C-statistic 0.75 0.74 0.77 0.76 
 Discrimination slope 0.37 0.36 0.40 0.39 
Calibration     
 Calibration slopec 1.01 — 1.01 — 
Model for predicting 5-year risk of all-cause mortalityModel for predicting 10-year risk of all-cause mortality
MetricProbabilityScoreProbabilityScore
Overall performance     
 Balanced accuracya 0.78 0.77 0.70 0.70 
Discriminationb     
 C-statistic 0.75 0.74 0.77 0.76 
 Discrimination slope 0.37 0.36 0.40 0.39 
Calibration     
 Calibration slopec 1.01 — 1.01 — 

The table contains several measures of the models’ predictive validity using test set data that are based on either regression-predicted probabilities or the risk-based point scoring system. See the text for an explanation of these measures.

a

A score of 1 indicates a perfect model, and 0.5 indicates that the model is no better than chance.

b

These measures may range from 0 to 1, with higher values indicating better model performance.

c

Values closer to unity indicate better model agreement.

Figure 1 contains calibration belt plots for predicted versus observed mortality in both 5- and 10-year models. The calibration slope was 1.01 for both the 5-year and the 10-year models, indicating that predicted risk tracks very closely with actual risk. The 5-year model slightly underpredicted mortality for those at highest risk.

Figure 1

Calibration belts for the goodness-of-fit between observed and predicted mortality. The calibration belt shows the 95% CIs for observed mortality at predicted levels of risk. The red 45° line represents a perfect prediction.

Figure 1

Calibration belts for the goodness-of-fit between observed and predicted mortality. The calibration belt shows the 95% CIs for observed mortality at predicted levels of risk. The red 45° line represents a perfect prediction.

Close modal

Sensitivity Analysis Results

Our prognostic indices had satisfactory to good accuracy when predicting mortality after excluding age from the models. Figure 2 shows mortality within age-groups (65–69, 70–74, 75–79, 80+ years) by risk score; the indices discriminated well in all age-subgroups, with strata-specific C-statistics ranging from a low of 0.66 for 5-year mortality in the 80+ group to a high of 0.71 for 5-year mortality in the 65–69 group.

Figure 2

Observed mortality by risk score in varying age-groups. This graph shows mortality within age-groups (65–69, 70–74, 75–79, 80+ years) by risk score calculated using prognostic indexes but excluding the points contributed by age. A horizontal line is drawn at 50% to note that such risk scores predict when mortality is more likely than not.

Figure 2

Observed mortality by risk score in varying age-groups. This graph shows mortality within age-groups (65–69, 70–74, 75–79, 80+ years) by risk score calculated using prognostic indexes but excluding the points contributed by age. A horizontal line is drawn at 50% to note that such risk scores predict when mortality is more likely than not.

Close modal

Model results for variables other than sex were unchanged when limiting our sample to men. The saturated models, including all possible two-way interactions between predictors, showed only minor improvements in predictive accuracy compared with main-effects regressions (e.g., increase of <0.01 in C-statistics for the 5- and 10-year models). Changes in the incremental divisor used for the point-based risk scoring affected model parsimony; for instance, the 10-year LASSO model using 0.1 increments assigned 1 point each to seven comorbidities that were not included in the base model. The 5-year LASSO model using 0.3 increments assigned no points to several variables that were awarded points in the base model. The overall predictive accuracy was generally unchanged during these sensitivity analyses (Supplementary Table 3). There was little benefit to adding additional variables, and the results suggest that even larger increments could be used before predictive accuracy deteriorates.

The overall goal of diabetes treatment in older adults is to minimize or eliminate symptoms and reduce risk for longer-term complications. In the U.S., average life expectancy for a 65-year-old male and female is ∼18 and 20 years, respectively (39), but this will vary markedly among individuals and may be shorter in patients with diabetes. For those with longer life expectancy, A1C targets may be set at lower levels to reduce the risk of microvascular complications. Higher A1C targets are appropriate on the basis of the presence of major comorbidities, complications, and limited life expectancy. In each case, treatment goals can be individualized on the basis of a patient’s unique circumstances and goals of care.

Clinicians are notoriously inaccurate in predicting life expectancy, with studies frequently showing both over- and underestimating (40,41). We generated models with high predictive validity of future mortality in a large sample of older veterans with diabetes. The models showed good discrimination in determining both 5- and 10-year mortality. Our predicted mortality probabilities or prognostic indices used only administrative data and for implementation purposes, could be calculated and embedded in an electronic health record to inform clinical decision making. Data visualizations (e.g., Fig. 2) could be used to place patients in specific mortality risk categories (e.g., expected mortality within 5 years, 5–10 years, 10+ years). Such a tool may be deployed at the point of care to assist in shared decision making with patients when setting diabetes treatment goals.

Our results highlight the importance of accounting for a wide variety of risk factors, including demographics, social determinants of health, health service utilization, medication history, biomarkers, and comorbidities, when predicting mortality. A few specific findings warrant comment. First, we showed that a BMI of 18.5–24.9 kg/m2 is associated with increased mortality. This finding comports with other work showing an obesity paradox, where higher BMI is associated with lower mortality (42). However, the biomarker variables used represent measurements at a discrete point in time. A lower BMI may indicate changes in weight toward end of life, which has been previously associated with mortality (43). Our initial variable list included a BMI slope for veterans who had multiple BMI measurements, but this was not retained by the LASSO regressions. Alternatively, comorbidities tend to be diagnosed earlier among patients with chronic conditions, which could bias the observed effects of BMI on life expectancy (44). Second, the 10-year mortality risk model slightly outperformed the 5-year model on several measures of predictive validity. This suggests that such data may have greater utility for predicting higher-risk events, such as longer-term mortality, and less utility for more proximal outcomes, such as short-term mortality. Third, certain individual comorbidities were retained in the 10-year model but not in the 5-year model. While it is possible that the effects of these comorbidities are cumulative over time, we can only state that these variables are important for longer-term mortality predictions. Lastly, we found that serious service-related disabilities and/or economic hardship were associated with higher mortality, and this is consistent with other published studies (45,46). These results do not infer causation but do illustrate the potential importance of social determinants of health as factors that affect mortality in older adults with diabetes.

These results have several strengths. Our approach provides several advantages over previously developed prognostic indices for mortality. First, we used a machine learning approach (logistic LASSO) to search through a large number of variables with putative links to mortality and selected only those that aid in mortality risk prediction. This approach is adaptable to a wide variety of institutional settings with different types and numbers of variables. The robust data sources also allowed us to examine longer-term effects on mortality, a critically important feature since the benefits of improved glycemic control may take several years to accrue. We also developed models using only data that are routinely collected in the electronic health record, eliminating data requirements that are not easily applied in clinical settings or suitable for operational use. Many guidelines and guidance statements on diabetes care consistently propose moderating A1C treatment goals in patients with limited life expectancy (10,47,48), and our findings may help to operationalize such recommendations.

Our study also has several important limitations. First, the logistic LASSO results may not be interpreted as causal; this methodology allows for bias in the resulting odds ratios if doing so leads to better predictive accuracy. Our results do not necessarily imply that treating or modifying these risk factors attenuates mortality risk. The VA patient population is disproportionately white and male compared with the general U.S. population. We focused our analyses only on patients who were not enrolled in Medicare Advantage or lacked routine biomarkers. Elderly veterans who primarily rely on Medicare for their care tend to be sicker, wealthier, and less likely to be black than veterans who rely on the VA for their care (49). Compared with traditional fee-for-service Medicare enrollees, Medicare Advantage enrollees tend to be poorer, more racially/ethnically diverse, and more likely to reside in urban areas (50). The specific results here may not generalize to other settings, although our methodology is adaptable and could be applied to other populations. Future research should validate these results in nonveteran patient populations. We also focused our analyses on older adults, and the results may not generalize to younger age-groups.

In summary, we developed prognostic mortality indices to predict 5- and 10-year mortality in older adults with diabetes. These indices have high predictive validity and demonstrate the importance of several individual and condition-specific characteristics that may inform clinicians and patients about life expectancy as they set A1C targets that balance treatment benefits and risks. Such models may also provide a basis for testing life expectancy predictions in other clinical outcomes and in practice.

This article contains supplementary material online at https://doi.org/10.2337/figshare.12298781.

Funding. This material is based on work supported by the Department of Veterans Affairs, VA Health Services Research and Development (IIR 15-116), and the National Institutes of Health, National Institute of Diabetes and Digestive and Kidney Diseases (R01 DK114098).

The views expressed in this article are those of the authors and do not necessarily reflect the position or policy of the Department of Veterans Affairs or the U.S. government.

Duality of Interest. No potential conflicts of interest relevant to this article were reported.

Author Contributions. K.N.G. contributed to the study design, analysis, and drafting and editing of the manuscript. J.C.P. and P.R.C. contributed to the study design and drafting and editing of the manuscript. D.C.M. contributed to the drafting and editing of the manuscript. P.R.C. is the guarantor of this work and, as such, had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.

1.
Reaven
PD
,
Emanuele
NV
,
Wiitala
WL
, et al.;
VADT Investigators
.
Intensive glucose control in patients with type 2 diabetes - 15-year follow-up
.
N Engl J Med
2019
;
380
:
2215
2224
2.
Esposito
K
,
Gentile
S
,
Candido
R
, et al.;
Associazione Medici Diabetologi
.
Management of hyperglycemia in type 2 diabetes: evidence and uncertainty
.
Cardiovasc Diabetol
2013
;
12
:
81
3.
Riddle
MC
.
Counterpoint: intensive glucose control and mortality in ACCORD--still looking for clues
.
Diabetes Care
2010
;
33
:
2722
2724
4.
Boussageon
R
,
Bejan-Angoulvant
T
,
Saadatian-Elahi
M
, et al
.
Effect of intensive glucose lowering treatment on all cause mortality, cardiovascular death, and microvascular events in type 2 diabetes: meta-analysis of randomised controlled trials
.
BMJ
2011
;
343
:
d4169
5.
Nathan
DM
,
Buse
JB
,
Davidson
MB
, et al.;
American Diabetes Association
;
European Association for Study of Diabetes
.
Medical management of hyperglycemia in type 2 diabetes: a consensus algorithm for the initiation and adjustment of therapy: a consensus statement of the American Diabetes Association and the European Association for the Study of Diabetes
.
Diabetes Care
2009
;
32
:
193
203
6.
The Action to Control Cardiovascular Risk in Diabetes Study Group
;
Gerstein
HC
,
Miller
ME
,
Byington
RP
, et al
.
Effects of intensive glucose lowering in type 2 diabetes
.
N Engl J Med
2006
;
358
:
2545
2559
7.
Finfer
S
,
Chittock
DR
,
Su
SY
, et al.;
NICE-SUGAR Study Investigators
.
Intensive versus conventional glucose control in critically ill patients
.
N Engl J Med
2009
;
360
:
1283
1297
8.
Libby
P
,
Plutzky
J
.
Diabetic macrovascular disease: the glucose paradox
?
Circulation
2002
;
106
:
2760
2763
9.
Stettler
C
,
Allemann
S
,
Jüni
P
, et al
.
Glycemic control and macrovascular disease in types 1 and 2 diabetes mellitus: meta-analysis of randomized trials
.
Am Heart J
2006
;
152
:
27
38
10.
The Management of Type 2 Diabetes Mellitus in Primary Care Working Group
.
VA/DoD Clinical Practice Guideline for the management of type 2 diabetes mellitus in primary care [Internet], 2017
.
Available from www.healthquality.va.gov/guidelines/CD/diabetes. Accessed 28 June 2019
11.
Yourman
LC
,
Lee
SJ
,
Schonberg
MA
,
Widera
EW
,
Smith
AK
.
Prognostic indices for older adults: a systematic review
.
JAMA
2012
;
307
:
182
192
12.
Gagne
JJ
,
Glynn
RJ
,
Avorn
J
,
Levin
R
,
Schneeweiss
S
.
A combined comorbidity score predicted mortality in elderly patients better than existing scores
.
J Clin Epidemiol
2011
;
64
:
749
759
13.
Carey
EC
,
Covinsky
KE
,
Lui
L-Y
,
Eng
C
,
Sands
LP
,
Walter
LC
.
Prediction of mortality in community-living frail elderly people with long-term care needs
.
J Am Geriatr Soc
2008
;
56
:
68
75
14.
Lee
SJ
,
Lindquist
K
,
Segal
MR
,
Covinsky
KE
.
Development and validation of a prognostic index for 4-year mortality in older adults
.
JAMA
2006
;
295
:
801
808
15.
Schonberg
MA
,
Davis
RB
,
McCarthy
EP
,
Marcantonio
ER
.
Index to predict 5-year mortality of community-dwelling adults aged 65 and older using data from the National Health Interview Survey
.
J Gen Intern Med
2009
;
24
:
1115
1122
16.
Miller
DR
,
Safford
MM
,
Pogach
LM
.
Who has diabetes? Best estimates of diabetes prevalence in the Department of Veterans Affairs based on computerized patient data
.
Diabetes Care
2004
;
27
(
Suppl. 2
):
B10
B21
17.
Woodard
LD
,
Landrum
CR
,
Urech
TH
,
Profit
J
,
Virani
SS
,
Petersen
LA
.
Treating chronically ill people with diabetes mellitus with limited life expectancy: implications for performance measurement
.
J Am Geriatr Soc
2012
;
60
:
193
201
18.
Piette
JD
,
Kerr
EA
.
The impact of comorbid chronic conditions on diabetes care
.
Diabetes Care
2006
;
29
:
725
731
19.
McEwen
LN
,
Kim
C
,
Karter
AJ
, et al
.
Risk factors for mortality among patients with diabetes: the Translating Research Into Action for Diabetes (TRIAD) Study
.
Diabetes Care
2007
;
30
:
1736
1741
20.
Wells
BJ
,
Jain
A
,
Arrigain
S
,
Yu
C
,
Rosenkrans
WA
 Jr
.,
Kattan
MW
.
Predicting 6-year mortality risk in patients with type 2 diabetes
.
Diabetes Care
2008
;
31
:
2301
2306
21.
Orkaby
AR
,
Nussbaum
L
,
Ho
Y
, et al
.
The burden of frailty among U.S. veterans and its association With mortality, 2002–2012
. J Gerontol A Biol Sci Med Sci
2018
;
74
:
1257
1264
22.
Leal
J
,
Gray
AM
,
Clarke
PM
.
Development of life-expectancy tables for people with type 2 diabetes
.
Eur Heart J
2009
;
30
:
834
839
23.
Skaaby
T
,
Husemoen
LL
,
Ahluwalia
TS
, et al
.
Cause-specific mortality according to urine albumin creatinine ratio in the general population
.
PLoS One
2014
;
9
:
e93212
24.
Sung
K
,
Ryu
S
,
Lee
J
, et al
.
Urine albumin/creatinine ratio below 30 mg/g is a predictor of incident hypertension and cardiovascular mortality
.
J Am Heart Assoc
2016
;
5
:
e003245
25.
U.S. Department of Veterans Affairs
.
Enrollment priority groups: IB 10-441
.
26.
Quan
H
,
Sundararajan
V
,
Halfon
P
, et al
.
Coding algorithms for defining comorbidities in ICD-9-CM and ICD-10 administrative data
.
Med Care
2005
;
43
:
1130
1139
27.
Bittl
JA
.
Percutaneous coronary interventions in the diabetic patient: where do we stand
?
Circ Cardiovasc Interv
2015
;
8
:
e001944
28.
Young
BA
,
Lin
E
,
Von Korff
M
, et al
.
Diabetes complications severity index and risk of mortality, hospitalization, and healthcare utilization
.
Am J Manag Care
2008
;
14
:
15
23
29.
Chen
S
,
Hsiao
P
,
Huang
J
,
Lin
K
,
Hsu
W
.
Abnormally low or high ankle-brachial index is associated with proliferative diabetic retinopathy in type 2 diabetic mellitus patients
.
PLoS One
2015
;
10
:
e0134718
30.
Athey
S
,
Imbens
G
.
The state of applied econometrics: causality and policy evaluation
.
J Econ Perspect
2017
;
31
:
3
32
31.
McNeish
DM
.
Using Lasso for predictor selection and to assuage overfitting: a method long overlooked in behavioral sciences
.
Multivariate Behav Res
2015
;
50
:
471
484
32.
Desboulets
LDD
.
A review on variable selection in regression analysis
.
Econometrics
2018
;
6
:
1
27
33.
Tong
L
,
Erdmann
C
,
Daldalian
M
,
Li
J
,
Esposito
T
.
Comparison of predictive modeling approaches for 30-day all-cause non-elective readmission risk
.
BMC Med Res Methodol
2016
;
16
:
26
34.
Rose
S
.
Machine learning for prediction in electronic health data
.
JAMA Netw Open
2018
;
1
:
e181404
35.
Brownlee
J
.
A gentle introduction to k-fold cross-validation [Internet]
.
Victoria, Australia
,
Machine Learning Mastery
,
2018
.
36.
Steyerberg
EW
,
Vickers
AJ
,
Cook
NR
, et al
.
Assessing the performance of prediction models: a framework for traditional and novel measures
.
Epidemiology
2010
;
21
:
128
138
37.
Brodersen
KH
,
Ong
CS
,
Stephan
KE
,
Buhmann
JM
.
The balanced accuracy and its posterior distribution. In Proceedings of the 20th International Conference on Pattern Recognition, 2010. Buffalo, NY, International Association for Pattern Recognition, p. 3125–3128
38.
Finazzi
S
,
Poole
D
,
Luciani
D
,
Cogo
PE
,
Bertolini
G
.
Calibration belt for quality-of-care assessment based on dichotomous outcomes
.
PLoS One
2011
;
6
:
e16110
39.
Office of the Chief Actuary SSA
.
Actuarial life table [Internet], 2016
.
Available from https://www.ssa.gov/oact/STATS/table4c6.html. Accessed 15 July 2019
40.
Lambden
J
,
Zhang
B
,
Friedlander
R
,
Prigerson
HG
.
Accuracy of oncologists’ life-expectancy estimates recalled by their advanced cancer patients: correlates and outcomes
.
J Palliat Med
2016
;
19
:
1296
1303
41.
Clarke
MG
,
Ewings
P
,
Hanna
T
,
Dunn
L
,
Girling
T
,
Widdison
AL
.
How accurate are doctors, nurses and medical students at predicting life expectancy
?
Eur J Intern Med
2009
;
20
:
640
644
42.
Hainer
V
,
Aldhoon-Hainerová
I
.
Obesity paradox does exist
.
Diabetes Care
2013
;
36
(
Suppl. 2
):
S276
S281
43.
Strandberg
TE
,
Strandberg
AY
,
Salomaa
VV
, et al
.
Explaining the obesity paradox: cardiovascular risk, weight change, and mortality during long-term follow-up in men
.
Eur Heart J
2009
;
30
:
1720
1727
44.
Khan
SS
,
Ning
H
,
Wilkins
JT
, et al
.
Association of body mass index with lifetime risk of cardiovascular disease and compression of morbidity
.
JAMA Cardiol
2018
;
3
:
280
287
45.
Majer
IM
,
Nusselder
WJ
,
Mackenbach
JP
,
Klijs
B
,
van Baal
PHM
.
Mortality risk associated with disability: a population-based record linkage study
.
Am J Public Health
2011
;
101
:
e9
e15
46.
Tucker-Seeley
RD
,
Li
Y
,
Subramanian
SV
,
Sorensen
G
.
Financial hardship and mortality among older adults using the 1996-2004 Health and Retirement Study
.
Ann Epidemiol
2009
;
19
:
850
857
47.
American Diabetes Association
.
6. Glycemic targets: Standards of Medical Care in Diabetes—2018
.
Diabetes Care
2018
;
41
(
Suppl. 1
):
S55
S64
48.
Qaseem
A
,
Wilt
TJ
,
Kansagara
D
,
Horwitch
C
,
Barry
MJ
,
Forciea
MA
;
Clinical Guidelines Committee of the American College of Physicians
.
Hemoglobin A1c targets for glycemic control with pharmacologic therapy for nonpregnant adults with type 2 diabetes mellitus: a guidance statement update from the American college of physicians
.
Ann Intern Med
2018
;
168
:
569
576
49.
Hynes
DM
,
Koelling
K
,
Stroupe
K
, et al
.
Veterans’ access to and use of Medicare and Veterans Affairs health care
.
Med Care
2007
;
45
:
214
223
50.
America’s Health Insurance Plans
.
Medicare Advantage [Internet], 2019. Available from http://www.ahip.org/issues/medicare-advantage. Accessed 28 June 2019
Readers may use this article as long as the work is properly cited, the use is educational and not for profit, and the work is not altered. More information is available at https://www.diabetesjournals.org/content/license.