To develop and validate a parsimonious model for predicting short-term all-cause mortality in patients with type 2 diabetes mellitus (T2DM).
Two cohorts of patients with T2DM were investigated. The Gargano Mortality Study (GMS, n = 679 patients) was the training set and the Foggia Mortality Study (FMS, n = 936 patients) represented the validation sample. GMS and FMS cohorts were prospectively followed up for 7.40 ±2.15 and 4.51 ±1.69 years, respectively, and all-cause mortality was registered. A new forward variable selection within a multivariate Cox regression was implemented. Starting from the empty model, each step selected the predictor that, once included into the multivariate Cox model, yielded the maximum continuous net reclassification improvement (cNRI). The selection procedure stopped when no further statistically significant cNRI increase was detected.
Nine variables (age, BMI, diastolic blood pressure, LDL cholesterol, triglycerides, HDL cholesterol, urine albumin-to-creatinine ratio, and antihypertensive and insulin therapy) were included in the final predictive model with a C statistic of 0.88 (95% CI 0.82–0.94) in the GMS and 0.82 (0.76–0.87) in the FMS. Finally, we used a recursive partition and amalgamation algorithm to identify patients at intermediate and high mortality risk (hazard ratio 7.0 and 24.4, respectively, as compared with those at low risk). A web-based risk calculator was also developed.
We developed and validated a parsimonious all-cause mortality equation in T2DM, providing also a user-friendly web-based risk calculator. Our model may help prioritize the use of available resources for targeting aggressive preventive and treatment strategies in a subset of very high-risk individuals.
The mortality rate of diabetic patients is about twice as much as that of nondiabetic individuals of a similar age (1); this makes diabetes a leading risk factor for mortality, which accounts for 2.9 million global events yearly (2). This scenario is expected to further deteriorate, given that diabetes prevalence is increasing worldwide (3). Thus, predicting mortality in diabetic patients is definitively needed in order to target aggressive prevention strategies to very high-risk patients.
Only a few models for predicting all-cause mortality in diabetic patients have been proposed thus far (4–6). Unfortunately, none of them have been validated in an external cohort. In addition, these models have used standard approaches that are not specifically suited for prediction model building. Instead, new attractive statistical methods based on the use of the reclassification measures (7–13) have become recently available when prediction model building is at issue; among these, the continuous net reclassification improvement (cNRI) has been mathematically proven to be a measure of effect size.
The aim of our study was to develop a parsimonious model for predicting short-term all-cause mortality in patients with type 2 diabetes mellitus (T2DM) based on the use of a forward variable selection driven by the cNRI approach and to validate it in a second independent sample. For this purpose, two prospective studies on patients with T2DM, the Gargano Mortality Study (GMS) and the Foggia Mortality Study (FMS), were analyzed.
RESEARCH DESIGN AND METHODS
Two cohorts of patients with T2DM were investigated: the GMS and the FMS.
Whites (n = 1,028) from Italy with T2DM (according to American Diabetes Association [ADA] 2003 criteria) were consecutively recruited at the Scientific Institute “Casa Sollievo della Sofferenza” in San Giovanni Rotondo (Apulia, central-southern Italy) from 1 November 2000 to 30 September 2005 for a study aimed at unraveling predictors of incident all-cause mortality. The only exclusion criterion was the presence of poor life expectancy due to malignancies. To date, this cohort has been followed up for 7.40 ± 2.15 years (range 0.04–9.83), with the last information on vital status obtained on 30 November 2010. After exclusion of patients 1) whose information on vital status at follow-up was not available (n = 190) and 2) who had some missing data for one or more baseline features (n = 159), 679 patients with T2DM (66.0% of the initial cohort) constituted the eligible sample for the present analysis. Observed missing data rates were as follows: BMI 1.1%, diabetes duration 0.2%, HbA1c 2.4%, systolic blood pressure (SBP) 0.9%, diastolic blood pressure (DBP) 0.9%, LDL cholesterol 1.6%, HDL cholesterol 3.2%, triglycerides 1.0%, albumin-to-creatinine ratio (ACR) 4.5%, estimated glomerular filtration rate (eGFR) 2.5%, lipid-lowering therapy 0.5%, insulin therapy 3.1%, and antihypertensive therapy 8.7%.
Whites (n = 1,153) from Italy with T2DM (according to ADA 2003 criteria) were consecutively recruited at the Endocrine Unit of the University of Foggia (Apulia, central-southern Italy) from 7 January 2002 to 30 September 2008 for a study aimed at unraveling predictors of incident all-cause mortality. Also in this case, the only exclusion criterion was the presence of poor life expectancy due to malignancies. To date, this cohort has been followed up for 4.51 ± 1.69 years (range 0.03–9.42), with the last information on vital status obtained on 31 March 2011.
After exclusion of patients 1) whose information on vital status at follow-up was not available (n = 91) and 2) who had missing data at baseline (n = 126), 936 patients with T2DM (81.2% of the initial cohort) constituted the eligible sample for the present analysis.
Observed missing data rates were as follows: age 5.9%, smoking habits 8.2%, BMI 6.8%, diabetes duration 4.0%, HbA1c 6.1%, SBP 6.0%, DBP 6.0%, LDL cholesterol 7.5%, HDL cholesterol 6.4%, triglycerides 6.2%, ACR 7.6%, eGFR 6.0%, lipid-lowering therapy 5.8%, insulin therapy 5.9%, and antihypertensive therapy 5.7%.
At baseline, all patients were interviewed regarding age at diabetes diagnosis and ongoing antidiabetic, hypolipidemic, and antihypertensive treatments. Duration of diabetes was calculated from the calendar year of data collection minus the calendar year of diabetes diagnosis. All subjects enrolled in the study underwent physical examination, including measurements of height, weight, BMI, and blood pressure (i.e., two measurements rounded to the nearest 2 mmHg in the sitting position after at least 5 min rest, using an appropriate-sized cuff; DBP was recorded at the disappearance of Korotokoff sound, phase V). Fasting venous blood was sampled from an antecubital vein from all patients for the measurement of standardized serum creatinine by using the modified kinetic Jaffè reaction (Hitachi 737 Autoanalyzer), total serum cholesterol (enzymatic method, Cobas; Roche Diagnostics, Welwin Garden City, U.K.), HDL cholesterol, serum triglycerides (enzymatic method, Cobas), and HbA1c (HPLC Diamat Analyzer; Bio-Rad, Richmond, CA). Urinary albumin and creatinine concentration were determined the same morning of the clinical examination from an early-morning first void sterile urine sample by the nephelometric method (Behring Nephelometer Analyzer; Behring, Marburg, Germany) and the Jaffè reaction-rate method, respectively. The urinary ACR was then calculated. eGFR was calculated with the abbreviated Modification of Diet in Renal Disease (MDRD) formula (13).
Study end point.
All-cause mortality was the only end point of this study. At follow-up, the vital status of study patients was ascertained by two authors for each study either by telephone interview with the patient or his/her relatives or by queries to the registry office of cities of residence. Vital status and date of death were checked and recorded from 1–30 November 2010 for GMS and from 1–31 March 2011 for FMS.
Patient baseline characteristics were reported as frequency (percentage) and mean ± SD or median along with lower and upper quartiles. Geometric mean and SD were reported for logarithm-transformed variables. Overall and age-adjusted death incidence rates for 100 person-years were also reported and compared using a Poisson model.
Time-to-death analyses were performed using multivariate Cox proportional hazards regression models, and risks were reported as hazard ratios (HRs) along with their 95% CIs. The assumption of proportionality of the hazards was tested by using scaled Schoenfeld residuals. The overall survival was defined as the time between enrollment and death. For subjects who did not experience the end point, survival time was censored at the time of the last available follow-up visit. Survival curves and survival probabilities were reported according to the Kaplan-Meier method. Discriminatory power was assessed by estimating survival C indices, along with 95% CI, and comparison between C indices was carried out according to the Pencina and D’Agostino approach (7). The survival-based Hosmer-Lemeshow measure of calibration (8) was also assessed. To control the accuracy of predictions and to increase the reliability of all statistical analyses, the prediction model was built on a “training set,” represented by the GMS, and eventually tested on a “validation set,” represented by the FMS. We chose GMS as a training set because of the longer follow-up as compared with FMS, which allowed a more stable estimation of the prediction model.
Continuous variables, specifically BMI, SBP, DBP, LDL, triglycerides, HDL, eGFR, and ACR, suspected to violate the multiplicative model linearity assumption were log transformed.
To build the final prediction model, we used a new forward variable selection within a multivariate Cox regression. Starting from the empty model, each step selected the predictor that, once included into the multivariate Cox model, yielded the maximum cNRI (9–13,14). The selection procedure, which analyzed all available variables, stopped when no further statistically significant cNRI was detected. cNRI was preferred to the classical Wald P value because our purpose was to build a prediction model rather than to assess statistical associations. Furthermore, cNRI can be directly interpreted as a measure of effect size (12). Since we were interested in predicting short-term all-cause mortality, we decided to set the time horizon for risk prediction at 2 years. In addition, we reasoned that since >90% of the sample has a follow-up of >2 years, very reliable estimates of the prediction ability measures would have been obtained. Results for prediction at 4 years were also reported.
A multivariate Cox proportional hazards regression model with robust standard errors was also used in the combined analysis on the pooled samples, accounting for potential clustering due to study effect (15). Regression coefficients from the fitted Cox model were also used to develop an interactive web-based tool that calculates 2-year mortality risk predictions. Indeed, a normalized overall mortality risk score (i.e., ranging from 0 to 1) was computed for each patient as the sum of the selected nine predictors weighted according to Cox regression coefficients. Furthermore, the recursive partitioning and amalgamation (RECPAM) algorithm (16,17) was used to identify subgroups of patients at different mortality risks according to the mortality risk score. The free web-based calculator is available at http://www.operapadrepio.it/rcalc/rcalc.php. The strategy of combining samples provides more stable and precise risk scores, as widely reported in the literature (18–22). All statistical analyses were performed using SAS Software Release 9.1.3 (SAS Institute, Cary, NC). RECPAM analysis was carried out with a macro written in SAS by one of the authors (F.P.).
GMS as a training set
Baseline clinical features of the 679 study patients are shown in Table 1. During follow-up (7.32 ± 2.12 years), 133 (19.6%) patients died with an annual incidence rate of 2.7%.
We used the cNRI-driven forward variable selection (see STATISTICAL METHODS for details) to build the 2-year mortality risk prediction model, including each predictor one by one. As recently demonstrated (12), cNRI is the appropriate statistic to build a predictive nested model instead of the C statistic or the Wald association P values. The model building stopped at step nine as no statistically significant values of cNRI were further observed (Supplementary Table 1). Therefore, the final model comprised patient age, BMI, DBP, LDL, triglyceride, HDL, and ACR levels and antihypertensive and insulin therapy. This parsimonious 2-year mortality prediction model yielded a C statistic = 0.88 (95% CI 0.82–0.94) and was calibrated (P value = 0.95). The same model also performed well in the 4-year mortality prediction with a C statistic = 0.81 (0.75–0.86) and was also calibrated (P value = 0.73). In addition, the survival C statistic was not different across sex strata, being 0.905 (0.846–0.965) and 0.891 (0.821–0.960) in men and women, respectively.
FMS as a validation sample
The FMS was used as an external sample to validate our prediction model generated in the GMS. Baseline clinical features of the 936 study patients are reported in Table 1. During follow-up (4.52 ± 1.71 years), 169 (18.0%) patients died with an annual incidence rate of 4.0%. The age-adjusted annual incidence rate in FMS was 2.4% and was not different from that observed in GMS (i.e., 2.1%, P = 0.30).
Using the model built in the GMS, the prediction of 2-year mortality in FMS yielded a C statistic = 0.82 (95% CI 0.76–0.87) and was calibrated (P value = 0.66). Similar results were obtained for prediction of 4-year mortality (i.e., C statistic = 0.80 [0.76–0.84]; P value for calibration = 0.71).
Multivariate HRs for 2-year mortality of the nine variables comprised in our predictive model were estimated on pooled samples and are reported in Supplementary Table 2. The prediction model had, in the pooled samples, a C statistic = 0.84 (95% CI 0.79–0.88) and was calibrated (P value = 0.99). The overall mortality risk score, built using the estimated Cox regression coefficients, normalized to a range from 0 to 1, had mean = 0.53 and SD = 0.16 (its distribution is reported in Supplementary Fig. 1). A Cox-based RECPAM analysis of 2-year mortality partitioned the pooled sample into three risk categories according to different levels of the overall mortality risk score. Survival curves, stratified into low (i.e., risk score ≤0.67), intermediate (risk score ranging from 0.68 to 0.79), and high (risk score ≥0.80) risk, are reported in Fig. 1. As compared with individuals with a low risk score, those with intermediate and high risk had HR 7.0 (95% CI 4.2–11.6) and 24.4 (14.4–41.5), respectively.
Despite T2DM being a leading cause of overall mortality, so far, few models have been developed for predicting diabetes-related mortality (4–6). To the best of our knowledge, this is the first study developing a well-performing model for this event that was validated in a second independent sample. In addition, our model is parsimonious, with few simple-to-measure variables needed to make it very informative.
Although risk factors for cardiovascular mortality have been widely described among patients with T2DM, only three studies have attempted to develop a prediction model of all-cause mortality to be used in the clinical set. Wells et al. (5) created a tool for predicting 6-year all-cause mortality risk in a large retrospective analysis of patients with T2DM from the Cleveland Clinic electronic health record. The authors developed a nomogram risk score using up to 20 predictive variables, which attained an acceptable C statistic of 0.752. However, only patients that were initially treated with a single oral hypoglycemic agent were included, thus making questionable the generalizability of the proposed score. In addition, as the authors also recognize, there was a significant amount of missing data, from 42.3 to 52.5%, for some predictor variables (i.e., BMI and LDL, triglyceride, and HDL levels), which were then statistically imputed. Although several and efficient statistical methods have been developed for missing data imputation, predictors with a high rate of missing variables should be omitted in order to keep the variability low about the estimates of interest (23). Moreover, the imputation techniques all rely on the missing at random assumption, a condition that cannot be easily verified (23). Finally, and most importantly, a second independent sample to be used as a validation set was not available (5).
Yang et al. (4) have instead used the Hong Kong diabetes registry, a prospective cohort of patients with T2DM, to develop a risk score based on nine predictors for all-cause mortality, which attained a C statistic = 0.845. Although they randomly divided the original sample into training (n = 3,775) and validation (n = 3,808) samples, a second external and independent sample for testing the validity of the proposed score was missing (4).
Finally, very recently, the Translating Research Into Action for Diabetes (TRIAD) study (n = 8,334) developed prediction equations for all-cause, cardiovascular, and noncardiovascular mortality in patients with T2DM. Missing values (<15%) have been imputed in this study too. The most parsimonious model was a logistic model built on 5,982 patients by using up to 15 predictive variables, which attained a C statistic = 0.815, although a Cox model would have been more appropriate in a survival context. In addition, and most importantly in this case, a second validation cohort was missing (6).
When comparing variables chosen in our models versus other models, we found that age, BMI, antihypertensive therapy, insulin therapy, and LDL cholesterol also significantly predicted all-cause mortality in the TRIAD study (6). Age, BMI, increased urinary albumin excretion, and insulin therapy were related to all-cause mortality in the Hong Kong Diabetes Registry study (4). Finally, age, BMI, DBP, insulin therapy, triglyceride levels, and LDL and HDL cholesterol related to all-cause mortality in the study by Well et al. (5).
Thus, although our well-performing, parsimonious, and validated model has not been compared with previous models, it represents an alternative for predicting all-cause mortality in a subset of high-risk patients, as those with T2DM certainly have to be considered. Thanks to the risk engine we have created, our model can now be tested in additional samples comprising diabetic patients from different geographical regions and/or different ethnicities in order to obtain deeper insights about its generalizability.
An additional important strength of our study is represented by an innovative variable selection strategy, based on the use of reclassification measures (i.e., cNRI in our case) instead of the classical association tests. The reason for such a methodological choice relies on the fact that we were interested in building a predictive model that, by definition, is expected of significantly capturing the contribution of predictors that correctly reclassify subjects according to their future outcome; it is of note that this is exactly what cNRI is able to guarantee. In addition, as recently demonstrated, the cNRI is a direct measure of the predictor’s effect size, allowing a direct interpretation of its estimate. Furthermore, simulation studies (13) have suggested that a cNRI approach reaches a higher level of statistical power, capturing those good predictors with a small effect size and a role that would not be detected by a Wald test or by an improvement in the C statistic.
We recognize that a limitation of our study is the lack of information on death-specific causes, especially those of cardiovascular origin. A second possible weakness is the lack of information on previous cardiovascular events, which represent a major risk factor for all-cause deaths (24). A third limitation of our study may be viewed in the short-term (i.e., 2 years) predictability we have addressed. However, the 4-year horizon tested as exploratory analysis performed well and was also calibrated in both the training and the validation samples, thus minimizing the risk that our model cannot be extended to larger time horizons of prediction. Further limitations are the relatively small sample size we investigated as compared with previous studies (4–6) and that the training and validation sets are both of Italian ancestry, thus not allowing us to address the generalizability of our model. Finally, missing data from both GMS and FMS also have to be recognized as limitations of our present study.
We have observed in our set a low percentage of patients treated with lipid-lowering drugs, which is particularly relevant in a population with a high percentage of subjects treated with insulin and poor glycemic control. We believe that a subcultured environment is the cause of such an unusual scenario even though it has been changing in the last few years.
Although addressing the role of variables as considered one by one was not our aim, some of them, as derived from the pooled analysis, may deserve some comments. Not unexpectedly, although elevated age (6), ACR (25,26), and both antihypertensive (6,27) and insulin therapy (6,28) predicted higher risk, elevated HDL (29), BMI (30), and DBP (31) predicted lower mortality risk.
In conclusion, we have developed an all-cause mortality equation in T2DM with good accuracy in calibration and discrimination in the training data set. We have also validated this model in a second external sample of similar patients. Finally, we have provided a user-friendly web-based risk engine (http://www.operapadrepio.it/rcalc/rcalc.php) that may serve the important function of testing our model in different sets of diabetic individuals in order to address its generalizability. Under this circumstance, the implementation of our model may help prioritize the use of available resources for targeting aggressive preventive and treatment strategies in a subset of very high-risk individuals.
This work was supported by Italian Ministry of Health grants RF2009 (S.D.C.) and RC2010 (V.T. and F.P.) and the Fondazione Roma “Sostegno alla ricerca scientifica biomedica 2008” (V.T.).
No potential conflicts of interest relevant to this article were reported.
S.D.C. and V.T. supervised the GMS, designed this particular study, and wrote the manuscript. M.C. and F.P. designed this particular study, analyzed data, contributed to discussion, and reviewed and edited the manuscript. O.L. supervised the FMS, researched data, contributed to discussion, and reviewed and edited the manuscript. A.F. analyzed data, contributed to discussion, and reviewed and edited the manuscript. M.M., A.Pac., S.F., A.Pal., and R.V. researched data and reviewed and edited the manuscript. E.M. and A.R. researched data, organized the database, and reviewed and edited the manuscript. R.D.P. and C.M. researched data, contributed to discussion, and reviewed and edited the manuscript. M.C. supervised the FMS, contributed to discussion, and reviewed and edited the manuscript. S.D.C. is the guarantor of this work and, as such, had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
The authors thank Dr. Francesco Giuliani (IRCCS Casa Sollievo della Sofferenza) for his kind help in developing the risk engine software.