To develop a model for the prediction of type 2 diabetes mellitus (T2DM) risk on the basis of a multivariate logistic model and 1-h plasma glucose concentration (1-h PG).
The model was developed in a cohort of 1,562 nondiabetic subjects from the San Antonio Heart Study (SAHS) and validated in 2,395 nondiabetic subjects in the Botnia Study. A risk score on the basis of anthropometric parameters, plasma glucose and lipid profile, and blood pressure was computed for each subject. Subjects with a risk score above a certain cut point were considered to represent high-risk individuals, and their 1-h PG concentration during the oral glucose tolerance test was used to further refine their future T2DM risk.
We used the San Antonio Diabetes Prediction Model (SADPM) to generate the initial risk score. A risk-score value of 0.065 was found to be an optimal cut point for initial screening and selection of high-risk individuals. A 1-h PG concentration >140 mg/dL in high-risk individuals (whose risk score was >0.065) was the optimal cut point for identification of subjects at increased risk. The two cut points had 77.8, 77.4, and 44.8% (for the SAHS) and 75.8, 71.6, and 11.9% (for the Botnia Study) sensitivity, specificity, and positive predictive value, respectively, in the SAHS and Botnia Study.
A two-step model, based on the combination of the SADPM and 1-h PG, is a useful tool for the identification of high-risk Mexican-American and Caucasian individuals.
The prevalence of type 2 diabetes mellitus (T2DM) has increased in recent decades to epidemic proportions (1). Because of the chronic course of T2DM, and the significant morbidity and mortality associated with the vascular complications of the disease, T2DM has become not only a serious public health threat but also a heavy economic burden on every health care system (2). Recent clinical trials have demonstrated that the incidence of T2DM can be reduced with lifestyle intervention (3,4) and pharmacotherapy (4,5) in subjects with impaired glucose tolerance (IGT). These results indicate that primary prevention of T2DM is a promising strategy to restrain the epidemic increase in disease prevalence and control the economic burden that it poses on health care expenditure.
Accurate identification of subjects at increased risk of future T2DM is essential for every prevention program. It minimizes the number of subjects in the intervention program and improves its efficacy and cost-effectiveness. All previous intervention trials that have tested the efficacy of various prevention strategies have recruited subjects with IGT (3–5) and/or impaired fasting glucose (IFG). Although subjects with IGT are at increased risk for future T2DMt compared with individuals with normal glucose tolerance (NGT), only 35–50% of subjects with IGT convert to T2DM after 5–10 years (6–8), and, even after 20 years of follow-up, only ~50% of subjects with IGT convert to T2DM (8). Furthermore, ~30–40% of subjects who develop T2DM in prospective studies have NGT at baseline (7,8), suggesting that the future risk for T2DMis not similar among all subjects with IGT or NGT. Thus, by solely relying on IGT for the identification of subjects at increased T2DM risk, a large group of high-risk subjects with NGT remains unidentified (9).
These limitations associated with the use of IGT to identify high-risk individuals have lead to the development of prediction models based on multivariate logistic models using risk factors for T2DM (e.g., age, sex, BMI, fasting plasma glucose [FPG], lipid profile, and blood pressure) (10–18). These predictive models have been shown to perform as well as IGT in predicting future T2DM risk. Because all of the measurements required for these models are taken during the fasting state, these models have been advocated to replace IGT for the identification of subjects at increased risk for future T2DM without the need to perform an oral glucose tolerance test (OGTT).
Although multivariate prediction models have better sensitivity compared with IGT in identifying subjects at increased T2DM risk, they have relatively low specificity and positive predictive value (PPV). We previously (19–21) have shown that a plasma glucose concentration >155 mg/dL at 1 h during the OGTT identifies subjects at increased T2DM risk with relatively high sensitivity and specificity. We also demonstrated that the 1-h plasma glucose concentration (1-h PG) performs superiorly to both IGT and multivariate prediction models in the identification of high-risk individuals. Furthermore, we demonstrated that the addition of the 1-h PG to multivariate prediction models significantly improves their predictive power, indicating that the 1-h PG contains additional information about future T2DM risk compared with all known diabetes risk factors. However, similar to IGT, the 1-h PG requires a glucose load.
In this study, we used data from the San Antonio Heart Study (SAHS) to develop a two-step model for the prediction of future T2DM risk. This model involves screening all nondiabetic subjects using the San Antonio Diabetes Prediction Model (SADPM) (14,19) and administering an oral glucose load to obtain the 1-h PG value, only in high-risk individuals, to further refine their future T2DM risk. We demonstrate that this two-step model decreases the number of subjects who require an oral glucose load and has high sensitivity, specificity, and PPV in identifying subjects at increased T2DM risk.
RESEARCH DESIGN AND METHODS
Subjects were participants of the SAHS (n = 1,562) (22) and the Botnia Study (n = 2,395) (23) who were free of diabetes at baseline. The two studies are prospective longitudinal studies in which nondiabetic subjects (Caucasians and Mexican Americans in the SAHS and Caucasians in the Botnia Study) were recruited and followed for 7–8 years. Detailed descriptions of the Botnia Study and SAHS previously have been published (22,23). Only subjects with 2-h plasma glucose concentrations (2-h PG) <200 mg/dL and FPG <125 mg/dL at baseline were included in this study. Table 1 presents the baseline patient characteristics. All subjects completed a 7- to 8-year follow-up examination and had their diabetes outcome determined with a repeat OGTT.
Baseline characteristics of subjects who progressed to T2DM and nonprogressors in the SAHS and the Botnia Study
. | SAHS . | Botnia Study . | ||||
---|---|---|---|---|---|---|
Nonprogressors . | Progressors . | P . | Nonprogressors . | Progressors . | P . | |
n | 1,388 | 174 | 2,271 | 124 | ||
Age (years) | 43 ± 1 | 48 ± 1 | <0.0001 | 46 ± 1 | 53 ± 1 | <0.0001 |
Sex (% male) | 43.2 | 39% | 45.9 | 43.2 | ||
FPG (mg/dL) | 85 ± 1 | 95 ± 1 | <0.0001 | 89 ± 1 | 95 ± 1 | <0.0001 |
1-h PG (mg/dL) | 127 ± 1 | 179 ± 3 | <0.0001 | 124 ± 1 | 168 ± 3 | <0.0001 |
2-h PG (mg/dL) | 101 ± 1 | 137 ± 3 | <0.0001 | 99 ± 1 | 119 ± 3 | <0.0001 |
HDL cholesterol (mg/dL) | 48 ± 1 | 42 ± 1 | <0.0001 | 54.3 ± 1 | 49.5 ± 1 | <0.0001 |
BMI (kg/m2) | 27.2 ± 0.2 | 31.2 ± 0.4 | <0.0001 | 25.6 ± 0.1 | 28.9 ± 0.4 | <0.0001 |
Systolic blood pressure (mmHg) | 117 ± 1 | 124 ± 1 | <0.0001 | 129 ± 1 | 140 ± 2 | <0.0001 |
Ethnicity (% white) | 35 | 22 | 100 | 100 | ||
8-Year diabetes incidence rate (%) | 0 | 100 | 0 | 100 | ||
SADPM risk score [median (range)] | 0.068 (0.001–0.94) | 0.28 (0.007–0.96) | <0.0001 | 0.185 (0.001–0.973) | 0.465 (0.013–0.959) | <0.0001 |
. | SAHS . | Botnia Study . | ||||
---|---|---|---|---|---|---|
Nonprogressors . | Progressors . | P . | Nonprogressors . | Progressors . | P . | |
n | 1,388 | 174 | 2,271 | 124 | ||
Age (years) | 43 ± 1 | 48 ± 1 | <0.0001 | 46 ± 1 | 53 ± 1 | <0.0001 |
Sex (% male) | 43.2 | 39% | 45.9 | 43.2 | ||
FPG (mg/dL) | 85 ± 1 | 95 ± 1 | <0.0001 | 89 ± 1 | 95 ± 1 | <0.0001 |
1-h PG (mg/dL) | 127 ± 1 | 179 ± 3 | <0.0001 | 124 ± 1 | 168 ± 3 | <0.0001 |
2-h PG (mg/dL) | 101 ± 1 | 137 ± 3 | <0.0001 | 99 ± 1 | 119 ± 3 | <0.0001 |
HDL cholesterol (mg/dL) | 48 ± 1 | 42 ± 1 | <0.0001 | 54.3 ± 1 | 49.5 ± 1 | <0.0001 |
BMI (kg/m2) | 27.2 ± 0.2 | 31.2 ± 0.4 | <0.0001 | 25.6 ± 0.1 | 28.9 ± 0.4 | <0.0001 |
Systolic blood pressure (mmHg) | 117 ± 1 | 124 ± 1 | <0.0001 | 129 ± 1 | 140 ± 2 | <0.0001 |
Ethnicity (% white) | 35 | 22 | 100 | 100 | ||
8-Year diabetes incidence rate (%) | 0 | 100 | 0 | 100 | ||
SADPM risk score [median (range)] | 0.068 (0.001–0.94) | 0.28 (0.007–0.96) | <0.0001 | 0.185 (0.001–0.973) | 0.465 (0.013–0.959) | <0.0001 |
Data are means ± SEM, unless otherwise indicated. SADPM risk score refers to the median (range) risk score measured by the SADPM (according to ref.14) at baseline.
Study design
During the baseline studies, clinical and anthropometric parameters (age, sex, BMI, and ethnicity) were collected. Blood pressure and plasma lipid concentrations were measured. In addition, all subjects received a 75-g OGTT following a 12-h overnight fast. Plasma glucose and serum insulin concentrations were measured at 0, 30, 60, and 120 min. After 7–8 years of follow-up, a repeat OGTT was performed and the diagnosis of diabetes was made on the basis of American Diabetes Association criteria (24) (2-h plasma glucose concentrations ≥200 mg/dL or FPG ≥126 mg/dL).
Analytical methods
Plasma glucose was measured at bedside with the glucose oxidase method using a Beckman Glucose Analyzer in both studies (Beckman Instruments, Fullerton, CA).
Data analysis and statistical methods
The two-step model is based on the concept of screening the population using the SADPM (14) and performing a 1-h PG only in high-risk individuals. The SADPM (14) is a multivariate logistic regression based on age, sex, BMI, ethnicity, FPG, HDL cholesterol, and blood pressure to compute a risk score for future T2DM. We used the dataset of the SAHS to develop the two-step model and the Botnia Study dataset to validate it.
In the first step, the SADPM, as published (14), was used to compute a score of T2DM risk for each participant. To obtain the optimal cut point above which subjects were considered at high risk, we constructed a receiver-operating characteristic curve by plotting the sensitivity against the false-positive rate, and the score value with the maximal sum of sensitivity and specificity was chosen to represent the optimal cut point. Likewise, the optimal cut point for the 1-h PG was determined by constructing a receiver-operating characteristic curve and defining the 1-h PG value with the maximal sum of sensitivity and specificity.
The predictive power of the two cut points (0.065 for the SADPM and 1-h PG >140 mg/dL, respectively) was tested in the Botnia Study dataset. The predictive power of the model was assessed by computing the sensitivity, specificity, and PPV in the Botnia Study and SAHS, and the result was compared with other prediction models.
Variables are presented as means ± SD. The significance of the mean differences was tested using ANOVA. Differences between categorical variables were tested using the χ2 test. Statistical significance was considered at the level of P < 0.05. Statistical analysis was performed using the SPSS software package version 17.
RESULTS
Model development
Nondiabetic subjects in the SAHS who had their plasma glucose concentration measured at 0, 30, 60, and 120 min during the baseline OGTT and completed the 7–8 years of follow-up were used for model development. The cut point of the risk score that had the maximal sum of sensitivity and specificity was 0.065. A total of 739 of 1,562 (47%) study participants had a risk score below this value and were considered to be low-risk individuals. A total of 823 subjects (53%) had a risk score >0.065 and were considered to be at high risk. The optimal cut point of 1-h PG that had the maximal sum of sensitivity and specificity for the prediction of T2DM risk in the 823 subjects who had a risk score >0.065 was 140 mg/dL. Of 823 subjects with an elevated risk score, 452 subjects also had 1-h PG >140 mg/dL and 371 subjects had a 1-h PG <140 mg/dL. Thus, the 0.065 cut point for the risk score and 140 mg/dL for the 1-h PG had a sensitivity, specificity, and PPV of 77.8, 77.4, and 44.8%, respectively, for identifying subjects at increased future T2DM risk in the SAHS.
Table 2 demonstrates that subjects with a risk score >0.065 and 1-h PG >140 mg/dL had a 36-fold increase in their T2DM risk compared with subjects with a risk score <0.065 and 1-h PG <140 mg/dL.
T2DM risk associated with increased SADPM score and 1-h PG
Group . | SAHS . | Botnia Study . | ||
---|---|---|---|---|
Odds ratio (95% CI) . | P . | Odds ratio (95% CI) . | P . | |
P < 0.065, 1- h PG <140 | 1 | 1 | ||
P < 0.065, 1-h PG >140 | 7.10 (2.78–18.1) | <0.0001 | 4.22 (0.37–47.5) | 0.289 |
P > 0.065, 1-h PG <140 | 4.64 (1.94–11.1) | <0.0001 | 3.69 (0.86–15.70) | 0.062 |
P > 0.065, 1-h PG >140 | 36.53 (16.9–79.0) | <0.0001 | 25.27 (6.18–103.2) | <0.0001 |
Group . | SAHS . | Botnia Study . | ||
---|---|---|---|---|
Odds ratio (95% CI) . | P . | Odds ratio (95% CI) . | P . | |
P < 0.065, 1- h PG <140 | 1 | 1 | ||
P < 0.065, 1-h PG >140 | 7.10 (2.78–18.1) | <0.0001 | 4.22 (0.37–47.5) | 0.289 |
P > 0.065, 1-h PG <140 | 4.64 (1.94–11.1) | <0.0001 | 3.69 (0.86–15.70) | 0.062 |
P > 0.065, 1-h PG >140 | 36.53 (16.9–79.0) | <0.0001 | 25.27 (6.18–103.2) | <0.0001 |
Model validation
Model validation was performed in the Botnia Study in 2,395 nondiabetic subjects who had their plasma glucose concentrations measured at 0, 30, 60, and 120 min during the baseline OGTT and completed a 7- to 8-year follow-up. A total of 419 of 2,395 (18%) subjects had a risk score <0.065 and, therefore, represent low-risk individuals. Of 1,976 subjects with a risk score >0.065, 734 had 1-h PG concentrations >140 mg/dL and represented the target population for intervention. The cut point of 0.065 for the SADPM and 140 mg/dL for the 1-h PG concentration had 76, 72, and 12% sensitivity, specificity, and PPV, respectively, in the Botnia Study.
Subjects with a risk score >0.065 and 1-h PG >140 mg/dL had a 25-fold increase in T2DM risk compared with subjects with a risk score <0.065 and 1-h PG <140 mg/dL.
CONCLUSIONS
Accurate identification of subjects at increased future T2DM risk is pivotal to all intervention programs that aim to prevent and/or delay the onset on T2DM. Currently, two methods are available to identify subjects at increased risk of future T2DM: 1) 2-h 75-g OGTT to identify subjects with IGT (3–5) and 2) risk-score models computed with multivariate regression models based on diabetes risk factors (age, sex, BMI, FPG, lipid profile, and blood pressure) (10–18). Although the 2-h OGTT has been used in all previous clinical trials to identify high-risk subjects with IGT for intervention, it is inconvenient in routine clinical practice. Moreover, by solely relying on IGT to identify at-risk subjects, many subjects with NGT who are at increased risk of T2DM remain unidentified (9). On the other hand, risk-score models, such as the SADPM and other models, which require measurements taken only during the fasting state, are easy to use and also may identify high-risk individuals with NGT. It is important to note that all variables that were required to obtain the risk score by these prediction models are routinely collected in routine clinical practice and readily are available in the subject’s medical record. Thus, a risk score for future T2DM risk can be computed by the physician in the absence of the subject and without the need for an office visit. Despite the simplicity in using the risk-score models and their good sensitivity, they have several limitations. Performance of the risk score is weaker in populations other than that in which they were developed, and, as a result, the parameters of each model need to be reoptimized in each population. Another limitation of the risk-score model is that despite their good sensitivity in identifying high-risk individuals, they have relatively low specificity (40–60%) and PPV compared with IGT (documented with OGTT). In the current study, the specificity and PPV of the SADPM was approximately one-half of that obtained with 1-h PG and the two-step model in both the Botnia Study and SAHS. Moreover, the PPV of the risk score in the population in which it was developed failed to exceed 10% (13–18). Moreover, it is likely that these multivariate regression models would have even lower specificity and PPV when used in populations other than the one in which they were developed. For example, in the current study, the sensitivity of the SADPM was 88.8 and 97.4% in the SAHS and Botnia Study, respectively. However, the specificity and PPV were 52 and 19%, respectively, in the SAHS and 18 and 6%, respectively, in the Botnia Study (Table 3). Thus, by relying on multivariate models to identify high-risk individuals for prevention programs, many false-positive cases would be identified as high-risk individuals and would be invited to participate in the intervention program. This would increase the cost of the intervention program and reduce its efficacy and cost-effectiveness. Thus, despite the simplicity, convenience, and relatively low cost of using risk-score models to identify high-risk individuals, the relatively low PPV of these models will reduce the overall cost-effectiveness of any prediction-prevention program, which relies on a risk-score model for diabetes prediction. The two-step prediction model presented in the current study provides a balance between both approaches and has several advantages: 1) Because the variables required to establish the risk score (BMI, blood pressure, FPG, and lipid profile) are collected in routine clinical practice, they readily are available in medical records. Thus, the initial risk score using the SADPM can be computed without the need for an office visit by the patient. 2) By refining the risk for future diabetes with the 1-h PG, the specificity and PPV of the model is significantly improved (Table 3). Because of the week specificity and PPV of the SADPM, >50% of subjects classified as high-risk individuals (>0.065) are false-positively identified, and these false-positive individuals can be excluded from intervention program with the 1-h PG. Furthermore, use of the 1-h PG reduces the time required to perform the glucose load by 50% compared with establishing the diagnosis of IGT. Because only subjects with increased risk score (>0.065) are required to perform the glucose load, an office visit is required only in a subgroup of the population to identify individuals who truly are at increased future T2DM risk. Thus, similar to IGT, this approach has the advantage of having high specificity and PPV compared with high-risk-score models, while avoiding some of the limitations of IGT (e.g., reducing the time required for OGTT and decreasing the number of subjects required to perform glucose load). 3) The improved specificity and PPV obtained with the 1-h PG leads to a smaller number of individuals who are targeted for the intervention program, and this would substantially reduce the cost and increase the efficacy and cost-effectiveness. Although obtaining 1-h PG concentrations will increase the cost of screening, we believe that the decrease in intervention costs after the improvement in specificity and PPV of the two-step approach will outweigh the cost of performing the glucose load, and using the two-step model to identify high-risk individuals for the prevention program decreases the overall cost-effectiveness of the prediction-prevention program. 4) We have validated the two-step prediction model in a completely independent dataset with a different ethnicity (Swedish Caucasians versus Mexican Americans), suggesting that this model has the potential to perform well in other ethnic groups.
Sensitivity, specificity, and PPV of various diabetes prediction models
Model . | SAHS . | Botnia Study . | ||||||
---|---|---|---|---|---|---|---|---|
Sensitivity (%) . | Specificity (%) . | Sensitivity and specificity . | PPV (%) . | Sensitivity (%) . | Specificity (%) . | Sensitivity and specificity . | PPV (%) . | |
SADPM | 88.8 | 52.0 | 140.8 | 19.4 | 97.4 | 18.2 | 115.6 | 5.7 |
IFG and/or IGT | 64.4 | 86.9 | 151.3 | 39.0 | 77.5 | 46.4 | 123.9 | 6.8 |
IFG | 31.6 | 91.5 | 123.1 | 41.2 | 68.5 | 51.2 | 119.7 | 6.9 |
IGT | 45.6 | 91.2 | 136.8 | 39.1 | 39.2 | 85.6 | 124.8 | 12.8 |
1-h PG >155 mg/dL | 75.0 | 78.7 | 153.7 | 45.9 | 62.0 | 81.3 | 143.3 | 14.5 |
Two-step model | 77.7 | 77.4 | 155.1 | 44.8 | 75.8 | 71.6 | 147.4 | 11.9 |
Model . | SAHS . | Botnia Study . | ||||||
---|---|---|---|---|---|---|---|---|
Sensitivity (%) . | Specificity (%) . | Sensitivity and specificity . | PPV (%) . | Sensitivity (%) . | Specificity (%) . | Sensitivity and specificity . | PPV (%) . | |
SADPM | 88.8 | 52.0 | 140.8 | 19.4 | 97.4 | 18.2 | 115.6 | 5.7 |
IFG and/or IGT | 64.4 | 86.9 | 151.3 | 39.0 | 77.5 | 46.4 | 123.9 | 6.8 |
IFG | 31.6 | 91.5 | 123.1 | 41.2 | 68.5 | 51.2 | 119.7 | 6.9 |
IGT | 45.6 | 91.2 | 136.8 | 39.1 | 39.2 | 85.6 | 124.8 | 12.8 |
1-h PG >155 mg/dL | 75.0 | 78.7 | 153.7 | 45.9 | 62.0 | 81.3 | 143.3 | 14.5 |
Two-step model | 77.7 | 77.4 | 155.1 | 44.8 | 75.8 | 71.6 | 147.4 | 11.9 |
The SADPM score was calculated according to ref.14 (see text for more details), and a 0.065 cut point value was used to calculate the sensitivity and specificity for this model. IFG and IGT were defined according to the American Diabetes Association criteria (24). The two-step model was based on a 0.065 cut point in the SADPM in the first step and a 1-h PG >140 mg/dL during the OGTT.
In the current study, we favored the SADPM over other risk-score models for several reasons: 1) the SADPM has been validated in other populations and 2) studies that have compared the sensitivity and specificity of the SADPM compared with other risk scores have reported similar or greater specificity of the SADPM to other models (e.g., Framingham Study and Atherosclerosis Risk in Communities Study risk scores). Although the specificity and PPV of the SADPM was slightly lower in other populations compared with Mexican Americans, the addition of 1-h PG to the SADPM improved the specificity and PPV of the two-step model.
Although the two-step model provides a balanced approach for identifying subjects at increased future T2DM risk, it has some limitations. It requires the performance of glucose load in a subgroup of the population, which is inconvenient, expensive, and requires a special office visit compared with only relying on the FPG and HbA1c. However, the information generated with the two-step model may outweigh the additional cost and inconvenience associated with the performance of glucose load. Moreover, glucose load may not be required in very-high-risk individuals (e.g., HbA1c >6% and FPG >115 mg/dL). In the Botnia Study, this subgroup (FPG >115 mg/dL and HbA1c >6.0%) had a 32% risk of T2DM over an 8-year period. However, this cut point had very low sensitivity (8%). Of note, all subjects with FPG >115 mg/dL and HbA1c >6.0% had a score in the SADPM >0.065 and 1-h PG >140 mg/dL. One could argue that in such a very-high-risk group, a glucose load may not be necessary.
Tools to ascertain the risk for future diabetes are valuable to the extent one believes that it is important to detect diabetes as soon as it exists. Although there are no trials to inform us as to the urgency to detect diabetes, it is clear that hyperglycemia is the principal cause of microvascular complications (25), so any test or algorithm that might help to reduce the time spent with undetected hyperglycemia should intuitively be of benefit. Future studies should address the value of early detection and, by implication, the value of sensitive and specific risk-assessment tools for future diabetes, such as the one described here.
The low incidence rate of T2DM in the Botnia Study compared with the SAHS (Table 1) most likely explains the low PPV of the present model in this population. Of note, the 1-h PG resulted in a twofold increase in PPV compared with the SADPM in both populations, suggesting that the contribution of the 1-h PG improves the accuracy of the prediction model and is independent of the incidence rate of T2DM. The decreased T2DM incidence and decreased specificity and PPV in the Botnia Study results in a larger number of subjects required to perform an OGTT (82 vs. 53% in the SAHS).
In summary, we have demonstrated that a two-step approach based on a multivariate logistic model and a 1-h PG concentration during an OGTT identifies subjects at increased risk of T2DM with high sensitivity, specificity, and PPV compared with risk-score models or IFG and/or IGT. Although this approach requires the measurement of 1-h PG, the glucose load only needs to be admitted to a subgroup of the population. This model could provide a useful tool for the identification of subjects at increased risk of T2DM in the community.
Acknowledgments
No potential conflicts of interest relevant to this article were reported.
M.A.A.-G. and T.A.-G. performed the analysis and wrote the manuscript. M.P.S. reviewed the manuscript and contributed to generating data. J.K. prepared the dataset for analysis. T.T., I.B., and L.G. contributed to generating data. R.A.D. reviewed the manuscript.