OBJECTIVE—Our objective was to compare the performance of oral glucose tolerance tests (OGTTs) and multivariate models incorporating commonly available clinical variables in their ability to predict future cardiovascular disease (CVD).
RESEARCH DESIGN AND METHODS—We randomly selected 2,662 Mexican-Americans and 1,595 non-Hispanic whites, 25–64 years of age, who were free of both CVD and known diabetes at baseline from several San Antonio census tracts. Medical history, cigarette smoking history, BMI, blood pressure, fasting and 2-h plasma glucose and serum insulin levels, triglyceride level, and fasting serum total, LDL, and HDL cholesterol levels were obtained at baseline. CVD developed in 88 Mexican-Americans and 71 non-Hispanic whites after 7–8 years of follow-up. Stepwise multiple logistic regression models were developed to predict incident CVD. The areas under receiver operator characteristic (ROC) curves were used to assess the predictive power of these models.
RESULTS—The area under the 2-h glucose ROC curve was modestly but not significantly greater than under the fasting glucose curve, but both were relatively weak predictors of CVD. The areas under the ROC curves for the multivariate models incorporating readily available clinical variables other than 2-h glucose were substantially and significantly greater than under the glucose ROC curves. Addition of 2-h glucose to these models did not improve their predicting power.
CONCLUSIONS—Better identification of individuals at high risk for CVD can be achieved with simple predicting models than with OGTTs, and the addition of the latter adds little if anything to the predictive power of the model.
A principal reason that is typically given for screening large segments of the population with 2-h oral glucose tolerance tests (OGTTs) is to identify individuals with impaired glucose tolerance (IGT), because such individuals are at increased risk for diabetes. We have previously shown that individuals at high risk for diabetes can be more efficiently identified using multivariate models that do not require OGTT (1). A second reason that is typically given for screening the population for IGT is that individuals with this condition are also at increased risk for cardiovascular disease (CVD) (2,3). In this study, we examine the possibility that as good or superior identification of individuals at high risk for CVD can be achieved using readily available clinical measurements that, again, do not require OGTT.
As in our previous report (1), we evaluated the performance of various tests using receiver operator characteristic (ROC) curves in which the sensitivity of a test is plotted against the corresponding false-positive rate. In the present context, sensitivity refers to the percentage of individuals whose initial values were above a given cutpoint among those who later developed CVD and false-positive rate refers to the percentage of these individuals among those who nevertheless remained free of CVD. The area under the ROC curve measures how well a continuous variable predicts the outcome of interest: if the sensitivity increases steeply as the threshold for diagnosis is relaxed with only a relatively slow accumulation of false positives, the area under the ROC curve will be large; conversely, if the sensitivity increases slowly as the threshold for diagnosis is relaxed with a rapid accumulation of false positives, the area under the ROC curve will be correspondingly smaller. The differences in areas may be tested to determine whether they are statistically significant. We have used this approach to compare the fasting glucose value and the value 2 h after an oral glucose load with various multivariate models for predicting future CVD.
RESEARCH DESIGN AND METHODS
The data presented in this report were collected as part of the San Antonio Heart Study, the methods of which have been previously described (4–6). Briefly, households were randomly sampled from three types of neighborhoods: low, middle, and high income. Individuals residing in these households were eligible for the study if they were between the ages of 25 and 64 years and, if women, were not pregnant. There were no other exclusions, except that only Mexican-Americans were recruited from the low-income neighborhoods, there being a negligible number of non-Mexican-American individuals residing in these neighborhoods. Stratified random sampling was used in the middle- and high-income neighborhoods to recruit an approximately equal number of Mexican-Americans and non-Hispanic whites into the samples from these neighborhoods.
The baseline data were collected in two phases: from 1979 to 1982 and from 1984 to 1988. A total of 5,158 participants were enrolled in these two phases, representing a response rate of 65.3% of all study-eligible individuals residing in the selected households. Follow-up data were collected from 1987 to 1991 and from 1992 to 1997. The median follow-up period was ∼7.5 years. Height, weight, systolic and diastolic blood pressure, fasting and 2-h plasma glucose concentrations, fasting and 2-h serum insulin concentrations, triglyceride concentrations, and fasting serum total, LDL, and HDL cholesterol levels were measured by identical methods at both the baseline and follow-up examinations as previously described (4,5). The protocol was approved by the institutional review board of the University of Texas Health Science Center at San Antonio, and all participants gave informed consent.
CVD outcomes, defined as self-reported heart attack, stroke, coronary revascularization procedure, or cardiovascular death (International Classification of Diseases, 9th Revision [ICD-9] codes of 390–459 on the death certificate) were available on 4,839 individuals. A total of 3,736 of these individuals were either reexamined or determined to have died of a cardiovascular cause before their scheduled follow-up examination, and 1,103 of these individuals provided outcome data by telephone or home interview. Of the 4,839 individuals for whom CVD outcome data were available, 392 were excluded because of prevalent CVD at baseline, defined as self-reported heart attack, stroke, coronary revascularization procedure, or angina (by Rose Angina questionnaire) (7), and 190 were excluded because they self-reported previous diagnosis of diabetes before their baseline clinic examination; therefore, 4,257 individuals were available for analysis. Because clinical diabetes is an established risk factor for CVD (8), by excluding these individuals we reduced the likelihood of finding plasma glucose to be a significant risk factor for CVD. Nevertheless, we believed it was appropriate to exclude them because subjects in whom diabetes had been diagnosed would not ordinarily be screened for cardiovascular risk with an OGTT. A total of 159 of the 4,257 individuals experienced a first CVD event over the follow-up period for a crude incidence of 3.7% (159 of 4,257). These 159 events were distributed as follows: 22 deaths due to CVD; 45 coronary revascularization procedures; 56 self-reported heart attacks; and 36 self-reported strokes. The number of individuals with no missing values for the variables that entered the multivariate models was 3,902, of whom 145 experienced a first CVD event.
Using multiple logistic regression analysis, univariate odds ratios for CVD were computed for each potential CVD risk factor for men and women separately and for both sexes combined. For continuous risk factors, the odds ratios are presented for a 1-SD increment. A multivariate predicting model with both sexes combined was then developed using a stepwise logistic regression procedure in which the variables that had shown statistically significant odds ratios when examined individually were allowed to enter the model. Age, sex, and ethnicity were forced into this model. (Serum insulin concentration was not allowed to enter the model even though it had a significant odds ratio when examined individually, because we aimed to limit the eventual predicting models to variables that would be readily available in a clinical setting.) The significance criteria to enter and remain in the model were 5 and 10%, respectively. The multivariate model was developed on individuals with no missing values for any of the variables that entered (of 2,427 Mexican-Americans and 1,475 non-Hispanic whites, CVD developed in 81 Mexican-Americans and 64 non-Hispanic whites; of 1,693 men and 2,209 women, CVD developed in 95 men and 50 women). To determine whether 2-h glucose made an additional contribution to predicting CVD once the other variables were considered, we created a new model by forcing 2-h glucose into the previous model. Finally, the models developed for the two sexes combined were fit to the data for men and women separately.
ROC curves were calculated for fasting and 2-h glucose concentrations and for the multivariate models using SAS PROC LOGISTIC software (9) and plotted for each 1% increment of the false-positive rate. To evaluate possible nonlinear associations between glucose levels and CVD outcomes, quintiles of fasting and 2-h glucose concentration were created and used as categorical variables in the ROC analysis. The Hosmer-Lemeshow test was used to assess the fit of the models (9). The CVD risks predicted for each individual by the various logistic regression models were used to construct ROC curves for these models. The statistical significance of differences in areas under the ROC curves were estimated using the approach and SAS software developed by DeLong et al. (10). ROC CIs were calculated using the “roccomp” command in Stata (11).
The univariate odds ratios and 95% CIs for selected cardiovascular risk factors are shown in Table 1. For continuous variables, the odds ratios are presented for a 1-SD increment in the risk factor. All of the risk factors were statistically significant predictors of CVD, except for ethnicity and family history of diabetes in a first-degree relative in both sexes combined and in men and women separately, HDL cholesterol and current cigarette smoking in men, and fasting insulin in women. The univariate odds ratios for the individual end points (CVD death and self-reported revascularization procedures, myocardial infarction, and stroke) were similar to those shown in Table 1, with a tendency for lipids to be stronger risk factors for myocardial infarction and for glucose and blood pressure to be stronger for stroke (data not shown).
The multivariate odds ratios and 95% CIs for the risk factors that entered the stepwise logistic regression models for both sexes combined and for men and women separately are shown in Table 2. Age and ethnicity were forced into all models, and sex was forced into the model with both sexes combined. Models are shown both with and without 2-h glucose forced into the model. Although both fasting and 2-h glucose were significant predictors of CVD in univariate analyses (Table 1), neither entered the stepwise multivariate models and the odds ratios for 2-h glucose were not statistically significant when this variable was forced into the multivariate models (Table 2).
ROC curves for both sexes combined are shown in Fig. 1. The curves display the performance of each of the following models as predictors of CVD: fasting glucose concentration; 2-h glucose concentration; and multivariate models containing the variables selected by the stepwise regression procedure with or without 2-h glucose concentration included. The performance of the 2-h glucose curve seemed to be slightly better than the fasting glucose curve, but both showed relatively weak predicting power. A multiple logistic regression model incorporating both fasting and 2-h glucose levels, but no other risk factors, was not superior to the 2-h glucose concentration alone at predicting CVD (data not shown). By contrast, both multivariate curves substantially outperformed the two glucose curves. There was no evidence that the addition of 2-h glucose to the multivariate model improved its ability to predict CVD.
The areas under the ROC curves for both sexes combined and for men and women separately are shown in Table 3. The results of statistical tests comparing these areas are presented in the lower half of the table, and the parameter estimates for the multivariate models are presented in the footnotes. For only 1 of the 12 models shown in Table 3 was the fit rejected by the Hosmer-Lemeshow test, specifically the 2-h glucose-only model for men (P = 0.033). The data in Table 3 confirm the impression obtained from Fig. 1. The areas under the 2-h glucose curves (59.9–64.7%) were modestly higher than the areas under the fasting glucose curves (57.4–59.4%), but these differences were not statistically significant. When quintiles of glucose concentration were used in the analyses, the areas under the ROC curves were slightly higher than when glucose was analyzed as a continuous variable (60.5 vs. 59.4% for fasting glucose and 63.6 vs. 62.4% for 2-h glucose). The areas under the multivariate curves (78.0–83.1%) were all substantially greater than the areas under the glucose- only curves, and in particular, the superiority of the multivariate curves that did not include 2-h glucose over the 2-h glucose-only curves was highly statistically significant, both with the sexes combined and for each sex separately (P ≤ 0.0003). Adding 2-h glucose to the multivariate curves improved their performance minimally, but the improvement was not statistically significant.
The results of this study indicate that although fasting and 2-h glucose values are both statistically significant predictors of CVD, better detection of individuals at high risk for CVD can be achieved using a multivariate model that incorporates readily available clinical variables, specifically fasting lipids, blood pressure, BMI, smoking history, and family history of CVD. The addition of OGTT results to this panel of risk factors does not lead to a significant improvement in the ability to predict CVD. It should also be noted that the risk factors that enter into our multivariate predicting model strongly overlap with those the National Cholesterol Education Program recommends to screen the entire U. S. population aged >25 years for cardiovascular risk (12).
Our multivariate results differ from those of several other studies (13,14), including the DECODE Study (3), in which data from a number of prospective European studies were pooled. In these other studies, unlike in our study, 2-h glucose made a statistically significant contribution to predicting cardiovascular mortality, even after accounting for conventional cardiovascular risk factors. These differences could relate to the fact that CVD mortality was the only end point considered in these other studies, whereas we also considered nonfatal outcomes. Also, these other studies included older subjects than were included in our study. It should also be noted that in none of the other studies were the data analyzed using ROC curves. It is possible for a risk factor to be an independently significant predictor of an outcome even though it makes only a minor and statistically nonsignificant contribution to the area under the ROC curve. Before concluding that our results are truly at odds with these other studies and particularly before making policy decisions about the potential utility of OGTTs in detecting individuals at high risk of CVD, it would be helpful if the DECODE data and data from the other two other studies were analyzed using ROC curves.
Although the 2-h glucose value may add little to identifying individuals at risk for either diabetes (1) or CVD, one could still argue that OGTTs are needed to diagnose prevalent cases of diabetes in those whose only manifestation of the disease is an abnormal 2-h value. Although the American Diabetes Association took the position that identifying such cases was not mandatory (15), the World Health Organization did not concur with this viewpoint (16). Although a full discussion of this matter is beyond the scope of the present study, it should be noted that there has never been a clinical trial that has focused specifically on the benefits of treating such diabetic patients as opposed to waiting until fasting hyperglycemia is manifest. Moreover, patients in whom diabetes was diagnosed exclusively on the basis of an abnormal 2-h glucose value have a high rate of reversion to normal on follow-up and may, in fact, represent false-positive diagnoses. We have previously reported, for example, that such cases were almost five times more likely to revert to nondiabetic status after 7–8 years of follow-up than those meeting conventional fasting or clinical diagnostic criteria (17).
Using the screening criteria recommended by the American Diabetes Association (summarized in Table 6 in reference 15) and projections from the U. S. Bureau of the Census for the year 2000 (18), 112,267,000 individuals aged 25–64 years, i.e., in the working age range, would be eligible for screening for IGT (calculation available from the authors upon request). (Although the ADA does not recommend the 2-h OGTT as a screening test for undiagnosed diabetes (15), identification of subjects with IGT as a risk factor for CVD would require this test.) Such a screening effort would entail 224,534,000 man-hours. Valuing individual’s time at the U. S. average wage of $13.70 results in an estimate of $3.08 billion for the indirect cost of such a screening effort. Valuing the time of individuals aged >65 years yields even higher cost estimates. Moreover, the marginal benefit of widespread glucose tolerance screening would accrue, not to all who screened positive, but only to those not detected by a competing screening strategy, not requiring a 2-h test. Although the preceding obviously does not constitute a formal cost-benefit analysis and, in fact, is limited to a consideration of the indirect cost of an individual’s time, it does at least highlight the need for those who advocate widespread oral glucose tolerance testing to quantitate the benefits of such tests.
Based on the results of this paper and our previous publication (1), it would seem that, among young and middle-aged adults, OGTTs add little if anything to the detection of individuals at high risk for either diabetes or CVD. Moreover, the test is inconvenient, not often performed in clinical practice (19), and would be costly to implement in a nationwide screening program. Before making a final judgment, however, it is necessary that our results be replicated in other populations, and we would advocate that the results in those populations be analyzed using ROC curves.
This work was supported by National Heart, Lung and Blood Institute Grants RO1-HL-24799 and RO1-HL-36820.
Address correspondence and reprint requests to Michael P. Stern, MD, Division of Clinical Epidemiology, Department of Medicine, University of Texas Health Science Center at San Antonio, 7703 Floyd Curl Dr., San Antonio, TX 78229-3900. E-mail: firstname.lastname@example.org.
Received for publication 22 January 2002 and accepted in revised form 29 June 2002.
A table elsewhere in this issue shows conventional and Système International (SI) units and conversion factors for many substances.
See Point-Counterpoint, p. 1879.