To test if knowledge of type 2 diabetes genetic variants improves disease prediction.
We tested 40 single nucleotide polymorphisms (SNPs) associated with diabetes in 3,471 Framingham Offspring Study subjects followed over 34 years using pooled logistic regression models stratified by age (<50 years, diabetes cases = 144; or ≥50 years, diabetes cases = 302). Models included clinical risk factors and a 40-SNP weighted genetic risk score.
In people <50 years of age, the clinical risk factors model C-statistic was 0.908; the 40-SNP score increased it to 0.911 (P = 0.3; net reclassification improvement (NRI): 10.2%, P = 0.001). In people ≥50 years of age, the C-statistics without and with the score were 0.883 and 0.884 (P = 0.2; NRI: 0.4%). The risk per risk allele was higher in people <50 than ≥50 years of age (24 vs. 11%; P value for age interaction = 0.02).
Knowledge of common genetic variation appropriately reclassifies younger people for type 2 diabetes risk beyond clinical risk factors but not older people.
A genetic risk score built with 18 type 2 diabetes genetic loci predicted new diabetes cases (1), though it did not add to common diabetes clinical risk factors that usually appear during adulthood (1,–3). In recent years, the number of genetic loci convincingly associated with diabetes has doubled (4,,,,,–10). Here, we test two hypotheses: an updated genetic risk score incorporating a larger number of common diabetes-associated single nucleotide polymorphisms (SNPs) improves ∼8-year risk prediction of diabetes beyond common clinical diabetes risk factors; and the predictive ability is better in younger subjects in whom early preventive strategies could delay diabetes onset (11).
RESEARCH DESIGN AND METHODS
We have previously described the methods (1). We pooled data of the Framingham Offspring Study (12) into four time periods (exams 1 and 2, 2 to 4, 4 to 6, and 6 to 8) (3), extending follow-up 6 years beyond our previous report (1). We generated 11,358 person-observations for 3,471 subjects with available genetic data. We excluded prevalent diabetes at the baseline of each period. Diabetes was defined as fasting plasma glucose >7.0 mmol/l (>125 mg/dl) or use of antidiabetic therapy.
We genotyped or imputed 40 autosomal diabetes-SNPs reported in European-origin populations (4,,,,,–10), thus adding 23 new SNPs and excluding INS from our previous 18-SNP analysis (1). Genotypes were obtained from Affymetrix array data available in the Framingham Heart Study SNP Health Associate Resource dataset (13) or from de novo genotyping on the iPLEX (Sequenom) platform. Minimum call rates were 97% for Affymetrix and 96.9% for iPLEX SNPs. All SNPs were in Hardy-Weinberg equilibrium. Median variance ratio for the imputed SNPs was 0.94; only for rs725210 at HNF1B, the variance ratio was <0.3 (namely, 0.2).
We modeled the 40 SNPs by constructing a 40-SNP weighted genetic risk score based on the published β coefficients (8,10) (see footnote, Table 1) and alternatively by entering one term per SNP in an additive model using the expected or observed number of minor alleles plus terms for sex or clinical variables. A general nonadditive genetic model was also fit for each SNP, but inclusion of a nonadditive term did not improve the fit (P > 0.043 for all SNPs). We also performed bootstrap resampling with replacement to assess the degree of statistical overestimation.
Odds ratios (ORs) and risk for incident type 2 diabetes associated with 40 individual SNPs, a weighted 40-SNP genetic risk score, and a weighted 17-SNP genetic risk score in the Framingham Offspring Study, stratified by age (<50 years and ≥50 years old), in the simple clinical variables–adjusted model†
. | Subjects <50 years old (n = 144 diabetes cases) . | |||
---|---|---|---|---|
Model without genetic information . | Model using 40 individual SNPs . | Model using 40-SNP weighted risk score . | Model using prior 17-SNP weighted risk score . | |
Men (vs. women) | 0.45 (0.30–0.68) | 0.43 (0.28–0.67) | 0.46 (0.30–0.70) | 0.46 (0.30–0.70) |
Family history of diabetes vs. not | 2.26 (1.55–3.30) | 2.22 (1.49–3.29) | 2.20 (1.50–3.22) | 2.18 (1.49–3.19) |
BMI (kg/m2) | 1.10 (1.06–1.14) | 1.11 (1.07–1.15) | 1.11 (1.07–1.15) | 1.11 (1.08–1.15) |
Fasting plasma glucose (mg/dl) | 1.14 (1.11–1.16) | 1.13 (1.11–1.16) | 1.13 (1.11–1.16) | 1.13 (1.11–1.16) |
Systolic blood pressure (mmHg) | 1.02 (1.01–1.03) | 1.03 (1.01–1.04) | 1.02 (1.01–1.03) | 1.02 (1.01–1.03) |
HDL cholesterol (mg/dl) | 0.96 (0.95–0.98) | 0.96 (0.95–0.98) | 0.96 (0.95–0.98) | 0.96 (0.95–0.98) |
Fasting triglycerides (mg/dl) | 1.00 (1.00–1.00) | 1.00 (1.00–1.00) | 1.00 (1.00–1.02) | 1.00 (1.00–1.00) |
Genetic risk score | — | — | 1.24 (1.13–1.36) | 1.39 (1.22–1.59) |
C-statistic (95% CI) | 0.908 (0.884–0.932) | 0.920 (0.898–0.941) | 0.911 (0.887–0.935) | 0.909 (0.884–0.933) |
P value for difference in C-statistic | 0.02 | 0.3 | 0.89 | |
Calibration χ2 (P value) | 4.37 (0.8) | 6.60 (0.6) | 9.78 (0.28) | |
NRI (%) | 11.4 | 10.2 | 7.5 | |
P value | 0.002 | 0.001 | 0.01 |
. | Subjects <50 years old (n = 144 diabetes cases) . | |||
---|---|---|---|---|
Model without genetic information . | Model using 40 individual SNPs . | Model using 40-SNP weighted risk score . | Model using prior 17-SNP weighted risk score . | |
Men (vs. women) | 0.45 (0.30–0.68) | 0.43 (0.28–0.67) | 0.46 (0.30–0.70) | 0.46 (0.30–0.70) |
Family history of diabetes vs. not | 2.26 (1.55–3.30) | 2.22 (1.49–3.29) | 2.20 (1.50–3.22) | 2.18 (1.49–3.19) |
BMI (kg/m2) | 1.10 (1.06–1.14) | 1.11 (1.07–1.15) | 1.11 (1.07–1.15) | 1.11 (1.08–1.15) |
Fasting plasma glucose (mg/dl) | 1.14 (1.11–1.16) | 1.13 (1.11–1.16) | 1.13 (1.11–1.16) | 1.13 (1.11–1.16) |
Systolic blood pressure (mmHg) | 1.02 (1.01–1.03) | 1.03 (1.01–1.04) | 1.02 (1.01–1.03) | 1.02 (1.01–1.03) |
HDL cholesterol (mg/dl) | 0.96 (0.95–0.98) | 0.96 (0.95–0.98) | 0.96 (0.95–0.98) | 0.96 (0.95–0.98) |
Fasting triglycerides (mg/dl) | 1.00 (1.00–1.00) | 1.00 (1.00–1.00) | 1.00 (1.00–1.02) | 1.00 (1.00–1.00) |
Genetic risk score | — | — | 1.24 (1.13–1.36) | 1.39 (1.22–1.59) |
C-statistic (95% CI) | 0.908 (0.884–0.932) | 0.920 (0.898–0.941) | 0.911 (0.887–0.935) | 0.909 (0.884–0.933) |
P value for difference in C-statistic | 0.02 | 0.3 | 0.89 | |
Calibration χ2 (P value) | 4.37 (0.8) | 6.60 (0.6) | 9.78 (0.28) | |
NRI (%) | 11.4 | 10.2 | 7.5 | |
P value | 0.002 | 0.001 | 0.01 |
. | Subjects ≥50 years old (n = 302 diabetes cases) . | |||
---|---|---|---|---|
Model without genetic information . | Model using 40 individual SNPs . | Model using 40-SNP weighted risk score . | Model using prior 17-SNP weighted risk score . | |
Men (vs. women) | 1.03 (0.76–1.38) | 1.04 (0.76–1.41) | 1.05 (0.78–1.41) | 1.05 (0.78–1.41) |
Family history of diabetes vs. not | 2.09 (1.54–2.85) | 2.18 (1.58–3.00) | 2.11 (1.55–2.88) | 2.12 (1.56–2.88) |
BMI (kg/m2) | 1.08 (1.05–1.11) | 1.09 (1.06–1.12) | 1.09 (1.06–1.12) | 1.09 (1.06–1.12) |
Fasting plasma glucose (mg/dl) | 1.14 (1.13–1.16) | 1.14 (1.12–1.16) | 1.14 (1.12–1.16) | 1.14 (1.12–1.16) |
Systolic blood pressure (mmHg) | 1.01 (1.00–1.02) | 1.01 (1.00–1.02) | 1.01 (1.01–1.02) | 1.01 (1.01–1.02) |
HDL cholesterol (mg/dl) | 0.98 (0.97–0.99) | 0.98 (0.97–0.99) | 0.98 (0.97–0.99) | 0.98 (0.97–0.99) |
Fasting triglycerides (mg/dl) | 1.00 (1.00–1.00) | 1.00 (1.00–1.00) | 1.00 (1.00–1.00) | 1.00 (1.00–1.00) |
Genetic risk score | — | — | 1.11 (1.03–1.19) | 1.13 (1.02–1.25) |
C-statistic (95% CI) | 0.883 (0.863–0.903) | 0.888 (0.869–0.908) | 0.884 (0.865–0.904) | 0.884 (0.865–0.904) |
P value for difference in C-statistic | 0.02 | 0.2 | 0.18 | |
Calibration χ2 (P value) | 10.97 (0.2) | 15.01 (0.06) | 8.46 (0.39) | |
NRI (%) | 5.7 | 0.4 | 0.02% | |
P value | 0.001 | 0.7 | 0.98 |
. | Subjects ≥50 years old (n = 302 diabetes cases) . | |||
---|---|---|---|---|
Model without genetic information . | Model using 40 individual SNPs . | Model using 40-SNP weighted risk score . | Model using prior 17-SNP weighted risk score . | |
Men (vs. women) | 1.03 (0.76–1.38) | 1.04 (0.76–1.41) | 1.05 (0.78–1.41) | 1.05 (0.78–1.41) |
Family history of diabetes vs. not | 2.09 (1.54–2.85) | 2.18 (1.58–3.00) | 2.11 (1.55–2.88) | 2.12 (1.56–2.88) |
BMI (kg/m2) | 1.08 (1.05–1.11) | 1.09 (1.06–1.12) | 1.09 (1.06–1.12) | 1.09 (1.06–1.12) |
Fasting plasma glucose (mg/dl) | 1.14 (1.13–1.16) | 1.14 (1.12–1.16) | 1.14 (1.12–1.16) | 1.14 (1.12–1.16) |
Systolic blood pressure (mmHg) | 1.01 (1.00–1.02) | 1.01 (1.00–1.02) | 1.01 (1.01–1.02) | 1.01 (1.01–1.02) |
HDL cholesterol (mg/dl) | 0.98 (0.97–0.99) | 0.98 (0.97–0.99) | 0.98 (0.97–0.99) | 0.98 (0.97–0.99) |
Fasting triglycerides (mg/dl) | 1.00 (1.00–1.00) | 1.00 (1.00–1.00) | 1.00 (1.00–1.00) | 1.00 (1.00–1.00) |
Genetic risk score | — | — | 1.11 (1.03–1.19) | 1.13 (1.02–1.25) |
C-statistic (95% CI) | 0.883 (0.863–0.903) | 0.888 (0.869–0.908) | 0.884 (0.865–0.904) | 0.884 (0.865–0.904) |
P value for difference in C-statistic | 0.02 | 0.2 | 0.18 | |
Calibration χ2 (P value) | 10.97 (0.2) | 15.01 (0.06) | 8.46 (0.39) | |
NRI (%) | 5.7 | 0.4 | 0.02% | |
P value | 0.001 | 0.7 | 0.98 |
Data are OR (95% CI) unless otherwise indicated. Data in bold represent statistical significance.
†The simple clinical variables–adjusted model included sex, family history of diabetes (self-report that one or both parents had diabetes), BMI, fasting glucose level, systolic blood pressure, HDL cholesterol, and fasting triglycerides levels (3). No age adjustment was done in the age-stratified models.
To evaluate the individual contribution of each SNP, we entered one term per SNP (total 40 terms plus terms for sex or clinical variables) in the logistic regression models.
We constructed a weighted genetic risk score using 40 SNPs currently associated with type 2 diabetes and a weighted genetic risk score using 17 SNPs that we used in our previous report (1). rs689 at INS on chromosome 11, previously included in our 18-SNP genetic risk score (1), was not replicated in posterior meta-analyses and is therefore not included in the current 17-SNP or 40-SNP analyses. Moreover, rs5945326 at DUSP9 on chromosome X (10) is not included in the analysis because there are no available genotyping or imputation data for this SNP in the Framingham Offspring Study.
For the construction of the weighted risk scores, we counted risk alleles (0, 1, 2) for each genotyped SNP—or its dosage when imputed—(actual distribution ranging from 28 to 53) and multiplied each SNP genotype by its published β coefficient for diabetes risk (10). We added up the product of that multiplication at each SNP, divided the sum by twice the sum of the β coefficients, and multiplied the result by the number of SNPs.
ORs, 95% CIs, and C-statistics for the 144 cases of diabetes in 6,763 person-observations in subjects <50 years old and for the 302 cases of diabetes in 4,595 person-observations in subjects ≥50 years old were calculated using pooled logistic regression with generalized estimating equations. Mean age at diabetes onset was 49.30 years for subjects <50 years old at baseline and 66.07 years for subjects ≥50 years old at baseline. We took 50 years as the age cutoff point because of the low incidence rate of diabetes in younger subjects when lower values were chosen. Sensitivity analyses using a cutoff age of 45 years (84 cases in 5,095 person-observations) showed a lower NRI in younger subjects (3.59%; P = 0.2), though this result should be taken with caution because of the low number of cases.
For NRI evaluation, we established three risk categories (low, intermediate, and high). The percentages of low, medium, and high risk of diabetes are based on the distribution of the cumulative incidence of diabetes across our population, in which cumulative incidence was low for a predicted risk <2%, intermediate for predicted risks ≥2% and ≤8%, and high when predicted risk was >8% (this assumption is an a priori requirement for the NRI calculation) (15). NRI is better if more people who develop diabetes are reclassified as higher risk when the genotype score is added to the model, and more people who remain free of diabetes are classified as lower risk when the score is added. The NRI is penalized for misreclassification; for instance, if many people who develop diabetes are classified as lower risk by adding the genetic risk score to the model.
Data for the sex-adjusted model in age-stratified analyses are shown in supplementary Table A3. Complete data for the population overall are shown in supplementary Table A4.
Association tests were done after age-stratification (<50 and ≥50 years) and in the sample overall. We compared the mean genetic risk score for persons who did develop diabetes with those who did not using mixed-effects linear models to account for family relatedness. Likewise, we used generalized estimating equations in pooled logistic-regression models (14) to test associations of the genetic risk scores with diabetes onset in sex- and simple clinical diabetes risk factors–adjusted models, which included sex, family history of diabetes (self-report that any parent had diabetes), BMI, fasting glucose and triglyceride levels, systolic blood pressure, and HDL cholesterol (3).
We evaluated model discrimination using C-statistics and net reclassification improvement (NRI) (15) (see footnote, Table 1). A two-tailed P value <0.05 indicated statistical significance. The institutional review board at Boston University approved the study, and all participants gave written informed consent.
RESULTS
Mean age was 36 ± 9 years at the first exam; nearly half the subjects were men, and BMI increased over follow-up (supplementary Table A1 in the online appendix available at http://care.diabetesjournals.org/cgi/content/full/dc10-1265/DC1). Over 11,358 person-observations we diagnosed 446 cases of diabetes. Few individual SNPs were significantly associated with diabetes in our sample, but for most SNPs the effects were in the same direction as in the original reports and of expected effect sizes (1.05–1.3) (supplementary Table A2). Individuals who developed diabetes had higher genetic risk scores than those who did not (20.4 vs. 19.7; P = 1.7 × 10−10).
The 40-SNP genetic risk score significantly reclassified subjects <50 years of age in the simple clinical variables model (NRI: 10.2%; P = 0.001), although it did not improve model discrimination (P = 0.3) (Table 1). In subjects ≥50 years, the 40-SNP score neither improved model discrimination (P = 0.2) nor risk reclassification (NRI: 0.4%; P = 0.7). The relative risk per risk allele was higher in subjects <50 years of age (24%) than in those ≥50 years of age (11%) (P = 0.02 for age-interaction effect). Results for the sex-adjusted model are shown in supplementary Table A3.
We also tested a weighted genetic risk score constructed with the originally modeled 17 SNPs (1), whereby fewer subjects were appropriately reclassified for diabetes risk (Table 1).
In the population overall, the 40-SNP genetic risk score marginally improved risk prediction (C-statistics: 0.903 and 0.906, without and with the score; P = 0.04), whereas the 17-SNP score did not (P = 0.11) (supplementary Table A4). In the whole population, NRI with the score was lower than in subjects <50 years of age (at most, 1.8%).
The individual incorporation of 40 SNPs improved model discrimination beyond the 40-SNP score (C-statistics: 0.908 and 0.920 without and with individual SNPs; P = 0.02), but after bootstrap resampling, median C-statistic values dropped to 0.905 and 0.907, respectively, thus lowering optimism about the effect of modeling individual SNPs.
CONCLUSIONS
We found that 40 SNPs selected based on the latest genetic association data improved diabetes risk reclassification after accounting for common diabetes clinical risk predictive factors.
The 40 SNPs contributing individually had the highest discrimination ability, but this model was probably overfit. The increased prediction performance of 40 as opposed to 17 SNPs appeared to be due to additional, more comprehensively modeled genetic information rather than to longer follow-up or greater number of diabetes cases as compared to our earlier report.
Limitations include that the Framingham Offspring Study subjects are mostly white and of European ancestry. Although we did not find sufficient evidence for departure from an additive model, we cannot definitely rule out that other nonadditive models are operating. We only analyzed common genetic variants; eventual incorporation of rare variants might enhance prediction. Lastly, criticism has been raised on the somewhat arbitrary assumptions needed to estimate NRI.
In summary, diabetes risk prediction improved with 40 diabetes-associated SNPs, especially in people <50 years of age. More subjects were appropriately reclassified for diabetes risk. Genetic prediction could be useful in younger people. Nonetheless, the clinical usefulness of common genetic variants for diabetes risk prediction should be further confirmed in other samples and in randomized controlled trials.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
Acknowledgments
This study was supported by the by the National Heart, Lung, and Blood Institute's Framingham Heart Study (contract no. N01-HC- 25195), the National Institute for Diabetes and Digestive and Kidney Diseases (NIDDK) grants R01 DK078616 and K24 DK080140 (to J.B.M.), NIDDK Research Career Award K23 DK65978 (to J.C.F.), NIDDK Grant R21 DK084527 (to R.W.G.), “Bolsa de Ampliación de Estudios” from the “Instituto de Salud Carlos III”, Madrid, Spain (2009/90071) (to J.M.D.M.Y.), and the Boston University Linux Cluster for Genetic Analysis (LinGA) funded by the National Institutes of Health National Center for Research Resources Shared Instrumentation Grant (1S10RR163736-01A1).
J.B.M. has a consulting agreement with Interleukin Genetics, Inc. No other potential conflicts of interest relevant to this article were reported.
J.M.D.M.Y. researched data and wrote the manuscript. P.S. researched data and contributed to discussion. M.J.P., J.D., R.B.D., and L.A.C. researched data, contributed to discussion, and reviewed the manuscript. C.S.F. and A.K.M. researched data and reviewed the manuscript. R.W.G. and J.C.F. contributed to discussion and reviewed the manuscript. J.B.M. contributed to discussion and wrote the manuscript.
Parts of this study were presented in poster form at the 70th Scientific Sessions of the American Diabetes Association, Orlando, Florida, 25–29 June 2010.