Using a nationally representative sample of the civilian noninstitutionalized U.S. population, we estimated trends in diabetes prevalence across cohorts born 1910–1989 and provide the first estimates of age-specific diabetes incidence using nationally representative, measured data.
Data were from 40,130 nonpregnant individuals aged 20–79 years who participated in the third National Health and Nutrition Examination Survey (NHANES III), 1988–1994, and the continuous 1999–2010 NHANES. We defined diabetes as HbA1c ≥6.5% (48 mmol/mol) or taking diabetes medication. We estimated age-specific diabetes prevalence for the 5-year age-groups 20–24 through 75–79 for cohorts born 1910–1919 through 1980–1989 and calendar periods 1988–1994, 1999–2002, 2003–2006, and 2007–2010. We modeled diabetes prevalence as a function of age, calendar year, and birth cohort, and used our cohort model to estimate age-specific diabetes incidence.
Age-adjusted diabetes prevalence rose by a factor of 4.9 between the birth cohorts of 1910–1919 and 1980–1989. Diabetes prevalence rose with age within each birth cohort. Models based on birth cohorts show a steeper age pattern of diabetes prevalence than those based on calendar years. Diabetes incidence peaks at 55–64 years of age.
Diabetes prevalence has risen across cohorts born through the 20th century. Changes across birth cohorts explain the majority of observed increases in prevalence over time. Incidence peaks between 55 and 64 years of age and then declines at older ages.
Diabetes is a leading cause of death in the U.S. (1). A recent meta-analysis estimates that people with diabetes have a 50–80% increased risk of disability, including impaired mobility, activities of daily living, and instrumental activities of daily living, compared with people without diabetes (2). The prevalence of diabetes among adults is ∼12%, corresponding to ∼26.1 million adults with diabetes in 2005–2010 (3).
The incidence and prevalence of type 2 diabetes, which accounts for >90% of diabetes cases (4), are clearly related to factors in an individual’s past. In particular, individuals’ own histories of obesity and smoking (5,6) have been shown to affect the risk of developing diabetes. Of these risk factors, the relationship between obesity history and diabetes incidence has been studied more extensively. One study found a steep gradient in the lifetime risk of diabetes based on BMI (measured in kilograms per meters squared) at 18 years of age. Males in the optimal BMI range of 18.5–25 kg/m2 at 18 years of age had a 19.8% lifetime risk of diabetes, whereas males with a BMI in the obese range of 30–35 kg/m2 at 18 years of age had a 57.0% lifetime risk of diabetes (7). A European cohort study found that the earlier in life that subjects gained weight, the more likely they were to develop diabetes (8). Among subjects in the Framingham Heart Study, each additional 2 years of obesity were associated with an ∼12% increased odds of developing diabetes (9). In the National Longitudinal Study of Adolescent Health, persistent obesity was associated with twice the risk of diabetes prevalence compared with adult-onset obesity (10). In the Coronary Artery Risk Development in Young Adults (CARDIA) study, each additional year a person was obese increased their odds of developing diabetes by 4% (11). These and other studies indicate that obesity over the life course is an important predictor of diabetes incidence.
In this article, we investigate the rise in diabetes in the U.S. through the lens of birth cohorts. Previous studies examining changes in diabetes prevalence over time have compared one calendar-year period to another (3,12). However, like other chronic diseases, type 2 diabetes is the result of cumulative processes that develop over a lifetime. A full understanding of the prevalence of diabetes at a moment in time requires reference to the past, a past that is embodied in the birth cohorts alive during that period. Because histories in a birth cohort are persistent—characteristics of a birth cohort established at 25 years of age remain the age 25 characteristics of that cohort as it ages—we expect to find “cohort effects” that differentiate one birth cohort from another as they age.
Birth cohorts not only embody a history of exposures, they are also the appropriate vehicle for calculating disease incidence. We take advantage of this opportunity to present new estimates of the age pattern of diabetes incidence in the U.S. These are the first estimates of incidence that use measured data in a nationally representative sample. Previous national estimates of diabetes incidence used retrospective reports of individuals rather than biological indicators and provided little age detail (13,14).
Research Design and Methods
Population and Data Collection
In order to investigate the dynamics of diabetes in the U.S., we used data from the National Health and Nutrition Examination Survey (NHANES). We used data from NHANES III, conducted in two phases, 1988–1991 and 1991–1994, and from the Continuous NHANES that began in 1999, for which data are released in 2-year cycles. We pooled adjacent data release cycles of Continuous NHANES to obtain three observation periods from Continuous NHANES: 1999–2002, 2003–2006, and 2007–2010. NHANES is a complex, multistage probability sample of the U.S. civilian noninstitutionalized population. Participants complete a home interview and are then examined in a mobile examination center, which includes sampling participant blood for laboratory tests. Participants are randomized into morning or afternoon examinations, and the morning examinees are asked to fast for at least 9 h prior to the examination. Whenever possible, NHANES uses consistent laboratory procedures over time to facilitate the analysis of trends in population health. The National Center for Health Statistics (NCHS) provides extensive documentation of NHANES survey, examination, and laboratory procedures on its Web site (15). The characteristics of the NHANES study sample are reported elsewhere (3,12).
There were 88,224 individuals examined during our study periods. We excluded individuals <20 years of age (n = 40,899), >80 years of age (n = 3,558), or who were pregnant (n = 1,510). We also excluded individuals who were exactly 20 years of age when surveyed in 2010 (n = 105) because these individuals would not comprise a complete birth cohort, as described below. We also excluded subjects with missing HbA1c values (n = 2,022). The final analytic sample for HbA1c-based measures consisted of 40,130 observations, with 7,011 observations from phase 1 of NHANES III, 7,427 from phase 2 of NHANES III, 7,778 from NHANES 1999–2002, 7,755 from NHANES 2003–2006, and 10,159 from NHANES 2007–2010.
Definition of Diabetes
We relied on laboratory results, rather than self-reported diagnoses, because the latter fails to capture the considerable number of individuals in the U.S. population with undiagnosed diabetes. A 2010 study estimated that 3.9 million individuals >20 years of age had undiagnosed diabetes, representing 19% of the diabetic population (16). Furthermore, intertemporal comparisons based on self-reported diagnosis are complicated by the fact that the criteria for diagnosing diabetes in the clinical setting have changed (17,18).
Our primary definition of diabetes is based on HbA1c, which was first measured in NHANES III. This measure reflects average glycemia over a prolonged period and thus has more intrasubject stability than the leading alternative, a measure of fasting plasma glucose (FPG) (19). Furthermore, HbA1c-based measures of diabetes are more strongly associated with cardiovascular disease and death than are FPG-based measures (20). Finally, only 54% as many observations of diabetes status are available in NHANES using FPG as using HbA1c.
Several changes in laboratory measurement of HbA1c occurred over the course of Continuous NHANES (detailed elsewhere ), but we follow the NCHS recommendation and the methods of recent studies and used HbA1c data without any corrections or adjustments (3,12). Individuals are considered diabetic if they had HbA1c ≥6.5% (48 mmol/mol) (4). Because diabetes medication is expected to reduce glycemia, the HbA1c values of medicated individuals might not capture their diabetes status correctly; therefore, all individuals who reported taking diabetes medication are considered diabetic. In our sample, there were 4,678 individuals who met our definition of having diabetes. There were 896 individuals, or 19.2% of the group with diabetes, who reported taking diabetes medication and who had HbA1c < 6.5%.
Birth cohorts must be constructed from repeated cross-sections because NHANES does not repeatedly sample the same individuals over time. We calculated each individual’s birth year using the following equation: birth cohort = period − age. For the purpose of calculating birth cohorts, period is defined as the midpoint of the NHANES wave or phase: 21 April 1990 for phase 1 of NHANES III, 23 April 1992 for phase 2 of NHANES III, and January 1 of the 2nd year of each data release cycle of Continuous NHANES. In a recent study of cohort obesity patterns that used NHANES data and the same procedure for calculating birth years, results were robust to alternative specifications of period (21). Age is the age of the individual, in completed years, at the time of the survey. To ensure large enough age/cohort cells, we analyzed cohorts born in 10-year-wide intervals (1910–1919, 1920–1929, etc.). Using this approach, we obtained a total of eight 10-year birth cohorts between 1910–1919 and 1980–1989. This method involves assuming that upon reaching 20 years of age, diabetes prevalence is not affected by migration. We tested the sensitivity of our results to this assumption by excluding foreign-born individuals from the sample.
Prevalence was calculated as the proportion of individuals in the given age-period or age-cohort cell with diabetes as defined above. Calculations were adjusted for complex survey design using strata and primary sampling units provided by the NCHS, along with survey weights. For HbA1c, we used the final examination weight provided by NCHS; because we pooled adjacent data release cycles of Continuous NHANES, we divided the examination weights in Continuous NHANES by two, as recommended by NCHS (22).
We then used ordinary least squares regression to model the age, cohort, and period patterns of diabetes prevalence in the U.S. population. We regressed the log of the prevalence estimate on a series of age and cohort or age and period indicators, with each prevalence estimate weighted by the number of observations that gave rise to it. Then, in an age/period/cohort model, we regressed the log of the prevalence estimate on age and period indicators, plus a continuous variable equal to the prevalence of obesity at 25 years of age in the corresponding birth cohort. We used 25 years of age because NHANES inquired about weight at that specific age. Obesity at 25 years of age serves as a measure of a cohort’s history of obesity. The use of a continuous variable to represent birth cohort influences avoids the identification problem that any two of age, cohort, and period indicators can be linearly combined to produce the third (23).
Birth-cohort obesity prevalence was estimated using the age 25 years weight and height recall data in Continuous NHANES waves 1999–2008. Height recall was only asked of participants 50 years of age and over; for younger individuals, we used self-reported current height. We identified birth cohorts by subtracting age from survey year, using the beginning of the 2nd year of each of the waves (e.g., 2000.0 for 1999–2000) and aggregated them into 5-year-wide intervals. The earliest and most recent birth cohorts for whom cohort obesity was calculated were the 1920–1924 and 1975–1979 birth cohorts, respectively. Thus, the age/period/cohort model excludes prevalence estimates that drew exclusively from the oldest or youngest birth cohorts (born 1910–1919 and 1980–1989). Supplementary Appendix 1 shows a table of the obesity prevalence values used in this study.
The examination of diabetes prevalence within birth cohorts allowed us to estimate the age-specific incidence of diabetes. In essence, this estimate was made by dividing the prevalence of nondiabetes in a birth cohort at one age interval (e.g., 50–54 years) by the prevalence of nondiabetes in the same birth cohort in the adjacent, younger age interval (e.g., 45–49 years) and adjusting for the fact that people without diabetes die at lower rates than the general population. The prevalence estimates used in this calculation were based upon the age coefficients estimated from the age/cohort model, presented in Fig. 3B. These summarized the age pattern of prevalence revealed within eight birth cohorts, adjusting for cohort-specific effects. Life tables for individuals without diabetes and for the general population were estimated using pooled data from NHANES III and Continuous NHANES (1999–2004 waves) cohorts linked to deaths in the National Death Index through 2006 (24). A discrete hazards model on a person-month file was used to generate the underlying risks for predicting mortality rates. The model was implemented on baseline ages 20–74 years. There were 2,903 deaths among 25,971 respondents.
Derivation of the formula for estimating incidence is shown in Supplementary Appendix 2. In deriving the formula, we assumed that the prevalence of diabetes is not affected by migration beyond 20 years of age. Furthermore, we assumed that, once one becomes diabetic, diabetes is never cured. To smooth the incidence series, we used a three-term moving average. The use of a moving average to infer incidence was appropriate because of the likelihood of offsetting errors in adjacent age intervals (see Supplementary Appendix 2).
All statistical analysis was performed using Stata version 11 (StataCorp, College Station, TX). Standard errors were estimated using Taylor series linearization.
Prevalence Estimates and Modeled Age and Cohort Patterns
Figure 1A plots estimates of age-specific diabetes prevalence during the four observation periods under study. The underlying values and their standard errors are reported in Supplementary Tables 3A and B. As reported elsewhere (3), there is a general upward trend in prevalence at each age.
Figure 1A shows a pattern in which the prevalence of diabetes declines at some set of ages >60–64 years in each of the four periods. Such a decline could be produced by higher mortality rates among those with diabetes than among those without. However, we show below that this pattern of decline with age is not present when prevalence rates are arrayed by birth cohort. In other words, the declines in prevalence with age in Fig. 1A result from the increasing prevalence of diabetes among later-born cohorts.
Figure 1B presents estimates of diabetes prevalence among birth cohorts. It is clear that prevalence is rising from one birth cohort to the next, even at younger ages where prevalence is low. Furthermore, prevalence continues to rise even at the oldest ages, which is consistent with a continued positive incidence of diabetes as cohorts age. Declining prevalence with age, a pattern suggested by period data, is not observed among real birth cohorts as they age.
The age pattern of diabetes, as well as changes in diabetes prevalence from birth cohort to birth cohort, is summarized by our statistical model. Figure 2 plots the coefficients for each birth cohort in the age/cohort regression model. That the coefficients are monotonically increasing shows that more recent birth cohorts have higher diabetes prevalence than older cohorts. The increase is exceptionally rapid among cohorts born after 1950–1959. The implication of the cohort coefficients is that the prevalence of diabetes at any age for the cohort born in 1980–1989 will be nearly triple that of the cohort born in 1950–1959 and 4.9 times that of the cohort born in 1910–1919 (derived from Supplementary Table 4A).
Just as the age/cohort model produces rapidly increasing cohort effects, the age/period model produces rapidly rising period effects. This nearly straight-line increase in prevalence across periods is shown in Fig. 3A (see Supplementary Table 4B for actual values). By themselves, there is nothing in Fig. 2 and Fig. 3A that would indicate which model is preferred. Both models produce R2 values >0.94. But when we add a cohort variable to the age/period model, the prevalence of obesity at 25 years of age, the period effects nearly disappear, as shown in Fig. 3A (Supplementary Table 4C). They also become statistically insignificant.
Figure 3B compares the age patterns of diabetes prevalence that are produced by the age/cohort model, the age/period model, and the age/period/cohort model. By far, the most level age pattern is produced by the age/period model. As argued earlier, that age pattern is misleading because it fails to account for the rise in diabetes prevalence from one birth cohort to the next. As was suggested by a comparison of Fig. 1A and B, the age pattern of diabetes prevalence in a birth cohort is steeper than that in a period. The age pattern in the age/period model becomes much steeper when birth-cohort obesity is introduced, as shown in Fig. 3B. The age pattern identified in the age/period/cohort model is very similar to that in the age/cohort model.
Based on the formula presented in Supplementary Appendix 2, Fig. 4 shows the age pattern of diabetes incidence that is implied by the age pattern of prevalence that we have uncovered. The values on the graph apply to the cohort born in 1950–1959, but the shape of the curve is nearly identical for all birth cohorts. The age pattern of incidence rises to a peak in the age interval 55–64 years (centered at age 60 years) and then declines slowly. At its peak from ages 55 to 64 years, for the cohort born in 1950–1959, ∼1.1% of the diabetes-free population will develop diabetes each year. Supplementary Appendix 5 presents numerical details of our incidence estimates.
To examine the sensitivity of results to the choice of the HbA1c threshold, we adopted a threshold of HbA1c levels ≥6.0%. Recent guidelines from the American Diabetes Association consider individuals at this level to be at “very high risk” of incident diabetes (4). See Supplementary Appendix 6 for a discussion of this choice of threshold. Using this lower threshold, we estimated the prevalence of being “at least at high risk” of diabetes over time and across birth cohorts, as shown in Supplementary Fig. 6A and B. In our sample, 7,370 individuals met the more inclusive criterion. A comparison of Fig. 1B to Supplementary Fig. 6B shows that the increase across birth cohorts in age-specific prevalence of “at least high risk” is even more striking than that using the higher cutoff. In particular, the higher prevalence observed in more recent birth cohorts appears at earlier ages in “at least high risk” than it does in diabetes itself.
We also estimated age/period, age/cohort, and age/period/cohort models of “at least high risk” prevalence. The patterns described above were largely replicated using the lower cutoff. Consistent with the higher level of prevalence, the rise in prevalence across ages and birth cohorts is greater when HbA1c ≥6.0% is used. However, the introduction of obesity at 25 years of age into the age/period model has much the same effect as when HbA1c ≥6.5% is used; it steepens the age effects and reduces the period effects, although a significant period effect remains in the most recent period (see Supplementary Fig. 6C–E and Fig. 3B). Once again, this result places the spotlight on birth cohort influences in the rise of diabetes in the U.S. Supplementary Tables 6A–C present numerical details of the results of our modeling of the prevalence of HbA1c ≥6.0%.
Birth cohorts are an attractive vehicle for investigating changes in the prevalence of diabetes because prevalence at any age is a cumulative product of influences in the past. These influences manifest themselves over the lifetime of birth cohorts, creating close associations in the prevalence of diabetes across age within a cohort.
We show that the prevalence of diabetes in the U.S. is rapidly increasing from one birth cohort to the next. We demonstrate this increase graphically and by means of an age/cohort model. The increase is especially rapid across cohorts born after 1950–1959.
Our results also reveal that the pattern of increase with age in the prevalence of diabetes is considerably faster within a birth cohort than it is across ages in a particular period. The increase with age during any particular period is too mild, or even negative, because it does not account for the higher levels of diabetes evident among more recent birth cohorts.
An additional suggestion of the importance of birth cohort influences on diabetes prevalence is supplied by our age/period/cohort model. Although an age/period model shows sharply increasing period effects, the addition of a term measuring birth cohort obesity at 25 years of age renders the period effects small and insignificant. This result indicates that birth cohort influences, in particular, birth cohort obesity levels, are important determinants of diabetes prevalence.
An innovation of our approach is that we convert estimates of birth cohort diabetes prevalence to estimates of incidence. Such estimates cannot be made using period data alone without the extreme assumption that no population rates are changing (25). This assumption is clearly not warranted in the case of diabetes, as shown in Fig. 1. But such calculations of incidence can be made by comparing prevalence at different ages for the same birth cohort since any changes in prevalence within a birth cohort must be attributed to some combination of new diagnoses (incidence), differential mortality by diabetes status, and recovery (if any). To estimate incidence, we use the age effect coefficients from the age/cohort model, which is based on observations across eight birth cohorts. We demonstrate that the incidence of diabetes among diabetes-free people rises steadily to a peak at 55–64 years of age and then declines slowly.
To the best of our knowledge, these are the first estimates of the age pattern of diabetes incidence that are based on measured data in a nationally representative sample. Other estimates of age patterns of diabetes incidence are few and inconsistent. Age patterns of diabetes incidence that peak and then decline are found in some populations (26–29). Other studies find that incidence continues to rise with age (30,31) or levels off at older ages (13,32). Annual estimates of incidence in the U.S. from the Centers for Disease Control and Prevention, which are based on retrospective self-reports, show a peak in the age interval 45–64 in some years and at age 65–79 in other years (33). Experimental evidence suggests a biological mechanism for increasing incidence with age at the individual level (34). One possible explanation for the peak and decline in diabetes incidence in a birth cohort is population heterogeneity in vulnerability to diabetes, with the most vulnerable individuals being successively selected out of the diabetes-free population as birth cohorts age.
Our study has several limitations. We assume that migration does not affect the prevalence of diabetes in birth cohorts. When we removed foreign-born respondents from the sample, however, the pattern of our results for both prevalence and incidence was essentially unchanged (results available upon request). We also assume no age/cohort interactions. We tested this assumption by including interactions between a continuous variable for age and indicators for the three birth cohorts that provided the most prevalence estimates; coefficients on these interaction terms were not statistically significant (P > 0.15 in all cases).
The small sample sizes in NHANES required us to use 10-year-wide birth cohorts and assume homogeneity within those birth cohorts. As a specification check, we divided the birth cohorts into different 10-year intervals than reported in this article (1915–1924, 1925–1934, etc.). Resulting patterns of prevalence were similar to the results presented here (results available upon request).
The NHANES data do not permit distinguishing between type 1 and type 2 diabetes. However, because type 2 diabetes accounts for ∼90–95% of all diabetes cases (4), this was not a serious limitation.
We categorized as diabetic individuals below the 6.5% HbA1c threshold who reported taking medication for diabetes. On the other hand, we did not categorize as diabetic individuals below the 6.5% threshold with self-reported diabetes because we assumed that the large majority of this group was assessed using alternative diagnostic criteria, such as FPG or oral glucose tolerance test. Prior research indicates that relative to these measures, the HbA1c test identifies as diabetic a smaller group of high-risk individuals (16). For this reason, we did not assume that individuals with self-reported diabetes were ever above the HbA1c threshold for diabetes.
Finally, our method for estimating diabetes incidence assumes that mortality differences between people with and without diabetes have been constant and that remission rates are zero. The literature on the former is unresolved (17,35,36) and assuming zero remission is standard in projection models of diabetes prevalence (37,38). Supplementary Appendix 2 provides more information on remission rates.
Two recent studies of individuals in NHANES found that secular changes in time-of-survey BMI explained some, but not all, of the secular increase in the prevalence of diabetes and prediabetes (3,12). Our findings also implicate the rise in obesity for increases in diabetes but we use aggregate data on birth cohorts and a historical rather than contemporary indicator of obesity. That both current and past levels of obesity affect an individual’s risk of developing diabetes has been demonstrated in prior research (9). Thus, our results are consistent with other analyses that identify increases in the prevalence of obesity as an important factor in the rise in diabetes.
The prevalence of obesity has increased dramatically across recent U.S. birth cohorts. We have shown that birth-cohort prevalence of diabetes is associated with birth-cohort levels of obesity at 25 years of age. Because birth cohort effects persist as birth cohorts age, our results suggest that diabetes prevalence is likely to continue increasing despite an apparent plateauing of obesity in recent years (39). Additional analyses should investigate the implications of the birth cohort trends identified here for future diabetes prevalence in the U.S.
Acknowledgments. The authors acknowledge helpful comments from Irma Elo and Douglas Ewbank (Population Studies Center, University of Pennsylvania) and Neil Mehta (Emory University Rollins School of Public Health, Atlanta, GA). The authors thank the anonymous reviewers for this journal for their helpful comments.
Funding. This research has been supported by grant R01-AG-040212 from the National Institute on Aging.
Duality of Interest. No potential conflicts of interest relevant to this article were reported.
Author Contributions. E.I.F. managed and analyzed the data and wrote the first draft of the manuscript. A.S. managed and analyzed the data. S.H.P. conceived of the analysis and oversaw the research. All authors edited the manuscript. E.I.F., A.S., and S.H.P. are the guarantors of this work and, as such, had full access to all the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis.
Prior Presentation. This study was accepted for presentation at the Annual Meeting of the Population Association of America, Boston, MA, 1–3 May 2014.