OBJECTIVE—This study compares alternative methods for attributing hospital utilization and costs to diabetes. Findings from five “numerator” methods, found in the literature and based on presence of certain diagnoses or combinations of diagnoses in the billing records, were compared to benchmark findings derived from attributable risk calculations.
RESEARCH DESIGN AND METHODS—Estimates of non-HMO, short-term, nonspecialized hospital stays, hospital days, and costs attributable to diabetes in Texas were derived from the 1995 Medicare inpatient database (MEDPAR) for persons aged at least 65 years at the end of 1994. Attributable risk calculations applied age-, sex-, and ethnicity-specific estimates of diabetes prevalence, based on the combined 1987–1994 National Health Interview Surveys, to 1995 Medicare non-HMO, Part A (hospital insurance) enrollment among the Texas elderly. Alternative prevalence estimates were based on the 1994–1996 Texas Behavioral Risk Factor Surveillance System.
RESULTS—The five numerator methods yielded cost estimates that were 10, 10, 75, 144, and 172% of the benchmark estimate.
CONCLUSIONS—This study documents great variation in diabetes cost estimates that might result from alternative methods for selecting diagnoses or combinations of diagnoses as criteria for attributing costs to diabetes. Whereas no method that ignores population prevalence yielded an accurate cost estimate, I suggest that further empirical study may be helpful in selecting those combinations of diagnoses that might, on average, reasonably estimate diabetes costs in situations where population denominators are unavailable or prevalence is unknown.
This study used a single inpatient billing database to compare findings from application of alternative methods for estimating the extent to which hospitalizations and associated costs were attributable to diabetes.
Over the past decade, U.S. national estimates of direct medical costs of diabetes have varied from $15 billion to $86 billion (1–5), with each estimate based on different methods. Because costing methods have varied, both in the definition of persons with diabetes and in methods for attributing costs to the disease, it is not clear whether differing cost estimates reflect growth in the size of the elderly population (6), increased incidence and prevalence due to factors other than aging (7), improved survival (8), increased propensity to diagnose, changes in record-keeping or in value of money over time, greater use of services or use of higher quality services, or differences in research methods (9).
A major issue in estimating costs of diabetes is how to deal with the many nonspecific complications of diabetes. Persons with diabetes have a high risk of developing chronic complications including neurological, cardiovascular, cerebrovascular, peripheral vascular, renal, and ophthalmic diseases (10). Diabetes is the most common cause of end-stage renal disease (11), and diabetes accounts for almost half of nontraumatic lower-extremity amputations (12,13). Estimates of the costs of such complications suggest that they are formidable (14).
The attribution problem stems from the fact that most diabetic complications are not specific to diabetes. When patients have both diabetes and nonspecific complications, it is not clear whether, and to what extent, the costs of treating the complications should be attributed to diabetes. Also, it is unclear whether such cases are recorded in the billing records with diabetes as principal (first-listed) diagnosis, or with diabetes among the various secondary diagnoses. Among hospitalizations of persons known to have diabetes, 40% of records did not mention diabetes among the discharge diagnoses (15,16). Failure to mention diabetes is especially a problem among the elderly and those with multiple comorbidities (17).
When population denominators are available and diabetes prevalence is known, cost estimates can be derived from billing records using calculations for attributable risk among the exposed (1,4,18). Although details have varied, the basic method considers the difference in per capita costs between diabetic and nondiabetic populations. When clear population denominators are not available or prevalence is unknown, researchers are obliged to select medical records for attribution to diabetes using one of a handful of routines for sorting records on the basis of diagnostic codes. Five different methods for sorting records are described in the literature on costs of diabetes. Among the traditional approaches are selection of records having a “principal diagnosis” of diabetes (3,19–22), selection of records having a “principal or secondary diagnosis” of diabetes (20,21), and selection of “all care for persons with diabetes” without regard to information on any particular record (2) (see appendix for details). The three approaches yield minimum, intermediate, and maximum cost estimates, respectively.
Researchers in Texas developed an alternative approach based on expert review of the International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) codebook. The method identifies persons with diabetes and, after locating all of their records, searches for diagnoses or combinations of diagnoses and codes for diagnostic-related groups (DRGs) that are viewed as “clearly attributable to diabetes” or “probably attributable to diabetes,” given that the patient is known to have diabetes. The two categories combined are described as “clearly or probably attributable to diabetes,” and the aggregated costs for those hospital stays are suggested as an alternative intermediate estimate of hospital costs attributable to diabetes (23,24) (see appendix).
These five “numerator” methods sort medical records without regard to population denominators, and they have not often been examined to see how findings might compare to findings from a benchmark denominator method (see appendix). Thus, I sought to determine whether utilization and cost estimates from any of the alternative numerator methods for attributing medical records to diabetes reasonably approximate estimates based on attributable risk calculations. Close attention was given to the two intermediate estimators. Specifically, I expected that the “principal or secondary diagnosis” method would produce estimates too high to be useful and that the “clearly or probably attributable” method would produce estimates too low. Thus, I examined which of the two intermediate methods best approximates findings from the benchmark denominator method, and how numerator methods might be improved to make them more useful when population denominators are unavailable or diabetes prevalence is unclear.
RESEARCH DESIGN AND METHODS
The National Health Interview Surveys (NHIS) have asked if respondents or their family members have diabetes. Diabetes prevalence for the elderly in Texas was estimated by compiling data from the NHIS for the years 1987–1994 by sex, age group (65–74, 75–84, and ≥85 years), and ethnicity (non-Cuban and non–Puerto Rican Hispanic, African American, and non-Hispanic white/other). Stratified national estimates were applied to person-months of non-HMO, Part A (hospital insurance) Medicare enrollment in Texas in 1995 for the comparable elderly populations (Hispanic/Native American, African American, and non-Hispanic white/other) aged at least 65 years at the end of 1994.
To account for the effect of increasing numbers of people with diabetes, prevalence estimates were increased by half of incident cases (0.43% aged 65–74, 0.23% aged ≥75) based on national findings (25). The Medicare enrollment files did not consistently identify Hispanic enrollees, a population known to be at high risk for diabetes. Thus, to better estimate the extent of Hispanic Medicare enrollment, the number of person-months of Hispanic enrollment was scaled to census estimates of the size of that population based on the ratio of African American Medicare coverage in Texas relative to census estimates of the number of elderly African Americans in Texas in 1995, with the corresponding deduction made from the white/other count of person-months of enrollment. This step assured that the higher prevalence estimates for the Hispanic population would be applied to an appropriately sized estimate of person-months of Hispanic Medicare enrollment in Texas. Further adjustment was made to account for straight-line projection of diabetes prevalence for the elderly within the NHIS for the years 1987–1994 to the 1995 study year.
A statistical confidence interval for Texas diabetes prevalence was calculated from the NHIS, and the calculations incorporated weights to the Texas demographic structure. However, the reader is advised that the NHIS is a complex national survey and application of sampling weights to Texas demography is, at best, a risky procedure. Thus, stated confidence intervals are best viewed as approximations. As a check on the prevalence estimate derived from the NHIS, I also considered an estimate of diabetes prevalence among the Texas elderly derived from the Texas Behavioral Risk Factor Surveillance System (BRFSS) for 1994–1996 (26). Because relatively few elderly persons participated in that survey, the calculated confidence interval was relatively wide.
Estimates of inpatient utilization and costs for the study population were calculated from Medicare’s inpatient billing database (MEDPAR) for Texas in 1995. Attribution of utilization and costs to diabetes using the benchmark denominator method employed standard calculations for attributable risk among the exposed (27). Whereas estimates based on population attributable risk and attributable risk among the exposed can differ in certain situations (5,9), the distinction is not relevant for this study and the reader may regard the calculation as the difference in average monthly costs for persons with and without diabetes times total months of enrollment for persons with diabetes. The calculation yields the total excess cost for the population with diabetes. Methodological details were adapted from prior study of national diabetes costs conducted on behalf of the American Diabetes Association (1).
Each record in the MEDPAR database represented one hospital discharge in 1995. With up to 10 discharge diagnoses available for each record, a person with diabetes was defined by presence of the ICD-9-CM codes for diabetes (250) or hypoglycemia (251.0 or 251.2) in any position in any record for that person. A record mentioning diabetes was selected using the same criteria applied to that record. The number of hospital days was calculated by subtracting date of admission from date of discharge (same-day discharges were counted as 1 day of stay). Costs included amounts paid by Medicare plus copayments, deductibles, and third-party payments.
Utilization and cost estimates that employed alternative numerator methods included two minimum estimates. The simpler of these was to select for records listing diabetes as “principal diagnosis.” A more complicated minimum estimate used methods previously employed in Texas to describe hospitalizations “clearly attributable to diabetes” (see appendix). The diagnoses employed for this method describe medical complications specific to diabetes. A maximum estimate included “all care for persons with diabetes” defined as all records for persons having any record that mentioned diabetes.
Of particular interest were two intermediate methods. The simpler method (principal or secondary diagnosis) selected records that mentioned diabetes in any position in the medical record. The more complicated method (clearly or probably attributable) was selected for certain principal diagnoses or combinations of diagnoses and DRGs among records of persons known to have diabetes. The method expands on the “clearly attributable” method by including hospitalizations for many of the nonspecific complications of diabetes (see appendix).
RESULTS
The 1995 Medicare enrollment database for Texas included 1.72 million individuals who were at least aged 65 years at the end of the prior year. Together, they had almost 20 million person-months of non-HMO, Part A enrollment. The 1995 inpatient database included 553,556 non-HMO hospital stays for those individuals, with 3.70 million days of stay and a total cost of $3.8 billion.
Diabetes prevalence for 1995 among the elderly in Texas was estimated at 11.7% (95% CI 11.2–12.2) from the NHIS. Prevalence was estimated at 12.6% (9.7–15.4) using the Texas BRFSS. Using the NHIS prevalence estimate, ∼76,200 excess hospitalizations were attributed to diabetes, with ∼568,000 excess days of stay and an excess cost of $536 million (Table 1). Average monthly excess cost for a person with diabetes was $232. Using the BRFSS, 71,400 hospitalizations were attributed to diabetes, with ∼536,000 days of stay and a cost of $502 million. Thus, the two cost estimates suggest that ∼13–14% of total inpatient costs were attributable to diabetes. The 95% CI for the cost estimate based on the NHIS was narrow and ranged from $517 million to $554 million. The broader interval based on the BRFSS ranged from $390 million to $607 million.
Among the respective numerator methods for attributing costs to diabetes, the two minimum estimates differed little: $53.3 million using the “principal diagnosis” method and $53.4 million using the “clearly attributable” method. This finding was expected, as the “clearly attributable” method differed little from the “principal diagnosis” method. It included all cases with diabetes as principal diagnosis and added only a handful of cases where diabetes itself either was not listed or was listed among secondary diagnoses. The maximum estimate of $919 million for “all care” implies that persons with diabetes accounted for ∼24% of total costs.
When costs were attributed on the basis of diabetes as “principal or secondary diagnosis,” estimates were much higher ($773 million) than could reasonably be attributed on the basis of the benchmark method. Findings from the “clearly or probably attributable” method were comparatively low when the benchmark was based on prevalence data from the NHIS. However, when the benchmark was based on the BRFSS, with the much broader confidence interval, findings from the “clearly or probably attributable” method were ambiguous. The numbers of attributed hospitalizations and attributed costs, while low, were within the specified confidence interval. The estimate for number of attributable patient days, on the other hand, was outside the confidence interval.
The distinction between findings from the two intermediate numerator methods can be clarified by reversing the calculations; that is, if we accept the utilization and cost estimates derived from each of the two methods, then we can calculate the implications for prevalence and look to see if resulting prevalence estimates are reasonable (Table 1). For example, if the cost estimate based on the “principal or secondary diagnosis” method were accepted, then it would imply that diabetes prevalence among the elderly was <5%. Thus, we can reject findings from that method as entirely unreasonable. Similar calculations using the “clearly or probably attributable” method suggest that diabetes prevalence was ∼14.7%–16.6%, depending on the measure. Although these prevalence estimates are a bit high, they are not unreasonable when comparison is made with findings from the BRFSS. This analysis suggests that utilization and cost estimates derived from the “clearly or probably attributable” method, while low, were closer to the benchmarks than were estimates from the “principal or secondary diagnosis” method.
By way of checking reliability in the coding of diagnostic data, I reviewed all hospital stays defined as “clearly attributable” to diabetes to check whether such records included a diabetes diagnosis. Of 8,895 hospital stays selected by that method, all but four mentioned diabetes. Similarly, records selected as “probably attributable” (excluding those that were “clearly attributable”) were reviewed for mention of diabetes. Of 50,367 records in that category, 7,043 had no mention of diabetes. A third review searched for records of persons known to have diabetes that were not selected as “clearly or probably attributable.” Of 72,720 stays for persons known to have diabetes, 60,130 mentioned diabetes and the rest did not. Finally, we should note that hospital stays selected by the “clearly or probably attributable” method were, for the most part, a subset of those selected by the “principal or secondary diagnosis” method. As noted, 60,130 stays rejected by the “clearly or probably attributable” method mentioned diabetes, and were thus selected by the “principal or secondary diagnosis” method. Conversely, as described, 7,043 records that did not mention diabetes were selected as “clearly or probably attributable” to diabetes.
CONCLUSIONS
In the absence of clear population denominators or when diabetes prevalence is uncertain, researchers have no clear method for attributing costs to diabetes. As a crude approximation, researchers might be tempted to simply deduct ∼31% from findings from the “principal or secondary diagnosis” method or, alternatively, add 33% to findings from the “clearly or probably attributable” method. However, the reader is reminded of the many limitations to this study, principally that it was limited to elderly Medicare enrollees in Texas. The nature of diabetes and associated treatments likely differ for the elderly and nonelderly (28). Also, prevalence estimates reflect only persons with diagnosed diabetes and exclude undiagnosed cases. Given that about one-third of cases are undiagnosed (8), this study assumed, perhaps incorrectly, that preclinical cases did not substantially influence diabetes costs. There is question regarding the accuracy of the NHIS because prevalence of diagnosed diabetes was either self-reported (7,29) or based on secondary reports on the status of other family members (30), and because institutional residents, many with diabetes, were not covered by the survey (29). Researchers have reported excellent agreement between self-reports and medical records concerning diabetes status (25), and evaluation of the NHIS found that, on the whole, the survey accurately captured diagnosed diabetes (31). It is not clear whether the NHIS can be reasonably applied to the Texas population, even when applied across demographic strata, because we do not know whether prevalence within the respective demographic groups in Texas equals national prevalence for those groups. Similarly, propensity to diagnose diabetes may differ for Texas in comparison to the nation. Finally, calculation of confidence intervals from NHIS data lacks validity when applied to the Texas population.
This study assumed that diabetes prevalence and impact did not substantially differ for Medicare HMO and non-HMO enrollees, and that presence or absence of diabetes did not differentially influence propensity to enroll in HMOs. Whereas national data suggest little difference between persons with and without diabetes in terms of health insurance coverage (32), at least one study of elderly Mexican Americans reported that 95% of those with diabetes had Medicare coverage versus 91% of those without diabetes (33).
Incorrect estimation of diabetes prevalence among the elderly would substantially influence cost findings. For example, in this study, the relatively small difference in prevalence estimates from the NHIS and the BRFSS resulted in cost estimates that differed by $34 million. As an alternative, diabetes prevalence could be based on estimates of 22% prevalence among both elderly Hispanics (34) and African Americans (35). These figures, combined with an approximate 9% prevalence among the remaining elderly population and applied to Texas population estimates for 1995, suggest an overall prevalence of 11.6% among the elderly, a figure comparable to that derived from the NHIS.
Within the medical care system, there is little consistency or completeness in the identification or coding of diseases (36), and omission of diabetes is common. If an individual with diabetes was hospitalized during the year, but had no diagnosis of diabetes noted on any record, then that individual’s care would be counted among those without diabetes. In such cases, the record would be counted as nondiabetic and would result in a downward bias for estimates. Also, findings may be influenced by practices within the medical care system. Physician or institutional responses to a person with diabetes may differ from those for a person without diabetes, even when cases do not substantially differ. For example, observed presence of diabetes may increase propensity to hospitalize or may result in greater intensity of care (36). Finally, this study does not control for independent factors, such as obesity, that can influence both diabetes and many of the nonspecific complications of diabetes. In such situations, added costs would be incorrectly attributed to diabetes, and the calculation for attributable risk would not factor out the independent effects (5).
Because the health care system has few of the attributes of a free market system that economists would like to see (28), the reader is encouraged to view the term “costs” as simply reflecting expenditures rather than the economic costs of diabetes. Also, the DRG system averages prices across groups of cases, resulting in a loss of information on individual cases (37). This is especially a concern for a study of the elderly, who are more subject to comorbidies (28).
The “clearly or probably attributable” method could be improved. For example, the report describing the method set aside certain medical procedure codes for future study (23). Also, the 60,000 records selected by the “principal or secondary diagnosis” method, but omitted by the “clearly or probably attributable” method, could be reviewed for their relevance to the problem at hand. This could be accomplished either by expert opinion or by empirical analysis of the relative risks for those principal diagnosis among persons with diabetes in comparison to persons without diabetes. The finding of a small number of “clearly attributable” records that did not mention diabetes suggests that those diagnoses might be used to expand the definition of diabetes. At least two other studies have included additional diagnostic criteria to help identify a larger pool of persons with diabetes (2,21), and adoption of more inclusive criteria might help to offset the problems resulting from failure to mention diabetes among medical records of persons known to have the disease.
APPENDIX: Alternative methods for attributing hospital records and costs to diabetes
Diabetes is defined by ICD-9-CM codes 250, 251.0, or 251.2. A person with diabetes is a person with a code for diabetes in any position in any hospital record. Numerator methods select hospital records according to various diagnostic criteria. The benchmark denominator method calculates excess per capita hospitalization and cost for the population with diabetes.
Traditional numerator methods
Principal diagnosis (minimum estimate).
Medical records with a first-listed diagnosis of diabetes.
Principal or secondary diagnosis (intermediate estimate).
Medical records with a diagnosis of diabetes in any position in the record.
All care for persons with diabetes (maximum estimate).
All records for persons with diabetes, even records that do not mention diabetes.
Experimental numerator methods
Clearly attributable to diabetes (minimum estimate).
Identify all persons in the database with diabetes, locate all of their records, and select records with a principal diagnosis of ICD-9-CM 250, 251.0, 251.2, 357.2, 362.0, 364.4, 648.0, or 790.2; 337.1, 358.1, or 713.5 if 250.6 is secondary; 731.8 if 250.8 is secondary; 443.81 or 785.4 if 250.7 is secondary; 581.81 or 583.81 if 250.4 is secondary; or 366.41 if 250.5 is secondary. These are codes for diabetes or complications specific to diabetes.
Clearly or probably attributable to diabetes (intermediate estimate).
In addition to records selected as clearly attributable to diabetes, select from among records of persons with diabetes those with a principal diagnosis of ICD-9-CM 112.1-.3, 272.0-.4, 276.7, 352.9, 354–5, 362.1-.5, 362.8-.9, 365.0-.1, 365.5-.6, 366.0-.1, 366.3-.4, 366.8-.9, 368.1-.4, 368.8–369, 377.1, 377.4-.6, 380.1, 401–5, 410–4, 425.4, 425.9, 426–8, 429.1, 429.3, 430–6, 437.0, 437.1, 437.7–438, 440–2, 443.1, 443.8-.9, 444, 447.0-.2, 447.9, 458.0, 459.0, 558.9, 567.2, 567.8, 581.8-.9, 583.8-.9, 585–8, 590, 593.1, 593.6, 593.8, 595.0, 595.3, 595.9, 596.4-.5, 596.9, 599.0, 607.8, 707.1, 707.8-.9, 709.3, 716.9, 729.2, 730.1, 791.0, 791.5-.6, or 896; 337.1, 358.1, or 713.5 if 250.6 is not secondary; 731.8 if 250.8 is not secondary; 785.4 if 250.7 is not secondary; or 885–7, 895, or 897 with DRG 108, 110–114, 130–131, or 285. In some instances, criteria are more stringent when fourth or fifth digits are available. These are codes for many of the nonspecific complications of diabetes, and they are viewed as “probably attributable” to diabetes, given that the patient is known to have diabetes.
Benchmark denominator method.
Employs calculation for attributable risk among the exposed. Identify persons with diabetes and locate all of their records. Sum the costs associated with those records and divide by the number of months of enrollment for the underlying population with diabetes, as estimated from population prevalence data. Subtract the monthly per capita cost for the underlying population without diabetes. Multiply by the total months of enrollment for the underlying population with diabetes. The result is the excess cost for the population with diabetes.
Alternative estimates of Medicare non-HMO hospital stays, days, and costs attributable to diabetes among the Texas elderly, 1995
. | Hospital stays (times 1,000) . | Hospital days (times 1,000) . | Cost (times $1 million) . |
---|---|---|---|
Findings from denominator method, prevalence estimated from: | |||
NHIS (1987–1994) | 76.2 | 567.8 | 535.7 |
95% CI | 73.5–78.9 | 549.9–585.4 | 516.9–554.2 |
Texas BRFSS (1994–1996) | 71.4 | 535.8 | 502.2 |
95% CI | 55.1–86.6 | 428.8–635.9 | 390.0–607.3 |
Findings from numerator methods: | |||
Principal diagnosis of diabetes | 8.9 | 60.9 | 53.3 |
Clearly attributable to diabetes | 8.9 | 61.0 | 53.4 |
Clearly or probably attributable to diabetes | 59.3 | 381.4 | 402.7 |
Principal or secondary diagnosis of diabetes | 112.3 | 780.6 | 773.4 |
All care for persons with diabetes | 132.9 | 933.0 | 918.9 |
Prevalence implications of: | |||
Principal or secondary diagnosis method (%) | 4.4 | 5.2 | 4.8 |
Clearly or probably attributable method (%) | 14.7 | 16.6 | 15.1 |
Texas elderly diabetes prevalence estimated from: | |||
NHIS (1987–1994) | 11.7 | ||
95% CI | 11.2, 12.2 | ||
Texas BRFSS (1994–1996) | 12.6 | ||
95% CI | 9.7, 15.4 |
. | Hospital stays (times 1,000) . | Hospital days (times 1,000) . | Cost (times $1 million) . |
---|---|---|---|
Findings from denominator method, prevalence estimated from: | |||
NHIS (1987–1994) | 76.2 | 567.8 | 535.7 |
95% CI | 73.5–78.9 | 549.9–585.4 | 516.9–554.2 |
Texas BRFSS (1994–1996) | 71.4 | 535.8 | 502.2 |
95% CI | 55.1–86.6 | 428.8–635.9 | 390.0–607.3 |
Findings from numerator methods: | |||
Principal diagnosis of diabetes | 8.9 | 60.9 | 53.3 |
Clearly attributable to diabetes | 8.9 | 61.0 | 53.4 |
Clearly or probably attributable to diabetes | 59.3 | 381.4 | 402.7 |
Principal or secondary diagnosis of diabetes | 112.3 | 780.6 | 773.4 |
All care for persons with diabetes | 132.9 | 933.0 | 918.9 |
Prevalence implications of: | |||
Principal or secondary diagnosis method (%) | 4.4 | 5.2 | 4.8 |
Clearly or probably attributable method (%) | 14.7 | 16.6 | 15.1 |
Texas elderly diabetes prevalence estimated from: | |||
NHIS (1987–1994) | 11.7 | ||
95% CI | 11.2, 12.2 | ||
Texas BRFSS (1994–1996) | 12.6 | ||
95% CI | 9.7, 15.4 |
Article Information
This project was funded by Grant #30-P-90725/6-01 from the Office of Research and Demonstrations, Health Care Finance Administration.
The author thanks David C. Warner, PhD, of the Lyndon B. Johnson School of Public Affairs, University of Texas at Austin; and Jacqueline A. Pugh, MD, John E. Cornell, PhD, and Louis A. DeNino, PhD, of the Department of Medicine, the University of Texas Health Science Center at San Antonio, for their work on costs of diabetes in Texas and for development of methods evaluated by this study.
References
Address correspondence and reprint requests to Roy R. McCandless, DrPH, Department of Family and Community Medicine, University of California, San Francisco, 3333 California St., Suite 365, San Francisco, CA 94118. E-mail: [email protected].
Received for publication 27 February 2002 and accepted in revised form 29 July 2002.
A table elsewhere in this issue shows conventional and Système International (SI) units and conversion factors for many substances.