Using real-world data (RWD) from three U.S. claims data sets, we aim to predict the findings of the CARdiovascular Outcome Trial of LINAgliptin Versus Glimepiride in Type 2 Diabetes (CAROLINA) comparing linagliptin versus glimepiride in patients with type 2 diabetes (T2D) at increased cardiovascular risk by using a novel framework that requires passing prespecified validity checks before analyzing the primary outcome.
Within Medicare and two commercial claims data sets (May 2011–September 2015), we identified a 1:1 propensity score–matched (PSM) cohort of T2D patients 40–85 years old at increased cardiovascular risk who initiated linagliptin or glimepiride by adapting eligibility criteria from CAROLINA. PSM was used to balance >120 confounders. Validity checks included the evaluation of expected power, covariate balance, and two control outcomes for which we expected a positive association and a null finding. We registered the protocol (NCT03648424, ClinicalTrials.gov) before evaluating the composite cardiovascular outcome based on CAROLINA’s primary end point. Hazard ratios (HR) and 95% CIs were estimated in each data source and pooled with a fixed-effects meta-analysis.
We identified 24,131 PSM pairs of linagliptin and glimepiride initiators with sufficient power for noninferiority (>98%). Exposure groups achieved excellent covariate balance, including key laboratory results, and expected associations between glimepiride and hypoglycemia (HR 2.38 [95% CI 1.79–3.13]) and between linagliptin and end-stage renal disease (HR 1.08 [0.66–1.79]) were replicated. Linagliptin was associated with a 9% decreased risk in the composite cardiovascular outcome with a CI including the null (HR 0.91 [0.79–1.05]), in line with noninferiority.
In a nonrandomized RWD study, we found that linagliptin has noninferior risk of a composite cardiovascular outcome compared with glimepiride.
The 21st Century Cures Act mandates that the U.S. Food and Drug Administration (FDA) establish a program to evaluate the potential use of real-world evidence (RWE) to support a new indication for a drug or postapproval study requirements (1). In the framework for FDA’s Real World Evidence Program, real-world data (RWD) includes longitudinal electronic health records, medical claims and billing data, and patient-generated data (2). Nonrandomized database studies are noninterventional clinical study designs in which the study identifies the population and determines the exposure/treatment from data generated before the initiation of the study (2). Health insurance claims data contain structured diagnosis and procedure information, capture patient experiences across the care continuum, and are frequently used by the FDA to evaluate drug safety (3).
Questions remain on whether RWD analyses of claims data can be reliably used for regulatory decision making to evaluate not only the safety of medications but also their effectiveness (4). Frequent criticisms include the lack of prespecified protocols, avoidable design and analytic flaws, data quality issues, and confounding biases that may threaten the validity of findings. Some investigators have designed such studies to match published randomized controlled trials (RCTs) and have successfully replicated (5) or predicted (6–8) trial findings, suggesting that principled methodology can generate accurate information based on RWD in certain cases. Several threats to validity can be assessed before registering the protocol and implementing a nonrandomized database study, so that the decision to move forward with a given study and the analytic plan are well documented. Such a process will instill greater confidence in the ability of nonrandomized database studies to reproduce findings of a comparable RCT.
The CARdiovascular Outcome Trial of LINAgliptin Versus Glimepiride in Type 2 Diabetes (CAROLINA) study (9) is an ongoing RCT designed to assess whether linagliptin, a dipeptidyl peptidase 4 (DPP-4) inhibitor, is noninferior and, if so, superior compared with the sulfonylurea glimepiride with respect to cardiovascular events in adults with type 2 diabetes (T2D) at increased risk of cardiovascular events. Given that medications of both classes are frequently used as second-line therapy after metformin, and because sulfonylureas have been associated with concerns regarding their cardiovascular safety (10), the results of this trial could have a significant impact on clinical practice if practitioners conclude there is a difference in cardiovascular safety between the drugs. Trial recruitment started in 2010, and results are expected in 2019.
Using RWD, we aimed to predict CAROLINA’s findings within a framework that requires passing prespecified validity checks before analyzing the primary end point (11). This is part of a series of studies aimed to predict the findings of ongoing trials before their completion.
Research Design and Methods
This study included data from two commercial U.S. health insurance claims data sets (Optum Clinformatics and IBM MarketScan) and fee-for-service Medicare claims data. For each insured individual, the three data sets contain demographic information, health plan enrollment status, longitudinal patient-level information on all reimbursed medical services, inpatient and outpatient diagnoses and procedures, and pharmacy dispensing records, including information on medication start and refill, strength, quantity, and days’ supply. Optum and MarketScan are both linked to laboratory test results provided by two national laboratory test provider chains. Through this linkage, results for outpatient laboratory tests are available for a subset of beneficiaries.
Within the three U.S. RWD sources, we identified a cohort of T2D patients 40–85 years old at increased cardiovascular risk who initiated linagliptin or glimepiride from May 2011 (in accordance with the approval of linagliptin in the U.S.) to September 2015, adapting eligibility criteria from CAROLINA (Fig. 1 and Supplementary Table 1). Cohort entry date was the day of the first filled prescription of linagliptin or glimepiride among patients with at least 6 months of continuous enrollment before drug initiation. We used 1:1 propensity score matching (PSM) to control for >120 potential confounders, which were measured during the 6 months before cohort entry and included demographics, calendar time, comorbidities, diabetes-specific complications, use of diabetes and other medications, and indicators of health care utilization as proxy for overall disease state, care intensity, and surveillance. Laboratory test results, which were available in a 7% subset of the population, were also measured at baseline, although they were not included in the claims-based PS model.
The primary outcome was a composite cardiovascular outcome of hospitalization for myocardial infarction, stroke, or death (Supplementary Table 2), adapted from the CAROLINA study’s primary end point of three-point major adverse cardiovascular event (3-P MACE) composite, comprising nonfatal myocardial infarction, nonfatal stroke, or cardiovascular death. The availability of mortality information varied by database. Medicare fee-for-service included complete information on all-cause mortality, MarketScan included information on in-hospital death, and no information was available in Clinformatics. Individual components of the composite cardiovascular outcome were also analyzed as secondary outcomes.
Follow-up started on the day after cohort entry and continued in an “as-treated” approach until treatment discontinuation or switch to a comparator, occurrence of an event of interest, nursing home admission, plan disenrollment, or end of the study period, whichever came first. In case of treatment interruption or discontinuation, we extended the exposure effect window until 30 days after the end of the last prescription’s supply. In line with CAROLINA, we allowed study participants who initiated linagliptin or glimepiride to be exposed to nonglimepiride sulfonylureas before cohort entry. Because in the trial patients in both arms could be exposed to nonglimepiride sulfonylureas, we censored patients who added a nonlinagliptin DPP-4 inhibitor but did not censor patients who added a nonglimepiride sulfonylurea.
We purposefully chose an as-treated analysis rather than an intention-to-treat analysis to address the high rate of treatment discontinuation in routine care. The as-treated analysis avoids the substantial exposure misclassification that often occurs when intention-to-treat analyses are applied in RWE studies, which typically bias findings toward the null. Hazard ratios (HR) and 95% CIs were estimated in the PSM cohort using unstratified Cox regression models. Analyses were conducted in each data source separately and then pooled across data sources using a fixed-effects meta-analysis.
Several prospective validity checks were conducted. First, we calculated the expected power for noninferiority of the primary end point at an α level of 0.05, to exclude an upper margin of the 95% CI for the hazard ratio of 1.3, as specified in CAROLINA (9) and mandated by the FDA for cardiovascular outcome trials evaluating new therapies for T2D (12).
Then, we assessed the postmatching balance of potential confounders between exposure groups by calculating standardized differences, with meaningful imbalances set at values >0.1, and postmatching C-statistic, which is expected to be close to 0.5 when balance is present (13). The potential for residual confounding by unmeasured factors not included in the PS model was evaluated by inspecting the balance in key baseline laboratory results in the population subset with this information available.
Finally, we evaluated two control outcomes with the aim to replicate known associations. Specifically, we assessed the risk of severe hypoglycemia, defined as an emergency department visit or a hospitalization for hypoglycemia, for which we expected an increased risk associated with the initiation of glimepiride (14), and the risk of incident end-stage renal disease (ESRD), for which we expected a null finding (15) (see Supplementary Table 2 for outcome definitions). ESRD was specifically chosen as a control outcome because of the known preferential prescribing of linagliptin toward patients with chronic kidney disease (16), which may lead to an apparent elevation in ESRD risk associated with the use of linagliptin.
All analyses were performed using Aetion platform version 3.11 with R version 3.4.2, which has previously been scientifically validated by accurately repeating a range of previously published studies (17) and by replicating (18) or predicting clinical trial findings (6). After all validity checks were met and confidence for accurately predicting results from CAROLINA was achieved, the protocol was registered (NCT03648424, ClinicalTrials.gov), and the primary end point analysis conducted. All individual data were deidentified, the study was approved by the Brigham and Women’s Hospital Institutional Review Board, and signed data license agreements were in place for all data sources.
The final eligible study cohort included 164,176 patients with T2D, of whom 24,842 initiated linagliptin and 139,334 initiated glimepiride. Before PSM, linagliptin initiators tended to be younger but had a greater burden of comorbidities such as hypertension, hyperlipidemia, and chronic kidney disease (Table 1 and Supplementary Table 3). After PSM, we identified 24,131 patient pairs of linagliptin versus glimepiride initiators, and exposure groups achieved excellent covariate balance, with standardized differences for all covariates of <0.1 (Table 1 and Supplementary Table 4). Compared with the CAROLINA participants, the patients included in our study population were older (mean age 70 vs. 64 years) and more frequently women; however, they had a similar burden of comorbidities, such as prior cardiovascular events and renal dysfunction, and patterns of medication use (Table 1). The estimated power exceeded 98% for noninferiority, and the postmatching C-statistic was 0.53 (Table 2). Key baseline laboratory results (HbA1c, lipid levels, estimated glomerular filtration rate, and urinary albumin-to-creatinine ratio [UACR]), which were available in a subset of the population and therefore were not included in the PS adjustment, were equally well balanced (Table 2 and Supplementary Table 4).
The mean (SD) and median (interquartile range) follow-up were 222 (221) and 131 (62, 294) days, respectively. The known association between glimepiride and hypoglycemia (HR 2.38 [95% CI 1.79–3.13]) and the null association between linagliptin and ESRD (HR 1.08 [0.66–1.79]) were correctly estimated, confirming the study’s ability to replicate known findings for control outcomes (Table 2 and Supplementary Table 5). Linagliptin was associated with a nonsignificant decrease in the risk of the primary cardiovascular outcome (HR 0.91 [0.79–1.05]) compared with glimepiride (Table 3), in line with the noninferiority hypothesis of the CAROLINA trial. Within the individual databases, the cardiovascular effect varied from HR 0.96 (95% CI 0.83–1.12) in Medicare (mean age 73 years) to HR 0.76 (95% CI 0.47–1.22) in MarketScan (mean age 66 years) and HR 0.44 (95% CI 0.23–0.87) in Clinformatics (mean age 63 years) (Supplementary Table 6). Analysis of the individual components of the composite cardiovascular outcome produced results consistent with a nonsignificant decreased risk for myocardial infarction (HR 0.87 [95% CI 0.68–1.12]) and stroke (HR 0.84 [0.64–1.11]) associated with the use of linagliptin and with a null association with all-cause mortality (HR 0.96 [0.79–1.17]), although the availability of mortality information varied by data source (Supplementary Table 5). Results were largely consistent across individual databases and in a sensitivity analysis based on random effects pooling (HR 0.76 [0.51–1.13]).
In this cohort study using electronic claims data, we found evidence of adequate statistical power, solid confounding control, and the ability to replicate known associations for two control outcomes, which suggests high confidence for accurately predicting results from the CAROLINA trial before the release of its findings. Linagliptin was noninferior to glimepiride and was associated with a nonsignificant 9% decrease in the risk of the primary cardiovascular end point.
The results are consistent with a prior small trial (19) and with noninterventional studies (20,21) that showed a decreased risk of cardiovascular events associated with DPP-4 inhibitors compared with sulfonylureas in younger patients with lower cardiovascular risk, but not in patients with higher risk (20). Because participants in the CAROLINA trial were on average younger compared with our pooled study population (64 vs. 70 years), it is possible that an age-dependent cardiovascular benefit of linagliptin may be slightly larger in the trial. This article was submitted to this journal on 10 January 2019. On 14 February 2019, Boehringer Ingelheim and Eli Lilly announced CAROLINA met its primary end point for 3P-MACE, defined as noninferiority for linagliptin versus glimepiride in adults with T2D with cardiovascular risk (22). Although this press release points toward an alignment with our findings, full results of CAROLINA, expected to be presented in June at the American Diabetes Association’s 79th Scientific Sessions, will reveal whether and to what extent our RWD analysis succeeded in reproducing the CAROLINA findings.
Although many questions on the effectiveness of medications can only be reliably assessed in the setting of an RCT, some questions could potentially be reliably answered using RWE generated from nonrandomized database studies, even in the absence of traditional RCT evidence (23). Developing a process for prospectively identifying such settings and generating RWE that instills high confidence in its validity is essential if there will be a role for nonrandomized studies using RWD in regulatory decision making (2,11).
This study has limitations. First, residual confounding by some unmeasured characteristic(s) cannot be entirely ruled out, although it is likely to be minor. Prespecified validity checks confirmed that the new user–active comparator design combined with the information-rich propensity-score adjustment by >120 variables reduced the potential for confounding by unmeasured covariates substantially. This is shown by the balance in selected laboratory test results, despite them not having been included in the PS model (because they were not recorded in most patients). This multipronged approach to confounding minimization has previously shown success in achieving balance of unmeasured characteristics when studying oral antidiabetic medications, including socioeconomic status (24,25). Prespecified validity checks also confirmed the study’s ability to replicate known causal associations for two control outcomes.
Second, information on mortality varied by database, with complete information only in the Medicare fee-for-service database.
Third, because our study reflected the use of linagliptin or glimepiride in routine care, the median follow-up was shorter compared with most cardiovascular outcome trials, which have substantial adherence-improvement measures built-in. Trials generally require long follow-up to accumulate sufficient events for powered analyses. The size of our study population (∼50,000 patients) allowed us to achieve powered analyses even with a shorter duration of follow-up. Assuming no time-varying hazards, our shorter-term findings will be generalizable to longer-term trial findings.
Fourth, heterogeneity was observed in the point estimates across the three databases. This is expected, because the databases include different populations with different baseline risks for the primary outcome, similar to the heterogeneity that is observed in the context of RCTs reporting stratified analyses by baseline risk. The small number of events observed in the commercial databases, particularly in Optum, and the resulting imprecise point estimates also contributed to the observed heterogeneity across databases.
Fifth, this is only the first of a series of studies aimed to predict the findings of ongoing trials before their completion, and thus it does not intend to provide conclusive evidence regarding the capacity of RWD analyses to succeed or not in replicating ongoing RCTs.
In conclusion, this large cohort study specifically designed to predict the findings of the ongoing CAROLINA trial before its completion found that linagliptin was noninferior to glimepiride regarding combined cardiovascular events. The observed treatment effect, a 9% decrease in the primary cardiovascular end point associated with the initiation of linagliptin, was compatible with a null finding in a population slightly older than CAROLINA. Although a press release from CAROLINA pointed toward an alignment with our study findings, full results from the trial, expected in a few months, will reveal whether and to what extent our RWD analysis succeeded in predicting the CAROLINA findings.
See accompanying article, p. 2161.
Acknowledgments. The authors thank Dr. Dorothee B. Bartels (BI X GmbH, Ingelheim, Germany) for her involvement in an earlier phase of this study.
Funding. This study was funded by U.S. Food and Drug Administration contract numbers HHSF223201710186C and HHSF223201810146C through the Department of Health and Human Services (HHS) and by the Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA. E.P. was supported by a career development grant (K08-AG-055670) from the National Institute on Aging.
Duality of Interest. E.P. is co-investigator of investigator-initiated grants to the Brigham and Women’s Hospital from GlaxoSmithKline and Boehringer Ingelheim, not directly related to the topic of the submitted work. S.S. is the principal investigator of investigator-initiated grants to the Brigham and Women’s Hospital from Bayer, Vertex, and Boehringer Ingelheim unrelated to the topic of this study and is a consultant to WHISCON and to Aetion, a software manufacturer of which he owns equity. His interests were declared, reviewed, and approved by the Brigham and Women’s Hospital and Partners HealthCare System in accordance with their institutional compliance policies. No other potential conflicts of interest relevant to this article were reported.
Author Contributions. E.P., S.S., and J.M.F. developed the study protocol. E.P. and C.G. wrote the manuscript and conducted the data analysis. E.P., S.S., C.G., D.M., and J.M.F. contributed to discussion and reviewed and edited the manuscript. E.P. is the guarantor of this work and, as such, had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.