The U.S. Food and Drug Administration (FDA) issued guidance on requirements to assess cardiovascular disease (CVD) risk with drugs being developed for approval for clinical use. The guidance was triggered by a meta-analysis published by Nissen and Wolski that suggested an increased risk for myocardial infarction with the use of rosiglitazone. This article discusses controversies around CVD trials in diabetes beginning with the University Group Diabetes Program. This is followed by a brief description of the FDA guidance for evaluating CVD risk with glucose-lowering medications. Limitations of meta-analyses of data from phase 2 and 3 (phase 2/3) trials to inform CVD risk are highlighted. These include the differences between patient characteristics in phase 2/3 trials and those in cardiovascular outcome trials (CVOTs) and the relatively short exposure time in phase 2/3 trials. The differences may partly explain the observed disparity between phase 2/3 meta-analyses and the results of completed CVOTs. Approaches to understanding CVD risk with a new medication should get to the answer about risk as efficiently as possible to minimize any potential harm to patients. In that context, we discuss options for clinical trial design and an alternative approach for statistical analyses.
Introduction
Controversies about trial results for glucose-lowering medications and cardiovascular disease (CVD) risk have existed for several decades. The University Group Diabetes Program (UGDP) was a source of controversy about whether a sulfonylurea (SU) (tolbutamide) was associated with increased CVD risk (1–3). Nearly 40 years later, rosiglitazone-related controversy was discussed in the public press, U.S. Congress, the U.S. Food and Drug Administration (FDA), and medical literature (4–6).
This article addresses the efforts to understand CVD risks of glucose-lowering medications in type 2 diabetes (T2D) and provides information on the 2008 FDA guidance for assessing cardiovascular (CV) risk in glucose-lowering medications (6). General reviews of the FDA guidance have previously been published by Menon and Lincoff (7). The Cardiac Safety Research Consortium and other organizations have expressed the need for ongoing dialog about the guidance (8). Current approaches to CV outcome trials (CVOTs) to assess CVD risk with glucose-lowering medications will be reviewed and alternative solutions proposed.
Relationship of Glucose-Lowering, Glucose-Lowering Medications, and CVD: 40 Years of Clinical Trials
The UGDP was the first randomized trial (5 arms and ∼1,000 patients) to address whether differences in glycemic control or the use of specific glucose-lowering medications was associated with CV outcomes (Table 1). Although designed to evaluate the effects of several different diabetes therapies (SU, phenformin, two different insulin arms, and an SU placebo) (1), the UGDP was underpowered to answer the CVD risk question. The observation of a putative increased risk of CVD with SU (tolbutamide vs. placebo) was widely criticized for statistical reasons and was the source of controversy for many years (2). The controversies related to the UK Prospective Diabetes Study (UKPDS) included a marginal P value for a possible beneficial effect on myocardial infarction (MI) in the intensive policy versus conventional policy, demonstration of CV benefit of metformin with a small number of obese patients, and putative concerns about the combined use of SU and metformin (9–11). Confirmation of favorable effects on MI risk of the intensive policy and metformin use was provided during the observational study following the trial. This favorable effect was labeled as the “legacy effect” (12). An explanation for why the SU-metformin combination group appeared to be at increased risk could be attributed to an abnormally low CV event rate in the comparator groups (13).
Clinical trials with glucose-lowering medications and CVD outcomes
. | Pros . | Cons . |
---|---|---|
Key controversial topics | ||
Relationship of glucose lowering, glucose-lowering medication, and CVD from completed diabetes trials | • Wide variety of trial designs and patient cohorts• Extensive data sets• Trials have assessed glycemic targets and glycemic strategies• Many trials with several years of intervention and some with posttreatment long-term follow-up• Unexpected safety signals have been identified | • Inconsistency of results across trials on benefits of glycemic control, benefits and risks for specific drug classes, and wide variation in patient cohorts |
FDA guidance and meta-analysis issues | ||
Meta-analysis for discharging 1.8 CVD event threshold | • Preliminary evidence that the drug does not increase CV harm to an unacceptable level at the time of submission | • Differences in patient characteristics in phase 2/3 trials compared with patient populations in CVOTs: patients at low risk for CVD in phase 3 trial, while FDA suggests inclusion of patients at high risk for CVD in CVOT• Phase 3 trials are typically too short to affect CVD events• Disparate results between meta-analysis of phase 2/3 trials compared with CVOT (evidence from two recently completed DPP-4 inhibitor CVOTs) |
Currently implemented trial design | ||
Starting CVOT during phase 3 development | • MACE during the CVOT can be used to discharge 1.8 (therefore, sponsor does not have to rely only on MACE from phase 2/3 trials)• In a 2-CVOT approach, the first CVOT would start during phase 3 and would only enroll enough patients needed to discharge 1.8 and would also contribute to discharging 1.3. Fewer patients are exposed until it is known whether the drug is safe and effective | • Unblinded results of the MACE from the ongoing CVOT would need to be shared with regulators at the time of submission. CVOT integrity may be compromised if interim results are reported unless preventative measures can be taken to protect integrity of the study• Additional planning, coordination, and discussion with regulators need to occur, particularly for the 2-CVOT approach |
Group sequential designs | • Allows for the opportunity to end a clinical trial early for noninferiority or futility• Patient exposure is reduced to shortest amount necessary to meet objective of trial | • Study conclusions may be questioned if too few events (e.g., <100) were observed when the study is stopped for noninferiority; therefore, timing of interim analysis is important |
Alternative CVOT design options | ||
Use of historical control data | • Reduce sample size of control arm by using information already learned from completed CVOTs• This approach is used in other therapeutic areas | • Similar patient populations and adjudication processes between historical studies and planned CVOT are needed• Requires predefined statistical approaches to determine how much borrowing occurs (e.g., “downweighted” historical data) |
Platform designs | • Same control arm used for comparison with multiple compounds within the same trial creates efficiency by reducing patient numbers | • Blinding of study drug (e.g., one drug is injectable and another is oral)• Maintaining enrollment rates when new treatment arms added• If additional safety measures are needed for one treatment arm, then they would need to be performed for all patients• Decisions on rescue medications may be different among treatment arms• Operational issues |
Increasing sample size while the trial is ongoing | • Maintain desired power if trial assumptions about event rates or enrollment rates are not accurate | • Accurate estimates of event rate likely to occur later in the study, when enrollment is already completed. Problematic operationally if investigative sites stop enrolling and then need to start up again |
Alternative statistical approaches | ||
HR | • Well-known approach in the literature• When the proportional hazards assumptions are met, it provides a valid estimate of the difference between survival curves | • When the proportional hazards model assumption is not met, the result is not a simple average of HRs over time. Hence, the results cannot be interpreted• The inference procedure based on the HR may not have power to detect the safety signal, especially when the two hazard functions cross during the study follow-up• In a noninferiority trial setting, the precision of the HR estimate depends on the number of observed events but not on the number of patients or their exposure time involved in the study. This may lead to an impractically large study for assessing the noninferiority of the new therapy |
RMET difference | • The mean estimate is generally more stable than the median estimate• Uses information from patients who do not have an event• Simple clinical interpretation• No model is needed, and proportional hazards assumption is not needed• No need for large studies to assess the noninferiority claim if the patient’s exposure time is sufficiently long for safety evaluation | • Requires a prespecified time point of interest for which the restricted mean event time is evaluated• The choice of the study population for safety evaluation is critical so that enough events are observed• Existing margins of 1.3 and 1.8 cannot be applied, and relevant margins will need to be set |
. | Pros . | Cons . |
---|---|---|
Key controversial topics | ||
Relationship of glucose lowering, glucose-lowering medication, and CVD from completed diabetes trials | • Wide variety of trial designs and patient cohorts• Extensive data sets• Trials have assessed glycemic targets and glycemic strategies• Many trials with several years of intervention and some with posttreatment long-term follow-up• Unexpected safety signals have been identified | • Inconsistency of results across trials on benefits of glycemic control, benefits and risks for specific drug classes, and wide variation in patient cohorts |
FDA guidance and meta-analysis issues | ||
Meta-analysis for discharging 1.8 CVD event threshold | • Preliminary evidence that the drug does not increase CV harm to an unacceptable level at the time of submission | • Differences in patient characteristics in phase 2/3 trials compared with patient populations in CVOTs: patients at low risk for CVD in phase 3 trial, while FDA suggests inclusion of patients at high risk for CVD in CVOT• Phase 3 trials are typically too short to affect CVD events• Disparate results between meta-analysis of phase 2/3 trials compared with CVOT (evidence from two recently completed DPP-4 inhibitor CVOTs) |
Currently implemented trial design | ||
Starting CVOT during phase 3 development | • MACE during the CVOT can be used to discharge 1.8 (therefore, sponsor does not have to rely only on MACE from phase 2/3 trials)• In a 2-CVOT approach, the first CVOT would start during phase 3 and would only enroll enough patients needed to discharge 1.8 and would also contribute to discharging 1.3. Fewer patients are exposed until it is known whether the drug is safe and effective | • Unblinded results of the MACE from the ongoing CVOT would need to be shared with regulators at the time of submission. CVOT integrity may be compromised if interim results are reported unless preventative measures can be taken to protect integrity of the study• Additional planning, coordination, and discussion with regulators need to occur, particularly for the 2-CVOT approach |
Group sequential designs | • Allows for the opportunity to end a clinical trial early for noninferiority or futility• Patient exposure is reduced to shortest amount necessary to meet objective of trial | • Study conclusions may be questioned if too few events (e.g., <100) were observed when the study is stopped for noninferiority; therefore, timing of interim analysis is important |
Alternative CVOT design options | ||
Use of historical control data | • Reduce sample size of control arm by using information already learned from completed CVOTs• This approach is used in other therapeutic areas | • Similar patient populations and adjudication processes between historical studies and planned CVOT are needed• Requires predefined statistical approaches to determine how much borrowing occurs (e.g., “downweighted” historical data) |
Platform designs | • Same control arm used for comparison with multiple compounds within the same trial creates efficiency by reducing patient numbers | • Blinding of study drug (e.g., one drug is injectable and another is oral)• Maintaining enrollment rates when new treatment arms added• If additional safety measures are needed for one treatment arm, then they would need to be performed for all patients• Decisions on rescue medications may be different among treatment arms• Operational issues |
Increasing sample size while the trial is ongoing | • Maintain desired power if trial assumptions about event rates or enrollment rates are not accurate | • Accurate estimates of event rate likely to occur later in the study, when enrollment is already completed. Problematic operationally if investigative sites stop enrolling and then need to start up again |
Alternative statistical approaches | ||
HR | • Well-known approach in the literature• When the proportional hazards assumptions are met, it provides a valid estimate of the difference between survival curves | • When the proportional hazards model assumption is not met, the result is not a simple average of HRs over time. Hence, the results cannot be interpreted• The inference procedure based on the HR may not have power to detect the safety signal, especially when the two hazard functions cross during the study follow-up• In a noninferiority trial setting, the precision of the HR estimate depends on the number of observed events but not on the number of patients or their exposure time involved in the study. This may lead to an impractically large study for assessing the noninferiority of the new therapy |
RMET difference | • The mean estimate is generally more stable than the median estimate• Uses information from patients who do not have an event• Simple clinical interpretation• No model is needed, and proportional hazards assumption is not needed• No need for large studies to assess the noninferiority claim if the patient’s exposure time is sufficiently long for safety evaluation | • Requires a prespecified time point of interest for which the restricted mean event time is evaluated• The choice of the study population for safety evaluation is critical so that enough events are observed• Existing margins of 1.3 and 1.8 cannot be applied, and relevant margins will need to be set |
DPP-4, dipeptidyl peptidase-4; HR, hazard ratio; RMET, restricted mean event time.
More recently, the results of several large studies with a variety of glucose-lowering strategies in T2D have been published. Two basic approaches have been used. Some trials have used a trial design comparing a drug versus placebo—usually on top of standard of care (SOC)—(pharmaceutical CVOT safety studies use this approach as a form of blinding, e.g., Prospective Pioglitazone Clinical Trial in Macrovascular Events [PROactive] [14]) or versus active comparators (e.g., Rosiglitazone Evaluated for Cardiovascular Outcomes in Oral Agent Combination Therapy for Type 2 Diabetes [RECORD]) [15]). Three trials have used a glucose-lowering target approach (Action to Control Cardiovascular Risk in Diabetes [ACCORD] [16], Action in Diabetes and Vascular Disease: Preterax and Diamicron MR Controlled Evaluation [ADVANCE] [17], and the Veterans Affairs Diabetes Trial [VADT] [18]). These three trials had different glycemic targets, and none showed statistically significant lowering of CVD events. ACCORD had an adverse mortality signal in the intensively treated group (16). Meta-analyses of CVD events from these three trials and the UKPDS demonstrated a statistically significant reduction in CVD with lower achieved HbA1c (19).
Seven pharmaceutical industry–sponsored CVOTs with major adverse cardiac events (MACE) (CV death, MI, or stroke) or MACE+ (MACE plus unstable angina with hospitalization) as primary endpoints have reported results. PROactive, initiated prior to the 2008 FDA guidance (14), reported that pioglitazone versus placebo, on top of SOC, did not show significant reduction in the broad primary composite endpoint (death from any cause, nonfatal MI, stroke, acute coronary syndrome, leg amputation, coronary revascularization, or revascularization of the leg) but showed a reduction in the “main secondary endpoint” comprised of all-cause mortality, MI, and stroke. The authors were criticized for reporting the more restricted endpoint when the primary endpoint was not significant (20). Saxagliptin Assessment of Vascular Outcomes Recorded in Patients with Diabetes Mellitus–Thrombolysis in Myocardial Infarction (SAVOR-TIMI 53) (saxagliptin [21]) and Examination of Cardiovascular Outcomes with Alogliptin versus Standard of Care (EXAMINE) (alogliptin [22]) each had sufficient numbers of participants to be powered for superiority; yet, each trial showed no difference in primary endpoint between active drug and placebo (MACE: CV death, nonfatal MI, or stroke). SAVOR detected an unexpected heart failure signal (21). The Trial Evaluating Cardiovascular Outcomes with Sitagliptin (TECOS) (23) and Evaluation of Cardiovascular Outcomes in Patients With Type 2 Diabetes After Acute Coronary Syndrome During Treatment With Lixisenatide (ELIXA) (24) also had sufficient patients to demonstrate superiority, but no differences in CVD outcomes were demonstrated. The BI 10773 (Empagliflozin) Cardiovascular Outcome Event Trial in Type 2 Diabetes Mellitus Patients (EMPA-REG OUTCOME) (25) demonstrated superiority for the primary endpoint (MACE: CV death, nonfatal MI, or stroke), and patients in the empagliflozin group had significantly lower risks of death from any cause and for hospitalization due to heart failure than the placebo group. AleCardio (aleglitazar [26]) showed no reduction in CV events in spite of favorable lipid effects, and there were safety signals for heart failure, fractures, and bleeding. An important consideration for any of these trials is the potential uncertainty of “standard of care” contributions to CVD risk. Decline in β-cell function is well-known in T2D; thus, additional glucose-lowering medications (usually more in the placebo group) may be added during the course of the trial. These studies provide some clarity on the influence of glucose-lowering medications on CVD risk. However, generalizability of the results to the target patients for the drug (lower risk than study patients) and the impact of SOC medications on CVD outcomes make interpretation of results complex.
FDA Guidance: Background, Approach, and Clarifying Comments
The FDA guidance and interpretation of subsequent trial results have raised controversial comments and questions (Table 1). The meta-analysis published by Nissen and Wolski (4), in which they reported that rosiglitazone was associated with an increased risk for MI, triggered extensive discussion in the public press (5,6,27). The publication was criticized by the statistical community for the fact that 19 of 42 studies were excluded from the meta-analysis because they had reported zero CV events (28–30). The meta-analysis of Diamond, Bax, and Kaul using statistical methods that included all the studies did not confirm harm with rosiglitazone use (28). Their approach aligns with recommendations from other statisticians (31,32). Nevertheless, in response to the meta-analysis of Nissen and Wolski, the FDA released a guidance in 2008 requiring pharmaceutical industry sponsors to demonstrate no unacceptable increase in CVD risk for any new glucose-lowering drug or biologics before the submission of the license application (6). The guidance specifies that a meta-analysis of adjudicated CVD events from phase 2 and 3 (phase 2/3) trials should be performed to inform CVD risk. The studies should be designed to obtain “sufficient endpoints to allow a meaningful estimate of risk . . . [and] should include patients at higher risk of cardiovascular events. . . .” The upper bound of a two-sided 95% CI for the estimated risk ratio should be <1.8 with a reassuring point estimate, and effects in subpopulations should be explored. Further, it may be necessary to run an additional large safety trial in parallel with the phase 3 studies to generate sufficient events to satisfy the 1.8 margin prior to submission. A postmarketing trial will be required to show that the upper bound of the 95% CI of the risk ratio is <1.3 if not demonstrated at the time of submission. Note that in the rest of this article, the use of “1.3” and “1.8” will refer to the upper bound of the 95% CI.
The rationale for the choice of 1.8 and 1.3 is informed by dialogue from the FDA meetings. During the Endocrinologic and Metabolic Drugs Advisory Committee meeting, the role of CV assessment in the premarketing and postmarketing settings was discussed. The publicly available meeting minutes report, “The majority of the committee members felt that the hazard ratio of 1.2 to 1.4 is reasonable, given the benefits of lowering HbA1c and decreasing microvascular complications are well-known.” The minutes state that while a majority of the committee members agreed it would be beneficial to perform a meta-analysis of all safety data at the time of submission, there was no reference to a 1.8 margin, except for the pragmatic conclusion that “it’s mainly just to look and see if we should be concerned. . . .” (33,34). Since the FDA guidance was triggered by a meta-analysis of rosiglitazone trials, the RECORD results generated further controversy. The final RECORD results did not support the conclusion of increased MI or overall CVD risk with rosiglitazone (15). The FDA then required a readjudication of the CVD endpoints; these results also did not suggest increased risk for CVD events with rosiglitazone (35). During the discussion of the readjudication results, the need to continue the FDA guidance was questioned (36). FDA representative Dr. Curtis Rosebraugh indicated that the question was interesting, but not the “topic at hand,” and stated, “So we’re not going to go down that alley right now” (36).
Potential Limitations of Phase 2/3 Meta-analyses to Determine CVD Risk
Disparity Between Patient Profiles in Phase 2/3 Efficacy Trials Versus CVOTs: Study Population
Phase 2/3 trials are typically 6–24 months long and include patients at relatively low risk for CVD. Pooled data from phase 2/3 trials may not be appropriate to address potential CVD risk with a new therapy. The FDA guidance to recruit patients at high risk for CVD, including older patients and those with renal disease, is often not feasible in a phase 2/3 development program designed to assess glycemic efficacy and safety of an investigational drug. Patients in phase 2/3 trials tend to be younger, with a short duration of diabetes, and treated by diet and exercise or with oral medications. Background therapy with metformin precludes enrollment of patients with diminished renal function. Patients in CVOTs tend to be older with multiple comorbidities and with long-standing diabetes often treated with injectable medications. Thus, phase 2/3 diabetes trials add little to the overall understanding of diabetes therapies on CVD risk in high-risk patients.
Disparity Between Study Drug Exposure Time in Phase 2/3 Efficacy Trials Versus CVOTs: Trial Duration
Randomized clinical trials with lipid-lowering or antihypertensive agents (vs. placebo) typically do not show differences in CVD events until 1–2 years after drug exposure (37–39). Studies of intensive glucose-lowering therapy in type 1 diabetes show that, for both microvascular complications and CVD, glucose-lowering effects in a “low-risk” cohort may not become evident until at least 17 years of exposure (40,41). Mortality risk associated with intensive glycemic control in ACCORD (comprised of patients at high risk of CVD) was not evident until 2 years of exposure; a suggestion of a benefit for the primary CVD endpoint with the intensive glycemic arm was not evident until >4 years of exposure to intensive therapy (16). Sibutramine Cardiovascular Outcomes Trial (SCOUT), designed to assess the benefit and risk of sibutramine, did not show harm until after 1 year of exposure (42). These observations suggest that meta-analysis of trials that have a mean follow-up of ≤1 year in duration are not likely to meaningfully inform CVD risk and raise questions about the value of meta-analyses of phase 2/3 trials to assess CVD risks.
Disparity Between Meta-analysis of Phase 2/3 Efficacy Trials Versus CVOT Results: Examples
The phase 2/3 meta-analysis results of MACE with saxagliptin versus placebo in 4,607 patients showed a hazard ratio (HR) of 0.45 (95% CI 0.24–0.83) (43), while the prospective trial (SAVOR) of 16,492 patients with a median follow-up of 2.1 years (maximum 2.9 years) reported a MACE HR of 1.00 (95% CI 0.89–1.12) (21). The phase 2/3 meta-analysis of MACE with alogliptin versus placebo in 4,702 patients showed an HR of 0.61 (95% CI 0.24–1.56) (44), while the HR from EXAMINE of 5,380 patients and a median follow-up of 18 months reported a MACE HR of 0.96 (upper bound of the 95% CI 1.16) (22). Disparities between meta-analysis results and clinical trial results suggest that meta-analyses may not predict CVOT results. In summary, the requisite patient demographics in a phase 3 program, the disparity between phase 2/3 trial durations and exposure to study drug in large CVOTs, and the discrepancies between meta-analysis and randomized clinical trial results are all reasons that the use of meta-analysis for discharging CV harm is controversial (8).
Considerations for Overall Clinical Program and Trial Designs to Discharge CV Risk
Currently Implemented Trial Designs
Pharmaceutical companies have developed different approaches to discharge CVD risk during the development of a glucose-lowering medication to comply with the FDA regulatory requirements. The goal is to understand CVD risk as soon as possible during drug development to ensure patient safety and meet the FDA guidance. Here, we discuss approaches already implemented in the pharmaceutical industry including the following: starting a CVOT during phase 3 development, combining the results of two CVOTs to discharge 1.3, and using interim analyses in a CVOT to stop the trial early if the outcome is known with high probability (Table 1).
Some sponsors have relied only on a meta-analysis of CVD events from their phase 2/3 efficacy trials to discharge the presubmission CVD threshold of 1.8. This has been followed by a single CVOT designed specifically to discharge 1.3 after approval of the drug for clinical use. Other sponsors have started a CVOT study during phase 3 development with the possibility of including CVD events from the ongoing CVOT and the CVD events from the phase 2/3 trials in a meta-analysis to discharge 1.8. This approach ensures enough CVD events to discharge 1.8 with reasonable power (e.g., 90% power) at the time of submission (7,8).
Another approach to discharge CVD risk during development without exposing more patients than necessary to an investigational drug prior to understanding safety is to initiate two CVOTs and perform a combined analysis on CVD events from two studies. The first CVOT starts during phase 3, and MACE plus hospitalization for unstable angina events from the first trial contribute to discharging 1.8 along with the CVD events from the phase 2/3 trials. This CVOT continues postapproval. The second CVOT starts at the time of approval and enrolls enough patients so that enough MACE are observed to discharge 1.3 when combined with MACE from the first CVOT. A development plan that used this approach was CANagliflozin cardioVascular Assessment Study (CANVAS), Janssen’s canagliflozin CVOT (45).
Interim Analysis Concerns
If interim data from an ongoing CVOT are used to discharge 1.8, consideration must be given to the disclosure to regulators to protect the integrity of the trial. In 2014, the FDA held a public hearing to discuss how and whether interim results of CVD data from an ongoing CVOT can be disclosed during regulatory review (46). Speakers agreed that the interim results should not be disclosed publicly because results may influence the behavior of patients and investigators participating in the ongoing CVOT, thus potentially compromising the ability to complete the trial (47). An acceptable approach was that the FDA could simply state whether the 1.8 CV threshold had been met during their review process. This process was successfully implemented with the approval of alogliptin (46). Whether other regulatory agencies will agree to keeping interim results confidential remains uncertain.
Interim analyses have been used to discharge the CV thresholds of 1.8 and 1.3 before occurrence of the planned total number of CVD events, thereby minimizing the time patients need to participate in a CVOT. A statistical penalty must be paid for taking multiple looks at the data to mitigate against falsely stopping the trial early and declaring that the CV threshold has been met. Group sequential designs (GSDs) account for the multiple looks. This penalty is referred to as spending alpha. Different alpha spending approaches used in GSDs have been published (48).
A key consideration for a GSD is the number of CVD events needed to assess CV safety. A study without an interim would typically have 611 events to discharge 1.3 (7), i.e., the number of events needed assuming an HR of 1.0 and 90% power to achieve an upper bound of the 95% CI of 1.3 (7,48). If a study is stopped early because the 1.3 threshold was met statistically, then the conclusions may be questioned if the follow-up was too short or too few events (e.g., <100) were observed to assess safety in subgroups of patients, to assess the individual components of the composite MACE endpoint, or to perform other exploratory analyses. These assessments can be misleading even in studies with 611 events. Rather than requiring a larger study to enable qualitative assessments that can be misleading, an alternative approach is to design the study sufficiently large, but not larger than needed, to address the key preidentified questions. These challenges can be overcome but will require a committed effort by the scientific community in order for the benefits of this approach to be realized.
Alternative CVOT Design Options
Use of Historical Control Data
If completed CVOTs use similar patient populations, event rates from the control arms of randomized completed CVOTs could be used formally in the analyses of future trials, thereby reducing the number of patients needed in the control arm without loss of precision of the HR estimate. Use of control arm data from other trials has previously been applied to oncology and other therapeutic areas (49). The acceptability of incorporating previous studies’ control information depends on several factors including patient population (inclusion/exclusion criteria and allowed concomitant medications), geography, adjudication process, and timeliness of the study so that event rates are similar (Table 1).
Use of All Historical Data
If CVOTs have been completed on glucose-lowering medications in the same class, then the information can be used in the CVOT design or in the overall clinical plan for a new medication in the same class. Not only comparator arm information from completed CVOTs can be used in the trial design but also the information on the active treatment arm being tested. For example, if CVOTs have been completed in the same class of glucose-lowering medication (e.g., dipeptidyl peptidase-4 [DPP-4] inhibitors), then the estimated HR from those studies can be used for a future diabetes CVOT in the same class. The design would use the estimated HR from the completed CVOTs and the HR estimated from the phase 2/3 trials (in spite of limitations of data noted above) of the new medication in a Bayesian analysis (50,51). The contribution of these HRs from these previous CVOTs could be “downweighted” in the prior distribution so that the final results are not dominated by this prior information. The benefit of this approach is to significantly reduce the number of events needed in the CVOT for a new glucose-lowering medication of the same class. The phase 3 studies could be designed such that a sufficient number of CVD events are observed in a relevant high-risk population in a 2-year trial so that when combined with data from previously completed CVOTs in the same class, the CVD threshold of 1.3 can be discharged without a CVOT for the new medication (Table 1).
Platform Designs
Another design approach with economic and public health advantages is referred to by various names: platform design, shared control design, master protocol, or umbrella design. This approach enables the inclusion of two or more different compounds in the same study with a common control arm. The advantages of this approach include reductions of the number of patients allocated to the control arm for the testing of several therapies, and depending on the design, different therapies are allowed to enter the study at any time; however, patients continue to be enrolled in the control arm throughout the study (52,53). There are several statistical approaches to analyzing data from platform trials. In one approach, only patients enrolled in the control arm during the same enrollment period as the treatment arm are used for the assessment of that therapy. This reduces bias in the results that may be induced by changes to patient characteristics over time. However, this mitigation strategy comes at a cost of ignoring information. More powerful approaches exist that would allow for the borrowing of nonconcurrent information from patients in the control arm.
While this approach has many advantages, it has yet to be used in CVOTs due to many operational difficulties. The most challenging of these is maintaining the blind. As therapies meet the 1.3 threshold, there is a need to share the results publicly while still maintaining the blind for the therapies still ongoing in the study. Another difficulty is gaining agreement on common design features that allow minimizing the number of patients assigned to the control arm (Table 1).
Increasing the Sample Size While in an Ongoing Trial
Assumptions of the annual MACE rate and accrual rates made when designing a CVOT may not be realized during the actual running of the CVOT. If rates were underestimated, the duration of the trial will be longer than anticipated. Enrolling additional patients can help ensure that the study will be completed in a reasonable time frame. Because the statistical properties are based on the total number of events observed and not the total number of patients enrolled, no statistical penalty is required if the data remain blinded. A blinded sample size re-estimation of the observed MACE and accrual rates can be performed at any time during the conduct of the trial. It is desirable to perform the assessment before the trial is fully enrolled, since it is difficult for investigators to pause enrollment. If the trial enrolls quickly (e.g., within 1–2 years) there may not be sufficient time to accurately estimate the event rate. In this case, it may be more efficient to assume a larger sample size initially and use a GSD to reduce the exposure of patients in the trial (Table 1).
Considerations for Alternative Statistical Approach in CVOTs
The quantification of the treatment difference for CVOTs is routinely based on the HR estimation from a Cox regression model. This measurement may not be ideal for these trials. First, the HR is not easy to interpret clinically without knowing the event rate for the control group. For example, if the HR is 1.3 then a 30% increased hazard has different clinical importance if the control group has a 5% event rate or a 50% event rate. Second, the key assumption of a constant hazard over time when using the proportional hazard method is not always met. Third, the proportional hazard model may not be a good model for rare events. Specifically, Cox proportional hazard models only depend on the number of events, ignoring the duration of drug exposure and total sample sizes. In a noninferiority safety study, the patients’ exposure time without an event might be clinically more important than the observed number of events. When the event rates are low for both groups, as is common in diabetes CVOTs, the resulting CI for the HR estimate can be quite large, suggesting that there is not enough information to properly assess the drug safety profile. This conclusion can be misleading if a large number of patients have been followed for a long time with few observed events (54) (Table 1).
Restricted Mean Event Time as an Alternative to Proportional Hazards Model
Difficulties related to a nonconstant HR over time and small numbers of events may be overcome by using the restricted mean event time (RMET) measurement (54). RMET is the average number of days prior to the occurrence of an event within a certain period of time and is therefore clinically meaningful for a safety trial. For illustration of the interpretation of the method, SAVOR was analyzed using the RMET method (21,54) (Fig. 1). It shows that up to 900 days of follow-up time, the average time for a first MACE is 860 days for both saxagliptin and placebo. That is interpreted as if we treat future patients from the study population and follow them for 900 days; the average time spent event free would be ∼860 days for both groups. The 95% CI estimate for the difference in RMETs is −5 to 4 days. For interpretation of this in a noninferiority context, future patients treated with saxagliptin for 900 days will on average have a MACE event between 4 days sooner or 5 days later than placebo patients at a 95% confidence level. Since the results are presented in calendar days instead of an HR, the RMET might be easier to understand. This method does not depend on the proportional hazards assumption that is needed in the Cox model. Most importantly, RMET takes the exposure and sample size (not just events as the HR does) into account, which can significantly reduce the study sample size for CVOTs. The RMET approach does not align with the FDA guidance at this time (Table 1).
A and B: Kaplan-Meier curves from SAVOR for saxagliptin and placebo over 900 days, respectively.
A and B: Kaplan-Meier curves from SAVOR for saxagliptin and placebo over 900 days, respectively.
Conclusions
The need for CVOTs was driven by one meta-analysis of publicly available data of phase 2/3 trials suggesting that rosiglitazone was associated with an increased risk for MI. The statistical approach was criticized. Later, primary data and readjudication of the data related to rosiglitazone use in RECORD did not confirm increased CVD risk except for heart failure. Meta-analyses of CVD data from phase 2/3 trials to understand CVD drug–related risk are controversial for the following reasons: 1) the need to study patients at low risk for CVD to meet registration needs, 2) study durations of 6–18 months are likely too short to demonstrate beneficial or harmful effects on CVD, and 3) major disparities between the meta-analyses and CVOT results have been shown for two drugs. We propose considering more efficient and easier-to-interpret approaches to answer the safety questions and potentially reduce patient exposure time to a drug while its effects are being evaluated. These include alternative trial designs, Bayesian methods, and an alternative statistical approach (RMET). Thus, it may be valuable to rethink appropriate methods to better understand CVD risk for future glucose-lowering agents to treat T2D.
This publication is based on the presentations at the 5th World Congress on Controversies to Consensus in Diabetes, Obesity and Hypertension (CODHy). The Congress and the publication of this supplement were made possible in part by unrestricted educational grants from AstraZeneca.
Article Information
Acknowledgments. The authors thank Sharon Myers, Eli Lilly and Company, for thoughtful input into the FDA regulations; Theressa Wright and Linda Shurzinske, Eli Lilly and Company, for review and critical comments on the manuscript; and Holly Martin and Chrisanthi Karanikas, Eli Lilly and Company, for assistance in preparing the manuscript.
Duality of Interest. B.J.H., D.H.M., H.F., E.M., B.L.G., and R.J.H. are employees and shareholders of Eli Lilly and Company. No other potential conflicts of interest relevant to this article were reported.
The opinions expressed in this article are those of the authors and do not necessarily reflect the views of Eli Lilly and Company or any of its Alliance partners.