The results of the Prospective Pioglitazone Clinical Trial in Macrovascular Events (PROactive)study1 have stimulated much discussion in both the diabetes and cardiovascular communities. The official commentary of the American Diabetes Association(ADA) was published in Diabetes Care2and is reprinted in this issue of Clinical Diabetes(p. 66). Previously, I offered a different view of PROactive in another ADA publication, DOC News,3 from which this commentary is adapted. My earlier comments generated a flood of e-mails, a few of which claimed I was a naysayer or curmudgeon, but the majority of which expressed agreement with my views. Other commentaries have been published by Yki-Järvinen4 as an accompanying article in the same issue of Lancet in which the PROactive study appeared,Freemantle5 in the British Medical Journal, and Ceriello6 in Diabetic Medicine. These commentaries focus on different points, but there are a number of commonalities among them.
PROactive was a much-anticipated study because it was the first large study to be reported that was designed to determine whether the potential theoretical benefits of peroxisome proliferator-activated receptor-γ(PPAR-γ) agonists (in this case pioglitazone) on endothelial function and cardiovascular risk markers might indeed result in fewer macrovascular disease (atherosclerosis) events in patients with type 2 diabetes. That anticipation was driven by a considerable amount of hype and by the 2004 publication of the study's design and baseline data in Diabetes Care.7 An entire hour was allocated for the PROactive presentation on 12 September 2005 during the annual meeting of the European Association for the Study of Diabetes (EASD) in Athens, Greece, and the presentation was webcast worldwide.8 The publication of the study's results in Lancet occurred < 1 month later.1
PROactive researchers had enrolled 5,238 patients at 321 sites in 19 European countries, with almost all subjects having significant preexisting macrovascular disease at baseline. In contrast to many studies for which enrollment is slower than projected, the investigators achieved their enrollment target ahead of schedule. They also had superb follow-up, with end point data being available on all but two subjects. Thus, they did not have a problem of differential results based on an “intention to treat”analysis versus a “completer” analysis. But some-where along the way, something else went astray: the definition of the efficacy end points.
In the study design article, the PROactive investigators stated that
“a composite cardiovascular disease end point is used because the aim of the study is to evaluate the overall effects on macrovascular disease. The primary end point variable is the time from randomization to the first occurrence of any of the events in the following composite: all-cause mortality; nonfatal [myocardial infarction]; acute coronary syndrome; cardiac intervention, including coronary artery bypass graft, or percutaneous coronary intervention; stroke; major leg amputation (above the ankle); bypass surgery;or revascularization in the leg. The end points are adjudicated by an independent panel. Secondary end points include the individual components of the primary end point and cardiovascular mortality.”7
The primary end point, so defined, failed to reach statistical significance. Thus, PROactive was a negative study and should have been reported as such. However, the investigators—and the press—discarded the primary end point, apparently because it did not show the effect they had hoped it would.
Suddenly, there appeared on the scene a new “principal secondary end point,” but there was no mention of this in the study design article7 published the year before. This new end point included only all-cause mortality,nonfatal myocardial infarction, and stroke. And guess what? It reached statistical significance. Where did it come from? Why wasn't it defined a priori? Why was it allowed to replace the primary end point?
At the EASD presentation—but not in their published article—the investigators asserted that they inserted this measure into the statistical plan before breaking the code and analyzing the data. I hate to say something that might suggest that this is disingenuous. However, PROactive did(appropriately) have a data safety and monitoring board (DSMB). The usual procedure is for such a DSMB to have access to the emerging data to assess risk versus benefit. Thus, the statisticians at the study's coordinating center, the DSMB, and perhaps the study chairman, had earlier access to end point data. Under such circumstances, one has to wonder who might have suspected or learned of the potential of a negative primary end point and whether such information could have led to the last-minute creation of a principal secondary end point that conveniently created a positive study outcome. Yet, even when secondary end points are stated a priori, they should only be considered meaningful when the primary end point is positive. Otherwise, in the face of a negative primary end point, secondary end points should be considered exploratory or hypothesis-generating.
Championing their new principal secondary end point and trashing their original primary end point, the PROactive authors asserted that a composite end point encompassing only all-cause mortality, nonfatal myocardial infarction, and stroke avoids those components of the primary end point that may be “in part determined by a decision to intervene based on local surgical or medical practice.”1 I concur that the combination of all-cause mortality, nonfatal myocardial infarction, and stroke indeed may have been a better primary end point. So why wasn't it selected as the primary end point in the first place or at least included as a predefined principal secondary end point?
One explanation may be that the investigators wanted to be sure there were enough events in the relatively short time frame in which they hoped to complete the study, and to get that number of events, they included all possible components in the primary end point. Unfortunately, the inclusion of extra components in the primary end point diluted the potential impact of the study, resulting in a negative overall study. That type of dilution could only occur if there were more events (among the extra components) in the experimental group than in the control group. That raises the question of why there might be differential effects on different outcome measures. As noted by Freemantle,5 “had the effects of treatment been real and substantial, we could have expected consistent results across all important cardiovascular outcomes.”
In any case, all of the argument and discussion about the importance of the principal secondary end point cannot negate the fact that the primary end point failed to reach significance. The correct statistical interpretation of the study is that it is negative. Period.
The investigators (and the press release issued about their study) asserted that the beneficial effects (on the new secondary end point) were because of pioglitazone per se. However, the hemoglobin A1c (A1C) at study end was 7.0 % in the pioglitazone group versus 7.6% in the placebo group(P < 0.0001). Thus, any result seen could be a consequence of improved glycemic control rather than a unique effect of pioglitazone. To test whether there is a unique effect of pioglitazone not attributable to glycemic control, this effect must be evident when there is equivalent glycemic control in the comparison group, that control being achieved by other glucose-lowering agents. In fact, the earlier study design article stated that “the investigators are encouraged to maintain glycemia within the limits outlined in the International Diabetes Federation (IDF) Europe Guidelines (< 6.5%),which was high-lighted and circulated to all investigators.”7 And in the results article, investigators again stated that they “drew particular attention to the need to reach an [A1C] concentration below the recommended target (<6.5%).”1 Not only was there a difference in glycemic control, there also was greater improvement in lipids in the pioglitazone group than the placebo group.
In addition, in a subgroup analysis—which also is probably not warranted in a negative study—that was included in the EASD presentation8 but not in the Lancetreport,1 the alleged potential beneficial effects were seen in patients not using statins and were not present in those who used statins. Because the use of statins in patients with type 2 diabetes is highly desirable, their use may obviate the need for adding pioglitazone even if one accepts the flawed notion that a benefit of pioglitazone has been demonstrated.
Unfortunately, the PROactive investigators minimized the adverse events,particularly heart failure. In her commentary,Yki-Järvinen4 noted that although there were 58 fewer primary end points (57 fewer principal secondary end points) with pioglitazone, the subjects treated with pioglitazone had 115 more episodes of heart failure and 221 more episodes of edema than the placebo subjects, and weight gain was 4 kg (8.8 lb) greater in the pioglitazone group than in the placebo group. She pondered the overall impact of pioglitazone on health and asked whether, “from the patient's perspective, is it better to have healthy arteries in the heart than a failing heart?”Ceriello,6 too,observed that “to have a higher incidence of heart failure using a compound which is claimed to reduce cardiovascular disability and mortality is certainly not intuitively logical.”
The ADA statement, although accepting the secondary end point as indicative of a beneficial effect of pioglitazone (a point I dispute), does raise a number of caveats that should be considered before generalizing any conclusions to other types of patients with diabetes and appropriately calls for more studies.2 In contrast, sadly, the PROactive investigators concluded that “in summary, in patients with type 2 diabetes who are at high cardiovascular risk,pioglitazone improves cardiovascular outcome and reduces the need to add insulin to glucose-lowering regimens compared with placebo” and worse still that “we believe our results are generalisable to all patients with type 2 diabetes.”1
The past few decades have seen a growing awareness of the appropriate design and interpretation of randomized controlled clinical trials, the statistical methods used to analyze such trials, and the assessment of the strength of evidence that underpins clinical decision making. The PROactive study was, as noted in the ADA commentary,2 “a carefully designed and well-executed clinical trial” with“only two subjects... lost to follow-up,... a testimony to the dedication and skill of the investigators.” Unfortunately, after conducting the trial so well, the investigators' inappropriate analysis and unjustified interpretation not only negated their hard work but also made a mockery of our system of evidence-based medicine.
professor in the Division of Endocrinology, Diabetes, & Metabolism and associate director of the Diabetes Research Institute at the University of Miami Miller School of Medicine in Florida.