People with diabetes are at higher risk of developing atherosclerosis and have more than twofold higher prevalence of peripheral artery disease (PAD) compared with the general population (1). Those with PAD and diabetes are at high risk of lower-extremity amputation (2), an outcome linked to decreased quality of life and increased cardiovascular disease (CVD) morbidity and mortality (3,4). Since the publication of the CANVAS (Canagliflozin Cardiovascular Assessment Study) (5) and CANVAS-R (Canagliflozin and Cardiovascular and Renal Events in Type 2 Diabetes) (6) trials, the effect of sodium–glucose cotransporter 2 inhibitors (SGLT2is) on the risk of amputations has generated interest. Due to their benefits in CVD outcomes, SGLT2is are preferentially prescribed to patients with diabetes with or at high risk of CVD (7,8). Concerns about a possible increased risk of amputation with canagliflozin, and whether this is a drug or class effect (9,10), create uncertainty about the benefit-harm balance of these drugs. Thus, more evidence is needed to evaluate the use of SGLT2is in real-world populations with elevated risk of PAD.

In the current issue, Griffin et al. (11) report results from a retrospective cohort study of older U.S. veterans with diabetes using the Veterans Health Administration (VHA) national database. They found that the initiation of an SGLT2i (mostly empagliflozin) by patients on various background antihyperglycemic treatments (metformin, sulfonylureas, or insulin) was associated with an 18% increased event rate of a composite PAD surgical outcome (lower-extremity stent placement, vascular surgery, or amputation) compared with the initiation of a dipeptidyl peptidase 4 inhibitor (DPP-4i) (a drug class felt to have neutral effects on CVD outcomes). The authors assembled a large, nationally representative veteran cohort with diabetes and linked VHA data with Medicare and Medicaid data sources. Their broad outcome definition of PAD events, which included peripheral revascularization and amputation procedures, was designed to capture events occurring earlier in the PAD disease process that could ultimately lead to amputation (12).

Although the authors used a state-of-the art active comparator (DPP-4i) new user design and employed a comprehensive list of baseline covariates and propensity scores to account for differences in measured baseline characteristics between SGLT2i and DPP-4 users, we should be cautious when interpreting the study results as demonstrating a causal effect of SGLT2is on risk of PAD outcomes and ultimately amputations.

As mentioned above, SGLT2is are preferentially prescribed to patients with diabetes at highest risk for CVD. Table 1 in Griffin et al. (11) provides strong empirical evidence for this statement, as virtually all measured CVD-related variables are more prevalent in SGLT2i initiators than in DPP-4i initiators. For example, the prevalence of congestive heart failure (CHF) is 18.1% in SGLT2i initiators and 9.6% in DPP-4i initiators. The crude incidence rate of the composite PAD outcome is 12.7 per 1,000 person-years in the SGLT2i cohort and 8.6 per 1,000 person-years in the DPP-4i cohort, for a ∼50% higher risk, not controlling for confounding by indication. SGLT2i (vs. DPP-4i) initiation is clearly associated with an increased risk of PAD and thus could be used as a strong marker for future PAD complications. This estimate, however, is not what we are interested in. Rather, we want to know what potential causal effect initiating SGLT2i (vs. DPP-4i) has on developing the PAD outcome by comparing patients at the same baseline risk for the outcome.

To do so, Griffin et al. (11) used propensity score methods to remove these imbalances in measured baseline risk factors for PAD, resulting in an adjusted estimate of an 18% increase in risk of the PAD outcome that they interpret causally, even when using “association” language. Balancing measured risk factors is, however, not sufficient to balance actual risk. The great majority of covariates the authors balanced were dichotomous measurements (e.g., CHF yes vs. no) without classifying the disease severity. Balancing only the dichotomy (CHF yes vs. no) is clearly capturing only a proportion of the overall confounding by CHF, leading to residual confounding (Fig. 1). Similar logic would apply to other CVD risk factors. Given the change in estimate of the rate ratio from roughly 1.48 (crude) to 1.18 (adjusted) by controlling for underlying burden of comorbidities measured without specifying severity, it is very plausible that the true estimate would be closer to no effect or even no effect at all if all confounding were well controlled for. Note that confounding does not bias toward the null but in a specific direction, here toward SGLT2i initiators having a higher risk of developing the PAD outcome irrespective of any treatment effect.

Figure 1

Directed acyclic graph illustrating residual confounding when controlling for a proxy (C*) of the actual confounder (C). C, severity of cardiovascular disease; C*, dichotomous measure of cardiovascular disease (yes vs. no); T, sodium–glucose cotransporter 2 inhibitor initiation (vs. DPP-4i initiation); U, severity of cardiovascular disease (unmeasured); Y, PAD surgical outcome. A: Controlling C (the true confounder) depicted by the box around C is sufficient to allow for causal inference. If C is measured incorrectly, however, then the effect of T on Y based on controlling for a proxy (C*) will generally be biased. B: Measurement error in confounders can be thought of as unmeasured confounding. Controlling for C* is better than not controlling for C*, as it is an (imperfect) proxy for C, but it is insufficient to block the backdoor path from T to Y.

Figure 1

Directed acyclic graph illustrating residual confounding when controlling for a proxy (C*) of the actual confounder (C). C, severity of cardiovascular disease; C*, dichotomous measure of cardiovascular disease (yes vs. no); T, sodium–glucose cotransporter 2 inhibitor initiation (vs. DPP-4i initiation); U, severity of cardiovascular disease (unmeasured); Y, PAD surgical outcome. A: Controlling C (the true confounder) depicted by the box around C is sufficient to allow for causal inference. If C is measured incorrectly, however, then the effect of T on Y based on controlling for a proxy (C*) will generally be biased. B: Measurement error in confounders can be thought of as unmeasured confounding. Controlling for C* is better than not controlling for C*, as it is an (imperfect) proxy for C, but it is insufficient to block the backdoor path from T to Y.

Close modal

A causal interpretation of the results depends on an assumption of how much actual confounding was removed by adjusting for the measured covariates. If we assume it was 90%, then there might be a true effect of SGLT2i initiation on risk of developing the PAD outcome. If we assume 70%, which is probably more realistic, there would be no effect of SGLT2i on PAD. The authors attempted to address unmeasured confounding by presenting an E-value, a measure of “exceptional confounding, the control of which would completely wipe out apparent risk increases despite researchers’ best efforts at confounder control” (13), stating that “an E-value of 1.37 indicates that a moderate confounder would render the study findings inconclusive.” However, an E-value is for a single unmeasured confounder (14) and therefore does not capture the effect of multiple confounders all acting in the same direction. The latter is, however, exactly what we would expect with confounding by indication (15).

While residual confounding threatens the causal interpretation of the study reported by Griffin et al. (11), the study could be improved if the authors had chosen an active comparator that is an actual clinical alternative for patients at high CVD risk (e.g., glucagon-like peptide 1 receptor agonists [GLP1-RAs], a drug class also indicated for CVD risk reduction), or if the authors had restricted the eligible population to those without CVD at baseline in whom the risk for CVD could be assumed to be more evenly distributed than in patients with preexisting CVD. Without the severity variation among patients with baseline CVD, the potential for residual (unmeasured) confounding by indication would be reduced, providing more robust estimates of the effect of SGLT2is on the PAD outcome. For clinicians and patients, treatment decisions should be made based on an individual evaluation of the balance between potential benefit and potential harm. This evaluation can only be made on the absolute risk scale, not a relative scale. The 18% increase in the relative risk of the PAD outcome seen in the study by Griffin et al. (11) (assuming a true causal effect) translates into a very small increase in absolute risk of ∼1–2 events per 1,000 people treated for 1 year (kudos to the authors for presenting an absolute measure, i.e., a rate difference of 1.8/1,000 person-years) or a number needed to treat (16) of between 500 and 1,000 for one additional person to experience the adverse outcome. Given our argument about residual confounding, this number is likely higher or even moot (no effect) and needs to be weighed against the proven benefits of SGLT2i with respect to CVD outcomes. In EMPA-REG (BI 10773 [Empagliflozin] Cardiovascular Outcome Event Trial in Type 2 Diabetes Mellitus Patients), which enrolled patients with diabetes and high CVD risk, the number needed to treat for empagliflozin was 63 to prevent one major adverse cardiovascular outcome and 29 to prevent one CVD death (17). Thus, the potential harm that an SGLT2i may cause, if true, is small compared with the potential great benefit.

Considering the controversies over the effects of SGLT2is on amputation risk and the heterogeneity of drug effects across different populations, further research is needed to identify the mechanisms underlying drug-induced adverse events and to fully quantify risk associated with each SGLT2i in different populations and settings. With real-world data, the robustness of the findings and causal interpretation will depend on the study design, particularly those that fully measure baseline risk or that minimize confounding by indication based on choosing the most appropriate active comparator. We would argue that the guilty verdict for SGLT2is implied by the study by Griffin et al. (11) may not have met the threshold of beyond a reasonable doubt.

See accompanying article, p. 361.

Funding. T.S. receives investigator-initiated research funding and support as principal investigator (R01AG056479) from the National Institute on Aging and as a co-investigator (R01CA277756) from the National Cancer Institute. He also receives salary support as Director of Comparative Effectiveness Research (CER), North Carolina Translational and Clinical Sciences Institute, University of North Carolina Clinical and Translational Science Award (UM1TR004406), codirector of the Human Studies Consultation Core, North Carolina Diabetes Research Center (P30DK124723), National Institute of Diabetes and Digestive and Kidney Diseases, the Center for Pharmacoepidemiology (current members: GlaxoSmithKline, UCB BioSciences, Takeda, AbbVie, Boehringer Ingelheim, Astellas, and Sarepta), and from a generous contribution from Dr. Nancy A. Dreyer to the Department of Epidemiology, University of North Carolina at Chapel Hill.

Duality of Interest. T.S. owns stock in Novartis, Roche, and Novo Nordisk. No other potential conflicts of interest relevant to this article were reported.

Handling Editors. The journal editors responsible for overseeing the review of the manuscript were Elizabeth Selvin and M. Sue Kirkman.

1.
Soyoye
DO
,
Abiodun
OO
,
Ikem
RT
,
Kolawole
BA
,
Akintomide
AO
.
Diabetes and peripheral artery disease: a review
.
World J Diabetes
2021
;
12
:
827
838
2.
Barnes
JA
,
Eid
MA
,
Creager
MA
,
Goodney
PP
.
Epidemiology and risk of amputation in patients with diabetes mellitus and peripheral artery disease
.
Arterioscler Thromb Vasc Biol
2020
;
40
:
1808
1817
3.
Grzebień
A
,
Chabowski
M
,
Malinowski
M
,
Uchmanowicz
I
,
Milan
M
,
Janczak
D
.
Analysis of selected factors determining quality of life in patients after lower limb amputation-a review article
.
Pol Przegl Chir
2017
;
89
:
57
61
4.
Hoffstad
O
,
Mitra
N
,
Walsh
J
,
Margolis
DJ
.
Diabetes, lower-extremity amputation, and death
.
Diabetes Care
2015
;
38
:
1852
1857
5.
Neal
B
,
Perkovic
V
,
de Zeeuw
D
, et al
.
Rationale, design, and baseline characteristics of the Canagliflozin Cardiovascular Assessment Study (CANVAS)–a randomized placebo-controlled trial
.
Am Heart J
2013
;
166
:
217
223.e11
6.
Neal
B
,
Perkovic
V
,
Mahaffey
KW
, et al.;
CANVAS Program Collaborative Group
.
Canagliflozin and cardiovascular and renal events in type 2 diabetes
.
N Engl J Med
2017
;
377
:
644
657
7.
Tuttle
KR
,
Brosius
FC
,
Cavender
MA
, et al
.
SGLT2 inhibition for CKD and cardiovascular disease in type 2 diabetes: report of a scientific workshop sponsored by the National Kidney Foundation
.
Diabetes
2021
;
70
:
1
16
8.
D’Andrea
E
,
Wexler
DJ
,
Kim
SC
,
Paik
JM
,
Alt
E
,
Patorno
E
.
Comparing effectiveness and safety of SGLT2 inhibitors vs DPP-4 inhibitors in patients with type 2 diabetes and varying baseline HbA1c levels
.
JAMA Intern Med
2023
;
183
:
242
254
9.
Katsiki
N
,
Dimitriadis
G
,
Hahalis
G
, et al
.
Sodium-glucose co-transporter-2 inhibitors (SGLT2i) use and risk of amputation: an expert panel overview of the evidence
.
Metabolism
2019
;
96
:
92
100
10.
Scheen
AJ
.
Does lower limb amputation concern all SGLT2 inhibitors?
Nat Rev Endocrinol
2018
;
14
:
326
328
11.
Griffin
KE
,
Snyder
K
,
Javid
AH
, et al
.
Use of SGLT2i versus DPP-4i as an add-on therapy and the risk of PAD-related surgical events (amputation, stent placement, or vascular surgery): a cohort study in veterans with diabetes
.
Diabetes Care
2025
;
48
:
361
370
12.
Gul
F
,
Janzer
SF
.
Peripheral vascular disease
.
StatPearls Publishing; 2024.
Accessed 15 November 2024. Available from https://www.ncbi.nlm.nih.gov/books/NBK557482/
13.
Poole
C
.
Commentary: continuing the E-value's post-publication peer review
.
Int J Epidemiol
2020
;
49
:
1497
1500
14.
Chung
WT
,
Chung
KC
.
The use of the E-value for sensitivity analysis
.
J Clin Epidemiol
2023
;
163
:
92
94
15.
Sendor
R
,
Stürmer
T
.
Core concepts in pharmacoepidemiology: confounding by indication and the role of active comparators
.
Pharmacoepidemiol Drug Saf
2022
;
31
:
261
269
16.
Stang
A
,
Poole
C
,
Bender
R
.
Common problems related to the use of number needed to treat
.
J Clin Epidemiol
2010
;
63
:
820
825
17.
Verma
S
,
Mazer
CD
,
Al-Omran
M
, et al
.
Cardiovascular outcomes and safety of empagliflozin in patients with type 2 diabetes mellitus and peripheral artery disease: a subanalysis of EMPA-REG OUTCOME
.
Circulation
2018
;
137
:
405
407
Readers may use this article as long as the work is properly cited, the use is educational and not for profit, and the work is not altered. More information is available at https://www.diabetesjournals.org/journals/pages/license.