Diabetic kidney disease (DKD) is the leading cause of end-stage kidney disease (ESKD). Prognostic biomarkers reflective of underlying molecular mechanisms are critically needed for effective management of DKD. A three-marker panel was derived from a proteomics analysis of plasma samples by an unbiased machine learning approach from participants (N = 58) in the Clinical Phenotyping and Resource Biobank study. In combination with standard clinical parameters, this panel improved prediction of the composite outcome of ESKD or a 40% decline in glomerular filtration rate. The panel was validated in an independent group (N = 68), who also had kidney transcriptomic profiles. One marker, plasma angiopoietin 2 (ANGPT2), was significantly associated with outcomes in cohorts from the Cardiovascular Health Study (N = 3,183) and the Chinese Cohort Study of Chronic Kidney Disease (N = 210). Glomerular transcriptional angiopoietin/Tie (ANG-TIE) pathway scores, derived from the expression of 154 ANG-TIE signaling mediators, correlated positively with plasma ANGPT2 levels and kidney outcomes. Higher receptor expression in glomeruli and higher ANG-TIE pathway scores in endothelial cells corroborated potential functional effects in the kidney from elevated plasma ANGPT2 levels. Our work suggests that ANGPT2 is a promising prognostic endothelial biomarker with likely functional impact on glomerular pathogenesis in DKD.
Introduction
Diabetes is the most common cause of chronic kidney disease (CKD) in the world (1). Early identification of patients at risk for diabetic kidney disease (DKD) progression is critical for effective management to prevent or delay kidney damage (2). Biomarker testing combined with effective treatments has been shown to be a cost-effective way to improve patient outcomes in patients with, or at risk for, DKD (3). However, currently recommended biomarkers, which include estimated glomerular filtration rate (eGFR) and albuminuria, have limited sensitivity and specificity for disease prognosis (4). Over the past decade several additional prognostic biomarker candidates, identified with transcriptomic (5,6), proteomic (4), and metabolomic/lipidomic (7–9) analyses, have been found to improve the prognostic accuracy of current biomarkers. Plasma proteins have become attractive candidates as clinical biomarkers because they are easy to process and measure using routine clinical workflows (10–12).
For putative biomarkers that do not originate specifically from disease-affected tissues, such as those measured in body fluids like plasma or serum, it is important to understand the mechanistic connections between the biomarker and the pathophysiology of the disease process. For example, circulating TNFR1 and -2, which are among the most powerful prognostic biomarkers discovered for DKD progression, are unlikely to originate from the kidney (13). However, TNF signaling plays a clear role in DKD pathogenesis. Therefore, investigations into how these circulating biomarkers participate in the disease process can provide novel insights into the underlying molecular mechanism of DKD progression and reveal potential therapeutic targets.
Single-cell RNA sequencing (scRNAseq) allows identification of cell type–specific genes, which can help establish pathogenic signaling connections between circulating biomarkers and potential cell type–specific receptors or other mediators of signaling (14). scRNAseq of kidney biopsy tissue can provide valuable insights into these cell type–specific molecular pathways previously underappreciated due to cellular heterogeneity (15,16). Importantly, investigation of plasma protein ligands, their corresponding receptors in the kidney at single-cell level, and the downstream signaling cascade offers opportunities to unravel the link between plasma biomarkers and pathogenesis of end-organ damage like DKD.
In this study, by integrating plasma proteomics and kidney bulk and single cell transcriptomic data across three independent cohorts, we aimed to identify specific circulating markers and delineate the intrarenal signaling cascade that may mediate the association between the circulating biomarker and DKD progression.
Research Design and Methods
The overall workflow of the study is depicted in Fig. 1.
Study Populations
The Clinical Phenotyping and Resource Biobank Core (C-PROBE) includes a multiethnic, prospective observational cohort of patients with CKD stages 1–4, for whom clinical information and biospecimens were collected at enrollment and yearly thereafter at six sites in the U.S. (17). The C-PROBE discovery group (A) included adult patients with CKD and diabetes who had three or more eGFR measurements after enrollment and had plasma samples available from the time of enrollment. For this group, eGFR was estimated with the Chronic Kidney Disease Epidemiology Collaboration (CKD-EPI) equation (18) and was >30 mL/min/1.73 m2. eGFR slopes were calculated using a linear mixed-effects model (17). Patients with eGFR slopes <−3% per year and >3% per year were matched for major confounders including age, sex, race, and baseline eGFR. An independent C-PROBE validation group (B) included participants with available kidney biopsy transcriptomic profiles.
The Cardiovascular Health Study (CHS) is a community-based, longitudinal study of adults aged ≥65 years designed to identify risk factors for the onset and progression of coronary heart disease and stroke (19). Recruitment began in 1990, and study data were collected annually via clinic visits and biannually via telephone visits. SomaLogic SomaScan v1.3 plasma protein measurements were conducted in a subset of 3,183 CHS participants, without end-stage kidney disease (ESKD), assessed during the 1992–1993 study visit (hereafter referred to as “baseline”). Serum creatinine and cystatin C measurements were conducted on samples taken at baseline and after 4 years, from which eGFR was calculated with the 2021 CKD-EPI equation (20). Incident ESKD events were assessed within 4 years and over the duration of follow-up (mean 13 [SD 6.4] years).
The Chinese Cohort Study of Chronic Kidney Disease (C-STRIDE) includes a multicenter prospective cohort of CKD patients (21). Participants were included who had 1) a diagnosis of diabetes or fasting blood glucose ≥7 mmol/L, 2) eGFR <60 and >30 mL/min/1.73 m2 or urinary albumin-to-creatinine ratio (uACR) >30 mg/g for at least 3 months, 3) two or more available eGFR measurements during follow-up, and 4) plasma samples from the baseline visit. eGFR was calculated with the MDRD equation (22).
Kidney Outcomes and Covariates
The primary outcome was defined as ESKD or >40% reduction from baseline eGFR—whichever occurred first. Participants who reached this outcome were defined as progressors. Covariates included age, sex, race, eGFR, and uACR documented at study enrollment. Participants missing key covariates of eGFR or uACR were not included except when otherwise indicated. In CHS, the primary outcome was defined as an eGFR decline >30% or incident ESKD over 4 years of follow-up.
SomaScan Measurement
SOMAmer-based proteomic assays using the SomaScan 1.3K Plasma kits per standard experimental and data analysis protocols (23) were performed on plasma samples collected at study enrollment at the Genome Technology Access Center at Washington University School of Medicine for C-PROBE and the CardioVascular Institute at the Beth Israel Deaconess Medical Center for CHS. Aptamer levels were log2 transformed and then standardized to have a mean of 0 and SD of 1. Detailed protocol and analysis are included in Supplementary Material.
Measurement of ANGPT2 Using ELISA
Human Angiopoietin-2 Quantikine ELISA (DANG20; R&D Systems) was used according to the manufacturer’s instructions. Details can be found in Supplementary Material. The quality control measures of % recovery mean and % coefficient of variation were within acceptable ranges, and the interassay coefficient of variation was <4.0%.
Statistical Analysis
A Cox proportional hazards model was used to determine the association of baseline levels of each putative biomarker with outcomes. Statistically significant biomarkers (P < 0.05) in a univariate model were used as input for the machine learning approach. The Lasso Cox method (24) was used for feature reduction (details in Supplementary Material). A multivariate Cox model based on the markers selected by Lasso was constructed. Model performance was evaluated with C statistic and time-dependent area under the curve (AUC). In survival models, C statistic provides an overall predictive performance of the model, while time-dependent AUC summarizes the predictive accuracy at a specific time point. The performance of two competing models was compared using ΔC statistic based on a perturbation-resampling method (iteration = 1,000) (25) and the likelihood ratio (LR) test for its sensitivity in capturing the differences in prediction ability. R package glmnet (26,27) was used to construct the Lasso Cox model. A Cox proportional model was constructed using R package survival (28). C statistic and ΔC statistic were calculated with R package survC1 (25), and time-dependent AUC was calculated with R package timeROC (29). All analyses were completed with R, version 3.6.2 (R Core Team, 2019).
In CHS, the association of baseline plasma ANGPT2 concentration with outcomes in the combined cohort was tested using a logistic regression model adjusted for age, sex, race (White vs. non-White), baseline eGFR, and diabetes history. Associations of plasma ANGPT2 and the primary outcome were also tested in diabetes strata. Additionally, the association of plasma ANGPT2 with incident ESKD over the entire follow-up duration was tested with a Cox proportional hazards model, with adjustment for the aforementioned covariates.
Gene Expression Data Analysis
Gene expression profiling from microdissected kidney biopsies was performed as previously described (30) using Affymetrix 2.1 ST chips (Affymetrix, Santa Clara, CA). The CEL files can be accessed through the Gene Expression Omnibus (GEO), GSE180395.
RNA-sequencing (RNAseq) data, GSE142025, fastq files were downloaded from GEO (31). Gene-level expression quantification was performed with HTSeq, and the resultant count data were normalized in voom. Principal components analysis and hierarchical clustering were used to identify and remove samples with abnormal expression profiles due to technical issues, and the mapping statistics were obtained from STAR. scRNASeq data were downloaded from Kidney Precision Medicine Project (KPMP) Kidney Tissue Atlas repository (accessed November 2020) and processed according to the KPMP single cell protocol (Supplementary Material), as previously described (15,32,33). In the dot plot analysis, the size of the dots represents the percentage of cells expressing the TEK gene. TEK (+) expression was defined as greater than zero (TEK > 0) normalized gene expression, and remaining cells were considered TEK (−). TEK gene expression was extracted from Nephroseq database (https://www.nephroseq.org/) using the normal kidney tissue panel of Lindenmeyer et al. (34).
Gene Set Curation, Network Visualization, and Functional Module Detection
The union of ANG-TIE signaling pathway genes curated from three databases, the Pathway Interaction Database (PID) (35), the Reactome database (36), and the NetPath database (37), provided a set of 154 genes (Supplementary Fig. 1). Cytoscape was used to visualize the genes and their database sources. Community clustering in HumanBase (38) was applied to the ANG-TIE signatures within the kidney functional network to identify cohesive gene modules.
Creation of Gene Set–Based Pathway Activation Score
Log2-transformed gene expression was used to compute z scores, and the average of z scores of the 154 pathway gene signature was used to generate an ANG-TIE pathway activation score for each participant. The AddModuleScore function in R package Seurat was used to generate ANG-TIE pathway scores at the single-cell cluster level.
Data and Resource Availability
The C-PROBE transcriptomic data files can be accessed through GEO, GSE180395. External validation RNAseq data files were downloaded from GEO, GSE142025. scRNASeq data were downloaded from KPMP Kidney Tissue Atlas. SomaScan and other clinical data can be accessed after necessary approval processes with respective study review committees.
Results
Circulating Biomarkers Associated With DKD Progression
Baseline and follow-up characteristics of the 58 participants included in the discovery group (C-PROBE group A) are summarized in Table 1. Mean age at enrollment was 49.54 (SD 13.74) years. Participants had impaired baseline kidney function, with mean eGFR 49.54 (13.74) mL/min/1.73 m2 and median uACR 167.4 mg/g (interquartile range [IQR:683.09]). After 4.4 years [IQR:3.13] of follow-up, 28 of 58 patients reached the composite outcome of ESKD or 40% loss of baseline kidney function.
. | C-PROBE discovery group A (N = 58) . | C-PROBE validation group B (N = 68) . | CHS (N = 3,183) . | C-STRIDE (N = 210) . |
---|---|---|---|---|
At baseline | ||||
Age (years) | 59.90 (11.70) | 43.57 (15.35) | 74.39 (4.91) | 58.52 (10.61) |
Female (%) | 33 (56.9) | 41 (60.3) | 1,934 (60.8) | 80 (47.3) |
Race Black (%) | 28 (48.3) | 13 (19.1) | 505 (15.9) | NA |
eGFR (mL/min/1.73 m2) | 49.54 (13.74) | 65.61 (34.43) | 74.3 (17.1) | 57.22 (30.62) |
uACR (mg/g) | 167.4 [683.09] | 816.2 [1680.8] | NA | 595.0 [1,364.9] |
During follow-up | ||||
Follow-up (years) | 4.4 (3.0) | 3.6 (2.8) | 2.6 (1.0) | 1.9 (2.3) |
No. reaching outcome (%) | 28 (48.3) | 29 (49.2) | 74 (2.3)* | 74 (35.2) |
. | C-PROBE discovery group A (N = 58) . | C-PROBE validation group B (N = 68) . | CHS (N = 3,183) . | C-STRIDE (N = 210) . |
---|---|---|---|---|
At baseline | ||||
Age (years) | 59.90 (11.70) | 43.57 (15.35) | 74.39 (4.91) | 58.52 (10.61) |
Female (%) | 33 (56.9) | 41 (60.3) | 1,934 (60.8) | 80 (47.3) |
Race Black (%) | 28 (48.3) | 13 (19.1) | 505 (15.9) | NA |
eGFR (mL/min/1.73 m2) | 49.54 (13.74) | 65.61 (34.43) | 74.3 (17.1) | 57.22 (30.62) |
uACR (mg/g) | 167.4 [683.09] | 816.2 [1680.8] | NA | 595.0 [1,364.9] |
During follow-up | ||||
Follow-up (years) | 4.4 (3.0) | 3.6 (2.8) | 2.6 (1.0) | 1.9 (2.3) |
No. reaching outcome (%) | 28 (48.3) | 29 (49.2) | 74 (2.3)* | 74 (35.2) |
Data are means (SD) or median [IQR] unless otherwise indicated. Outcome was defined as time to composite event of developing ESKD or loss >40% of baseline eGFR—whichever occurs first. Data were complete except uACR missing for one, four, and seven individuals in C-PROBE group A, C-PROBE group B, and C-STRIDE, respectively. Baseline uACR was not measured in CHS. Time of follow-up is missing for five people in C-PROBE group B. eGFR decline could not be calculated for 815 (26%) CHS participants due to missing creatinine data. NA, not applicable.
For the CHS cohort, outcome was defined as a composite of eGFR decline >30% or incident ESKD over 4 years of follow-up from baseline.
Of the 1,301 plasma proteins measured on the SomaScan platform, 84 were univariately associated with time to reaching the outcome (P < 0.05) (Fig. 2A and Supplementary Table 1). Epidermal growth factor receptor (EGFR), apolipoprotein A1 (APOA1), cathepsin V (CTSV), c-type lectin domain family 4 member M (CLEC4M), and TNF receptor superfamily member 1A (TNFRSF1A) were among the markers that showed the strongest associations with outcome. Many of the 84 proteins were intercorrelated, with Pearson correlation coefficients ranging from −0.63 to 0.96 (Supplementary Fig. 2A). Correlation matrix of those proteins ordered by hierarchical clustering showed distinct correlation patterns (Supplementary Fig. 2B), potentially representing different aspects of the pathogenesis of DKD progression. The intercorrelated expression pattern also necessitated a feature selection approach to reduce excessive multicollinearity.
Predictive Biomarker Panel for DKD Progression
To reduce dimensionality and construct an actionable marker panel that could predict DKD progression, a feature selection procedure was applied to the 84 proteins (cross-validation curve in Supplementary Fig. 2C) resulting in a panel of three biomarkers (EGFR, CLEC4M, and ANGPT2). The markers were evaluated together with clinical variables (Fig. 2B), with CLEC4M (hazard ratio [HR] 0.076, P = 0.005) and ANGPT2 (HR 3.39, P = 0.02) remaining statistically significant in the multivariate Cox proportional hazards model with the reduction of Akaike information criterion (AIC) values from 168.2 in model 0 to 160.6 in model 2. Addition of the biomarker panel significantly improved the prediction of DKD progression (LR test P = 0.003) over clinical variables alone (model 0 [Table 2]), with C statistic improving from 0.728 to 0.791 for the joint model including clinical variables and the selected biomarker panel (model 2 [Table 2]). Time-dependent receiver operating characteristic curve for the Cox model (truncated at 5 years) also demonstrated that the Lasso-selected biomarker panel could improve the prediction when added to clinical variables, with an increase in AUC from 0.704 to 0.806 (Supplementary Fig. 3).
. | C statistic (CI) . | ΔC statistic (CI)a . | P (LR test)a . |
---|---|---|---|
Group A (discovery) | |||
Model 0 | 0.728 (0.583–0.873) | ||
Model 1 | 0.773 (0.671–0.874) | ||
Model 2 | 0.791 (0.662–0.920) | 0.063 (−0.031 to 0.157) | 0.003 |
Group B (validation) | |||
Model 0 | 0.671 (0.531–0.811) | ||
Model 1 | 0.703 (0.596–0.810) | ||
Model 2 | 0.787 (0.666–0.907) | 0.116 (−0.002 to 0.234) | 0.0004 |
. | C statistic (CI) . | ΔC statistic (CI)a . | P (LR test)a . |
---|---|---|---|
Group A (discovery) | |||
Model 0 | 0.728 (0.583–0.873) | ||
Model 1 | 0.773 (0.671–0.874) | ||
Model 2 | 0.791 (0.662–0.920) | 0.063 (−0.031 to 0.157) | 0.003 |
Group B (validation) | |||
Model 0 | 0.671 (0.531–0.811) | ||
Model 1 | 0.703 (0.596–0.810) | ||
Model 2 | 0.787 (0.666–0.907) | 0.116 (−0.002 to 0.234) | 0.0004 |
Model 0 covariates: age, sex, race, eGFR, and uACR. Model 1 covariates: three biomarkers, EGFR, CLEC4M, and ANGPT2. Model 2 covariates: age, sex, race, eGFR, and uACR and three biomarkers, EGFR, CLEC4M, and ANGPT2.
ΔC statistic and P value are the results of comparing model 2 and model 0.
Validation of the Additive Value of the Biomarker Panel in an Independent Group of Patients
For evaluation of the performance of the three-marker panel in an independent group of patients, concentrations of the identified biomarkers were measured by SomaScan assay in plasma samples of C-PROBE validation group B (N = 68) (Table 1). Group B included C-PROBE participants with transcriptomic data derived from kidney biopsies. In comparison with group A, group B participants were younger and had higher eGFR and uACR at baseline and fewer were Black. Twenty-nine participants reached the outcome after a median follow-up of 3.6 years.
The three-marker panel showed significant improvement in prediction of progression to outcome with an increase of C statistic from 0.671 (model 0 [Table 2]) to 0.787 for model 2 (the joint clinical and biomarker model) with a significant LR test P value of 0.0004 (Table 2) in this validation group. The AIC values also showed a marked reduction from 202.1 to 189.9 respectively. Of the three markers, ANGPT2 remained statistically significant with HR of 3.59, namely, one-unit increase of ANGPT2 (in log2 scale) was significantly associated with 259% increased risk of progression with the other covariates withheld (P = 0.006) (Fig. 2C).
Validation of ANGPT2 Concentration Using ELISA
Among the marker panel demonstrating significant association with outcome in both the discovery (C-PROBE group A) and validation (C-PROBE group B) groups, ANGPT2 stood out as the only consistent significant predictor after adjustment for age, sex, race, eGFR, uACR, and the other two biomarkers (Fig. 2B and C) and was selected for further clinical and mechanistic studies. For confirmation of concentrations measured by SomaScan, ANGPT2 in plasma samples from C-PROBE group A were measured with ELISA. The ANGPT2 concentrations measured using these two technologies were strongly correlated, with a correlation coefficient of 0.91 (P = 2.58e−23) (Fig. 3A). The association of ANGPT2 with outcome was also replicated (HR 2.11 [95% CI 1.30–3.41]). With stratification by ANGPT2 quartiles measured by ELISA, the risk of reaching the outcome increased by 8.04-fold (95% CI 2.13–30.38, P = 0.002) for patients from quartile 4 compared with those from quartile 1 (Supplementary Table 2).
Validation of the Association of ANGPT2 With Outcome in External Cohorts
The association of ANGPT2 with outcome was further validated in two independent, external, multicenter cohort studies, CHS and C-STRIDE. Compared with those of C-PROBE and C-STRIDE, CHS participants (N = 3,183) were older and had higher mean eGFR at baseline (Table 1). In CHS, a higher baseline plasma concentration of ANGPT2, measured using SomaScan, was associated with the composite outcome of eGFR decline ≥30% or ESKD over 4 years of follow-up (odds ratio [OR] 1.50 [95% CI 1.17–1.92], P value = 0.001) (Fig. 3B). The AIC values showed a marked reduction, from 602.6 to 594.5. The association of ANGPT2 with the composite outcome persisted when restricted to people with diabetes (OR 2.25 [95% CI 1.41–3.58]) (Supplementary Table 3). In participants without diabetes, ANGPT2 demonstrated a trend of positive association with composite outcome (OR 1.28 [95% CI 0.95–1.74]) that, however, did not reach statistical significance (P = 0.107), likely due to the low event rate (Supplementary Table 3). In time-to-event analyses, baseline ANGPT2 was significantly associated with incident ESKD in the overall cohort (HR 1.53 [95% CI 1.11, 2.10], P = 0.009) (Supplementary Table 4). Above associations were all adjusted for available common risk factors of CKD, including age, sex, and eGFR.
Baseline characteristics of the C-STRIDE participants (N = 210) (Table 1) were similar to those of C-PROBE group A (age) or B (eGFR and albumin-to-creatinine ratio). Length of follow-up was shorter in C-STRIDE. Higher baseline concentrations of ANGPT2, measured using ELISA, were associated with increased risk of the outcome, and the estimated HR was 2.09 (95% CI 1.40–3.12), with adjustment for age, sex, eGFR, and albumin-to-creatinine ratio (Fig. 3C). The AIC values also showed a marked reduction, from 373.6 to 366.9.
ANGPT2-Associated Gene Network in the Kidney
The mechanism of action of circulating ANGPT2 on kidney disease progression was investigated through analyzing transcriptomic data from C-PROBE group B participants. To understand the ANGPT2 downstream signaling cascade in the kidney and its association with outcome, we generated an ANG-TIE signaling network gene list with 154 genes (Supplementary Table 5 and Fig. 4A). Community clustering of the 154 genes through HumanBase kidney functional network analysis resulted in three gene cluster modules (Supplementary Fig. 4) that contained key processes closely related to transmembrane receptor protein tyrosine kinase signaling (in both M1 and M2), regulation of cellular response to insulin stimulus and ephrin receptor signaling pathway (M1), superoxide metabolic process and positive regulation of cell motility (M2), and cellular response to oxidative stress and regulation of apoptotic process (M3).
Association of ANG-TIE Activation Score With Kidney Outcome
ANG-TIE pathway activation scores were calculated for every C-PROBE group B participant through aggregating the expression levels of the 154 pathway genes, in the glomerular and tubulointerstitial compartments separately. The association of ANG-TIE pathway activation scores with plasma ANGPT2 levels was evaluated in a compartment-specific manner. For patients with both glomerular transcriptomic data and plasma SomaScan data (N = 32), significant positive correlation between plasma ANGPT2 levels and glomerular ANG-TIE activation scores (r = 0.43, P = 0.01) (Fig. 4B) was observed. No significant correlations were observed between tubulointerstitial ANG-TIE pathway activation scores and plasma ANGPT2 levels (N = 25, r = 0.16, P = 0.82) (Supplementary Fig. 5A).
The ANG-TIE pathway activation score was significantly higher in glomeruli of progressors (those who reached outcome during follow-up) compared with nonprogressors (the rest of C-PROBE group B) (P = 0.02) (Fig. 4C); no significant differences were observed in tubulointerstitial ANG-TIE scores between these two groups (P = 0.51) (Supplementary Fig. 5B).
To test the generalizability of these findings, we downloaded the kidney cortex transcriptomic data from 28 patients with biopsy-proven DKD (31). Twenty-seven samples passed the quality control criteria and were divided into two groups (31), early (n = 6) and advanced (n = 21) DKD. Early DKD was defined as uACR between 30 and 300 mg/g, eGFR <90 mL/min/1.73 m2, whereas advanced DKD was defined as uACR >300 mg/g, eGFR <90 mL/min/1.73 m2. Significantly higher ANG-TIE pathway activation scores were observed in the advanced compared with early DKD group (P = 9.98e−7) (Fig. 4D) in this independent, external data set.
Compartment- and Cell Cluster–Specific ANG-TIE Signaling Pathway Activation in the Kidney
To uncover the link among plasma ANGPT2, glomerular ANG-TIE scores, and disease outcome, we next focused our investigation on angiopoietin receptor. We evaluated mRNA expression of the tyrosine protein kinase receptor TIE2, encoded by the TEK gene, in microdissected glomeruli and tubulointerstitia from biopsies of six healthy donors using Nephroseq data from normal kidney tissue panel of Lindenmeyer et al. (34). TEK expression was significantly higher in glomeruli compared with tubulointerstitia (Fig. 4E).
scRNAseq data from kidney biopsies of the 10 participants diagnosed with DKD were accessed through the KPMP Kidney Tissue Atlas (33) (Supplementary Table 6). The previously published scRNAseq data from 18 living kidney donors served as controls (39). A combined analysis of 56,906 cells from these two data sets demonstrated that TEK was specifically expressed in endothelial cell (EC) clusters (Fig. 5A and Supplementary Fig. 6A), and the average intensity of TEK is significantly higher in those with DKD compared with living donors (Fig. 5B and Supplementary Fig. 6B and C).
ANG-TIE pathway scores were calculated at the scRNAseq level. Comparison of the pathway scores grouped by cells from glomerular and tubulointerstitial compartments indicated higher ANG-TIE signaling score in glomeruli compared with tubulointerstitia (P < 2.2e−16) (Fig. 5C), consistent with a glomerulus-enriched TEK expression pattern (Fig. 4E).
Approximately one-half of the cells in the EC clusters (EC1 and EC2) in the DKD samples were TEK (+) cells. The level of ANG-TIE pathway score was significantly higher in TEK (+) compared with TEK (−) ECs in the DKD scRNAseq data set (Fig. 5D).
Discussion
Multiple circulating protein biomarkers have been previously reported to be associated with DKD progression (4,10,11,13). However, the mechanistic links to kidney pathophysiology that explain these associations remain largely unclear. In this study we used an unbiased proteomics approach to identify and validate a plasma marker panel, including ANGPT2, that improved prediction of the composite kidney outcome. We then validated ANGPT2 as a strong independent prognostic biomarker in the C-PROBE, CHS, and C-STRIDE cohorts. In addition, through multiscalar data integration, we confirmed that glomeruli, specifically, glomerular ECs, appear to be the predominant site of enhanced human ANG-TIE activation in DKD. These data are consistent with ECs being the pathogenic link of circulating ANGPT2 to DKD progression within the glomerulus in humans.
ANGPT2 is a vascular growth factor that signals through integrin activation and competitive inhibition of the binding of ANGPT1 to the TIE2 receptor tyrosine kinase (40). ANGPT2 is stored in and, on stimulation, rapidly released from EC Weibel-Palade bodies (41). Release of ANGPT2 into the bloodstream can be triggered by hypoxia, inflammation, and high glucose levels (42). ANGPT2 is crucial to the induction of inflammation and sensitizes ECs to TNF-α (43). Increased plasma levels of ANGPT2 also have been associated with cardiovascular disorders (40,44,45), diabetic retinopathy (46) and CKD stages 3–5 (47) and with a rapid decline in kidney function and ESKD (48). However, the signaling pathways that link circulating ANGPT2 to adverse kidney outcomes have remained poorly defined. In vitro and in vivo studies have suggested that EC damage, associated with increased ANGPT2, could accelerate DKD progression (49). With use of a hypothesis-free proteomics approach in multiple DKD cohorts of diverse ethnicities, our study confirmed and extended the role of ANGPT2 in the pathogenesis of DKD. Our integrative analysis of plasma proteomics data and kidney compartment-specific transcriptomic data suggested that elevated circulating ANGPT2 levels enhance signaling in glomerular ECs through the TIE2 receptor, activating a downstream proinflammatory and profibrotic signaling cascade. The ANG-TIE signaling network in the kidney was also enriched in members of other pathways that regulate cellular responses to insulin and oxidative stress, ephrin receptor signaling, and superoxide metabolic processes, among others (Supplementary Fig. 4), suggesting that ANGPT2 interacts with other functional processes important in DKD progression.
ANGPT2 expression has been found to be elevated in a CKD model in kidney glomerular, tubular, and interstitial cells (50). Also, increased expression of ANGPT2 specifically in podocytes led to increased albuminuria and EC apoptosis (51), as has been observed in DKD. The findings presented here suggest that both circulating and local ANGPT2 are important in pathogenesis and progression in people with DKD. In addition to absolute elevation of ANGPT2 it appears that even a relative increase of ANGPT2 compared with ANGPT1 can accelerate glomerular disease in animal models (52) and in CKD patients with acute kidney injury (53). Conversely, elevated plasma ANGPT1 has been associated with reduced risk of DKD progression (54), suggesting that ANGPT1 is protective. In our analysis of the C-PROBE cohort, neither plasma ANGPT1 nor ANGPT2-to-ANGPT1 ratio was associated with outcomes.
ANGPT2 targeting and dual ANGPT2-VEGF targeting have been investigated mainly in human neovascular eye diseases or treatment of cancer (55). Potential therapeutics such as faricimab that reduce ANGPT2 signaling have provided promising results in diabetic macular edema and retinopathy (56,57). Despite promising results from preclinical models, clinical trials targeting ANGPT2 to prevent kidney disease progression in patients with DKD have yet to be conducted. Findings from this study provide additional support for considering ANGPT2 or the ANG-TIE signaling pathway as therapeutic targets for DKD and might even provide a rationale for considering circulating plasma ANGPT2 levels in the selection of patients for interventional trials in DKD.
Beyond ANGPT2, previously identified plasma proteins associated with DKD progression such as β2-microglobulin (B2M); TNF receptor superfamily members TNFRSF1A, TNFRSF1B, TNFRSF9, and TNFRSF4; CD27 antigen (CD27); and CCL2 were detected with this methodology, and their prognostic value was confirmed in the C-PROBE cohort, bolstering support for an unbiased, data-driven approach for identifying prognostic markers.
The key strengths of this study include multiple validation steps integrated within the study design: technical replication of SomaScan results with ELISA; validation of outcome association in two large, independent, diverse cohorts, CHS and C-STRIDE; and confirmation of ANG-TIE pathway score in external bulk and single-cell transcriptomic data sets. Moreover, the transcriptomic score generated in this study can potentially be used to assess total ANG-TIE pathway activation in models of kidney disease and in clinical studies using human kidney samples. Finally, the 154 genes comprised in the ANG-TIE score, including ANGPT2, could be tested as predictive and pharmacodynamic biomarkers for ANG-TIE associated therapeutics.
Due to the limited sample size of discovery group A and the large number of proteins measured, the association of circulating proteins with DKD progression was only significant at the nominal P value threshold, with statistical significance lost after multiple testing. This is a frequently encountered challenge in translational and preclinical research with small sample size and high-dimensional data analysis (58). However, these statistical concerns were mitigated by the use of a multiple validation design. The verification of the initial discovery in independent and external validation cohorts using different technologies reinforced the reliability of the discovery and minimized issues with potential false positive discovery. Further limitations include the missingness of some clinical data from one or more of the three cohorts. For example, baseline albuminuria was not captured in the CHS cohort. Finally, the definitions of outcomes were slightly different between cohorts. However, the association of plasma ANGPT2 with outcomes remained robust across all three.
In summary, with use of an unbiased proteomic approach, circulating ANGPT2 was identified as a putative DKD prognostic biomarker, cross-validated in multiple cohorts. Integrative bioinformatics analysis revealed an underlying mechanistic framework that connects plasma ANGPT2 to glomerular ANG-TIE signaling and disease progression in patients with DKD. These data further support the conclusion that glomerular EC ANG-TIE signaling is critical for DKD progression and that ANG-TIE signaling–targeted therapies, currently being tested in other conditions, could prevent or ameliorate DKD progression.
This article contains supplementary material online at https://doi.org/10.2337/figshare.20736553.
J.L. and V.N. are co–first authors.
Members of the Kidney Precision Medicine Project and Michigan Translational Core C-PROBE Investigator Group can be found in supplementary material.
Article Information
Acknowledgments. The authors thank Drs. Yuee Wang and Virginia Vega-Warner for their technical support. A full list of principal CHS investigators and institutions can be found at CHS-NHLBI.org. List of contributors to KPMP and CPROBE are included in supplementary materials.
Funding. This work was supported by a grant from the University of Michigan Health System and Peking University Health Sciences Center Joint Institute for Translational and Clinical Research (BMU2017JI001). J.L. was supported by the China Scholarship Council (201906370288) while visiting the University of Michigan. M.A.V. is supported through funds from the National Institutes of Health (5UH3DK114870-05). This study was also supported, in part, by funding from the National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK) through the George M. O’Brien Michigan Kidney Translational Core Center, grant 2P30-DK-081943, the Integrated Systems Biology Approach to Diabetic Microvascular Complications grant, R24DK082841, P30DK89503, and JDRF Center for Excellence (5-COE-2019-861-S-B). The KPMP, UH3-DK-114907, is a multi-year project funded by NIDDK with the purpose of understanding and finding new ways to treat CKD and acute kidney injury. (See Supplementary Acknowledgment for consortium details.) C-STRIDE is supported by a grant from China International Medical Foundation–Renal Anemia Fund, grants from the National Natural Science Foundation of China (nos. 82070748, 82090020, and 82090021). CHS is supported by National Heart, Lung, and Blood Institute contracts HHSN268201200036C, HHSN268200800007C, HHSN268201800001C, N01HC55222, N01HC85079, N01HC85080, N01HC85081, N01HC85082, N01HC85083, N01HC85086, and 75N92021D00006 and grants U01HL080295 and U01HL130114, with additional contribution from the National Institute of Neurological Disorders and Stroke. Additional support was provided by the National Institute on Aging (grant R01AG023629). M.K. reports grants from National Institutes of Health (NIH)/National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK) in support of this manuscript. The authors report grants and contracts outside the submitted work through the University of Michigan with NIH and JDRF. S.E.R. reports grants from NIH/NIDDK in support of this manuscript.
Duality of Interest. M.K. reports grants and contracts outside the submitted work through the University of Michigan with Chan Zuckerberg Initiative, AstraZeneca, Novo Nordisk, Eli Lilly, Gilead, Goldfinch Bio, Janssen, Boehringer Ingelheim, Moderna, European Union Innovative Medicine Initiative, Certa, Chinook, amfAR, Angion, RenalytixAI, Travere Therapeutics, Regeneron, and Ionis Pharmaceuticals and consulting fees through the University of Michigan from Astellas Pharma, Poxel, Janssen, and UCB. In addition, M.A.V., W.J., and V.N. have a patent (PCT/EP2014/073413 “Biomarkers and methods for progression prediction for chronic kidney disease”) licensed. S.E.R. reports grants and funding through the Joslin Diabetes Center from Bayer and AstraZeneca. No other potential conflicts of interest relevant to this article were reported.
Author Contributions. J.L. and V.N. led the manuscript development and analysis, generated figures, and wrote the manuscript. Y.-y.Z. and D.-y.C. analyzed C-STRIDE data. C.L. and N.B. analyzed CHS data. D.F., F.E., and E.C.T. were involved in generating and quality control of data. K.B., S.S., Z.B., and J.J.H. recruited participants for the C-PROBE cohort. L.S. wrote and edited the manuscript. S.E.R., J.R.S., M.A.V., and S.S.W. contributed to the generation and access of KPMP data. M.B., S.P., and F.C.B. lead the C-PROBE study enabling generation and access of C-PROBE data and revised the manuscript. I.D.B., M.C., M.K., and W.J. provided study oversight and guidance on study design. V.N. and W.J. led the analytical design. W.J. led the data generation, quality control, and study design and wrote and revised the manuscript. All of the authors reviewed and edited the manuscript. M.K. and W.J. are the guarantors of this work and, as such, had full access to all the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis.