There is an increasing need for new biomarkers to improve prediction of chronic kidney disease (CKD) in individuals with type 2 diabetes (T2D). We aimed to identify blood-based epigenetic biomarkers associated with incident CKD and develop a methylation risk score (MRS) predicting CKD in individuals with newly diagnosed T2D. DNA methylation was analyzed epigenome wide in blood from 487 individuals with newly diagnosed T2D, of whom 88 developed CKD during an 11.5-year follow-up. Weighted Cox regression was used to associate methylation with incident CKD. Weighted logistic models and cross-validation (k = 5) were performed to test whether the MRS could predict CKD. Methylation at 37 sites was associated with CKD development based on a false discovery rate of <5% and absolute methylation differences of ≥5% between individuals with incident CKD and those free of CKD during follow-up. Notably, 15 genes annotated to these sites, e.g., TGFBI, SHISA3, and SLC43A2 (encoding LAT4), have been linked to CKD or related risk factors, including blood pressure, BMI, and estimated glomerular filtration rate. Using an MRS including 37 sites and cross-validation for prediction of CKD, we generated receiver operating characteristic (ROC) curves with an area under the curve (AUC) of 0.82 for the MRS and AUC of 0.87 for the combination of MRS and clinical factors. Importantly, ROC curves including the MRS had significantly better AUCs versus the one only including clinical factors (AUC = 0.72). The combined epigenetic biomarker had high accuracy in identifying individuals free of future CKD (negative predictive value of 94.6%). We discovered a high-performance epigenetic biomarker for predicting CKD, encouraging its potential role in precision medicine, risk stratification, and targeted prevention in T2D.
There is an increasing need for new biomarkers to improve the prediction and prevention of chronic kidney disease (CKD) in individuals with type 2 diabetes (T2D), a leading cause of morbidity and mortality in this population.
We investigated whether new blood-based epigenetic biomarkers predict incident CKD in individuals with newly diagnosed T2D.
We discovered a novel blood-based epigenetic biomarker, composed of a combination of a methylation risk score and clinical factors, capable of predicting CKD during an 11.5-year follow-up (area under the curve of 0.87, negative predictive value of 94.6%) in individuals with newly diagnosed T2D.
The epigenetic biomarker could provide a valuable tool for early risk stratification and prevention of CKD in individuals with newly diagnosed T2D, supporting its future use for precision medicine.
Introduction
Chronic kidney disease (CKD) is the leading cause of end-stage kidney failure worldwide (1). Its incidence is accelerating, and between 25 and 50% of patients diagnosed with diabetes are at risk for developing CKD (2). CKD contributes to the excess of mortality and health care costs associated with diabetes, and it is expected to be the fifth leading cause of global mortality by 2040 (3). CKD is defined as reduced kidney function based on a glomerular filtration rate (GFR) <60 mL/min/1.73 m2 or the presence of kidney damage markers, including albuminuria (urinary albumin-to-creatinine ratio [UACR] >30 mg/g), tubular disorders, or other abnormalities detected by histology and imaging persisting for >3 months (1,4). Although chronic hyperglycemia has been established as a driving cause of diabetes complications, including CKD, individuals with similar glycemic control do not equally progress to CKD (1). In fact, the pathophysiological mechanisms also operate through glucose-independent mechanisms. Apart from known independent risk factors of CKD (e.g., hypertension, obesity), the causal biological pathways implicated in CKD are complex and not fully elucidated given the high heterogeneity of this disease (5). Moreover, the natural progression of CKD is insidious and may start at different blood glucose levels, including prior to diabetes onset (1,2). Renal function decline can also be asymptomatic and may follow different patterns, with or without a preceding hyperfiltration stage (2). Conventional biomarkers (i.e., estimated GFR [eGFR], UACR) are indexes of an already ongoing stage of renal function decline (6,7). There are a few proposed models for renal risk stratification of advanced CKD, kidney function decline, and CKD progression in people with diabetes (8–11) but not for the prevention and early stratification of CKD risk at diabetes diagnosis. The unpredictable and heterogeneous characteristics of CKD onset in diabetes emphasize the need to identify clinically useful new biomarkers with excellent predictive capacity to identify individuals at high risk of developing CKD already at diabetes diagnosis. These biomarkers may also elucidate novel biological pathways implicated in CKD pathogenesis, which could become potential targets for treatment.
The importance of epigenetics in the pathogenesis of complex metabolic diseases, such as type 2 diabetes (T2D) and its complications, is well established (12). Some studies showed that epigenetics, including DNA methylation (DNAm) and methylation risk scores (MRSs), may be promising biomarkers for the prediction of T2D, vascular complications, and response to pharmacotherapy (13–17). The use of blood epigenetic biomarkers offers practical advantages, being noninvasive, fast, and cost-effective. Additionally, the availability of Food and Drug Administration–approved diagnostic DNAm tests in cancer further supports their clinical feasibility (18,19).
Previous studies have associated DNAm with kidney function in the general population (20–22), in individuals with type 1 diabetes (16,23,24), and in individuals with T2D (25–28), but these either were case-control studies or were predicting renal function decline by including individuals already diagnosed with CKD. Thus, our aim was to identify novel blood-based DNAm biomarkers that could predict the development of CKD in individuals with newly diagnosed T2D, alone or together with clinical factors, thereby offering a valuable new tool for CKD risk stratification and prevention.
Research Design and Methods
Study Population
A longitudinal cohort of individuals with newly diagnosed T2D was selected within the All New Diabetics in Scania (ANDIS) and All New Diabetics in Uppsala County (ANDiU) cohorts. ANDIS aims to recruit all individuals with newly-diagnosed T2D in Scania, Sweden (15,29). Individuals are registered with ANDIS at their diagnosis of diabetes, and blood samples for DNA extraction and clinical variables are collected. Information from disease onset and prospective outcome measurements are available from the hospital clinical chemistry and regional health care databases. Medication data are available through the Regional Drug Registry. ANDIS was approved by Lund University’s ethical review board (584/2006, 2016/529, 2024_08012). ANDiU (https://www.andiu.se/) includes individuals with newly diagnosed diabetes who reside in the county of Uppsala, Sweden. ANDiU was approved by Uppsala University’s ethical review board (2011/155). Blood samples were collected at registration, and clinical data are available from the national diabetes registry and drug registry linked to ANDiU. Written informed consent was obtained from all participants.
Prospective Cohort for CKD Prediction in T2D
In this study, CKD was defined as persisting (>3 months) renal function decline based on two consecutive eGFR measurements <60 mL/min/1.73 m2, taking the second as time of CKD diagnosis (1) in line with previous genetic studies (30,31). The eGFR was calculated using the MDRD-4 equation (32), and further analyses were performed with the Chronic Kidney Disease Epidemiology Collaboration (CKD-EPI) equation (Supplementary Material) (4). Overall, 487 individuals with newly diagnosed T2D from ANDIS and ANDiU with available blood DNAm data taken at registration were included in The Prospective Cohort for CKD Prediction in T2D (Fig. 1A). Exclusion criteria were CKD diagnosis at registration; lack of baseline age, sex, height, glycated hemoglobin (HbA1c), and eGFR and at least two follow-up eGFR measurements; acute kidney failure defined by ICD-10 code N17 at baseline or follow-up; and age <46 years since the youngest participant who developed CKD during follow-up was 46 years old at baseline. Among the 487 individuals with newly diagnosed T2D fulfilling the above criteria, 88 developed CKD and 399 did not over a mean follow-up period of 6.5 years (0.8–11.5 years) (Fig. 1B). The time-to-event was defined as the interval time between blood collection for DNAm analysis and the second in 3 months of eGFR measurement <60 mL/min/1.73 m2(0.8–9.8 years). Total follow-up time was calculated for individuals free of CKD from the time of blood collection for DNAm analysis until the last eGFR measure available. Table 1 presents baseline clinical and biochemical characteristics of The Prospective Cohort for CKD Prediction in T2D. Information on age, sex, BMI, HbA1c, plasma glucose, insulin, creatinine, urinary creatinine, urinary albumin, triglycerides, cholesterol, and systolic and diastolic blood pressure was obtained at registration in ANDIS or ANDiU or within a maximum period of 6 months from blood sampling for DNAm analysis. HOMA2-IR, as a measure of insulin resistance, was calculated using the HOMA Calculator (https://www2.dtu.ox.ac.uk/homacalculator/). Medication information was extracted from the drug registry using ATC codes C10 (lipid-lowering medication) and C02–C03 and C08–C09 (antihypertensive medications) if individuals retrieved these medications from the pharmacy within 6 months from DNAm analysis blood sample collection.
Study design and statistical analyses for assessing the association between DNAm in blood and iCKD in The Prospective Cohort for CKD Prediction in T2D (n = 487). A: The overall selection flowchart of The Prospective Cohort for CKD Prediction in T2D from the individuals with newly diagnosed T2D from the ANDIS and the ANDiU cohorts (n = 487). The figure presents the number of participants excluded in each step due to different required criteria and available data. Among 796 individuals with DNAm data available, we excluded 95 based on lack of baseline age, sex, height, HbA1c, and eGFR measurements; 47 based on age <46 years (since the youngest participant who developed CKD during follow-up was 46 years old); 30 because of a lack of eGFR measurements at follow-up; 24 who failed DNA quality control (QC) analysis; 103 with a CKD diagnosis at registration in ANDIS or ANDiU; and 10 based on acute kidney failure (defined by ICD-10 code N17) at baseline or at follow-up. B: Clinical and biochemical follow-up. C: Eight weighted Cox regression models were run using epigenetic data from the EPIC array, covering ∼850,000 sites, and different covariates to identify DNAm sites associated with iCKD in The Prospective Cohort for CKD Prediction in T2D. The covariates included in each model are presented, and the number of DNAm sites based on P < 0.05 and q < 0.05 (FDR <5%) are shown for each model and expressed in boldface. The sample size used for each specific model is indicated. Model 2 is the same as model 1 but it is also adjusted for reference-based cell type composition. Created using BioRender.com. HOMA-IR, HOMA of insulin resistance; NK, natural killer.
Study design and statistical analyses for assessing the association between DNAm in blood and iCKD in The Prospective Cohort for CKD Prediction in T2D (n = 487). A: The overall selection flowchart of The Prospective Cohort for CKD Prediction in T2D from the individuals with newly diagnosed T2D from the ANDIS and the ANDiU cohorts (n = 487). The figure presents the number of participants excluded in each step due to different required criteria and available data. Among 796 individuals with DNAm data available, we excluded 95 based on lack of baseline age, sex, height, HbA1c, and eGFR measurements; 47 based on age <46 years (since the youngest participant who developed CKD during follow-up was 46 years old); 30 because of a lack of eGFR measurements at follow-up; 24 who failed DNA quality control (QC) analysis; 103 with a CKD diagnosis at registration in ANDIS or ANDiU; and 10 based on acute kidney failure (defined by ICD-10 code N17) at baseline or at follow-up. B: Clinical and biochemical follow-up. C: Eight weighted Cox regression models were run using epigenetic data from the EPIC array, covering ∼850,000 sites, and different covariates to identify DNAm sites associated with iCKD in The Prospective Cohort for CKD Prediction in T2D. The covariates included in each model are presented, and the number of DNAm sites based on P < 0.05 and q < 0.05 (FDR <5%) are shown for each model and expressed in boldface. The sample size used for each specific model is indicated. Model 2 is the same as model 1 but it is also adjusted for reference-based cell type composition. Created using BioRender.com. HOMA-IR, HOMA of insulin resistance; NK, natural killer.
Baseline clinical and biochemical characteristics of The Prospective Cohort for CKD Prediction in T2D (n = 487)
Variable . | Prevalence (%) or mean (SD) . | P . | AUC . | |
---|---|---|---|---|
Individuals free of CKD (n = 399) . | Individuals with iCKD (n = 88) . | |||
Time to event (years) | — | 4.3 (2.3) | — | — |
Follow-up (years) | 6.5 (2.2) | — | — | — |
Sex (male) | 237 (59.4) | 51 (57.9) | 0.804 | 0.51 |
Age (years) | 60.38 (8.9) | 67.69 (8.5) | 7.06E-11 | 0.72 |
BMI (kg/m2) | 31.41 (5.4) | 30.49 (4.4) | 0.128 | 0.54 |
Duration of diabetes at baseline (days)* | 52.35 (118.8) | 62.98 (93.3) | 0.015 | 0.58 |
HbA1c (mmol/mol) | 62.47 (17.6) | 60.40 (15.9) | 0.392 | 0.53 |
HbA1c (% NGSP) | 7.9 (1.61) | 7.7 (1.45) | ||
Baseline eGFR (mL/min/1.73 m2) | 96.79 (20.6) | 80.33 (16.4) | 2.59E-13 | 0.74 |
First eGFR <60 mL/min/1.73 m2 | — | 53.5 (6.5) | — | — |
Second eGFR <60 (mL/min/1.73 m2)** | — | 51.25 (9.3) | — | — |
UACR (mg/mmol) (n = 461) | 1.8 (5.2) | 5.4 (22.2) | 0.424 | 0.53 |
Creatinine (mmol/L) | 67.87 (13.9) | 77.49 (14.6) | 1.08E-07 | 0.68 |
Urinary creatinine (mmol/L) | 11 (5.4) | 9.9 (5.1) | 0.055 | 0.57 |
Urinary albumin (mg/L) | 18.66 (73.9) | 37.78 (128.1) | 0.584 | 0.52 |
HOMA2 of insulin resistance (n = 476) | 3.9 (2.9) | 3.6 (1.3) | 0.868 | 0.49 |
Triglycerides (mmol/L) (n = 472) | 2.4 (3.5) | 2.1 (1.2) | 0.993 | 0.50 |
Total cholesterol (mmol/L) (n = 477) | 5.3 (1.2) | 5.1 (1.3) | 0.152 | 0.55 |
HDL cholesterol (mmol/L) (n = 477) | 1.2 (0.3) | 1.1 (0.3) | 0.321 | 0.53 |
LDL cholesterol (mmol/L) (n = 477) | 3.4 (1.05) | 3.3 (1.1) | 0.067 | 0.56 |
Systolic blood pressure (mmHg) (n = 355) | 137.8 (17.2) | 142.5 (14.7) | 0.040 | 0.60 |
Diastolic blood pressure (mmHg) (n = 355) | 81.1 (10.5) | 81.1 (10.0) | 0.098 | 0.52 |
Antihypertensives at baseline*** | 222 (55.6) | 65 (73.9) | 0.002 | 0.59 |
Statins at baseline*** | 172 (43.1) | 42 (47.7) | 0.430 | 0.53 |
Previous CVD^ | 32 (8) | 14 (15.9) | 0.022 | 0.55 |
Smoking§ | 0.77 (0.1) | 0.77 (0.09) | 0.566 | 0.52 |
Variable . | Prevalence (%) or mean (SD) . | P . | AUC . | |
---|---|---|---|---|
Individuals free of CKD (n = 399) . | Individuals with iCKD (n = 88) . | |||
Time to event (years) | — | 4.3 (2.3) | — | — |
Follow-up (years) | 6.5 (2.2) | — | — | — |
Sex (male) | 237 (59.4) | 51 (57.9) | 0.804 | 0.51 |
Age (years) | 60.38 (8.9) | 67.69 (8.5) | 7.06E-11 | 0.72 |
BMI (kg/m2) | 31.41 (5.4) | 30.49 (4.4) | 0.128 | 0.54 |
Duration of diabetes at baseline (days)* | 52.35 (118.8) | 62.98 (93.3) | 0.015 | 0.58 |
HbA1c (mmol/mol) | 62.47 (17.6) | 60.40 (15.9) | 0.392 | 0.53 |
HbA1c (% NGSP) | 7.9 (1.61) | 7.7 (1.45) | ||
Baseline eGFR (mL/min/1.73 m2) | 96.79 (20.6) | 80.33 (16.4) | 2.59E-13 | 0.74 |
First eGFR <60 mL/min/1.73 m2 | — | 53.5 (6.5) | — | — |
Second eGFR <60 (mL/min/1.73 m2)** | — | 51.25 (9.3) | — | — |
UACR (mg/mmol) (n = 461) | 1.8 (5.2) | 5.4 (22.2) | 0.424 | 0.53 |
Creatinine (mmol/L) | 67.87 (13.9) | 77.49 (14.6) | 1.08E-07 | 0.68 |
Urinary creatinine (mmol/L) | 11 (5.4) | 9.9 (5.1) | 0.055 | 0.57 |
Urinary albumin (mg/L) | 18.66 (73.9) | 37.78 (128.1) | 0.584 | 0.52 |
HOMA2 of insulin resistance (n = 476) | 3.9 (2.9) | 3.6 (1.3) | 0.868 | 0.49 |
Triglycerides (mmol/L) (n = 472) | 2.4 (3.5) | 2.1 (1.2) | 0.993 | 0.50 |
Total cholesterol (mmol/L) (n = 477) | 5.3 (1.2) | 5.1 (1.3) | 0.152 | 0.55 |
HDL cholesterol (mmol/L) (n = 477) | 1.2 (0.3) | 1.1 (0.3) | 0.321 | 0.53 |
LDL cholesterol (mmol/L) (n = 477) | 3.4 (1.05) | 3.3 (1.1) | 0.067 | 0.56 |
Systolic blood pressure (mmHg) (n = 355) | 137.8 (17.2) | 142.5 (14.7) | 0.040 | 0.60 |
Diastolic blood pressure (mmHg) (n = 355) | 81.1 (10.5) | 81.1 (10.0) | 0.098 | 0.52 |
Antihypertensives at baseline*** | 222 (55.6) | 65 (73.9) | 0.002 | 0.59 |
Statins at baseline*** | 172 (43.1) | 42 (47.7) | 0.430 | 0.53 |
Previous CVD^ | 32 (8) | 14 (15.9) | 0.022 | 0.55 |
Smoking§ | 0.77 (0.1) | 0.77 (0.09) | 0.566 | 0.52 |
The Prospective Cohort for CKD Prediction in T2D was selected from individuals with newly diagnosed T2D from the ANDIS and the ANDiU studies (n = 487 unless otherwise indicated). P values were calculated using the Mann-Whitney U test for continuous variables and Pearson χ2 test for binary variables. P < 0.05 show a statistically significance difference between groups (individuals with iCKD [n = 88] vs. individuals free of CKD [n = 399]). Significant P values are shown in boldface. All variables refer to baseline status (i.e., within 6 months from the collection of the DNA sample for DNAm), except for the first and second eGFR measurements, which were collected during the 11.5-year follow-up.
*Days from the diabetes diagnosis to the collection of the DNA sample for DNAm analysis.
**Second altered eGFR measured with an interval of >3 months from the first altered measure found.
***Considering individuals who bought the medication at least once within 6 months before or after the baseline.
^Represents all participants with reported CVD (i.e., stroke, coronary artery disease) before or at baseline according to ICD-10 codes.
§Mean methylation value for the cg05575921 site on the AHRR gene.
Presence of cardiovascular disease (CVD) before or at registration in ANDIS or ANDiU was defined as having had a first coronary event (defined by ICD-10 codes I20, I21, I24, and I25) or stroke (defined by ICD-10 codes I60–I64). Although ANDIS and ANDiU include individuals with newly diagnosed T2D, we defined the variable duration of diabetes as little time may have passed between T2D diagnosis and blood collection for DNAm analysis.
DNAm Analysis
DNA was extracted from blood using a Gentra Puregene Blood Kit (QIAGEN, Hilden, Germany). DNA concentration and purity were determined using NanoDrop (NanoDrop Technologies, Wilmington, DE). Bisulfite conversion of DNA (500 or 1,000 ng) was performed using the EZ DNA Methylation Kit (Zymo Research Corporation, Irvine, CA). Next, samples were randomized across chips and hybridized to BeadChips following the Infinium HD Assay (Illumina, San Diego, CA) protocol. The DNAm analysis was performed at Lund University using the Infinium MethylationEPIC BeadChip array (Illumina), covering 853,307 sites. The bioinformatics pipeline is described in the Supplementary Material. Since blood contains various cell types, a reference-based method was applied to correct for potential effects of cellular heterogeneity (33). This deconvolution technique allows one to estimate relative proportions of blood cell types using blood-derived DNAm signatures of CD8T, CD4T, natural killer cells, B cells, monocytes, and neutrophils, which were subsequently included as covariates in the respective regression model. According to these estimations, small differences in blood cell type composition between individuals free of CKD and individuals with incident CKD (iCKD) were present (Supplementary Table 1). DNAm of 816,888 probes remained after filtering, and for easier biological interpretation, β-values are presented in tables and figures.
Statistical Analysis
R 4.2.2 software was used for data analyses. Differences in clinical characteristics between individuals with and without iCKD were assessed using Mann-Whitney U tests and Pearson χ2 tests for continuous and categorical variables, respectively. Data are presented as mean (SD) or counts and percentages unless stated otherwise. Binary logistic regression models were performed to assess associations between clinical variables and iCKD. Receiver operating characteristic (ROC) curves were plotted, and the area under the curve (AUC) was calculated for the clinical variables presented in Table 1.
To account for any bias potentially introduced by sampling or censoring, sampling weights (Wsamp) and censoring weights (Wcens) were calculated using inverse probability, as further described in the Supplementary Material. The final weights, calculated as the product of Wcens × Wsamp, were included in all logistic regression models.
Association Between DNAm and iCKD in T2D
To test whether DNAm in blood was associated with iCKD in The Prospective Cohort for CKD Prediction in T2D, eight different weighted Cox regression models adjusted for various confounders and cell type composition were run (Fig. 1C). Model 1 was adjusted for age, sex, BMI, and eGFR at baseline. Since whole blood contains multiple cell types, model 2 was further adjusted for potential effects of cellular heterogeneity (33). To increase the sensitivity of our study, six additional weighted Cox models were run, adjusting for additional covariates including UACR, HbA1c, lipid-lowering and antihypertensive drugs, smoking based on a known epigenetic biomarker (methylation of cg05575921 in AHRR) (34), total cholesterol, HDL and LDL cholesterol, triglycerides, HOMA2-IR, diabetes duration, and CVD before or at registration (Fig. 1C). No missing data were reported for models 1–4 and 6 (n = 487). The resulting P values were adjusted for multiple testing using a false discovery rate (FDR) <5% (i.e., q < 0.05, Benjamini-Hochberg method). Moreover, we tested whether DNAm sites associated with iCKD in model 1 were also identified in models 2–8. Hazard ratios (HR) with 95% CIs and P values are presented. Gene Ontology analysis was performed to discover enriched biological processes of DNAm associated with iCKD, as described in the Supplementary Material.
Prediction of iCKD Using MRS
A weighted MRS was generated using selected DNAm sites based on the following criteria: DNAm sites associated with iCKD in weighted Cox regression models (q < 0.05 in model 1 and P < 0.05 in all other models) and with ≥5% of absolute differences in DNAm between individuals with and without iCKD. This arbitrary cutoff was used for selection of methylation sites that may be more robust and more likely to be validated (35). Additionally, the MethyltoSNP R package was used to exclude single nucleotide polymorphism (SNP)–like patterns among the methylation sites with DNAm absolute differences ≥5%. The MRS was calculated as the sum of the methylation levels (M) at each selected DNAm site multiplied by its effect size (i.e., lnHR based on model 1a) as follows: (MCpG1 × lnHRCpG1) + (MCpG2 × lnHRCpG2) + … (MCpGn × lnHRCpGn), with n being the total number of DNAm sites included in the MRS (14,15,36,37).
To calculate the predicted probability of iCKD using the MRS, we used weighted logistic regression (weights = Wcens × Wsamp) and 5-fold cross-validation. Data were split into five parts stratifying on sex and iCKD status. Iterating through the five folds, we used 80% of the data for training and the remaining 20% for prediction. We fitted weighted logistic models using data in the training set with iCKD by the end of the follow-up as the outcome and the MRS as the main predictor, adjusting for clinical factors. The following model was used: iCKD ∼ MRS + (age, sex, BMI, eGFR, diabetes duration, HbA1c, and lipid-lowering and antihypertensive medications). This fitted model was then used to predict the probability of iCKD using data in the test set. The procedure was repeated for each of the five folds. Additional models were used and are reported in the Supplementary Material. Predicted risks were obtained for each participant. Next, ROC curves were generated using iCKD as the outcome and predictions evaluated based on AUCs using the pROC R package. Net reclassification index (NRI) and integrated discrimination index (IDI) information were extracted using the jsmodule R package.
Data and Resource Availability
The human blood DNAm data generated and used in this study (EPIC DNAm data, accession no. LUDC2023.12.1) are available upon request through the Lund University Diabetes Centre repository portal (https://www.ludc.lu.se/resources/ludc-repository). Individual-level data and resources from ANDIS and ANDiU are not publicly available due to ethical and legal restrictions related to the Personal Data Act and the European Union’s General Data Protection Regulation and Data Protection Act.
Results
DNAm Associated With iCKD in Individuals With Newly Diagnosed T2D
We explored whether DNAm in blood was associated with iCKD using The Prospective Cohort for CKD Prediction in T2D, including 487 individuals with newly diagnosed T2D free of CKD at baseline, of whom 88 developed CKD during follow-up. At baseline, individuals who developed CKD were older, had a longer duration of diabetes, lower eGFR values, higher creatinine levels, and higher systolic blood pressure; used more antihypertensives; and had a higher prevalence of CVD compared with individuals free of CKD at follow-up (P < 0.05) (Table 1). The mean follow-up time was 6.5 years for individuals free of CKD, and the mean time to event was 4.3 years for individuals with iCKD (Table 1). Age and baseline eGFR were associated with iCKD, with AUCs >0.70, while the other clinical phenotypes generated poorer AUCs (0.51–0.68) (Table 1).
To assess whether DNAm in blood from individuals with newly diagnosed T2D was associated with future risk of developing CKD, we performed eight weighted Cox regression models adjusted for potential confounders (Fig. 1C). We found 56,411 DNAm sites associated with iCKD after adjusting for age, sex, BMI, and baseline eGFR (model 1, q < 0.05). Among those, 35,948 DNAm sites were also associated with iCKD in all the other models (P = 3.75e-34–3.45e-3) (Supplementary Table 2 and Fig. 1C). The large number of DNAm sites identified in these regression models suggested statistical inflation (λ = 1.52). Therefore, we proceeded by filtering 2,965 of the 35,948 sites based on arbitrary absolute differences in methylation ≥2% between individuals with and without iCKD to focus on DNAm sites with larger effect sizes, as they may be more robust and less influenced by inflation (Supplementary Table 2). Notably, the majority (78%) of these 2,965 sites were hypermethylated in individuals with iCKD versus individuals free of CKD (Supplementary Table 2 and Fig. 2A). A Gene Ontology analysis of these 2,965 DNAm sites identified 11 enriched biological processes (FDR <5%), including regulation of transcription by RNA polymerase II, cell adhesion, and cell-cell signaling (Supplementary Table 3).
DNAm is associated with incident CKD in The Prospective Cohort for CKD Prediction in T2D (n = 487) during an 11.5-year follow-up. A: Volcano plot for the associations between DNAm at baseline and incident CKD at 11.5 years of follow-up in The Prospective Cohort for CKD Prediction in T2D (n = 487) adjusted for age, sex, BMI, and baseline eGFR (model 1), showing 56,411 associated methylation sites with q < 0.05. Red dashed horizontal lines indicate methylation sites with q < 0.05 (FDR <5%), and red dashed vertical lines indicate the cutoff effect size (i.e., absolute difference in DNAm) ≥2%. To the left of the first vertical red line are the methylation sites that are hypomethylated in individuals with iCKD, and to the right of the second vertical red line are the methylation sites that are hypermethylated in individuals with iCKD compared with those free of CKD at follow-up. The 2,965 methylation sites with effect size ≥2% and q < 0.05 are marked in purple. Additionally, the 37 methylation sites with effect size ≥5% DNAm and q < 0.05 are marked in red. Certain CpG sites may show larger effect sizes due to biological reasons but also have high interindividual variability, which could reduce the statistical significance. B: Bar plot with the 37 DNAm sites associated with iCKD and with absolute differences in DNAm ≥5% between individuals with iCKD (in red) and free of CKD at follow-up (in green) (q < 0.05). Error bars indicate ±2 SE. *P < 0.01, **P < 0.001 in model 1. C: Box plot with the MRS distribution in individuals with iCKD and individuals free of CKD at follow-up. The MRS was generated using data from the 37 DNAm sites associated with iCKD and with absolute differences in DNAm ≥5% between individuals with and without iCKD. The P value was determined using independent t tests of the MRS distribution across the categories of CKD.
DNAm is associated with incident CKD in The Prospective Cohort for CKD Prediction in T2D (n = 487) during an 11.5-year follow-up. A: Volcano plot for the associations between DNAm at baseline and incident CKD at 11.5 years of follow-up in The Prospective Cohort for CKD Prediction in T2D (n = 487) adjusted for age, sex, BMI, and baseline eGFR (model 1), showing 56,411 associated methylation sites with q < 0.05. Red dashed horizontal lines indicate methylation sites with q < 0.05 (FDR <5%), and red dashed vertical lines indicate the cutoff effect size (i.e., absolute difference in DNAm) ≥2%. To the left of the first vertical red line are the methylation sites that are hypomethylated in individuals with iCKD, and to the right of the second vertical red line are the methylation sites that are hypermethylated in individuals with iCKD compared with those free of CKD at follow-up. The 2,965 methylation sites with effect size ≥2% and q < 0.05 are marked in purple. Additionally, the 37 methylation sites with effect size ≥5% DNAm and q < 0.05 are marked in red. Certain CpG sites may show larger effect sizes due to biological reasons but also have high interindividual variability, which could reduce the statistical significance. B: Bar plot with the 37 DNAm sites associated with iCKD and with absolute differences in DNAm ≥5% between individuals with iCKD (in red) and free of CKD at follow-up (in green) (q < 0.05). Error bars indicate ±2 SE. *P < 0.01, **P < 0.001 in model 1. C: Box plot with the MRS distribution in individuals with iCKD and individuals free of CKD at follow-up. The MRS was generated using data from the 37 DNAm sites associated with iCKD and with absolute differences in DNAm ≥5% between individuals with and without iCKD. The P value was determined using independent t tests of the MRS distribution across the categories of CKD.
Prediction of iCKD Using an MRS in Individuals With T2D
Next, we examined whether we could develop a blood-based epigenetic biomarker predicting the development of CKD in individuals with newly diagnosed T2D using DNAm data in The Prospective Cohort for CKD Prediction in T2D. To select the most robust DNAm sites associated with iCKD, we filtered 37 sites showing absolute differences in methylation ≥5% between individuals with iCKD and those free of CKD at follow-up and q < 0.05 (Fig. 2A and B and Supplementary Table 4). We then generated a weighted MRS including these 37 methylation sites, which was higher in individuals with iCKD compared with those free of CKD (P < 4.6e-24) (Fig. 2C). The MRS was associated with iCKD, with an HR of 1.10 (95% CI 1.08–1.12), suggesting that for each MRS unit increase, the hazard of developing CKD increases by ∼10% after adjusting for clinical factors at baseline (i.e., age, sex, BMI, eGFR, diabetes duration, HbA1c, lipid-lowering and antihypertensive medications).
To further assess whether the MRS could discriminate between individuals with iCKD and those free of CKD at follow-up in The Prospective Cohort for CKD Prediction in T2D, we ran a 5-fold cross validation using censoring and sampling weights to perform logistic models with MRS as the independent variable. For cross-validation analyses, we ran logistic models with only the MRS, clinical factors at baseline alone (age, sex, BMI, eGFR, diabetes duration, HbA1c, and lipid-lowering and antihypertensive medications), and clinical factors in combination with the MRS as independent variables to test how these combined in an epigenetic biomarker predict iCKD compared with the MRS alone.
Predicted risks of each individual were obtained, and ROC curves were generated with iCKD as the observed outcome. The ROC curves showed AUCs of 0.82 (95% CI 0.78–0.87) for the MRS, 0.72 (95% CI 0.66–0.77) for only clinical factors, and 0.87 (95% CI 0.83–0.92) for the combination of the MRS and clinical factors (Fig. 3A). Importantly, when comparing these three ROC curves, the MRS could significantly better predict iCKD in individuals with newly diagnosed T2D compared with clinical factors alone (AUC 0.82 vs. 0.72, P = 0.003). Notably, the combined epigenetic biomarker was significantly better at predicting iCKD compared with both the clinical factors alone (AUC 0.87 vs. 0.72, P = 5.4e-10) and the MRS (AUC 0.87 vs. 0.82, P = 0.002) (Fig. 3A). The predictive capacity of our combined epigenetic biomarker did not change when adding smoking status and CVD at baseline to the clinical factors considered or when using the CKD-EPI equation instead of the MDRD-4 equation to calculate eGFR (Supplementary Material). The combined epigenetic biomarker significantly improved risk discrimination and reclassification compared with the clinical factors alone (NRI 1.17 [P < 0.001], IDI 0.28 [P < 0.001]), as further described in the Supplementary Material. We then generated precision-recall curves, which are commonly used for unbalanced data, considering in our cohort a prevalence of iCKD of 18.1% (defined as 100 × [88/487]). Here, both precision (true positives/[true positives + false positives]) and recall (or sensitivity, i.e., true positives/[true positives + false negatives]) were significantly better for the combined epigenetic biomarker compared with the MRS or clinical factors alone (P = 0.0001 and 2.7e-05, respectively) (Fig. 3B).
Prediction of iCKD in The Prospective Cohort for CKD Prediction in T2D (n = 487) during an 11.5-year follow-up using a combined epigenetic biomarker. A: ROC curves were generated with iCKD as the outcome using the predicted risks of each individual obtained from cross-validation (k = 5) for the MRS, for the baseline clinical factors considered in model 5 (age, sex, BMI, eGFR, diabetes duration, HbA1c, and lipid-lowering and antihypertensive medications), and for the combination of both MRS and clinical factors (combined epigenetic biomarker). Based on the AUC of the ROC curve, the MRS (AUC 0.82) could better predict iCKD in individuals with T2D than clinical factors alone (AUC 0.72, P = 0.003). Additionally, the combined MRS and clinical factors (AUC 0.87) was better at predicting iCKD compared with clinical factors alone (AUC 0.72, P = 5.4e-10) and MRS alone (AUC 0.82, P = 0.002). Adding smoking status and CVD at baseline to the clinical factors or using the CKD-EPI equation to calculate the eGFR did not significantly change the prediction of our combined epigenetic biomarker (Supplementary Material). B: Precision-recall curves (PRC) showing that MRS and the combined epigenetic biomarker were better compared with the clinical factors alone (P = 0.0001 and 2.7e-05, respectively). PRCs are commonly used for unbalanced data, considering the prevalence of CKD in our sample of 18.1%. C: The figure displays the optimal cutoff point (i.e., 0.113) that has been calculated using the ROC curve for the combined epigenetic biomarker (AUC 0.87), with the Youden index giving a sensitivity (Se) and a specificity (Sp) of 0.784 and 0.842, respectively. D: The data were split into two groups based on the optimal cutoff point (0.113): those with low predicted values of the combined epigenetic biomarker (n = 355) and those with high values (n = 132). Kaplan-Meier CKD-free survival curves for the combination of the MRS and clinical factors displays that the survival proportion was significantly higher for individuals with low values compared with those with high values (log-ranked P < 2e-16). E: A phenotype wheel of the 21 protein-coding genes annotated to the 37 methylation sites associated with iCKD and included in the MRS according to the following PubMed search: gene AND chronic kidney disease, gene AND eGFR, gene AND blood pressure, gene AND T2D, gene AND BMI. Thirteen of the 21 genes (62%) were previously associated with at least one of the phenotypes considered, and 8 (38%) were not previously associated with these phenotypes. F: GWAS Catalog trait wheel of the 21 protein-coding genes annotated to the 37 methylation sites. The figure shows the CKD-related traits to which SNPs, annotated to the same 21 genes, were associated based on the GWAS Catalog. Main traits considered were serum creatinine levels, eGFR, BMI, blood pressure, and T2D (detailed description of the traits in Supplementary Table 6).
Prediction of iCKD in The Prospective Cohort for CKD Prediction in T2D (n = 487) during an 11.5-year follow-up using a combined epigenetic biomarker. A: ROC curves were generated with iCKD as the outcome using the predicted risks of each individual obtained from cross-validation (k = 5) for the MRS, for the baseline clinical factors considered in model 5 (age, sex, BMI, eGFR, diabetes duration, HbA1c, and lipid-lowering and antihypertensive medications), and for the combination of both MRS and clinical factors (combined epigenetic biomarker). Based on the AUC of the ROC curve, the MRS (AUC 0.82) could better predict iCKD in individuals with T2D than clinical factors alone (AUC 0.72, P = 0.003). Additionally, the combined MRS and clinical factors (AUC 0.87) was better at predicting iCKD compared with clinical factors alone (AUC 0.72, P = 5.4e-10) and MRS alone (AUC 0.82, P = 0.002). Adding smoking status and CVD at baseline to the clinical factors or using the CKD-EPI equation to calculate the eGFR did not significantly change the prediction of our combined epigenetic biomarker (Supplementary Material). B: Precision-recall curves (PRC) showing that MRS and the combined epigenetic biomarker were better compared with the clinical factors alone (P = 0.0001 and 2.7e-05, respectively). PRCs are commonly used for unbalanced data, considering the prevalence of CKD in our sample of 18.1%. C: The figure displays the optimal cutoff point (i.e., 0.113) that has been calculated using the ROC curve for the combined epigenetic biomarker (AUC 0.87), with the Youden index giving a sensitivity (Se) and a specificity (Sp) of 0.784 and 0.842, respectively. D: The data were split into two groups based on the optimal cutoff point (0.113): those with low predicted values of the combined epigenetic biomarker (n = 355) and those with high values (n = 132). Kaplan-Meier CKD-free survival curves for the combination of the MRS and clinical factors displays that the survival proportion was significantly higher for individuals with low values compared with those with high values (log-ranked P < 2e-16). E: A phenotype wheel of the 21 protein-coding genes annotated to the 37 methylation sites associated with iCKD and included in the MRS according to the following PubMed search: gene AND chronic kidney disease, gene AND eGFR, gene AND blood pressure, gene AND T2D, gene AND BMI. Thirteen of the 21 genes (62%) were previously associated with at least one of the phenotypes considered, and 8 (38%) were not previously associated with these phenotypes. F: GWAS Catalog trait wheel of the 21 protein-coding genes annotated to the 37 methylation sites. The figure shows the CKD-related traits to which SNPs, annotated to the same 21 genes, were associated based on the GWAS Catalog. Main traits considered were serum creatinine levels, eGFR, BMI, blood pressure, and T2D (detailed description of the traits in Supplementary Table 6).
Since the combination of the MRS and clinical factors had the best predictive capacity based on the highest AUC (0.87), we further explored the use of this combined epigenetic biomarker by defining a cutoff point using the Youden index (the point on the ROC curve that has the largest vertical distance from the diagonal or chance line). Here, the optimal cutoff point calculated was 0.113, with a sensitivity of 0.784 and specificity of 0.842 (Fig. 3C). When further considering the prevalence of CKD in our cohort, this epigenetic biomarker demonstrated high accuracy in identifying individuals free of CKD among those with newly diagnosed T2D, with negative predictive value of 94.6%. However, this test exhibited a moderate false-positive rate, defined as a positive predictive value of 52.3%, suggesting that a positive result may not be fully reliable in discriminating people with T2D who will develop CKD.
Finally, we used the defined cutoff point of 0.113 to split the cohort into two groups: one with individuals characterized by low values of the combined epigenetic biomarker (n = 355) and one with individuals with higher values (n = 132). Using Kaplan-Meier survival analysis for future development of CKD in individuals with T2D, we found that the survival proportion was significantly higher for individuals with low values of the biomarker versus those with high values (P < 2e-16) (Fig. 3D).
Biological Relevance of DNAm Sites Included in the MRS
Finally, we dissected the biological relevance of the DNAm sites included in the MRS and the combined epigenetic biomarker by performing a systematic PubMed search and using the Genome-Wide Association Study (GWAS) Catalog. These 37 sites are annotated to 26 unique genes, of which 21 are protein coding. Notably, four DNAm sites were annotated to SLC43A2, encoding LAT4, a neutral amino acid transporter highly expressed in tubule epithelia (38). According to the PubMed search, using gene symbols together with the search terms CKD, eGFR, blood pressure, T2D, or BMI, 13 of the 21 protein-coding genes (62%) were previously associated with at least 1 of these phenotypes, while 8 (38%) were not previously associated with these phenotypes. For example, TGFBI, CCR6, TEC, and CDCP1 have been associated with eGFR or CKD; TGFBI, KCNH6, and RPH3AL with T2D; ADAMTS16, ZNF385D, and SHISA3 with blood pressure; and CCR6, TGFBI, and TOX2 with BMI (Supplementary Table 5 and Fig. 3E). We then used the GWAS Catalog to test whether SNPs annotated to these 21 genes have been associated with any traits related to CKD, such as BMI, T2D, blood pressure, eGFR, or serum creatinine levels. We found that SNPs annotated to 15 of the 21 genes (71%) have been associated with these CKD-related traits (Fig. 3F). For example, SNPs annotated to SHISA3, DZIP1, SLC43A2, SLC9A4, and SKA2 have been associated with eGFR or creatinine levels in GWAS (Supplementary Table 6 and Fig. 3F).
Discussion
We discovered a promising blood-based epigenetic biomarker for the prediction of CKD development in individuals with newly diagnosed T2D. The combination of MRS, including 37 DNAm sites and common clinical factors measured at baseline (age, sex, BMI, eGFR, diabetes duration, HbA1c, and lipid-lowering and antihypertensive medications), demonstrated very good prediction performance, with an AUC of 0.87 after performing cross-validation analyses, and significant improvement in iCKD risk discrimination and reclassification (NRI = 1.17, IDI = 0.28) compared with clinical risk factors alone.
Our novel combined epigenetic biomarker had an negative predictive value of 94.6%, when considering prevalence of iCKD in the cohort and a cutoff of 0.113. Subsequently, already at diabetes diagnosis, this epigenetic biomarker can discriminate with high precision individuals who will not develop CKD during an 11.5-year follow-up. This is useful information in a primary care setting, making it possible to better tailor clinical follow-up and treatment, saving costs for the health care system. Conversely, this test had a moderate false-positive rate, being not fully reliable in identifying individuals who will develop CKD (positive predictive value of 52.3%). Therefore, additional evaluations could be necessary to characterize the effective risk of iCKD in individuals with a positive test result at T2D diagnosis. Together, these data encourage the clinical utility of blood-based epigenetic biomarkers together with clinical factors for prediction of CKD development in individuals with newly diagnosed T2D.
Currently, there is still a need for validated risk scores with high predictive performance of iCKD at T2D onset (4). Available proposed predictive models developed in the T2D population use different end points for CKD to predict the progression of CKD or kidney failure (4), including a wide heterogeneity of variables, making comparisons between studies complex (8–11,39). However, the most commonly included risk factors in the proposed models were age, sex, eGFR, and HbA1c (39), which are easy to obtain in primary care settings and were included in our combined epigenetic biomarker. As strategies to reduce development and progression of CKD are available, including the use of ACE inhibitors, angiotensin II receptor blockers, and sodium–glucose cotransporter 2 inhibitors (40), incorporation of the epigenetic biomarker as a precision medicine tool in the management of CKD in individuals with newly diagnosed T2D can help identify high-risk patients and guide personalized treatment interventions, potentially improving patient outcomes and reducing CKD burden.
Among the DNAm sites included in the MRS, four were annotated to SLC43A2, encoding LAT4, which transports neutral amino acids in the kidney (38). SLC43A2 has also been associated with eGFR in GWAS (41). Interestingly, kidney tubule–specific LAT4 knockout mice had aminoaciduria due to reabsorption defects of amino acids (38). Additionally, SLC43A2 was downregulated in a murine model of ischemia reperfusion acute kidney injury, providing insights into the dysregulation of proximal tubular amino acid homeostasis response during kidney damage (42). Other DNAm sites included in the MRS have also been annotated to genes previously associated with CKD or renal function, e.g., TGFBI, CCR6, TEC, CDCP1, TOX2, SHISA3, DZIP1, SKA2, and SLC9A4. Of note, TGFBI encodes a protein that acts in the extracellular matrix where it impacts cell adhesion and migration, and it was upregulated in human proximal tubular epithelial cells exposed to high glucose and in kidneys of diabetic rats, supporting a role in diabetic CKD (43). Moreover, the Chronic Renal Insufficiency (CRIC) study found differential TGFBI DNAm in individuals with CKD and stable kidney function versus those with rapid disease progression (44). Together, these findings show that many of the DNAm sites included in our epigenetic biomarker are annotated to genes clearly linked to CKD pathogenesis. Although epigenetic modifications regulate cell-specific expression and differentiation, previous studies suggest that blood DNAm can reflect DNAm changes in tissues related to specific phenotypes (14,15,45,46). Some studies showed that DNAm of some sites in blood mirror the methylation pattern in target tissues, including the kidney (20,21,47). However, large-scale comparative studies with blood and kidney samples are needed to further clarify this association.
Strengths and Limitations
The strengths of our study are the prospective design and the follow-up time together with the unique characteristics of our cohort, including well-characterized individuals with newly diagnosed T2D with data available from national registries. To our knowledge, this is the first epigenome-wide study of iCKD in individuals with newly diagnosed T2D, leading to important insights for the development of new biomarkers that could predict CKD risk at diabetes diagnosis. This allowed us to investigate epigenetic modifications that occur early in the disease course rather than capturing methylation changes that may arise as a consequence of disease-driven pathways.
This study also has some limitations, including inflation, which we have partly mitigated with the selection of robust sites, and the use of one definition of CKD (eGFR <60 mL/min/1.73 m2for >3 months) without considering other markers of kidney damage, phenotype heterogeneity, or differences in the progression of CKD in individuals with T2D (48). We chose this definition of CKD also based on the presence of missing values of UACR in our cohort. It is also worth acknowledging the lack of diagnostic criteria for CKD that demonstrate a histopathological pattern of renal injury and the lack of independent validation cohorts needed for validating our combined epigenetic biomarker in populations with different ethnicities and demographic characteristics. Nevertheless, cross-validation, by reducing the risk of overfitting (49), mitigates this limitation. Differences in study design and cohort selection, e.g., markers associated with prevalent CKD versus iCKD or cohorts including general population versus individuals with newly diagnosed T2D, are likely to account for the lack of overlap between our epigenetic biomarker and previous epigenome-wide association study findings (22,28). Further studies should test whether this blood-based epigenetic biomarker could be generalizable to other patient populations at risk for developing CKD and how these markers may change over time according to the disease progression.
Conclusions
This study found that DNAm at baseline was associated with iCKD and that a blood-based epigenetic biomarker with 37 methylation sites and common clinical factors could predict, with very good performance, the risk of CKD in individuals with newly diagnosed T2D. This epigenetic biomarker should be further validated for development into a valuable precision medicine screening tool for CKD risk stratification. Early identification of individuals at higher risk of developing CKD will allow personalized prevention and treatment strategies to reduce the development and progression of CKD in individuals with T2D.
This article contains supplementary material online at https://doi.org/10.2337/figshare.28062917.
Article Information
Acknowledgments. The authors thank the participants in the ANDIS and ANDiU studies, Professor Leif Groop, and Maria Sterner, all from Department of Clinical Sciences Malmö, Lund University, Malmö, Sweden, for valuable support, as well as the Swegene Center for Integrative Biology at Lund University genomics facility for technical support with the DNAm analysis.
Funding. This study was supported by Vetenskapsrådet (the Swedish Research Council) grants 2018-02567 and 2021-00628 to C.L., 2015-02523 and 2019-01260, 2020-02191 to E.A./ANDIS, and 2018-02837 to M.F.G.; Swedish governmental funding of clinical research/Region Skåne (ALF, C.L., and E.A./ANDIS); Skåne University Hospital funds; Strategic Research Area Exodiab (Lund University Diabetes Centre Industrial Research Centre [LUDC-IRC] grant 2009-1039), Novo Nordisk Foundation grants NNF19OC0057415 to C.L. and NNF21OC0070457 to E.A.; the Swedish Foundation for Strategic Research grant IRC15-0067; the Swedish Diabetes Foundation (to C.L. and E.A.), Hjärt-Lungfonden (Swedish Heart and Lung Foundation) grants 20160602 to C.L., 20220606 to E.A., and 20190470 to M.F.G., and H2020-Marie-Curie grant 706081 (EpiHope). ANDIS was also funded by the Faculty of Medicine at Lund University and Vinnova Swelife. S.G.-C. was supported by a postdoctoral fellowship (Juan de la Cierva-Incorporación grant IJC2019-040796-I). A.M. was supported by a Fondo Gianesini Emma fellowship. M.F.G. has received funding from the European Union’s Research and Innovation Programme under grant 101095146 (Personalized Drug Response: Implementation and Evaluation in Chronic Kidney Disease [PRIME-CKD]) and from the Innovative Medicines Initiative 2 Joint Undertaking (JU) under grant 115974 (Biomarker Enterprise to Attack Diabetic Kidney Disease [BEAt-DKD]). The JU receives support from the European Union’s Horizon 2020 Research and Innovation Programme and the European Federation of Pharmaceutical Industries and Associations and JDRF.
Any dissemination of results reflects only the authors’ view; the JU is not responsible for any use that may be made of the information it contains.
Duality of Interest. M.F.G. has received financial and nonfinancial (in kind) support from Boehringer Ingelheim Pharma GmbH, JDRF International, Eli Lilly, AbbVie, Sanofi-Aventis, Astellas Pharma, Novo Nordisk A/S, and Bayer AG within European Union grant H2020-JTI-lMl2-2015-05 (agreement no. 115974, BEAt-DKD); financial and in kind support from Novo Nordisk, Pfizer, Follicum, Coegin Pharma, Abcentra, Probi, and Johnson & Johnson within a project funded by the Swedish Foundation for Strategic Research on precision medicine in diabetes (LUDC-IRC grant 15-0067); and personal consultancy fees from Eli Lilly and Tribune Therapeutics AB. No other potential conflicts of interest relevant to this article were reported.
Author Contributions. M.Marc., A.M., A.P., M.Maz., and S.G.-C. performed statistical and bioinformatics analyses. M.Marc., A.M., S.G.-C., and C.L. drafted the manuscript. M.Mart. and E.A. initiated clinical studies and data collection. M.F.G., E.A., S.G.-C., and C.L. participated in the design of the experiments and/or in the analysis and interpretation of the data. C.L. initiated the project. All authors read and edited the manuscript. C.L. and S.G.-C. are the guarantors of this work and, as such, had full access to all the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis.