The aim of this work was to describe the DNA methylation signature and identify genes associated with neuropathic pain in type 2 diabetes mellitus. We analyzed two independent cohorts: PROPGER, which comprised 72 patients with painful and 67 with painless diabetic neuropathy recruited at the German Diabetes Center in Düsseldorf, and PROPENG, which comprised 27 patients with painful and 65 with painless diabetic neuropathy recruited at the University of Manchester in the U.K. Genome-wide methylation data were generated using the Illumina Infinium MethylationEPIC v1.0 BeadChip. We used four different selection criteria to identify promising pain-related genes. Our findings revealed significant differences in methylation patterns between painful and painless diabetic neuropathy and identified a set of individual CpG sites of unique candidate genes associated with the painful phenotype. Several of these genes, including GCH1, MYT1L, and MED16, have been previously linked to pain-related phenotypes or diabetes. Through pathway enrichment analysis, we demonstrated that specific epigenetic signatures could contribute to the complex phenotype of diabetic neuropathy, and cluster analyses highlighted significant epigenetic dissimilarities between painful and painless phenotypes. Our results uncovered epigenetic differences between patients with painful and painless diabetic neuropathy and identified targeted genes linked to neuropathic pain through DNA methylation mechanisms. This approach holds promise for investigating other chronic pain conditions, such as secondary chronic pain from cancer treatment, thoracic surgery, and various transplant settings.
Approximately one out of two patients with diabetes develops diabetic neuropathy; of these, 20% experience neuropathic pain. Risk factors for neuropathic pain are largely unknown; however, DNA methylation was recently associated with neuropathies and degeneration of nerve fibers.
The aim of this work was to describe the DNA methylation signature and identify genes associated with neuropathic pain in type 2 diabetes mellitus (T2DM).
We discovered distinct DNA methylation signatures that differentiate painful and painless neuropathy phenotypes associated with T2DM and identified genes with potential as therapeutic targets for neuropathic pain, such as GCH1, MYT1L, and MED16.
This work can serve as reference hallmark for future studies on painful diabetic neuropathy and other chronic pain conditions.
Introduction
Epigenetics represents a powerful mechanism capable of controlling and altering gene expression without modifying the DNA sequence. Through various molecular mechanisms, epigenetics responds to external stimuli and is involved in the onset of disease. DNA methylation regulates transcription through the addition of a methyl group to the DNA sequence, mostly in cytosine followed by guanine motifs (CpG). This modification is reversible. In the genome, we can find regions significantly enriched in CpG sites that are called CpG islands. Such regions are often in the proximity of genes and represent elements for gene expression control.
Genome-wide DNA methylation experiments generate large data sets rich in epigenetic information, which require multidirectional and multilayer analytic strategies. Approaches to support data interpretation include differential methylation analysis (DMA), which identifies genomic regions with different methylation levels between groups; pathway enrichment analysis (PEA), which provides functional insight into gene sets under epigenetic regulation; and cluster analysis, which aids in identifying subgroups of individuals, uncovering epigenetic similarities.
In type 2 diabetes mellitus (T2DM), leukocytic genomic DNA has been shown to be hypomethylated, which correlates with the severity of diabetic peripheral neuropathy (DPN), the duration of diabetes, and BMI (1). In a genome-wide DNA methylation study in sural nerve biopsies of patients with DPN, 3,460 differentially methylated positions (DMPs) (55% hypomethylated and 45% hypermethylated) that mapped to 2,835 unique genes involved in nervous system development, axon guidance, glycerophospholipid metabolism, and mitogen-activated protein kinase signaling were shown to differ between patients with a significant reduction (degeneration) or increase (regeneration) in myelinated nerve fiber density over 52 weeks (2). Elsewhere, the methylome of T2DM with DPN was investigated, and individuals with the highest (n = 21) (more severe condition) versus the lowest (n = 32) HbA1c levels were compared (3). Of the 2,066 DMPs (mapping to 1,489 unique genes) detected, 57% were hypomethylated and 43% hypermethylated in those with a high HbA1c level. Moreover, 71 genes were simultaneously differentially methylated and expressed, and they were involved in immune response pathways that are closely related to neuropathies (4). We have previously shown differential methylation of genes involved in the migration process during monocyte differentiation into osteoclasts and inflammatory pathways in patients with severe diabetic neuropathy and acute Charcot foot compared with patients with severe diabetic neuropathy and patients without diabetic neuropathy (5).
The aim of this work was to describe the DNA methylation signature and identify genes associated with neuropathic pain (NP) in T2DM. To our knowledge, this is the first genome-wide DNA methylation study to explore the epigenetic signature of NP in patients with T2DM.
Research Design and Methods
Cohorts
DNA methylation was assessed in two independent cohorts: PROPGER (72 patients with painful diabetic neuropathy [PDN] and 67 with painless diabetic neuropathy [PLDN]) from the German Diabetes Center in Düsseldorf and PROPENG (27 patients with PDN and 65 with PLDN) from the University of Manchester in the U.K. The Lombardy Region Ethics Committee Section of the Fondazione IRCCS Istituto Neurologico “Carlo Besta” (no. 56, 7 November 2018) and Ethics Committee of Maastricht University (NL36128.06S.11/METC 11-2-030) approved the study, and all individuals gave their written informed consent.
Phenotype
Diabetic neuropathy was diagnosed according to established guidelines (6,7). Punch skin biopsies obtained from the distal site of the leg were evaluated through the morphometric quantification of the linear density of intraepidermal nerve fibers (IENFs) per length of section (in mm). IENF loss is an early indicator of diabetic neuropathy and progresses with increasing neuropathic severity. Early intervention may support rescue of IENF density.
PDN was defined based on the criteria of Treede et al. (8). Patients were assessed for NP symptoms, commonly described as prickling, deep aching, electric shock–like, and burning sensations often worsening at night and examined for hyperalgesia or allodynia. The frequency and severity of painful symptoms were assessed using the pain intensity numerical rating scale (PI-NRS) at the screening visit. The diagnosis was made if the duration of the NP experience lasted >1 year and the PI-NRS scores were ≥4 and other causes of NP besides T2DM were excluded. PLDN was defined when patients did not report pain (PI-NRS of 0) or based on the criteria of Treede et al. if NP lasted <1 year or scored <4 on the PI-NRS without the use of analgesic drugs (9) or if there were alternative causes of NP unrelated to T2DM. Applying these criteria, participants were diagnosed as having either PDN or PLDN.
Sample Selection
DNA Extraction and Methylation Experiments
Whole-blood samples were collected and DNA extracted using the Puregene blood kit (QIAGEN) according to manufacturer’s protocol and quantified using Qubit Fluorometer with dsDNA Broad Range Assay kit (Thermo Fisher Scientific). Normalized DNA samples were bisulfite converted using the EZ DNA Methylation Kit (Zymo Research), following a standard protocol. Samples were run on the Infinium MethylationEPIC v1.0 BeadChip (Illumina) to assess whole-genome DNA methylation with randomization of the samples and phenotypic groups.
Preprocessing of Raw Data
Epigenome-wide DNA methylation data were preprocessed using a bioinformatics pipeline implemented in R 3.6.3, and it reproduced the workflow recommended by Maksimovic et al. (14). Probes that failed in at least one of the samples (i.e., presented detection P > 0.01), those located on sex chromosomes, and those mapping to loci with single nucleotide polymorphisms (SNPs) were eliminated. Additional filtering steps were implemented as previously described (15). Only CpG sites that did not present bimodal or trimodal distribution among any sex-specific painless groups were considered. Eventually, for all probes that successfully passed the filtering steps, β-values expressing DNA methylation levels as a ratio of methylated to unmethylated allele intensities (with 0 corresponding to totally unmethylated and 1 to totally methylated states) were retrieved.
Differential Methylation Analysis and Identification of Candidate Genes
The fundamental principle of DMA is the identification of genomic sites and/or regions where the DNA methylation levels differ between groups. In this study, CpG sites that were differentially methylated between patients with PDN and PLDN were identified by creating multiple linear models with robust regression fitting (limma package in R). Briefly, the built models were corrected for chronological age, sex, DNA methylation–based estimates of blood cell counts (naive CD8+ T cells, CD4+ T cells, exhausted cytotoxic CD8+, CD28−, and CD45R− T cells, natural killer cells, granulocytes, and plasma blasts), and cohort of provenance. All calculated P values were corrected for multiple tests using the Benjamini-Hochberg (BH) procedure. DMRs were detected with the comb-p method. These two approaches were first applied to separate PROPGER and PROPENG cohorts and eventually applied to an integrated study population encompassing both sets.
To identify the most robust DMP, we applied the following selection criteria (Fig. 1): criterion A, significant BH-corrected P value (statistical significance level of 0.05) in the PROPGER cohort or in the integrated analysis of the studied population (combining PROPGER and PROPENG); criterion B, significant nominal P value (statistical significance level of 0.01) and absolute value of methylation difference between the PDN and PLDN groups >5% in the PROPGER cohort and an additionally concordant direction of methylation change between the PDN and PLDN groups in both the PROPGER and PROPENG cohorts; criterion C, significant nominal P value (statistical significance level of 0.01) and absolute value of methylation difference between the PDN and PLDN groups >5% in the integrated analysis of the combined PROPGER and PROPENG cohorts; and criterion D, the lowest nominal P value in significant DMRs from a region-oriented analysis of the PROPGER cohort or the integrated analysis of the combined PROPGER and PROPENG cohorts. Since one of the aims of this work was to identify genes associated with NP in T2DM, after applying selection criteria, significant probes were filtered to include exclusively genic CpGs.
Summary of the design of the analyses performed in this study. First, differential methylation analysis (yellow section) was performed to identify CpG sites associated with NP in T2DM by applying four different criteria (A–D). A more restricted subset of DMPs was determined and used in subsequent PEAs (green modules) after mapping those CpG sites to genes. Finally, with the same subset of DMPs, a cluster analysis (field in blue) was performed. The study involved two cohorts, PROPGER and PROPENG, and green checkmarks specify the type of analysis and sample subset (gray bars) on which the analysis was performed.
Summary of the design of the analyses performed in this study. First, differential methylation analysis (yellow section) was performed to identify CpG sites associated with NP in T2DM by applying four different criteria (A–D). A more restricted subset of DMPs was determined and used in subsequent PEAs (green modules) after mapping those CpG sites to genes. Finally, with the same subset of DMPs, a cluster analysis (field in blue) was performed. The study involved two cohorts, PROPGER and PROPENG, and green checkmarks specify the type of analysis and sample subset (gray bars) on which the analysis was performed.
PEA
The concept of this approach is to identify a subset of biological pathways enriched in genes from a larger predefined collection that are plausibly associated with an observed phenotype. In this study, to determine the biological pathways differentially involved in PDN and PLDN, a specific set of methylation signals was identified. Particularly, we selected genic CpG sites in which the nominal P value from the site-wise DMA was <0.0002 and the absolute value of the methylation difference between PDN and PLDN groups was >1%. PEA was performed in parallel in the separate PROPGER cohort and in the combined PROPGER-PROPENG data set.
The revealed PEA gene lists were independently uploaded to the Enrichr web-based tool (16) and annotated with frequently occurring pathways using the Kyoto Encyclopedia of Genes and Genomes (KEGG) database (17) and Online Mendelian Inheritance in Man (OMIM) Disease database (18). To further explore the epigenetically affected genes and confirm whether they were previously associated with the painful phenotype, we consulted the Comparative Toxicogenomics Database (CTD) based on manual curation of up-to-date literature. To this aim, using the CTD home page (19), we performed a search using pain as a disease keyword query and selected exclusively gene data to retrieve the list of genes associated with pain-related phenotypes and confronted it with PEA gene lists.
Cluster Analysis
Clustering is a common technique used to group different items, with an aim of allocating the most similar ones within the same cluster. As a result, those within the same class are more similar to each other than to any other component of the cluster.
Here, we performed a cluster analysis to examine the dissimilarity between PDN and PLDN and to assess whether the identified CpG sites uncover a discriminating epigenetic signal that could help to classify patients according to the clinical phenotypes or indicate even more homogeneous phenotypic subgroups. To this aim, we used a reduced data set comprising β-values exclusively of CpGs selected previously for PEA (nominal P value from site-wise DMA < 0.0002 and |Δβ(PLDN − PDN)| > 0.01). We implemented principal component analysis (PCA) (20) and plotted two main principal components that explained the greatest percentage of the total variability in a phenotype. Moreover, we performed multidimensional scaling (MDS) (21) with nonhierarchical K-means clustering (22) to explore further natural and nonobvious patterns in methylation data. We calculated the Euclidean distances between samples, which approximately quantify the dissimilarities, and we completed the clustering using the algorithm of Hartigan and Wong (23).
Data and Resource Availability
The data sets generated and analyzed during the current study are available in the Gene Expression Omnibus repository under accession number GSE286347. No applicable resources were generated or analyzed during the current study.
Results
Study Cohorts
The PROPGER cohort included 72 patients with PDN and 67 patients with PLDN (Table 1). The groups did not significantly differ in mean age (P = 0.554). Similarly, T2DM duration did not differ between the PDN and PLDN groups, ranging between 0 and 46 and 0 and 44 years, respectively (overall mean 13.6 years). Patients with PDN had a diagnosis of neuropathy for ∼6 years, while those with PLDN had a shorter duration, with a difference of 1.4 years that did not reach a level of statistical significance (two-sided P = 0.202 by t test).
Characterization of PROPGER cohort
Characteristic . | PDN . | PLDN . |
---|---|---|
Samples, n (%) | 72 (52) | 67 (48) |
Sex (female/male), n | 15/57 | 14/53 |
Age (years) | ||
Mean ± SD | 68.8 ± 9.0 | 69.7 ± 9.1 |
Range | 48–83 | 49–84 |
T2DM duration, mean ± SD (years) | 13.5 ± 10.3 | 13.6 ± 9.2 |
Neuropathy duration, mean ± SD (years) | 6.9 ± 6.1 | 5.5 ± 6.5 |
Characteristic . | PDN . | PLDN . |
---|---|---|
Samples, n (%) | 72 (52) | 67 (48) |
Sex (female/male), n | 15/57 | 14/53 |
Age (years) | ||
Mean ± SD | 68.8 ± 9.0 | 69.7 ± 9.1 |
Range | 48–83 | 49–84 |
T2DM duration, mean ± SD (years) | 13.5 ± 10.3 | 13.6 ± 9.2 |
Neuropathy duration, mean ± SD (years) | 6.9 ± 6.1 | 5.5 ± 6.5 |
The PROPENG cohort included 27 patients with PDN and 65 with PLDN (Table 2). Mean age did not differ between the groups (P = 0.169). T2DM duration was 13.4 years in patients with PDN and 9 years in those with PLDN, which was significantly shorter (P = 0.041).
Characterization of the PROPENG cohort
Characteristic . | PDN . | PLDN . |
---|---|---|
Samples, n (%) | 27 (15) | 65 (37) |
Sex (female/male), n | 9/18 | 19/46 |
Age (years) | ||
Mean ± SD (years) | 62.1 ± 9.1 | 65.0 ± 9.0 |
Range (years) | 46–78 | 41–78 |
T2DM duration, mean ± SD (years) | 13.4 ± 9.8 | 9.0 ± 7.0 |
Characteristic . | PDN . | PLDN . |
---|---|---|
Samples, n (%) | 27 (15) | 65 (37) |
Sex (female/male), n | 9/18 | 19/46 |
Age (years) | ||
Mean ± SD (years) | 62.1 ± 9.1 | 65.0 ± 9.0 |
Range (years) | 46–78 | 41–78 |
T2DM duration, mean ± SD (years) | 13.4 ± 9.8 | 9.0 ± 7.0 |
Traces of Differential Methylation
The identification of potential candidates of NP-related genes followed a four-step selection process. The first set of CpGs was determined according to criterion A (significant P < 0.05) after BH adjustment for multiple testing in the PROPGER cohort or in integrated analysis of the combined PROPGER and PROPENG cohorts). Four genic sites were identified (Table 3). Three DMPs, i.e., cg00973677, cg23672395, and cg03533123, were found within the regions of CpG islands and showed hypermethylated patterns in patients with PLDN compared with PDN. These positions were in the CHRM2, PSMF1, and GCH1 genes. One of the DMPs, cg08167727, was in the region of the SLC5A1 gene, not rich in CpGs (i.e., distant from island), and instead showed hypomethylation in the PROPGER PLDN group.
Genic CpGs that emerged from DMA after applying selection criterion A*
ID_REF . | P . | Adjusted P . | Δβ(PLDN − PDN) . | Genomic position (hg19) . | Island name . | Relation to island . | Gene . | Cohort analysis . |
---|---|---|---|---|---|---|---|---|
cg00973677 | 0.000 | 0.029 | 0.006 | chr7:136553595 | chr7:136553854–136556194 | N_Shore | CHRM2 | PROPGER |
cg08167727 | 0.000 | 0.029 | −0.043 | chr22:32475054 | OpenSea | SLC5A1 | PROPGER | |
cg23672395 | 0.000 | 0.029 | 0.004 | chr20:1099079 | chr20:1099078–1099561 | Island | PSMF1 | PROPGER |
cg03533123 | 0.000 | 0.026 | 0.012 | chr14:55368924 | chr14:55368547–55369913 | Island | GCH1 | PROPGER-PROPENG |
ID_REF . | P . | Adjusted P . | Δβ(PLDN − PDN) . | Genomic position (hg19) . | Island name . | Relation to island . | Gene . | Cohort analysis . |
---|---|---|---|---|---|---|---|---|
cg00973677 | 0.000 | 0.029 | 0.006 | chr7:136553595 | chr7:136553854–136556194 | N_Shore | CHRM2 | PROPGER |
cg08167727 | 0.000 | 0.029 | −0.043 | chr22:32475054 | OpenSea | SLC5A1 | PROPGER | |
cg23672395 | 0.000 | 0.029 | 0.004 | chr20:1099079 | chr20:1099078–1099561 | Island | PSMF1 | PROPGER |
cg03533123 | 0.000 | 0.026 | 0.012 | chr14:55368924 | chr14:55368547–55369913 | Island | GCH1 | PROPGER-PROPENG |
*Adjusted P < 0.05 in the PROPGER cohort or in the integrated analysis of the combined PROPGER-PROPENG cohort.
Following DMP selection criterion B (nominal P < 0.01 and absolute value of methylation difference between the PDN and PLDN group >5% in PROPGER cohort and an additionally concordant direction of methylation change between the PDN and PLDN groups in both the PROPGER and PROPENG cohorts), we first identified 23 significant sites, which were further confined to concordant CpGs. Eventually, there were eight DMPs mapping to five unique genes that passed the pruning step (Table 4). There were three significant probes within the MYT1L gene: cg10075506, cg06201514, and cg21862353. They all presented increased methylation in PLDN compared with PDN. Another candidate gene was MED16 with two neighboring DMPs, cg09670971 and cg15765251, in the shelf zone of its CpG island. Both CpG sites were hypermethylated in PLDN. From further analysis, another three genes emerged: HYDIN, DMTN, and PLCXD3, which were pointed out by the cg27261665, cg02794906, and cg25950625 probes, respectively. All three sites are in island regions rich in CpGs, and the cg02794906 probe maps to a single nucleotide variant identified as rs75936484-presenting G > A alleles. While DMPs within the HYDIN and DMTN genes were hypomethylated in PLDN, PLCXD3 showed an opposite trend, with an increment of epigenetic signal in the PLDN compared with PDN group.
Genic CpGs that emerged from DMA after applying selection criterion B*
ID_REF . | P . | Adjusted P . | Δβ(PLDN − PDN) . | Genomic position (hg19) . | Island name . | Relation to island . | Gene . | Cohort analysis . |
---|---|---|---|---|---|---|---|---|
cg27261665 | 0.000 | 0.077 | −0.066 | chr16:71265020 | chr16:71264360–71264659 | S_Shore | HYDIN | PROPGER |
cg10075506 | 0.000 | 0.484 | 0.109 | chr2:1817351 | OpenSea | MYT1L | PROPGER | |
cg06201514 | 0.000 | 0.697 | 0.074 | chr2:1817409 | OpenSea | MYT1L | PROPGER | |
cg21862353 | 0.002 | 0.905 | 0.063 | chr2:1801628 | chr2:1801618–1802060 | Island | MYT1L | PROPGER |
cg02794906 | 0.004 | 0.968 | −0.054 | chr8:21911045 | chr8:21914275–21914525 | N_Shelf | DMTN | PROPGER |
cg09670971 | 0.005 | 0.968 | 0.080 | chr19:888612 | chr19:891531–892150 | N_Shelf | MED16 | PROPGER |
cg15765251 | 0.007 | 0.968 | 0.057 | chr19:888814 | chr19:891531–892150 | N_Shelf | MED16 | PROPGER |
cg25950625 | 0.007 | 0.968 | 0.052 | chr5:41510116 | chr5:41509783–41510166 | Island | PLCXD3 | PROPGER |
ID_REF . | P . | Adjusted P . | Δβ(PLDN − PDN) . | Genomic position (hg19) . | Island name . | Relation to island . | Gene . | Cohort analysis . |
---|---|---|---|---|---|---|---|---|
cg27261665 | 0.000 | 0.077 | −0.066 | chr16:71265020 | chr16:71264360–71264659 | S_Shore | HYDIN | PROPGER |
cg10075506 | 0.000 | 0.484 | 0.109 | chr2:1817351 | OpenSea | MYT1L | PROPGER | |
cg06201514 | 0.000 | 0.697 | 0.074 | chr2:1817409 | OpenSea | MYT1L | PROPGER | |
cg21862353 | 0.002 | 0.905 | 0.063 | chr2:1801628 | chr2:1801618–1802060 | Island | MYT1L | PROPGER |
cg02794906 | 0.004 | 0.968 | −0.054 | chr8:21911045 | chr8:21914275–21914525 | N_Shelf | DMTN | PROPGER |
cg09670971 | 0.005 | 0.968 | 0.080 | chr19:888612 | chr19:891531–892150 | N_Shelf | MED16 | PROPGER |
cg15765251 | 0.007 | 0.968 | 0.057 | chr19:888814 | chr19:891531–892150 | N_Shelf | MED16 | PROPGER |
cg25950625 | 0.007 | 0.968 | 0.052 | chr5:41510116 | chr5:41509783–41510166 | Island | PLCXD3 | PROPGER |
*Nominal P < 0.01 and absolute value of methylation difference between PDN and PLDN groups >5% in PROPGER cohort and additionally concordant direction of methylation change between the PDN and PLDN groups in both the PROPGER and PROPENG cohorts.
After applying selection criterion C (nominal P < 0.01 and absolute value of methylation difference between the PDN and PLDN groups >5% in the integrated analysis of the combined PROPGER and PROPENG cohorts), 10 CpGs emerged. Seven of these probes were genic and mapped to six unique genes (Supplementary Table 1). Results confirmed the epigenetic involvement of two genes previously revealed with selection criterion B: MYT1L with the cg10075506 and cg06201514 probes (Supplementary Fig. 1), and MED16 with the cg09670971 probe (Supplementary Fig. 2). Respective significant sites were verified as hypermethylated in PLDN compared with PDN. From the analysis, four new genes were identified: TMPO, IREB2, PHLPP1, and F13A1, inferred with cg10121367, cg23671448, cg11816703, and cg16759545 probes, respectively.
With selection criterion D (P < 0.1 in the significant DMR from region-oriented analysis of the PROPGER cohort [i] or of the combined PROPGER and PROPENG cohorts [ii]), 17 sites were identified (Supplementary Table 2). In the first case (i), we confirmed the four loci that had already emerged previously from CpG-wise analysis with criterion A or B: cg00973677, cg08167727, cg23672395, and cg27261665. Analysis of the combined cohorts (ii) allowed the identification of 16 significant DMRs, of which 1 had already been found with criterion A, with cg03533123 mapping to the GCH1 gene (Fig. 2), 3 nongenic, and the remaining 12 newly discovered.
Methylation loci plot of the cg03533123 CpG site (marked with red asterisks) within the GCH1 gene, which emerged from DMA after applying selection criteria A and D in the PROPGER-PROPENG cohort analysis. The plot also includes four adjacent CpG sites (two before and two after significant DMP) located in the closest proximity to cg03533123. The x-axis represents genomic position on chromosome 14 (hg19 assembly), while the y-axis shows the DNA methylation B values.
Methylation loci plot of the cg03533123 CpG site (marked with red asterisks) within the GCH1 gene, which emerged from DMA after applying selection criteria A and D in the PROPGER-PROPENG cohort analysis. The plot also includes four adjacent CpG sites (two before and two after significant DMP) located in the closest proximity to cg03533123. The x-axis represents genomic position on chromosome 14 (hg19 assembly), while the y-axis shows the DNA methylation B values.
PEA
Because of the importance of the quality and purity of the signal in the KEGG and OMIM analyses, we decided to focus particularly on the outcomes obtained with the richer cohort and, thus, report the results produced with combined analysis of the PROPGER-PROPENG data set (PROPGER PEA results presented in Supplementary Tables 3 and 4). Eventually, there were 178 CpGs that passed the PEA selection criteria, resulting in 137 unique genes. From the KEGG analysis (Supplementary Table 5), 10 terms emerged that reached a combined score >5, and among them, there were several pathways related to altered metabolism that contribute to the onset and progression of diabetes pathology as vasopressin-regulated water reabsorption (24), purine metabolism (25), transforming growth factor-β (TGF-β) signaling pathway (26), pancreatic secretion (27), maturity-onset diabetes of the young, and circadian rhythm (28). Instead, OMIM PEA (Table 5) of the queried list of genes returned 10 diseases with a combined score >5, with neuropathy as the most significant association (P = 0.024). Enrichment analysis based on the CTD database confirmed that 85 of 97 (88%) PROPGER PEA genes and 114 of 137 (83%) PROPGER-PROPENG PEA genes were already associated with pain as the parent term (see also Supplementary Fig. 3 and Supplementary Tables 6 and 7).
Results of PEA based on OMIM disease database in the PROPGER-PROPENG data set
Term . | Overlap* . | P† . | Adjusted P‡ . | Combined score§ . | Genes . |
---|---|---|---|---|---|
Neuropathy | 2/35 | 0.024 | 0.210 | 33.235 | DCTN1, CCT5 |
Hypothyroidism | 1/11 | 0.073 | 0.210 | 38.239 | PAX8 |
Arrhythmogenic right ventricular dysplasia | 1/13 | 0.086 | 0.210 | 29.913 | RYR2 |
Ovarian cancer | 1/13 | 0.086 | 0.210 | 29.913 | OPCML |
Corneal dystrophy | 1/14 | 0.092 | 0.210 | 26.816 | PAX6 |
Dystonia | 1/18 | 0.116 | 0.210 | 18.460 | GCH1 |
Lateral sclerosis | 1/19 | 0.122 | 0.210 | 17.022 | DCTN1 |
Stature | 1/24 | 0.152 | 0.228 | 11.942 | GHR |
Spastic paraplegia | 1/32 | 0.198 | 0.263 | 7.628 | CCT5 |
Colorectal cancer | 1/40 | 0.241 | 0.289 | 5.325 | BUB1 |
Term . | Overlap* . | P† . | Adjusted P‡ . | Combined score§ . | Genes . |
---|---|---|---|---|---|
Neuropathy | 2/35 | 0.024 | 0.210 | 33.235 | DCTN1, CCT5 |
Hypothyroidism | 1/11 | 0.073 | 0.210 | 38.239 | PAX8 |
Arrhythmogenic right ventricular dysplasia | 1/13 | 0.086 | 0.210 | 29.913 | RYR2 |
Ovarian cancer | 1/13 | 0.086 | 0.210 | 29.913 | OPCML |
Corneal dystrophy | 1/14 | 0.092 | 0.210 | 26.816 | PAX6 |
Dystonia | 1/18 | 0.116 | 0.210 | 18.460 | GCH1 |
Lateral sclerosis | 1/19 | 0.122 | 0.210 | 17.022 | DCTN1 |
Stature | 1/24 | 0.152 | 0.228 | 11.942 | GHR |
Spastic paraplegia | 1/32 | 0.198 | 0.263 | 7.628 | CCT5 |
Colorectal cancer | 1/40 | 0.241 | 0.289 | 5.325 | BUB1 |
In the analysis, 137 unique genes emerged from 178 CpGs (with nominal P values from site-wise DMA <0.0002 and with the absolute value of the methylation difference between PDN and PLDN >1%) were used. Pathway terms with a combined score >5 are listed according to increasing P value.
*Overlap indicates the ratio between number of genes provided as the input list and present in a particular pathway and the total number of genes constituting that pathway.
†P value by Fisher exact test.
‡Adjusted P value by BH correction of Fisher P value.
§The combined score is a logarithm of the P value from Fisher exact test multiplied by the z score of the deviation from the expected rank. (Rank-based ranking comes from running the Fisher exact test for many random gene sets to compute a mean rank and SD from the expected rank for each term in the gene set library and finally calculating a z score to assess the deviation from the expected rank.)
Cluster Analysis
PCA and MDS analyses were performed exclusively in the combined PROPGER and PROPENG cohort to provide the largest possible data set. Methylation data were limited to a subset of 178 CpGs selected previously for PEA. Two principal components identified using PCA explained 16% and 8.1% of the variability present in the data, respectively. A certain degree of separation between PDN and PLDN samples was observed (Fig. 3A), even though there was no precise cutoff between the two groups. MDS analysis further confirmed the discriminative potential of the identified set of CpGs (Fig. 3B). Thus, the proposed set of 178 CpG sites cannot be considered as a predictive epigenetic signature but may be considered indicative.
Visualization of cluster analysis results in the combined PROPGER-PROPENG cohort using methylation data from 178 CpG subsets selected by applying the following thresholds: nominal P < 0.0002 and |Δβ(PLDN − PDN)| > 1%. A: PCA results. Two main principal components (PCs) are plotted on the x- and y-axes, with the first PCA component (PC1) explaining 16% of the total variability in the data and the second (PC2) contributing 8.1% of total variability explained. PC1 visibly shifts PLDN samples to the right and PDN to the left. B: Results of two-dimensional scaling with K-means clustering. Two MDS dimensions (Dim1 and Dim2) are plotted on the x- and y-axes. Considering the Euclidean distance, two clusters of the most similar samples were created. PLDN was predominantly co-located in the Cluster1 grouping, representing 77% of PLDN samples. A higher percentage of PDN was assigned to Cluster2 than to Cluster1, but the difference in the distribution between the two classes was even, reaching 56% and 44% of PDN samples, respectively.
Visualization of cluster analysis results in the combined PROPGER-PROPENG cohort using methylation data from 178 CpG subsets selected by applying the following thresholds: nominal P < 0.0002 and |Δβ(PLDN − PDN)| > 1%. A: PCA results. Two main principal components (PCs) are plotted on the x- and y-axes, with the first PCA component (PC1) explaining 16% of the total variability in the data and the second (PC2) contributing 8.1% of total variability explained. PC1 visibly shifts PLDN samples to the right and PDN to the left. B: Results of two-dimensional scaling with K-means clustering. Two MDS dimensions (Dim1 and Dim2) are plotted on the x- and y-axes. Considering the Euclidean distance, two clusters of the most similar samples were created. PLDN was predominantly co-located in the Cluster1 grouping, representing 77% of PLDN samples. A higher percentage of PDN was assigned to Cluster2 than to Cluster1, but the difference in the distribution between the two classes was even, reaching 56% and 44% of PDN samples, respectively.
Discussion
To our knowledge, this study is the first to investigate whole-genome DNA methylation at high resolution (>850,000 CpGs) in a population of patients with T2DM with and without NP who were accurately diagnosed with diabetic neuropathy. Methylome analysis in two independent European cohorts revealed significantly different methylation patterns between PDN and PLDN based on a set of individual CpG sites and unique candidate genes. Notably, several of these genes were previously associated with pain-related phenotypes and T2DM as discussed in detail in Supplementary Table 8. Three of the identified genes, GCH1, MYT1L, and MED16, deserve particular attention.
GCH1, coding for the enzyme GTP cyclohydrolase 1, was significantly hypomethylated in PDN. It is a rate-limiting enzyme for tetrahydrobiopterin (BH4) biosynthesis, whose production within the dorsal root ganglion plays a critical role in pain signaling (29). A marked and long-lasting upregulation of GCH1 mRNA, protein, and activity, with an order of magnitude increase in intracellular BH4 levels, was observed in a rat nerve injury model of peripheral NP (30). GCH1 polymorphisms have been associated with pain-related phenotypes, allowing the recognition of a related pain-protective haplotype (30,31). Pharmacological or genetic engineering interventions via the GCH1/BH4 axis to alleviate pain experience, sensitivity, and persistence in inflammatory and NP have provided a potential therapeutic target independent of the opioid system (30,32) and potentially useful for cancer pain (33). Additionally, in the specific context of PDN, GCH1/BH4 pathway overexpression was found to protect against cardiovascular events (34).
We have found a link between the MYT1L gene, encoding myelin transcription factor 1-like, and NP due to three CpG sites being significantly hypomethylated in PDN. A genome-wide association study analysis of data from a Spanish population that included 300 case participants and 203 control participants identified the rs11127292 MYT1L SNP to be associated with fibromyalgia (35). A study in 2,057 Southern Chinese patients with T2DM revealed that the rs10199521 MYT1L SNP was related to proliferative diabetic retinopathy (36). Although the link between MYT1L and pain is limited, its role in neuronal pathways and in neuron differentiation (37), as well as its involvement in several neuropsychiatric disorders, make it a very interesting candidate gene. Large, karyotypically detectable abnormalities within MYT1L were observed in individuals with mental retardation, autistic-like conduct, and/or developmental delay (38). Two-phase analysis of copy number variants in a Dutch population determined a rare duplication in the MYT1L gene as an underlying cause of deficit schizophrenia (39). The findings of another study involving 1,139 patients and 1,140 control participants from the Chinese Han population showed an association of the rs3748989 MYT1L SNP with major depressive disorder (40). A case study of two male half-siblings diagnosed with autism spectrum disorder reported partial duplication of MYT1L (2p25.3) inherited from the healthy mother via germline mosaic transmission (41).
Another interesting gene that emerged from our study was MED16 with two CpGs that were significantly hypomethylated in PDN. Mediator complex (MED) is an evolutionarily conserved multiprotein family, and some of its subunits were previously implicated in a number of neurological diseases and in brain activity (42). There is no previous evidence for an association between MED16 and pain, neuropathy, or diabetes.
An association between epigenetic regulation of the TRPA1 gene and pain has been found in several independent studies (43–48), but this was not confirmed in our cohort (Supplementary Fig. 4). The discrepancies with the studies by Bell et al. (43), Gombert and colleagues (45,46), and Achenbach et al. (47) could be due to the different clinical phenotypes evaluated. Sukenaga et al. (44) and Takenaka et al. (48) examined 12 and 48 patients with NP, respectively, with an average age comparable to that of PROPGER and PROPENG cohorts. Although both used Illumina HumanMethylation450 BeadChip rather than the Infinium MethylationEPIC BeadChip, this is an unlikely source of discrepancies in the assessment of CpG sites within TRPA1. Notably, only one single methylation probe assessed by Sukenaga et al. was absent in the EPIC platform (chr8:72988608; hg19/GRCh37).
Although variations in preprocessing procedures, such as the exclusion of some probes used in the cited studies during the filtration of our PROPGER and PROPENG data sets, could partially explain the differences, they seem insufficient to fully account for the lack of replication. A more likely explanation lies in the differences in diagnostic tools and chronic pain etiologies. The cited studies used the Douleur Neuropathique 4 questionnaire or the short form McGill Pain questionnaire, whereas our study used histological evaluation through skin biopsy and the PI-NRS. Furthermore, the cited studies included conditions like chronic back pain or postherpetic neuralgia, while our focus was on T2DM-related neuropathies.
In the current work, PEA reflected involvement of the constructed epigenetic signature in the complex phenotype of diabetic neuropathy. Among emerged KEGG pathways, purine metabolism, circadian rhythm, and TGF-β signaling pathways were found to be associated with neurological features, such as myalgia, neuropathies, neuroinflammation, and NP (49,50). On the other hand, pathways such as maturity-onset diabetes of the young, vasopressin-regulated water reabsorption, pancreatic secretion, and folate biosynthesis were described as related to diabetes (51). Circadian rhythm and TGF-β signaling pathways instead may play a dual role in both T2DM and NP phenotypes (49,52). The complexity of the studied phenotype was reflected also in results of OMIM enrichment analysis, which returned terms representing features and manifestations as neuropathy, hypothyroidism (53), and dystonia (54). Overrepresentation analysis based on the CTD database provided ultimate verification that created methylation signature captured the epigenetic signal driving a pain-related phenotype. Our cluster analyses (both PCA and MDS) confirmed that the generated set of CpG sites differentiated patients with PDN and PLDN, highlighting the epigenetic dissimilarities between the two phenotypes.
The main strength of this study lies in the unique population of patients with T2DM who received an accurate diagnosis of diabetic neuropathy based also on the assessment of IENF by skin biopsy. An additional strength, particularly relevant in the context of epigenome-wide association studies, is the large sample size, especially in the PROPGER and PROPENG combined analysis. Moreover, the inclusion of two independent cohorts ensured that our results are robust against potential biases related to population genetics, environmental factors, and differences in clinical management.
A substantial body of NP methylation research has been conducted on animal models, yielding several promising therapeutic candidates. However, many of these candidates failed in clinical trials, as targets identified in animals cannot always be directly translated to humans. This is due to differences between species in cellular and molecular pain-modulation pathways and the complexity of the human pain phenotype, particularly its emotional component, which cannot be replicated in animal models.
Our study addressed these issues by reducing interspecies translation bias and proposing gene candidates identified directly from human models. For future research, we recognize the value of using a control group without NP, diabetic neuropathy, or diabetes to further enrich our understanding of this complex phenotype.
Although the identified epigenetic signature showed strong correlations with the PDN and PLDN phenotypes, these markers are not suitable yet for epigenetics-driven risk stratification. Longitudinal studies are needed to determine their prediction value. However, several strongly associated genes, particularly GCH1, have already been identified as targets for treating painful conditions. These findings suggest that the epigenetic analysis of whole blood, despite it not being the target tissue for painful neuropathy, can still effectively highlight genes and pathways mechanistically involved in the pathogenesis of the complication.
In conclusion, our findings demonstrated significant epigenetic differences between patients with PDN and PLDN, with genes linked to diabetic NP through DNA methylation mechanisms. Selected genes can serve as reference hallmarks for future studies on PDN and pain in general. Overall, these results provide evidence to support the effectiveness of using peripheral tissues for studying neurological conditions. This approach holds promise for investigating other chronic pain conditions, such as secondary chronic pain from cancer treatment, thoracic surgery, or various transplant settings.
This article contains supplementary material online at https://doi.org/10.2337/figshare.28132316.
Article Information
Acknowledgments. The authors thank the Ministry of University and Research, NextGenerationEU, and National Recovery and Resilience Plan, project MNESYS (PE0000006): “A multiscale integrated approach to the study of the nervous system in health and disease” (DN.1553 11.10.2022).
Funding. This work was supported by the European Union’s Horizon 2020 research and innovation program under Marie Skłodowska-Curie grant 721841 PainNET and the Italian Ministry of Health (RRC).
Duality of Interest. No potential conflicts of interest relevant to this article were reported.
Author Contributions. K.M.K. contributed to the conceptualization and design of the study, data acquisition and analysis, statistical and mathematical analysis, formal analysis, investigation, literature search, and drafting, review, and editing of the manuscript. P.G. contributed to the conceptualization and design of the study; formal analysis; investigation; literature search; drafting, review, and editing of the manuscript; funding acquisition; resources; and supervision. M.B. contributed resources. M.G.B. contributed to the data acquisition and analysis. C.S. and G.C. contributed to the statistical and mathematical analysis. D.G. and L.C. contributed to the data acquisition and analysis. D.Z. and M.M.G. contributed resources. C.G.F. contributed to the conceptualization and design of the study and resources. R.A.M. contributed to the drafting, review, and editing of the manuscript. M.M. contributed to the formal analysis, investigation, and literature search. E.S. contributed to the statistical and mathematical analysis. G.L. contributed to the conceptualization and design of the study; drafting, review, and editing of the manuscript; funding acquisition; resources; and supervision. C.P. contributed to the conceptualization and design of the study, data acquisition and analysis, formal analysis, investigation, literature search, original draft preparation of the manuscript, review and editing of the manuscript, resources, and supervision. K.M.K. is the guarantor of this work and, as such, had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.