Chromatin accessibility quantitative trait loci (caQTL) discovery is vital for linking epigenomic features with genetic variations in Type 2 diabetes (T2D). However, caQTL scans require clear epigenomic signals, posing challenges in rare cell types from single nucleus (sn) studies. To address this, we proposed using deep learning (DL) to upscale chromatin accessibility signals e.g. snATAC-seq data in skeletal muscle (SM) to uncover new caQTLs in rare cells.

We used snATAC-seq data from a recent study representing ~260K nuclei across 287 human SM biopsies and a DL model to upscale profiles. The study identified 13 cell populations and performed caQTL scans in the five most prevalent types: muscle fibers (Type 1, 2a, 2x), endothelial cells (ECs) and fibro-adipogenic progenitors (FAPs). Their abundance ranged from 29% (Type 1) to 12% (ECs) and 5% (FAPs) of ATAC nuclei. We trained a model with two Type 1 fiber read sets: a sparse set with read count matching the median count across per-sample EC nuclei, and an abundant set 50x larger than the sparse set. The model was used to upscale ECs and FAPs’ snATAC signals for caQTL scans. CaQTLs identified with upscaled profiles were compared against those identified without upscaling (“base”).

With upscaled profiles, we found 4,753 and 8,722 significant caQTL peaks in ECs and FAPs, respectively, surpassing the base’s 3,412 EC and 4,762 FAP significant caQTL peaks (5% FDR). The upscaled and base scans showed high correlation (Spearman ρ = 0.96 and 0.95 for signed -log10(p-value); Spearman ρ = 1 and 0.99 for effect sizes in ECs and FAPs). Notably, DL upscaling additionally discovered 1,847 EC and 4,652 FAP caQTL peaks not detected in base scans.

DL is expected to help characterize rare cells’ genetic regulatory landscapes. We are working on colocalization between caQTLs from upscaling analysis and T2D and related trait GWAS signals to determine novel ones that may share common causal variants and nominate rare effector cell types in SM.

Disclosure

H.T.H. Vu: None. A. Varshney: None. P. Orchard: None. M. Laakso: None. J. Tuomilehto: Stock/Shareholder; Orion Pharma, Aktivolabs, Digostics. T.A. Lakka: None. K.L. Mohlke: None. M. Boehnke: None. L. Scott: None. H.A. Koistinen: Other Relationship; AstraZeneca, Novo Nordisk. F.S. Collins: None. S.C. Parker: Research Support; Pfizer Inc.

Funding

NIH NIDDK (U24DK138515)

Readers may use this article as long as the work is properly cited, the use is educational and not for profit, and the work is not altered. More information is available at http://www.diabetesjournals.org/content/license.