Genome-wide association studies (GWAS) have identified many genetic locations harboring variation that increases susceptibility to type 2 diabetes (T2D) (1). However, in order to leverage these exciting findings into rational personalized treatment strategies for patients, one needs to understand these loci in much greater detail. To begin with, it is far from clear how mechanistically these genetic differences drive T2D risk; indeed, GWAS typically report variation that is in itself not causal but rather closely “travels” down the generations with the culprit variant. Furthermore, it has proven challenging to elucidate the actual causal gene at each location. Studies of obesity genetics highlight this point. For some time, attention has been focused on understanding FTO, as intronic variation within this gene was implicated in obesity through consistent GWAS (2,3). However, it was recently reported that these variants actually act at a distance to influence the expression of the neighboring gene, IRX3 (4). There is much interest, therefore, in experimental strategies that can elucidate the functional significance of T2D GWAS variants while avoiding misattribution of biological risk.
In this issue of Diabetes, Locke et al. (5). applied a logical molecular biology approach to tackle this issue. They sought to discover the regional effects of previously identified T2D risk loci resulting from multiple GWAS efforts, the largest and most recent being from the DIAbetes Genetics Replication And Meta-analysis (DIAGRAM) consortium (6). Specifically, they investigated a particular mechanism by which nucleotide changes could impact T2D risk, namely by changing the transcription of genes in the proximity of a given signal. Their approach, “targeted allelic expression profiling,” aimed to identify imbalances in gene expression related to T2D risk–associated alleles. The presence of possible expression differences was thus hypothesized to tip the scales in favor of a transcriptional explanation for at least some of the GWAS results.
The authors’ strategy is illustrated in Fig. 1. Many genetic variants associated with increased T2D risk are single nucleotide polymorphisms (SNPs) that lie in regions of genes (introns) that are never transcribed into mature messenger (m)RNA. As a result, the effect of these intronic SNPs on gene expression can be difficult to assess directly. For each “lead” intronic SNP (i.e., those variants that capture the association most optimally) identified in major GWAS reports of T2D, the investigators searched for “proxy” exonic SNPs (i.e., variants inherited together with the lead SNPs but located in an exon instead of an intron and thus much more amenable to expression analyses). For example, as shown in Fig. 1, lead SNP rs2007084 is located in the intron of the gene ANPEP but is in linkage disequilibrium (i.e., inherited together) with proxy SNP rs17240240, located in one of the exons of ANPEP. The quantity of mature mRNA carrying the C allele (acting as a proxy for the risk allele of the lead SNP) can then be measured and compared with the amount carrying the T allele (acting as a proxy for the nonrisk-conferring allele at the lead SNP). In this way, the transcriptional effects attributable to the risk allele can be isolated using transcription yielded from the other allele as a within-experiment control.
A suitable proxy exonic SNP partner could not be found for every lead SNP. Indeed, of the 65 loci identified in the original GWAS, ultimately only 18 unique exonic SNPs could be leveraged. Samples of islet tissue from 36 deceased, white donors without diabetes were used for the gene expression studies. For the allelic expression profiling to be feasible for a given lead SNP, donors needed to be heterozygous for that SNP (i.e., have a copy of each allele, as illustrated in Fig. 1).
For five of the genes with available data, differential gene expression related to genotype at the proxy exonic SNP was identified and confirmed using other linked exonic SNPs. This short list includes genes with well-characterized function in islets. For example, KCNJ11 encodes an ATP-sensitive K+ channel that couples glucose-stimulated energy production to insulin secretion in the β-cell; mutations in KCNJ11 have been associated with neonatal diabetes (7). With others, there is a clear association with diabetes, and gene function is beginning to be better understood. For example, WFS1 is mutated in Wolfram syndrome, a complex multisystem disorder that includes diabetes precipitated by nonimmune-mediated pancreatic β-cell death. Mutant WFS1 may cause β-cell endoplasmic reticulum stress (8). In contrast, ANPEP (9), whose status as the causal gene was supported by additional expression quantitative trait loci experiments, is a transmembrane metalloprotease with a posited role in angiogenesis (10) whose involvement in diabetes pathogenesis remains to be explored.
The choice to use pancreatic islet tissue for these proof-of-principle experiments is a logical one, as pancreatic β-cell failure is a clinical hallmark of T2D. In addition, many T2D risk variants appear to exert their effects by altering insulin processing and secretion (11). However, many of the neighboring genes are also widely expressed outside the pancreas, and evidence of potentially significant regulatory variation at important T2D risk loci (e.g., TCF7L2) in nonpancreatic tissues is accumulating (12–14). Studying these other tissues may yield a more complete picture. Indeed, Locke et al. (5) acknowledge that their experiments do not elucidate whether or how these differences in gene expression influence T2D risk. They point out that there is a precedent for even apparently small changes in expression affecting biology. For example, haploinsufficiency (i.e., carrying one mutated copy) of SLC30A8, a gene that encodes an islet zinc transporter, appears sufficient to substantially reduce risk for T2D (15). A risk allele in the 3′ untranslated region of SLC30A8 also produced allelic expression imbalance in their study.
Despite not being able to assess every locus due to a lack of an available exonic proxy and the limitation of a single tissue, these experiments demonstrate one promising strategy for identifying how GWAS loci tip the scales of gene expression. Allelic expression profiling therefore may be one incremental step in translating findings from GWAS into a better understanding of T2D pathogenesis.
See accompanying article, p. 1484.
Article Information
Acknowledgments. The authors would like to thank Angela Knott (Children's Hospital of Philadelphia) for assistance in generating Fig. 1.
Funding. The authors received funding from National Institutes of Health grant K12 DK094723-01 (S.E.M.), and from the Daniel B. Burke Endowed Chair for Diabetes Research (S.F.A.G.).
Duality of Interest. No potential conflicts of interest relevant to this article were reported.