Over the past ∼15 years there has been great progress in our understanding of the genetics of both type 1 diabetes and type 2 diabetes. This has been driven principally by genome-wide association studies (GWAS) in increasingly larger sample sizes, where many distinct loci have now been reported for both traits. One of the loci that dominates these studies is the TCF7L2 locus for type 2 diabetes. This genetic signal has been leveraged to explore multiple aspects of disease risk, including developments in genetic risk scores, genetic commonalities with cancer, and for gaining insights into diabetes-related molecular pathways. Furthermore, the TCF7L2 locus has aided in providing insights into the genetics of both latent autoimmune diabetes in adults and various presentations of type 1 diabetes. This review outlines the knowledge gained to date and highlights how work with this locus leads the way in guiding how many other genetic loci could be similarly used to gain insights into the pathogenesis of diabetes.
Introduction
There is a general need to identify better drug targets with greater efficacy for common diseases, with gene discovery efforts presenting a specific possibility for novel therapeutic options. Given that a genetic component is well-known to exist for both type 1 diabetes and type 2 diabetes, genomics offers an opportunity to uncover new high-value targets for these diseases.
The evidence for a genetic component to type 2 diabetes is largely undisputed, given the combination of prevalence differences across different populations, familial clustering, and high concordance among monozygotic twins when contrasted with dizygotic twins; indeed, the sibling risk for type 2 diabetes is now estimated to be ∼3.5 (1), albeit varying somewhat between populations. In contrast to the relatively mild genetic component to type 2 diabetes, type 1 diabetes risk is far more clearly influenced by genetic factors, with it being most apparent within the MHC.
Genome-Wide Association Studies
Genome-wide association studies (GWAS) started to emerge in the literature around 2005, and such efforts have substantially advanced the field of diabetes genetics. Indeed, prior to GWAS, only a miniscule number of loci were established for type 1 and type 2 diabetes, including the MHC, INS, PTPN22, PPARG, and KCNJ11.
The resulting data from GWAS of complex traits are empirically interpreted through a search for signals breaching the specific P value threshold of 5 × 10−8. This bar is considered the gold standard for genome-wide significance and is derived from the required correction for a very large number of multiple comparisons due to assaying many hundreds of thousands of single nucleotide polymorphisms (SNPs) (Fig. 1). As a consequence, it is becoming increasingly clear that diseases like type 1 and type 2 diabetes are highly polygenic, with a plethora of additional signals achieving this level of significance as cohort sizes grow. GWAS have now revealed tens of loci for type 1 diabetes (2,3), while >240 loci have been reported for type 2 diabetes, corresponding to >400 independent association signals (4). This is a truly outstanding feat by the genetics field, where only less than two decades ago type 2 diabetes in particular was seen as the “geneticist’s nightmare” and it was widely considered that finding genes for the disease was comparable to searching for the “holy grail.”
After extensive international efforts driving increasingly larger sample sizes to characterize more and more of the genetic component to both diseases, the community has delivered not only a set of variants contributing to the susceptibility for each trait but, more importantly, the vast majority of them are readily replicatable by independent research groups. The community can therefore move forward confidently by working with an agreed “truth” that can inform subsequent characterization efforts by which type 1 and type 2 diabetes loci operate.
Another strategy for seeking genetic variants associated with common complex traits is through the utilization of exome and whole-genome sequencing, which have also revealed a number of key variants. Although much rarer, compared with individual GWAS signals these variants are relatively impactful for those specific subjects harboring them. A notable discovery in recent years was the protein-truncating variants in SLC30A8 (Arg138* and Lys34Serfs*50), which are loss of function in nature but actually confer protection (5), although subsequent larger sequencing efforts initially did not identify additional coding variants (6) until reaching critical mass this year with the exome sequencing of >20,000 case and >24,000 control subjects to reveal many more findings (7). Furthermore, the TBC1D4 Arg684Ter nonsense variant confers very high risk of developing type 2 diabetes, but only specifically among Greenlanders (8). Despite these studies indisputably revealing exciting targets for pursuit, such genetic variants are generally extremely rare and/or specific to a very narrow geographic window, making their potential application to generalized genetic risk assessment somewhat limited currently.
Conversely, GWAS has uncovered variants that are common in the population. Although the risk these variants confer is relatively subtle, it is very statistically significant and much more readily detectable in population-based studies. Such results observed in GWAS have proved useful for investigating specific traits more closely.
The most statistically significant association signal for type 2 diabetes in nonisolated populations is at the TCF7L2 locus (4). Given it has been known since 2006 and that its relative strength is higher than most other GWAS-implicated common loci reported to date, this locus has proved a highly leveraged signal in follow-up studies. The signal was first reported by myself and colleagues (9) in Iceland and was subsequently replicated in additional cohorts from Denmark and the U.S. Initially detected within a linkage region for type 2 diabetes, i.e., a region of sharing across multiple Icelandic families with the disease, we followed up this region with localized association efforts using highly polymorphic microsatellite markers. The associated microsatellite was rapidly found to be in linkage disequilibrium with a series of SNPs, any of which could have been the much sought after actual underlying causal variant.
It rapidly became clear from the first reported efforts in type 2 diabetes studies that the TCF7L2 locus was in fact the strongest GWAS signal (10), with subsequent independent studies readily replicating the association in populations of multiple ancestries. This finding remains the common locus most strongly associated with the disease (4).
Researchers went on to study, through association analyses, the influence of the TCF7L2 genetic signal on progress from prediabetes to diabetes, on drug response, and on possible association with other traits (see below), with varying degrees of positive outcomes. Interestingly, this variant has been reported to also be strongly associated with cystic fibrosis–related diabetes (11), the association appearing to be even stronger than that observed for type 2 diabetes itself.
Causal Variant
Large-scale genotyping approaches are excellent at interrogating the bulk of the common diversity in the genome, but the identification of the actual underlying causal variant “tagged” by each signal is not actually integral to the design of this method.
We refined the type 2 diabetes TCF7L2 association in West African cohorts down to a smaller region of the genome (12), concluding that from the variants tagging the association the T allele of intronic SNP rs7903146 was very likely to be the ancestral allele and served as the best proxy for the underlying mutation.
Subsequent transethnic analyses and functional efforts have further converged on this same SNP (4). Situated within the fourth intron of the TCF7L2 gene, it is the only variant that transcends ethnicity and captures the association with type 2 diabetes across various populations. It has similar allele frequencies around the world, except for East Asia, where the risk allele is relatively rare but still captures the disease association. Furthermore, the open chromatin landscape across the genome, which is an indication of the level of enhancer activity, has been shown to differ across rs7903146 depending on the allele studied (13). As such, this SNP has garnered general agreement as being the actual functional event at this locus—one of the first identified for a complex trait.
The TCF7L2 Locus and Type 2 Diabetes Therapy
There is continued excitement that novel type 2 diabetes genes offer new therapeutic opportunities, as has been seen previously with the peroxisomal proliferator–activated receptor γ (PPARG) gene in the context of thiazolidinediones and with the KCNJ11 gene (encoding Kir6.2) in relation to sulfonylurea treatment. However, glycemic control commonly involves a combination of oral agents, which in turn can lead to substantial side effects. In addition, the efficacy of existing drugs is regularly suboptimal and often ultimately requires insulin supplementation.
Efforts have gone on to attempt to determine where the TCF7L2 gene product, transcription factor 7-like 2, exerts its effect. It continues to be speculated (9) that it regulates the proglucagon gene (GCG) (14), which in turn yields a posttranslational product, glucagon-like peptide 1 (GLP-1). As this key incretin influences blood glucose homeostasis (an active area of treatment for type 2 diabetes), one theory is that the TCF7L2 locus exerts its influence on type 2 diabetes risk via the gut and is supported by observations that incretin levels are associated with this genetic signal. Of course, the α-cell would be another key candidate cell type for incretin action, but given that TCF7L2 exonic mutations and gene fusions lead to colorectal cancer, there is some compelling evidence that at least one site of action is intestinal in nature.
But the type 2 diabetes community is very focused on the β-cell as a primary site, and despite very strong evidence for many of the other type 2 diabetes signals operating at this location, there is less compelling evidence for TCF7L2, which is delineated further below. This is further supported by site-specific humanized mice, which have revealed that there is indeed a role for this transcription factor in non–β-cell types (15). Adipose tissue has been strongly implicated, with multiple splice isoforms of TCF7L2 being implicated as playing a role in conferring risk for type 2 diabetes (16). Furthermore, a compelling study in murine liver strongly supported a possible hepatic role for the gene product (17).
With respect to therapeutic response, Pearson et al. (18) showed that carriers of the rs7903146 T allele present with a lower response rate to sulfonylureas compared with individuals carrying either homozygous genotype; however, they did not observe association between metformin response and the variant after adjustment for baseline HbA1c. Further studies are therefore warranted to confirm these observations.
TCF7L2 and Function
Understanding the actual mechanism by which the TCF7L2 locus confers its influence on type 2 diabetes risk holds great promise to gain insights into the disease. TCF7L2 has an intriguing connection to cancer and may offer up some clues on the underlying molecular mechanisms driving type 2 diabetes pathogenesis. Common variants in the TCF7L2 gene have now been reported in GWAS discovery efforts in the context of colorectal cancer, while a multicancer signal on chromosome 8q24 has been shown to drive the expression of MYC distally via a distant TCF7L2 regulatory element (19,20). Indeed, it is already known that exonic mutations within TCF7L2 (formerly TCF4) confer strong risk for colorectal cancer, while a recurrent VTI1A-TCF7L2 gene fusion is known to drive pathogenesis of colorectal adenocarcinoma (21).
The encoded transcription factor operates at the end of the Wnt signaling cascade. As highlighted above, there are a number of lines of evidence that TCF7L2 influences type 2 diabetes risk through the influence of GLP-1 production in the intestine (9,14). TCF7L2 knockout mice do not live longer than 1 day as a specific consequence of faults in the intestinal epithelium, with nothing observed in any other organ systems, suggesting that TCF7L2 plays a specific key role in the development of the intestinal epithelium (22).
However, a less direct correlation with insulin secretion via the TCF7L2 locus in another culprit cell type cannot be ruled out, given that early studies (23,24) suggest an influence on the insulinogenic index, implicating this locus in actually operating via insulin secretion mechanisms. Indeed, it has been shown that the risk variant is associated with increased TCF7L2 expression at the mRNA level along with a decrease in insulin secretion in the pancreatic β-cell, implicating this tissue type (25).
In contrast to the lack of expression of TCF7L2 in murine pancreatic islets (14), it has been reported that significant expression is observed in human purified pancreatic β-cells (26), further supporting the concept that this transcription factor is involved in the function of this cell type. In addition, Schäfer et al. (27) demonstrated that the TCF7L2 risk allele specifically impacts insulin secretion induced by GLP-1, suggesting disruption of GLP-1 signaling in pancreatic β-cells, as opposed to impacting secretion of GLP-1 itself.
The work of Lyssenko et al. (25) resonated with other reports of the TCF7L2 locus conferring its effect via pancreatic β-cells by showing an impact on insulin secretion. In addition to oral glucose, they also showed that islet tissue harboring the risk variant had a reduced response to external stimuli, such as intravenous glucose and arginine. They also observed a reduced β-cell response to incretins. Interestingly, TCF7L2 gene expression was actually five times higher in type 2 diabetes islets, contrary to what one might expect, while nondiabetic islets harboring the risk variant correlated with higher TCF7L2 gene expression. Furthermore, overexpression of TCF7L2 in human islets led to a decrease in the secretion of insulin.
One obvious issue for functional follow-up efforts is as follows: if the causal SNP, rs7903146, is in an intron, it is likely that it operates within a form of a yet to be defined enhancer element. To that end, we explored what protein factor bound across this region. Our oligo pulldown and subsequent mass spectrophotometry approach suggested that PARP-1 interacted with this putative element (28). Despite the fact that we know PARP-1 is a known promiscuous factor that binds to fragmented DNA and features in the “CRAPome” (29), we were intrigued by the finding. This was particularly the case given that PARP-1 is a cancer therapeutic target and the fact that type 2 diabetes and cancer genetics appear to show a degree of overlap; in addition, it is known that PARP-1 acts as an intermediate for driving diabetic microvascular complications. By bathing Caenorhabditis elegans in glucose, which is known to shorten the life span of this organism, along with a PARP-1 inhibitor we observed a rescue of this feature, which in turn was ablated when we knocked down the expression of the TCF7L2 homolog (30). Despite our intriguing results, there is a lack of consensus on what factors bind across this region, with FOXA2 and HMGB1 both being suggested to play a role. This still remains to be fully resolved.
In addition to exploring what binds to this locus, TCF7L2 is itself a transcription factor, so there is also interest in its binding pattern across the genome. Characterizing its occupancy would provide insight into its downstream targets, which in turn could present more therapeutically attractive avenues. However, binding predictions are hampered by the fact that the DNA binding motif for TCF7L2 is very degenerate. It is therefore clear that functional studies, such as with chromatin immunoprecipitation (ChIP), should be leveraged to physically characterize the portfolio of genes transcriptionally regulated by TCF7L2. Indeed, TCF7L2 has been recognized as a master regulator (31), and when we employed ChIP-seq to investigate its genome-wide occupancy (32), we found that the genes it bound were indeed in pathways related to diabetes and comorbidities; most interestingly, these genes were enriched at GWAS loci, highlighting that the binding of TCF7L2 was far from random.
Overall, the therapeutic opportunities that the TCF7L2 gene product potentially presents have still to be resolved, especially as it remains harbored within the most strongly associated common locus for type 2 diabetes.
Implicating Effector Genes at the TCF7L2 Locus
The knockout mouse model of the GWAS-implicated obesity-associated FTO gene (33) resulted in a clear BMI-related phenotype, despite the fact that the originally uncovered variant does not account for a large proportion of the predicted genetic component for obesity. So despite the magnitude of risk conferred by a given locus, it does appear that GWAS presents us with novel insights into the biology of complex traits and thus opens up novel therapeutic opportunities. However, the narrative on that association got more complicated when the GWAS signal, harbored within an FTO intron, was studied more carefully. Specifically, the signal is now known to influence the expression of nearby genes, IRX3 and IRX5, rather than principally the “host” gene itself (34). These discoveries suggest that the associated variants can be residing in one gene but actually influence the expression of other putatively causal effector genes some distance away.
Inspired by this FTO work, and given that the T allele of rs7903146 at the TCF7L2 locus is also within an intronic region and has been widely implicated as the causal variant, we similarly elected to implicate genes whose expression is influenced by the putative noncoding enhancer element coinciding with this key variant. To that end, we conducted a chromatin conformation capture campaign combined with sequencing in two carcinoma cell lines derived from human colon (motivated by the known biology of TCF7L2 with respect to the intestine), along with expression analyses (35). In addition to observed physical contacts and correlation with TCF7L2 gene expression, we also observed a strong relationship with the ACSL5 gene, which encodes acyl-CoA synthetase long chain family, member 5′, and is known to influence fatty acid metabolism. Indeed, a recent study reported that the ACSL5 knockout mouse presented with increased insulin sensitivity (36). Subsequent genome editing in the immediate neighborhood around the presumed enhancer coinciding with the location of rs7903146 led to a very substantial reduction in ACSL5 protein expression. As such, we are working with the hypothesis that rs7903146 lies in a distal regulatory region for ACSL5 and is a mechanism that at least partially influences type 2 diabetes risk at this locus (35). However, although a recent international expression quantitative trait loci (eQTL) effort in pancreatic islets did support TCF7L2 as an effector gene at this locus, they did not see any evidence for ACSL5 in that particular tissue type (37).
To add further complexity, there are predicted to be multiple splice isoforms of the gene transcript, so it is likely that this feature is confounding functional follow-up analyses of TCF7L2. Despite there being less isoforms in endocrine-relevant cell types when compared with cancerous ones, there is still reason to believe this could cloud possible associations with gene expression. For instance, given it has been relatively challenging to observe significant eQTL between this SNP and TCF7L2 gene expression up until recently, this splicing feature may be one of the contributing factors.
Despite the ongoing efforts to characterize the precise functional behavior of this locus, there is one undisputed fact: the type 2 diabetes genetic signal at the TCF7L2 locus is beyond doubt and can be leveraged in other types of studies. Indeed, with larger and larger GWAS being published (4), there are now more and more independent signals, albeit substantially weaker, being observed at this locus, highlighting how important this region is in conferring susceptibility to type 2 diabetes and potentially other diseases.
Index Event Bias
One possible mechanism by which the TCF7L2 locus confers its type 2 diabetes susceptibility effect is by influencing BMI and therefore insulin resistance. Interestingly, a very large GWAS of adult BMI implicated the TCF7L2 locus in influencing risk for higher BMI (38). However, what was intriguing was that it was the nonrisk C allele of rs7903146 that conferred higher BMI, which seemed at odds with the T allele conferring risk for type 2 diabetes. The authors of the study did outline the fact that there was a relatively high degree of heterogeneity between cohorts, so the observation could have been due to issues related to differential accounting for comorbidities. Furthermore, sometime later it was shown that the TCF7L2 locus was an example of “index event bias” in the context of obesity risk (39), where stratification can lead to a false positive association with BMI. As such, there is still debate as to whether the association with obesity is a true phenomenon.
And the issue of confounding does not stop there; understanding the interplay between the various forms of diabetes is extremely important in such a context.
TCF7L2, Latent Autoimmune Diabetes in Adults, and Type 1 Diabetes
It has been widely thought that type 1 and type 2 diabetes have distinct underlying disease mechanisms with very little evidence for genetic overlaps between the two diseases. This is what has been seen generally with the TCF7L2 locus, where it does not obviously play an overall role in the pathogenesis of juvenile-onset type 1 diabetes (40,41).
However, we and others have observed that the TCF7L2 locus is associated with latent autoimmune diabetes in adults (LADA). This relatively common disease, that progresses to insulin dependency more slowly than type 1 diabetes, is considered by the World Health Organization as a distinct diabetes subgroup (42). There is very little consensus on the precise definition of the disease, with disagreement even existing on the typical age of onset, which ranges from 25 to 40 years old. Overall, the disease often masquerades as type 2 diabetes, as these adults have no need for insulin treatment for a minimum of 6 months following diagnosis of diabetes. As such, LADA presents with clinical features of both type 1 diabetes and type 2 diabetes, thus being often termed “type 1.5 diabetes.”
Such cases within a given apparent type 2 diabetes cohort could lead to spurious genetic associations, especially when one is pursuing increasingly modest signals in larger and larger cohorts. Indeed, in relatively recent GWAS efforts, type 2 diabetes loci involved in autoimmune traits are beginning to emerge; for instance, a GWAS of type 2 diabetes in African Americans revealed two novel signals that are both classic type 1 diabetes loci, namely, at the MHC and INS (43).
The roles of the type 1 diabetes MHC locus and the type 2 diabetes TCF7L2 locus have suggested that LADA sits at the “genetic meeting point” of these two traits (44,45). Indeed, one study suggested that the TCF7L2 locus could be used in young patients to distinguish autoimmune from nonautoimmune causes of diabetes; however, it was not applicable to older patients (46).
Such observations of course do not exclude the possibility that LADA just represents a mixture of poorly defined type 1 diabetes cases and type 2 diabetes cases, but given that the TCF7L2 locus has been reported to be associated with LADA in a number of cohorts, these findings do suggest an intriguing role for type 2 diabetes genetics in a disease with many features of type 1 diabetes pathogenesis.
As with LADA, type 1 diabetes cases present with varying repertoires of positivity for the four to five autoantibodies that are hallmarks of the trait. Maria Redondo and colleagues have spent much time leveraging the TCF7L2 locus to explore a possible role in the pathogenesis of type 1 diabetes, primarily as a first pass, given it is the strongest common type 2 diabetes locus and could guide the community on exploring further with other such loci. Intriguingly, although the TCF7L2 locus is not associated with type 1 diabetes overall, it does show a role when one focuses on type 1 diabetes cases only positive for one autoantibody (47). In this setting, the association is quite apparent and resonates with the LADA observations, with this particular subgroup residing in a gray zone between the two diseases. Furthermore, the same research group went on to show that autoantibody cases progress to full type 1 diabetes presentation less rapidly when the type 2 diabetes TCF7L2 risk allele is carried as well (48).
So, this presents us with an intriguing question: does the genetics of type 2 diabetes help “push” individuals over into diabetes presentation if they do not harbor the full repertoire of type 1 diabetes genetic risk factors, i.e., does the former act as a catalyst to the latter? In short, that question still remains to be answered, and further follow-up is certainly warranted. These observations could have fundamental implications for personalized approaches to treatment for what is increasingly looking like a spectrum of diabetes subtypes. The TCF7L2 locus remains a potent genetic factor to inform such studies going forward.
Summary
Although TCF7L2 is by far not the only locus associated with type 2 diabetes, it does represent one of the first and strongest genetic signals for this trait. As such, it has proved useful for some time in exploration of the role of type 2 diabetes genetics in various settings, including revealing clues to a common genetic etiology with cancer, revealing a role in the presentation of type 1 diabetes and LADA, and providing lessons on how to best functionally follow up a GWAS locus. Given the observations made with this locus, it bodes well when one comparably leverages the other established type 2 diabetes loci either individually or combined in the form of genetic risk scores.
Article Information
Funding. S.F.A.G. is supported by the National Institutes of Health (R01 DK085212) and the Daniel B. Burke Endowed Chair for Diabetes Research.
Duality of Interest. No potential conflicts of interest relevant to this article were reported.