Given that susceptibility to type 2 diabetes appears in large measure due to genetic makeup, investigators have spent years of effort trying to identify genes that influence type 2 diabetes risk. An important goal of gene identification is to improve our understanding of the pathophysiology of diabetes, leading to new measures of diagnosis, prevention, and treatment. A diabetes gene is considered identified when variants in that gene (more specifically, variation in DNA sequence between individuals) are found to be associated with type 2 diabetes and/or its pathophysiologic abnormalities such as insulin resistance or secretion.
TCF7L2 (transcription factor 7-like 2) was identified as a gene for type 2 diabetes in 2006 (1). Notably, its effect on diabetes (relative risk 1.5–1.6) is substantially larger than previously established diabetes genes (e.g., peroxisome proliferator–activated receptor γ [PPARG] and β-cell inwardly rectifying K+ channel KIR6.2 [KCNJ11], relative risk ∼1.2 each). Most studies of TCF7L2 have focused on the genetic variants that were implicated in the original report (1), largely ignoring the remainder of the gene. In this issue of Diabetes, investigators took the alternative approach of examining variants across the entire gene, which allowed them to discover a completely novel variant in TCF7L2 that affects diabetes risk (2). We believe this approach to genetic association studies is of great merit, as described below.
In the report that first identified TCF7L2 as a diabetes gene, a microsatellite (DG10S478) was highly associated with type 2 diabetes (in three Caucasian cohorts), as were five single nucleotide polymorphisms (SNPs) in linkage disequilibrium (LD). Of those five SNPs, two (rs12255372 and rs7903146) were identified as most strongly associated with type 2 diabetes; subsequent reports determined SNP rs7903146 had the greatest effect (3,4). Deep resequencing of exons and the surrounding region has not identified any variants with stronger effect (4). Over 50 articles subsequently described the role of TCF7L2 in multiple cohorts, firmly establishing the gene's place among diabetes genes (5). Quantitative phenotypic associations (3,6,7) and its role in proglucagon gene expression (8) suggest that TCF7L2 modulates diabetes risk via impaired insulin secretion. Further support that TCF7L2 influences insulin secretion comes from an article recently published in Diabetes, wherein genetic variation in TCF7L2 was shown to influence efficacy of sulfonylureas (agents that promote insulin secretion) but not efficacy of metformin (insulin sensitizer) (9).
The majority of the subsequent studies followed the lead of the original study, genotyping either the one or two most associated SNPs, or all five, with or without the microsatellite DG10S478. These variants are in a haplotype block that encompasses DG10S478 and includes part of intron 3, all of exon 4, and part of intron 4 (1). In the few instances where more extensive SNP genotyping was performed, it was usually centered around DG10S478 (3,10,11). As a result, SNP rs7903146 (specifically the T-allele) has been firmly established as increasing risk of type 2 diabetes in multiple European populations and in Caucasians, in general, and including diverse groups such as Mexicans, Amish, Indian Asians, and Moroccans (5). Some studies of individuals of African descent were not able to document association of TCF7L2 with diabetes (12,13), while others, including a study in this issue of Diabetes, were positive for rs7903146 (4,14). Unlike in Caucasians, the two most associated SNPs exhibit weak LD in Africans, allowing determination of the greater role of rs7903146 than rs12255372 in diabetes susceptibility. The functional role that this intronic SNP may play is still unknown. Furthermore, it does not explain the linkage signal that originally led investigators to chromosome 10q (1), raising the possibility that other variants in the region (in TCF7L2 or other genes) may predispose to type 2 diabetes. Other plausible candidate genes do exist on chromosome 10q (15,16).
The near-exclusive focus on rs7903146 and SNPs in LD with it has probably delayed the discovery of other variants in TCF7L2 that may affect risk of type 2 diabetes. In Asian populations, the frequencies of SNPs rs7903146 and rs12255372 are quite low. Nevertheless, association with these SNPs was identified in two large Japanese cohorts (>1,000 cases each), where the minor allele frequency (MAF) ranged from 3 to 5% (17,18). The large size of these cohorts provided adequate power for discovery despite the rarity of the variants examined. This was fortuitous; had the allele frequencies or sample sizes been slightly lower, false-negative studies would have been the likely result, even though the gene and those variants are producing risk in these populations.
In the study of Chinese Han (760 case and 760 control subjects) reported in this issue (2), if the investigators had followed the usual strategy of looking only at the original associated SNPs, they would have found no associations. In Chinese, the rarity of rs7903146 (MAF 2–2.5%) would require 1,700 cases to detect an association with an effect size similar to that in Caucasians (2). Fortunately, these investigators tested the entire TCF7L2 gene rather than only particular variants of prior interest. In addition to genotyping SNPs previously associated with type 2 diabetes, they used information from HapMap (19) to select a set of 13 SNPs to capture the majority of SNPs in the gene with MAF >20%. The “classic” SNPs showed no association; however, SNP rs290487 at the 3′ end of the gene, as well as haplotypes including this SNP, were associated with type 2 diabetes. The comprehensive approach of this study led to this identification of a TCF7L2 SNP not in LD with rs7903146 that may affect type 2 diabetes risk. This demonstrates the utility of the global gene approach.
Given the narrow focus of prior studies of TCF7L2, no other study to date has examined this recently identified 3′ variant. Those groups with large diabetic cohorts should go back and examine this variant and its neighbors for a role in type 2 diabetes, which is feasible given that rs290487 occurs with appreciable frequency in all HapMap populations except Yorubans. HapMap also reveals that this SNP is not well captured by other SNPs (at r2 > 0.8), indicating that its effect is likely to be missed if the SNP itself is not genotyped. The only other study to take a global approach to TCF7L2 (challenging with this large [∼216 kb] gene) was carried out in Mexican Americans (20). This study did not genotype rs290487 but did genotype rs290483, also in the 3′ end, with negative results. Notably, one of the recent genome-wide association studies for type 2 diabetes found that rs290483 was associated with type 2 diabetes (a finding eclipsed by the stronger association of rs7903146 LD group SNPs) (21). This latter observation and the data of Chang et al. (2) suggest the potential functional importance of variation in the 3′ end of the gene, which may influence alternative splicing of TCF7L2, in which there exist many alternative splice sites (22). Clearly this end of the gene deserves the kind of attention that the upstream region of rs7903149 has received.
When follow-up gene marker studies focus only on the original associated variants, they amount only to replication studies that will not result in novel discoveries unless they examine a related but different phenotype, e.g., response to treatment (9), or extend the result to additional populations (14). To test the entire gene, comprehensive genotyping such as that carried out by Chang et al. (2) is necessary. This is particularly important if the replication cohort is not of the same ethnic group as the original cohort. Comprehensive global genotyping leads to a complete understanding of the SNP frequencies and haplotype structure, which may account for differences in association results. Given the availability of the HapMap database, and the falling price of genotyping, investigators now more readily have the option to test the entire gene (or region) rather than only a particular variant.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.