Diabetes, and particularly type 2 diabetes, is one of the most significant public health problems facing Western civilization. Every day, 4,100 people in the U.S. are diagnosed with diabetes. Of those with diabetes, each day, 230 have their legs amputated, 120 are newly placed on kidney dialysis, and 55 become blind. Overall, in the U.S., almost 7% of the entire population (∼21 million people) have diabetes, including ∼1 in 5 individuals over the age of 60 years. Diabetes is the sixth leading cause of death (and is likely underreported), and people with diabetes have twice the risk of death as someone of the same age without diabetes. The primary cause of death in individuals with diabetes is not from diabetes itself but from the complications of the disease. In all, the cost of diabetes to the U.S. economy is thought to be nearly 132 billion USD in lost productivity and in direct medical care costs (1). To predict those at risk of diabetes, design efficacious behavioral and clinical interventions, and identify critical targets for pharmacologic therapy, a better understanding of the pathways and mechanisms leading to diabetes is required. It is this need for a more complete understanding of etiology that leads to the fundamental importance of genetic studies.

There is extensive and consistent evidence that genetic factors play an important role in modifying an individual's risk for type 2 diabetes (26). A major research focus for several decades has been the identification of genes contributing to diabetes risk. Unlike the many forms of maturity-onset diabetes in the young (MODY) that are transmitted as a single gene defect and appear “Mendelian” in nature (7), typical type 2 diabetes is multifactorial in its transmission (8). Unlike type 1 diabetes, which is multifactorial but has major genetic contribution by genes in the human major histocompatibility complex (MHC) (including the HLA and other genes [9]), type 2 diabetes genetic risk has few genes of major effect (10). Thus, the search for genes contributing to risk of type 2 diabetes has been difficult, and the genes themselves have been elusive.

The methods for gene discovery for complex human diseases such as type 2 diabetes have been rapidly evolving. Previous research was focused on evaluation of families with multiple cases of diabetes to detect linkage to a gene in a hypothetical causal pathway or testing functional variants in candidate genes using a case-control approach (11,12). Historically, these studies were limited by both a small population size that reduced statistical power to detect all but major gene effects and low genomic coverage, where only a few variants within the candidate gene are tested. A problem of these earlier studies was that the positive findings were not often replicated, thereby generating confusion and concern for the application of genetic methods to diabetes risk assessment. With the advent of the International HapMap Project (13,14), the limitation of genomic coverage was effectively resolved. Reagents are now available to cover the human genome at a 5-kb resolution, and the structure of the individual candidate genes can now be characterized, making population size (and replication of novel findings) the primary requirement for gene discovery.

Recently, a series of genome-wide association scans for type 2 diabetes were published (1525). These scans have taken the approach of an unbiased (agnostic) view of the genome related to type 2 diabetes genetic risk. Hundreds of thousands of single-nucleotide polymorphisms (SNPs) across the genome have been assayed in samples from populations of almost exclusively northern European ancestry, and novel genes (TCF7L2, SLC30A8, IDE-KIF11-HHEX, CDKAL1, CDKN2A-CDKN2B, IGF2BP2, FTO, etc. [26]) with uncertain function pertaining to risk of type 2 diabetes have been identified. Despite the increase in the number of genes from the few (PPARG, KCNJ11, CAPN10) to the many (now over a dozen), the contribution of these genes to both the overall and the genetic risk remains small (27). Hence, there are likely many more genes to be identified, any one of which could identify a key pathway involved in disease.

The article in this issue of Diabetes by Gaulton et al. (28) presents a variation on the candidate gene approach in the whole genome era. Using the framework of FUSION (Finland-U.S. Investigation of Type 2 Diabetes Genetics) that was applied to the genome-wide association scans, the investigators have characterized a battery of 222 candidate genes associated with type 2 diabetes risk. Unrelated individuals were abstracted from the FUSION families, and these 1,161 case subjects with type 2 diabetes and 1,174 normoglycemic control subjects were assayed for 3,531 SNPs in the candidate genes. The candidate genes were selected using a number of strategies, including use of bioinformatics and text/vocabulary processing, and the distribution of these genes across the human genome can be considered a candidate-wide association scan (CWAS). Using additional HapMap data, the investigators were able to increase the number of SNPs used in the analyses by imputing genotypes of ∼7,500 additional SNPs within/near the candidate genes, thereby capturing nearly all of the variants in the candidates. Using this CWAS approach, the FUSION team replicated associations between numerous genes with type 2 diabetes risk and identified two additional genes (RAPGEF1 and TP53) worthy of further study. The authors suggest that RAPGEF1 represents a strong candidate because of its role in insulin signaling. In addition, the RAPGEF1 pathway may be involved in regulation of proglucagon gene expression in intestinal endocrine L-cells (29), providing another mechanism for its effect on risk of type 2 diabetes. TP53, whereas primarily used as a target of breast cancer prognosis (30), is proposed here to be an indicator of apoptosis in the insulin-producing β-cells of the pancreas. This study demonstrates that not only are there likely additional genes to be discovered that affect an individual's risk of diabetes, but there are multiple approaches beyond genome-wide association scans alone that can be used for gene discovery.

There are both strengths and weaknesses associated with the contribution by Gaulton et al. (28). The strengths include the two-stage CWAS approach that uses novel text mining approaches to identify candidate genes and pathways that could be associated with risk of type 2 diabetes. In addition, the sample size of ∼1,000 case subjects and ∼1,000 control subjects, while not particularly large in the context of genome-wide coverage, has extensive genomic coverage of the candidates in a homogeneous population. Interestingly, this strength can also be viewed as a weakness. The sample only contains Finns, so there is little knowledge whether these same genes/pathways would be observed in other ethnic groups. In addition, the large battery of candidate genes and SNPs generates concern over multiple testing of associations and power associated with the study. This potential limitation can be addressed, in part, by both conduct of additional studies in different ethnic groups and by replication in other populations of the same ethnicity. There is also an issue of confounding by BMI so that it remains uncertain whether the genes identified are related to diabetes risk, or obesity risk, or after adjustment for BMI in the analytical models, any of the significant associations with diabetes becomes lost. While candidate gene studies and coding region analyses have the advantage of being “hypothesis-driven,” they are also “hypothesis-limited”; not all novel pathways and molecular mechanisms can be identified and interrogated. Finally, the question remains from ages past: can negative results from the CWAS be ignored?

Within the context of studies with modest population sizes, the thresholds for detecting significance in the literature are varied. In this study, the authors have used different thresholds for “biologically relevant candidate genes” (P < 0.10) versus other standard statistical corrections. Whereas this may protect against false-negative results, it may also include more SNPs within candidate genes for follow-up. Other aspects of the study that could lead to false-negative results include the overall power (low) for detection of associations, poor coverage of the candidate genes (SNP selection), gene-gene and gene-environment interaction (or correlation), or phenotype definition.

The genetic risk factors for both type 1 and type 2 diabetes are being identified and the etiologic pathways are being dissected. Currently, there are at least 10 genes that appear to influence risk of type 1 diabetes (31) and 18 of type 2 diabetes (32), yet much needs to be done to further understand how, in a pathophysiological sense, variation in each of the genes modifies risk, how we can predict who is at risk, and how we can intervene to reduce the risk to an individual. Three specific areas of future research are easily identified. First, and as noted in the current article, the pathway from gene to clinical outcome (diabetes) goes through protein products that are “intermediate”—quantitative phenotypes that are closer to the functional defect. Examination of these diabetes-related quantitative traits may provide important insight into the disease risk transition from normal glucose tolerance to type 2 diabetes. Second, resequencing of coding regions can offer an efficient way to focus the search for causal variants. It is estimated that coding regions make up ∼1% of the genome sequence, yet likely contain a much larger fraction of all causal variants. Studies that are limited to coding regions, however, will not identify regulatory variants that influence disease. Third, evolutionary conservation across relevant species can provide a means to identify functional sites in the human genome. Evolutionary analysis of sequence data suggests that altering sequence in regulatory regions may be as deleterious as altering sequence in coding regions. These future areas of research will need to be integrated with the ongoing epidemiologic studies that identify the important modifiable risk factors that interact with the host genotype. Using these current and future approaches to understanding the human genome, the potential to modify the current, almost inexorable, natural history of genetic risk, through quantitative trait (intermediate phenotype) abnormalities leading to clinical diabetes and its complications, is becoming more realistic. Understanding the pathophysiological mechanisms that these genes identify should hopefully lead to significant advances in the next decade. Thus, further identification of genes for diabetes and its complications will provide better understanding of etiopathogenesis and clear delineation of targets for intervention.

Readers may use this article as long as the work is properly cited, the use is educational and not for profit, and the work is not altered. See http://creativecommons.org/licenses/by-nc-nd/3.0/ for details.

See accompanying original article, p. 3136.

1.
NIDDK (National Institute of Diabetes and Digestive and Kidney Diseases):
National Diabetes Statistics Fact Sheet: General Information and National Estimates on Diabetes in the United States, 2005.
Bethesda, MD, U.S. Department of Health and Human Services, National Institute of Health,
2005
2.
Pincus G, White P: On the inheritance of diabetes mellitus. II. Further analysis of family histories.
Am J Med Sci
188
:
159
–169,
1934
3.
Kobberling J, Tillil H: Empirical risk figures for first degree relatives of non-insulin-dependent diabetics. In
The Genetics of Diabetes Mellitus.
Kobberling J, Tattersall R, Eds. Academic Press, New York,
1982
, p.
201
–209
4.
Kahn CR, Vicent D, Doria A: Genetics of non-insulin dependent (type II) diabetes mellitus.
Annu Rev Med
47
:
509
–531,
1996
5.
Rich SS: Mapping genes in diabetes: a genetic epidemiologic perspective.
Diabetes
39
:
1315
–1319,
1990
6.
Raffel LJ, Goodarzi MO, Rotter JI: Diabetes mellitus. In
Principles and Practice of Medical Genetics
Rimoin DL, Connor JM, Pyeritz RE, Korf B, Eds London, Churchill Livingstone,
2007
, p.
1980
–2022
7.
Fajans SS, Bell GI, Polonsky KS: Molecular mechanisms and clinical pathophysiology of maturity-onset diabetes of the young.
N Engl J Med
345
:
971
–980,
2001
8.
Owen KR, McCarthy MI: Genetics of type 2 diabetes.
Curr Opin Genet Dev
17
:
239
–244,
2007
9.
Erlich H, Valdes AM, Noble J, Carlson JA, Varney M, Concannon P, Mychaleckyj JC, Todd JA, Bonella P, Fear AL, Lavant E, Louey A, Moonsamy P, Type 1 Diabetes Genetics Consortium: HLA DR-DQ haplotypes and genotypes and type 1 diabetes risk: analysis of the type 1 diabetes genetics consortium families.
Diabetes
57
:
1084
–1092,
2008
10.
Lindgren CM, McCarthy MI: Mechanisms of disease: genetic insights into the etiology of type 2 diabetes and obesity.
Nat Clin Prac Endocrin Metab
4
:
156
–163,
2008
11.
Rich SS: Genetics of diabetes and its complications: frontiers in nephrology.
J Am Soc Nephrol
17
:
353
–360,
2006
12.
Sale MM, Rich SS: Genetic contributions to type 2 diabetes: recent insights.
Expert Rev Mol Diag
7
:
207
–217,
2007
13.
International HapMap Consortium: A haplotype map of the human genome.
Nature
437
:
1299
–1320,
2005
14.
International HapMap Consortium: A second generation human haplotype map of over 3.1 million SNPs.
Nature
449
:
851
–861,
2007
15.
Sladek R, Rocheleau G, Rung J, Dina C, Shen L, Serre D, Boutin P, Vincent D, Belisle A, Hadjadj S, Balkau B, Heude B, Charpentier G, Hudson TJ, Montpetit A, Pshezhetsky AV, Prentki M, Psoner BI, Balding DJ, Meyre D, Polychronakos C, Froguel P: A genome-wide association study identifies novel risk loci for type 2 diabetes.
Nature
445
:
881
–885,
2007
16.
Steinthorsdottir V, Thorleifsson G, Reynisdottir I, Benediktsson R, Jonsdottir T, Walters GB, Styrkarsdottir U, Gretarsdottir S, Emilsson V, Ghosh S, Baker A, Snorradottir S, Bjarnason H, Ng MCY, Hansen T, Bagger Y, Wilensky RL, Reilly MP, Adeyemo A, Chen Y, Zhou J, Gudnason V, Chen G, Huang H, Lashley K, Doumatey A, So W-Y, Ma RCY, Andersen G, Borch-Johnsen K, Jorgensen T, van Vliet-Ostaptchouk JV, Hofker MH, Wijmenga C, Christiansen C, Rader DJ, Rotimi C, Gurney M, Chan JCN, Pedersen O, Sigurdsson G, Gulcher JR, Thorsteinsdottir U, Kong A, Stefansson K: A variant in CDKAL1 influences insulin response and risk of type 2 diabetes.
Nat Genet
39
:
770
–775,
2007
17.
Wellcome Trust Case Control Consortium: Genomewide association study of 14,000 cases of seven common diseases and 3,000 shared controls.
Nature
447
:
661
–678,
2007
18.
Zeggini E, Weedon MN, Lindgren CM, Frayling TM, Elliott KS, Lango H, Timpson NJ, Perry JRB, Rayner NW, Freathy RM, Barrett JC, Shields B, Morris P, Ellard S, Groves CJ, Harries LW, Marchini JL, Owen KR, Knight B, Cardon LR, Walker M, Hitman GA, Morris AD, Doney ASF, The Wellcome Trust Case Control Consortium (WTCCC), McCarthy MI, Hattersley AT: Replication of genome-wide association signals in UK samples reveals risk loci for type 2 diabetes.
Science
316
:
1336
–1341,
2007
19.
Diabetes Genetics Initiative of Broad Institute of Harvard and MIT, Lund University, and Novartis Institutes for BioMedical Research: Genome-wide association analysis identifies loci for type 2 diabetes and triglyceride levels.
Science
316
:
1331
–1336,
2007
20.
Scott LJ, Mohlke KL, Bonnycastle LL, Willer CJ, Li Y, Duren WL, Erdos MR, Stringham HM, Chines PS, Jackson AU, Prokunina-Olsson L, Ding C-J, Swift AJ, Narisu N, Hu T, Pruim R, Xiao R, Li X-Y, Conneely KN, Riebow NL, Sprau AG, Tong M, White PP, Hetrick KN, Barnhart MW, Bark CW, Goldstein JL, Watkins L, Xiang F, Saramies J, Buchanan TA, Watanabe RM, Valle TT, Kinnunen L, Abecasis GR, Pugh EW, Doheny KF, Bergman RN, Tuomilehto J, Collins FS, Boehnke M: A genome-wide association study of type 2 diabetes in Finns detects multiple susceptibility variants.
Science
316
:
1341
–1345,
2007
21.
Salonen JT, Uimari P, Juha-Matti A, Pirskanen M, Kaikkonen J, Todorova B, Hypponen J, Korhonen V-P, Asikainen J, Devine C, Tuomainen T-P, Luedemann J, Nauck M, Kerner W, Stephens RH, New JP, Ollier WE, Gibson JM, Payton A, Horan MA, Pendleton N, Mahoney W, Meyre D, Delplanque J, Froguel P, Luzzatto O, Yakir B, Darvasi A: Type 2 diabetes whole-genome association study in four populations: The DiaGen Consortium.
Am J Hum Genet
81
:
338
–345,
2007
22.
Hayes MG, Pluzhinikov A, Miyake K, Sun Y, Ng MCY, Roe CA, Below JE, Nicolae RI, Konkashbaev A, Bell GI, Cox NJ, Hanis CL: Identification of type 2 diabetes genes in Mexican Americans through genome-wide association studies.
Diabetes
56
:
3033
–3044,
2007
23.
Hanson RL, Bogardus C, Duggan D, Kobes S, Knowlton M, Infante AM, Marovich L, Benitez D, Baier LJ, Knowler WC: A search for variants associated with young-onset type 2 diabetes in American Indians in a 100K genotyping array.
Diabetes
56
:
3045
–3052,
2007
24.
Rampersaud E, Damcott CM, Fu M, Shen H, McArdle P, Shi X, Shelton J, Yin J, Chang Y-PC, Ott SH, Zhang L, Zhao Y, Mitchell BD, O'Connell J, Shuldiner AR: Identification of novel candidate genes for type 2 diabetes from a genome-wide association scan in Old Order Amish: evidence for replication from diabetes-related quantitative traits and from independent populations.
Diabetes
56
:
3053
–3062,
2007
25.
Florez JC, Manning AK, Dupuis J, McAteer J, Irenze K, Gianniny L, Mirel DL, Fox CS, Cupples LA, Meigs JB: A 100K genome-wide association scan for diabetes and related traits in the Framingham Heart Study: replication and integration with other genome-wide datasets.
Diabetes
56
:
3063
–3074,
2007
26.
Taylor KD, Norris JM, Rotter JI: Genome-wide association: which do you want first: the good news, the bad news, or the good news?
Diabetes
56
:
2844
–2848,
2007
27.
McCarthy MI, Zeggini E: Genome-wide association scans for type 2 diabetes: new insights into biology and therapy.
Trends Pharmacol Sci
28
:
598
–601,
2007
28.
Gaulton KJ, Willer CJ, Li Y, Scott LJ, Conneely KN, Jackson AU, Duren WL, Chines PS, Narisu N, Bonnycastle LL, Luo J, Tong M, Sprau AG, Pugh EW, Doheny KF, Valle TT, Abescasis GR, Tuomilehto J, Bergman RN, Collins FS, Boehnke M, Mohlke KL: Comprehensive association study of type 2 diabetes and related quantitative traits with 222 candidate genes.
Diabetes
57
:
3136
–3144,
2008
29.
Lofti S, Li Z, Sun J, Zuo Y, Lam PP, Kang Y, Rahimi M, Islam D, Wang P, Gaisano HY, Jin T: Role of the exchange protein directly activated by cyclic adenosine 5′-monophosphate (Epac) pathway in regulating proglucagon gene expression in intestinal endocrine L cells.
Endocrinology
147
:
3727
–3736,
2006
30.
Takahashi S, Moriya T, Ishida T, Shibata H, Sasano H, Ohuchi N, Ishioka C: Prediction of breast cancer prognosis by gene expression profile of TP53 status.
Cancer Sci
99
:
324
–332,
2008
31.
Todd JA, Walker NM, Cooper JD, Smyth JD, Downes K, Plagnol V, Bailey R, Nejentsev S, Field SF, Payne F, Lowe CE, Szeszko JS, Hafler JP, Zeitels L, Yang JH, Vella A, Nutland S, Stevens HE, Schullenburg H, Coleman G, Maisuria M, Meadows W, Smink LJ, Healy B, Burren OS, Lam AA, Ovington NR, Allen J, Adlem E, Leung HT, Wallace C, Howson JMM, Guja C, Ionescu-Tirgoviste C, Genetics of Type 1 Diabetes in Finland, Simmonds MJ, Heward JM, Gough SCL, The Wellcome Trust Case Control Consortium, Dunger DB, Wicker LS, Clayton DG: Robust associations of four new chromosome regions from genome-wide analyses of type 1 diabetes.
Nat Genet
39
:
857
–864,
2007
32.
Zeggini E, Scott LJ, Saxena R, Voight BF, for the Diabetes Genetics Replication and Meta-Analysis (DIAGRAM) Consortium: Meta-analysis of genome-wide association data and large-scale replication identifies additional susceptibility loci for type 2 diabetes.
Nat Genet
40
:
638
–645,
2008