A total of 896 individuals of Ashkenazi Jewish descent were ascertained in Israel from 267 multiplex families, including 472 sib-pairs affected with type 2 diabetes. A genome-wide scan with average marker spacing of 9.5 cM revealed five regions on four chromosomes (4q, 8q, 14q, and 20q) that exhibited nominal evidence for linkage (P < 0.05). The highest observed nonparametric linkage Z score was 2.41 (equivalent to a logarithm of odds score of 1.26) at marker D4S1501. A maximal signal, with a Z score of 2.05, was observed on chromosome 20 near marker D20S195, and another on 20p near marker D20S103 (Z 1.80). A single marker on chromosome 8 (D8S593) and two adjacent markers on chromosome 14 (D14S749 and D14S605) also attained evidence of linkage. To explore the hypothesis that the signals on chromosomes 4 and 20 are differentially attributable to variation in BMI or age of onset, an ordered subset analysis was conducted. This analysis revealed that only when the families were ranked by BMI (in increasing order) did a subset attain nominal significance, and only for chromosome 4. The findings reported here lend credence to the hypothesis, now supported by four studies of Caucasian populations and most recently by a combined analysis of 1,852 pedigrees, that a type 2 diabetes susceptibility locus resides on chromosome 20q. This population, because of its unique genetic attributes, may facilitate identification of this and other genes contributing to type 2 diabetes.
A number of studies have implicated a genetic basis for type 2 diabetes (1). The discovery of monogenic forms of the disease underscored the phenotypic and genotypic heterogeneity, although monogenic forms account for only a few percent of the disease (1). Defining the genetic basis of the far more common polygenic form of the disease presents more difficulties (2,3). Nevertheless, some interesting results have recently emerged. A genome scan of Hispanic- American families (330 affected sib-pairs [ASPs]) found linkage to chromosome 2q37 (logarithm of odds [LOD] 4.15) (4), and the causative gene has been recently reported (5). A number of other genome scans in various racial groups have identified other putative susceptibility loci (6–8). The largest genome-wide scan for type 2 diabetes loci reported to date studied 477 Finnish families (716 ASPs) and found evidence for linkage to chromosome 20q12–13.1 (LOD 2.06 at D20S107) (9). Interestingly, similar results have been reported by at least three other groups (10–12).
In this study, we have attempted to improve our chances of identifying disease-linked loci by selecting a study population that has undergone genetic isolation. The Ashkenazi Jewish population is relatively young and homogeneous, having undergone several constrictions and expan-sions since its origin ca. 70 AD. Although the population history is largely conjecture, estimates of its age have been made from studies of DNA polymorphisms. Risch et al. (13) analyzed six microsatellite markers for linkage disequilibrium with idiopathic torsion dystonia, a rare disorder with elevated frequency in Ashkenazi Jews. It was estimated that the disease originated in the Eastern European Pale region from a single founder ∼350 years ago. Based on this data, it was concluded that present-day Ashkenazi descended from a founder population, perhaps as few as 10,000 individuals, who existed about 1500 AD. That number expanded to 5 million by 1900, a 500-fold increase in 400 years. Today there are ∼10 million Ashkenazi Jews, living predominantly in the U.S., Israel, and Russia. The conclusion that modern Ashkenazi represent a distinct closely related genetic isolate is supported by recent studies using polymorphic Y chromosome markers and mitochondrial DNA polymorphisms (14,15). Studies of monogenic diseases have reached similar conclusions (16–19). Because the Ashkenazi Jews originated from the Mediterranean basin, as did the larger American and European Caucasian populations, it is likely that genetic factors identified in the Ashkenazi will also be important in other Caucasian populations.
In this study, we assumed that the type 2 diabetes phenotype is the result of a limited number of genes (20), each with moderate effect, and that the power to detect these genes is greater in an isolated population. Here we report the results of our initial whole-genome screen in the Ashkenazi population. Five regions were identified on four chromosomes (4q, 8q, 14q, 20p, and 20q) that gave nominally significant evidence for linkage (P < 0.05). These findings may provide the basis for future fine-mapping studies aimed at isolating specific type 2 diabetes–related mutations.
RESEARCH DESIGN AND METHODS
The study population was chosen in a manner designed to minimize genetic and clinical heterogeneity and thus maximize the chances of identifying disease-associated chromosomal regions. All patients were of Ashkenazi Jewish origin, defined as having all four grandparents born in Northern or Eastern Europe. Subjects with known or suspected Sephardic Jewish or non-Jewish ancestry were excluded. Type 2 diabetes was initially defined according to World Health Organization criteria (fasting glucose >140 mg/dl on two or more occasions, or random glucose >200 mg/dl on two or more occasions). Because the risk of type 2 diabetes markedly increases with age in the general population (21), the relative genetic risk is greater in families with young-onset disease. Therefore, at least one sib was required to have had initial diagnosis before the age of 60 years. The mean difference in age at diagnosis between affected sibs is 8.9 years. To avoid late-onset type 1 diabetes, patients who became insulin-dependent within 2 years of diagnosis were excluded. In this population, the incidence of type 1 diabetes is relatively low compared with that of the Finnish population (22); therefore, anti-GAD or anti–islet cell antibody titers were not routinely measured. To avoid bilinial inheritance, families in which both parents were known to have diabetes were excluded. For sib-pair analysis, at least two full sibs with type 2 diabetes were required. Whenever possible, parents, additional affected sibs, and unaffected sibs were recruited. To test unaffected status, all presumed unaffected sibs were asked to undergo a standard 75-g oral glucose tolerance test (OGTT); however, inclusion in the study was not dependent on their agreement to undergo this test. A total of 76 OGTTs were performed on the unaffected sibs. If the OGTT was not possible, then a fasting glucose level was obtained.
A total of 896 individuals from 267 multiplex families were genotyped. Table 1 displays the distribution of family sizes cross-classified by the number of available parental DNA samples. Because type 2 diabetes has a late age of onset, many multiplex sibships did not have living parents. Accordingly, an effort was made to obtain and genotype unaffected sibs. A total of 241 unaffected sibs from 122 different multiplex sibships were sampled.
The study was approved by the institutional review boards of Washington University, St. Louis, and Hadassah University Hospital, Jerusalem. Whenever possible, all assessments were conducted in the morning after an overnight fast. A total of 50 ml of blood was drawn. Aliquots of serum and plasma were taken for serum lipid, HbA1c, glucose, and insulin determinations. Additional aliquots of serum were frozen at –80°C for future use. Genomic DNA was extracted from whole blood using the Puregene Genomic DNA extraction kit (Gentra System, Minneapolis, MN). Aliquots of concentrated DNA were also frozen at –80°C for future analysis. In addition, anthropomorphic measurements were obtained at the time of initial assessment, including height, weight, hip and waist circumference, blood pressure, and pulse rates. Clinical histories were obtained from the patients and, when necessary, confirmed by review of the patients’ medical records or discussion with the treating physicians.
Genotyping.
Linkage analysis.
Before undertaking the linkage analysis, the alleged genetic relationship of the various relative configurations was verified using the computer program Relative (24). A number of monozygotic twins were identified, and one member of each pair, chosen at random, was deleted from the data file. Additionally, a few half-sibs were identified; in the relationships we were able to check, all proved to be maternal half-sibs. All half-sibs were retained for analyses, and the master data file was edited to reflect their correct genetic relationship. Because the mode of transmission of type 2 diabetes is unknown, we elected to perform linkage analysis with a robust model-free allele-sharing method. Genehunter-Plus (25) was used to compute multipoint nonparametric linkage (NPL) Z scores with the emendation by Kong and Cox (25) under the exponential model option and the “ALL” scoring function. One family was too large to be analyzed as a single pedigree, so it was broken into two subpedigrees, a strategy that increases the number of families from 267 to 268. To guard against increased type I errors that can result from misspecified marker allele frequencies, these were estimated from the data. Using MultiMap, marker maps were determined by analysis of pedigree data from the Centre d’Etude du Polymorphisme Humain (CEPH) (26); markers lacking CEPH data were placed based on the integrated marker map maintained by the Genome Database (http://www.gdb.org). Sex-averaged recombination values were used in all analyses.
Ordered subset analysis.
To evaluate the possibility that a measured covariate is responsible for some of the heterogeneity suspected for type 2 diabetes, we undertook an ordered subset analysis (27) for all chromosomes that attained an NPL Z score >2. This technique is especially useful for evaluating the effects of covariates that have a continuous distribution. Moreover, this approach removes some of the arbitrariness that accompanies the subdivision of a sample by quartiles or deciles. Two covariates are of particular interest in this genome scan: BMI and age of onset. Briefly, families are ranked by the mean value of the covariate in affected sibs, and the cumulative sum of the NPL Z score grid is evaluated as each family, in rank order, is consecutively added to the analysis. For each covariate, the linkage analysis is performed twice, first by ranking the families from highest to lowest, and then ranking them from lowest to highest, and the maximum cumulative NPL Z score (and the rank of the family [Ri] at which the maximum occurs) is noted. The technique is guaranteed to yield an NPL Z score that is at least as large as that estimated for the data set as a whole. To determine statistical significance, an empirical P value is obtained by a randomization test (i.e., the entire sample of families is randomly ordered, the cumulative NPL Z scores are calculated, and the maximum is recorded). The process is repeated a large number of times (we used 10,000), and the percentage of randomizations that yield an NPL Z score greater than that obtained from the ranked sample determines the empirical P value.
RESULTS
The genome scan identified five regions on four chromosomes that gave nominally significant evidence for linkage (P < 0.05), though none of them meets genome-wide criteria for “significant” or “suggestive” linkage (2). The highest NPL Z scores were obtained over a sharply defined re-gion on chromosome 4 (between D4S1539 and D4S2967). Eight markers were genotyped in this 5.74 cM interval. The peak NPL Z score of 2.41 (equivalent to a LOD score of 1.26) was obtained at D4S1501 (see online appendix at www.diabetes.org/diabetes/appendix.asp). The second highest cluster of nominally significant NPL Z scores occurred on the long arm of chromosome 20, extending over a broad 12.6 cM region (from D20S106 to D20S481). The maximum signal, with a Z score of 2.05, occurred at D20S195. A second, weaker signal was found on the short arm of chromosome 20 in the vicinity of D20S103 (Z 1.80) (see online appendix at www.diabetes.org/diabetes/appendix.asp).
A nominally significant NPL Z score was seen on two other chromosomes. A single marker on chromosome 8 (D8S593) attained a Z score of 1.66, and two adjacent markers on chromosome 14 (D14S749 and D14S605), defining an interval of 6.4 cM, attained Z scores of 1.69 and 1.90, respectively. An ordered subset analysis was carried out for chromosomes 4 and 20, the only chromosomes for which the NPL Z score curve exceeded 2.0 in the entire sample. Before ranking the families, we tested for possible sex differences for BMI and age of onset. The mean age of onset for men (47.6 years) did not differ from the mean for women (49.0 years) in this sample (P = 0.08). A significant difference, however, was observed between affected men and women with regard to their BMI scores (27.6 vs. 29.3 kg/m2, P < 0.0001). Accordingly, before calculating the mean BMI score for each multiplex sibship, the scores were standardized so that each sex had mean zero and unit variance. These standardized scores were then used to obtain the sibship averages, which were ranked for the ordered subset analysis. Table 2 reports the results of the ordered subset analysis.
DISCUSSION
Despite the moderately large sample size of type 2 diabetes–affected sib-pairs reported here, the genome scan revealed only modest linkage signals. This underscores the almost certain genetic heterogeneity of this complex disease, even in a population that is comparatively homogeneous.
The strongest linkage signals were observed in a relatively narrow interval on chromosome 4q and in a broad interval on 20q. To explore the hypothesis that either of these signal regions is differentially attributable to variation in BMI or age of onset, we carried out an ordered subset analysis, with mixed results. With respect to the signal region on chromosome 4, the ordered subset analysis yielded nominally significant evidence for the existence of a small distinctive subset (P = 0.047), but only when the families were ranked by increasing mean BMI. This subset is composed of the leanest families, whose mean BMI (21.9 kg/m2) is substantially less than the remaining families (mean 28.6 kg/m2, P < 0.0001). The ordered subset analysis for chromosome 20 did not suggest the existence of a more homogeneous subgroup.
As a complement to the ordered subset analysis, we also carried out a linkage analysis for both BMI and age of onset on chromosomes 4 and 20 using the nonparametric quantitative trait linkage procedure and the traditional Haseman-Elston procedure implemented in Genehunter. No evidence for linkage was found with either test. Finally, as a test of the hypothesis that a susceptibility locus in the chromosome 4 interval and a susceptibility locus in the chromosome 20 interval epistatically interact to increase risk of type 2 diabetes, we calculated the correlation in NPL Z scores among the eight markers on chromosome 4 and the five markers on chromosome 20. No evidence for epistasis was found.
The participants were not routinely screened for anti-GAD antibodies to rule out the occurrence of type 1 diabetes. This could complicate the analysis, particularly because one of the implicated regions (chromosome 14) contains a putative type 1 diabetes locus. In the absence of screening, we do not know the number of type 2 diabetes sib-pairs incorrectly diagnosed, though we do not consider this to have had a major effect on the overall results. We reasoned that the prevalence of type 2 diabetes is 10–50 times greater than that of type 1 diabetes (28). Furthermore, we excluded those who became insulin-dependent within 2 years of diagnosis. Finally, the NPL Z score for chromosome 6p, the major locus for type 1 diabetes, was <1.5 in the Ashkenazi families.
How do the results of the current study compare with those in previous studies? On chromosome 4q, a study of quantitative traits in prediabetic Pima Indians (29) revealed modest evidence for linkage to acute insulin response to a glucose challenge (LOD 1.2). We are unaware of other reports linking diabetes or related quantitative traits to chromosome 8q. The linkage to chromosome 14 near marker D14S605 overlaps the type 1 diabetes locus NIDDM 11. Interestingly, NIDDM 11 was reported to contribute most to diabetes susceptibility in those sib-pairs known to lack high-risk HLA haplotypes (30). As seen in the appendix (available online at www.diabetes.org/diabetes/appendix.asp), we observed evidence for linkage to a broad region of chromosome 20q11–13, peaking at D20S195. Ghosh et al. (9) reported a genome scan of Finnish ASPs (n = 716), in which the highest LOD (2.06) occurred in the same region of 20q. HNF4α maps to this region, yet no mutations were found when the gene was sequenced. In a scan of chromosome 20 using 301 French Caucasian type 2 diabetes ASPs, Zouali et al. (10) found the greatest evidence for linkage to RPN2 (LOD 2.27), a locus that maps to the same interval. Two other groups examining Caucasian populations also reported positive LOD scores on 20q (11,12). Thus, the results of our investigation represent the fifth report in a Caucasian population of linkage of type 2 diabetes to chromosome 20q11–12.1. The International Type 2 Diabetes Linkage Analysis Consortium (http://www.sfbr.org/external/diabetes/index.html) hasconfirmed and extended these results in combined analysis of 7,124 individuals genotyped from 1,852 Northern European and American Caucasian families (M. Boenhke and N. Cox, personal communication).
Although our results overlap those of other genome studies of type 2 diabetes at chromosomes 4q, 14q, and 20q, we observed no significant linkage at regions identified in other groups, such as that at 1q in North American Caucasian families (8) and at 2q37 in Hispanic-American sib-pairs (4). There are many possible explanations for the failure of the current study to replicate the previous results, including differences in study design (e.g., extended pedigrees and parametric analysis) (8), sample size, or racial composition (4).
Our results suggest that the principal genetic influences on type 2 diabetes susceptibility in the Ashkenazi population may be encoded by loci on chromosomes 4q and 20q. Further refinement of the indicated intervals will require study of molecular markers sharing a common evolutionary history. This process of linkage disequilibrium mapping, which is challenging under the best of circumstances (31), may be facilitated by the elevated levels of intermarker disequilibrium that are characteristic of the Ashkenazi and other recent homogenous human populations (13).
Number of affected sibs . | Number of parents with DNA samples . | ||
---|---|---|---|
0 . | 1 . | 2 . | |
2 | 184 (75) | 14 (9) | 2 (1) |
3 | 45 (24) | 8 (4) | 1 (0) |
4 | 10 (5) | — | — |
5 | 1 (1) | 1 (1) | — |
6 | — | 2 (2) | — |
Number of affected sibs . | Number of parents with DNA samples . | ||
---|---|---|---|
0 . | 1 . | 2 . | |
2 | 184 (75) | 14 (9) | 2 (1) |
3 | 45 (24) | 8 (4) | 1 (0) |
4 | 10 (5) | — | — |
5 | 1 (1) | 1 (1) | — |
6 | — | 2 (2) | — |
The numbers in parentheses are the number of families that contained genotyping on at least one unaffected sib.
Covariate . | Rank order . | Zmax . | Marker . | Distance (in cM) to global maximum . | Number of sibships . | P . |
---|---|---|---|---|---|---|
Chromosome 4 | ||||||
BMI | Low-to-high | 3.78 | D4S1554 | 12.11 | 8 | 0.047 |
High-to-low | 3.59 | D4S2417 | 1.92 | 11 | 0.072 | |
Age of onset | Low-to-high | 2.91 | D4S2967 | 1.92 | 41 | 0.336 |
High-to-low | 2.76 | GATA67A08 | 1.35 | 35 | 0.440 | |
Chromosome 20 | ||||||
BMI | Low-to-high | 2.81 | D20S195 | 0 | 186 | 0.388 |
High-to-low | 2.88 | D20S181* | 47.30 | 73 | 0.331 | |
Age of onset | Low-to-high | 2.35 | D20S195 | 0 | 197 | 0.791 |
High-to-low | 2.76 | D20S891 | 16.60 | 62 | 0.422 |
Covariate . | Rank order . | Zmax . | Marker . | Distance (in cM) to global maximum . | Number of sibships . | P . |
---|---|---|---|---|---|---|
Chromosome 4 | ||||||
BMI | Low-to-high | 3.78 | D4S1554 | 12.11 | 8 | 0.047 |
High-to-low | 3.59 | D4S2417 | 1.92 | 11 | 0.072 | |
Age of onset | Low-to-high | 2.91 | D4S2967 | 1.92 | 41 | 0.336 |
High-to-low | 2.76 | GATA67A08 | 1.35 | 35 | 0.440 | |
Chromosome 20 | ||||||
BMI | Low-to-high | 2.81 | D20S195 | 0 | 186 | 0.388 |
High-to-low | 2.88 | D20S181* | 47.30 | 73 | 0.331 | |
Age of onset | Low-to-high | 2.35 | D20S195 | 0 | 197 | 0.791 |
High-to-low | 2.76 | D20S891 | 16.60 | 62 | 0.422 |
Attains identical maximum at adjacent marker D20S473.
Article Information
This work was supported in part by National Institutes of Health grants DK16746 and DK49583 (M.A.P.), by Grant #93/00191/2, by NIH grant 31302, and by awards from CaPCURE and the Urologic Research Foundation (B.K.S.).
We are grateful to the patients and their families and physicians for participating in this study. We acknowledge the dedicated work of the nursing staff that recruited and ascertained all of the subjects: Ruth Leber, Miraim Ohayon, Miriam Slazki, Yaffa Zisk, and Orly Mizrahi. We also thank Gary Skolnick for help in preparation of the manuscript.
REFERENCES
Address correspondence and reprint requests to M. Alan Permutt, MD, Metabolism Division, Washington University School of Medicine, 660 S. Euclid, Box 8127, St. Louis, MO 63110. E-mail: [email protected].
Received for publication 29 October 2000 and accepted in revised form 22 November 2000.
Additional information can be found in an online appendix at www.diabetes.org/diabetes/appendix.asp.
J.T. holds stock in Parke-Davis Pharmaceutical Research.