African Americans are at increased risk of type 2 diabetes and many diabetes complications. We have carried out a genome-wide scan for African American type 2 diabetes using 638 affected sibling pairs (ASPs) from 247 families ascertained through impaired renal function to identify type 2 diabetes loci in this high-risk population. Of the 638 ASPs, 210 were concordant for diabetes with impaired renal function. A total of 390 markers, at an average spacing of 9 cM, were genotyped by the Center for Inherited Disease Research (CIDR) as part of the International Type 2 Diabetes Linkage Analysis Consortium. Nonparametric linkage (NPL) analyses conducted using the exponential model implemented in Genehunter Plus provided suggestive evidence for linkage at 6q24-q27 (163.5 cM, logarithm of odds [LOD] 2.26). Multilocus NPL regression analysis identified the 6q locus (D6S1035, LOD 2.67) and two additional regions: 7p (LOD 1.06) and 18q (LOD 0.87) as important in this model. NPL regression-based interaction analyses and ordered subset analyses (OSAs) supported the presence of a locus at chromosome 7p (29–34 cM) in the pedigrees with the earliest mean age of diagnosis of type 2 diabetes (P = 0.009 for interaction, ΔP = 0.0034 for OSA) and lower mean BMI (P = 0.009 for interaction, ΔP = 0.070 for OSA). These results provide evidence that genes predisposing African-American individuals to type 2 diabetes are located in the 6q and 7p regions of the genome.
An African-American individual is twice as likely to develop type 2 diabetes as a Caucasian American peer (1,2), and it is estimated that 2.8 million African Americans (13% of this population) have type 2 diabetes (3). Further, African Americans have increased rates of many diabetes complications (4–7). Alarmingly, the prevalence of type 2 diabetes among African Americans aged 40–74 years doubled from 8.9% in 1976–1980 to 18.2% in 1988–1994 (1). Reports (8,9) also indicate an increasing prevalence of type 2 diabetes among African-American children.
Observed increased risk in African Americans is likely to result from a combination of shared environmental and genetic factors. Although there are few published studies specifically investigating familial aggregation of type 2 diabetes in African-American families, Rotimi et al. (10) found that relatives of African-American probands with type 2 diabetes had a 2.95-fold (95% CI 1.55–5.62) higher prevalence of diabetes when compared with relatives of unaffected individuals. In the GENNID (Genetics of Noninsulin Dependent Diabetes Mellitus) African-American families, the majority of first-degree relatives of African-American individuals with type 2 diabetes had abnormal glucose tolerance (11), with 27% found to have undiagnosed diabetes and 31% impaired fasting glucose and/or impaired glucose tolerance.
Given this evidence, it is surprising that genetic studies of type 2 diabetes focused on African Americans have been few in number and limited in scope. We have carried out the first large-scale genome scan for African American type 2 diabetes. The results of this genome scan analysis suggest several priority regions for mapping genes that predispose to type 2 diabetes in African Americans.
RESEARCH DESIGN AND METHODS
Selection criteria and recruitment.
This study was conducted under Institutional Review Board approval from Wake Forest University School of Medicine and adhered to the tenets of the Declaration of Helsinki. DNA samples were collected from self-described African-American families with multiple type 2 diabetes–affected members. Briefly, families were originally identified through a proband with impaired renal function associated with type 2 diabetes. Medical records were reviewed to verify the etiology of the nephropathy. Impaired renal function was attributed to diabetes in the presence of the following three criteria: serum creatinine ≥1.5 mg/dl, diabetes for >10 years or presence of proliferative diabetic retinopathy, and proteinuria ≥500 mg/24 h or >100 mg/dl, in the absence of other known causes of renal failure. Type 2 diabetes was diagnosed in patients developing diabetes after the age of 35 years and treated at the time of recruitment with oral hypoglycemic agents, insulin, or diet and exercise, where treatment was considered permanent (i.e., excluding steroid-induced diabetes and gestational diabetes). Recruitment strategies and selection criteria have been described in detail previously (12–17). The family set for the genome-wide scan comprised 247 African-American families with 638 affected sibling pairs (ASPs), totaling 675 individuals. Of these families, 177 contained two affected siblings, 41 had three affected siblings, 18 had four affected siblings, and 11 had >5 affected siblings, including 1 large family with seven affected siblings plus three affected offspring of one of these individuals. However, in general these family data consisted primarily of individuals from a single generation, with both parents available for only 2 families and one parent for 25 families. There were a total of 638 diabetes-affected individuals, 34 individuals without type 2 diabetes at recruitment, and 3 participants whose diabetes status was unknown. In 73 (30%) of the 247 genotyped African-American families, only the proband had both type 2 diabetes and impaired renal function. In 168 (68%) of the families, there were two or more family members with diabetic renal disease, containing 210 ASPs with diabetic renal disease from a total of 360 affected individuals, whereas in 6 families (2%) the renal status of relatives with type 2 diabetes was undetermined.
Genotyping.
DNA extraction was performed using the PureGene system (Gentra Systems, Minneapolis, MN). Through the International Type 2 Diabetes Linkage Analysis Consortium, a genome-wide scan was completed by the Center for Inherited Disease Research (CIDR). The marker set used was based on Marshfield Panel 8 and included 392 primer pairs at an average spacing of 8.9 cM and no gaps >18 cM.
Linkage analyses.
Each pedigree was examined for consistency of familial relationships using the Pedigree Relationship Statistical Test (18). When the self-reported familial relationships were strongly inconsistent with the genotypic data for that pedigree, then 1) the pedigree was modified when the identity-by-descent statistics suggested a very clear alternative, or 2) a minimal set of genotypic data were converted to missing. A total of 52 pedigrees (21%) exhibited probable misspecified familial relationships and were modified as above, with 84% of these changes from a full sibling to half-sibling. After modifying all family relationships that appeared to be inconsistent with the genome scan data, there were a total of 475 full sibling, 99 half-sibling, 66 parent-child, 31 avuncular, 3 half avuncular, and 3 grandparent-child genotyped affected relative pairs. Each marker was examined for Mendelian inconsistencies using PedCheck (19), and sporadic problem genotypes were converted to missing. Allele frequency estimates were computed using the maximum likelihood methods implemented in the software Recode (D. Weeks, personal communication). Map distances were based on the Marshfield genetic map (20).
The data were initially analyzed using the nonparametric linkage (NPL)pairs statistic and the exponential allele sharing model implemented in Genehunter Plus (21). Genehunter (22) tests whether the observed inheritance at a marker deviates from proportions expected under independent assortment, i.e., whether allele sharing among affected relatives is greater than expected. Genehunter Plus (21) uses one-parameter alternative allele-sharing models for distribution of the inheritance vector and sharing probabilities, which allows exact calculation of likelihood ratios and logarithm of odds (LOD) scores, and this approach is therefore less conservative than Genehunter. Both single-point and multipoint analyses were computed, although only multipoint results are reported.
We also computed NPL regression analyses using the NPLpairs statistics outputted from Genehunter, which we modified (23–26). The single locus models without covariates test excess allele sharing and are asymptotically equivalent to the Genehunter Plus allele-sharing model. To test for an interaction between two loci, we included the two loci and their statistical interaction into the model and computed the one-degree of freedom test of the interaction coefficient. In addition, we tested for interactions between the degree of sharing (identity by descent) at a locus and the pedigree-specific mean age at diabetes diagnosis or BMI. For further details, please see the online appendix (available from http://diabetes.diabetesjournals.org).
Ordered subsets linkage analysis.
A series of ordered subset analyses (OSAs) (27–29) were computed to investigate the influence of a pedigree’s mean age of diagnosis and mean BMI on linkage analyses. Analyses were conducted ranking the family level means for these parameters in ascending, and then in descending, order. The statistical significance of the change in the LOD score was evaluated by a permutation test under the null hypothesis that the ranking of the covariate is independent of the LOD score of the family on the target chromosome. Thus, the families were randomly permuted with respect to the covariate ranking, and an analysis proceeded as above for each permutation of these data. The resulting empirical distribution of the change in the LOD scores yielded a chromosome-wide P value (ΔP). This method is described in greater detail in the online appendix.
RESULTS
Clinical and phenotypic data for diabetic African-American individuals.
The clinical and phenotypic characteristics for the diabetes-affected individuals who were genotyped as part of the genome-wide scan are summarized in Table 1. The genotyped population was 65% female, probably reflecting both the increased prevalence of type 2 diabetes among African-American women (30) and participation bias. The diabetes-affected individuals are obese (median BMI >30 kg/m2), have relatively poor glucose control (median HbA1c 8.0%, normal range 4.5–5.7), and have relatively early onset (median age at diagnosis, 42 years).
Multipoint single-locus linkage results.
Genome-wide multipoint results are shown in Fig. 1, and the maximum LOD scores for each chromosome from multipoint analyses considering each locus separately (single-locus analyses) are presented in Table 2. Only two regions of the genome yielded LOD scores >1. Chromosome 6 at 163.5 cM had the strongest evidence for linkage with type 2 diabetes (LOD 2.26), and the second strongest evidence for linkage was on chromosome 22 at 32 cM (LOD 1.33).
Multilocus linkage analysis results.
The results of the multilocus NPL regression model are also shown in Table 2. Three chromosomal regions (6q, 7p, and 18q) remained statistically significant (P < 0.05) after adjusting for the evidence for linkage at the other two chromosomal regions. A comparison of the single locus and multilocus results for chromosome 6 is shown in Fig. 2. Interestingly, conditional on the model containing these three loci, no other regions of the genome provided significant evidence of linkage.
Influences of age at type 2 diabetes diagnosis and BMI
NPL regression interaction analyses for age at type 2 diabetes diagnosis or BMI.
The results of the NPL regression locus-specific linkage by age at diabetes diagnosis interaction analysis and for BMI are summarized in Table 3. Regions showing statistically significant (P < 0.01) interactions with age at diagnosis or BMI, or corresponding with significant (P < 0.01) loci identified using OSA, are listed, and the direction and magnitude of the interaction is indicated by the Pearson’s correlation coefficient. In total, eight loci displayed significant (P < 0.01) interactions with type 2 diabetes age at diagnosis, and additional loci on chromosomes 10 and 20 (P = 0.0119 and 0.0128) are included in Table 3 because they showed consistency with the OSA results. The strongest interactions with age at diagnosis are seen at 5p, 1p, and 7p, where the linked families at these loci were on average >3 years younger at diagnosis of type 2 diabetes (Table 3). The most significant interaction with BMI was detected at 3q, in a subgroup of slightly more obese families (mean BMI 32.4 kg/m2 in linked pedigrees versus 30.7 in those unlinked). Significant interactions (P < 0.01) with BMI were also detected at 1q and 7p (Table 3).
OSA on age at type 2 diabetes diagnosis or BMI.
The OSA found differential evidence for linkage depending on the age at type 2 diabetes diagnosis and BMI. Regions displaying a significant increase in the LOD score (change in chromosome-wide P value, ΔP < 0.01) are shown in Table 4, plus three regions that were consistent with the NPL regression interaction results (Table 3) for age at type 2 diabetes diagnosis (chromosomes 1 and 5, ΔP = 0.0128 and 0.0375 respectively) or BMI (chromosome 7, ΔP = 0.0704). Three regions of the genome exhibited significant OSA maximum LOD scores >2.5. Subset analysis on the 65 pedigrees (29%) with the earliest age of diagnosis increased the chromosome 7p LOD score from 0.48 to 3.85 (ΔP = 0.0034), as shown in Fig. 3. In contrast, subsetting on the 45 pedigrees (20%) with the latest age at type 2 diabetes diagnosis increased the chromosome 10 LOD score from 0.40 to 3.56 (ΔP = 0.0019), and subsetting on the 32 pedigrees (13%) with the latest age at diagnosis increased the LOD score on chromosome 20 from 0.13 to 2.57 (ΔP = 0.0080) (Table 4). The OSA based on BMI did not yield statistically significant increases in the LOD scores, although subset analysis on the 158 pedigrees (68%) with the lowest BMI increased the LOD score at chromosome 7p from 0.77 to 2.33 (ΔP = 0.0704; Table 4), consistent with the NPL regression interaction analysis with BMI (Table 3).
DISCUSSION
Our investigation represents the first large-scale effort to identify chromosomal regions with putative type 2 diabetes–predisposing genes in an African-American sample. Diabetes is a genetically complex multifactorial disease that requires sophisticated consideration of multigenic and phenotypic influences. As well as standard nonparametric methods, we used novel approaches to evaluate and identify locus heterogeneity. It has also proved productive to consider phenotypes such as age at type 2 diabetes onset and obesity, which may define a more homogeneous subgroup of families. A genome-wide scan of 247 African-American families has identified a locus on chromosome 6q and a region of 7p that apparently interacts with early-onset type 2 diabetes and low BMI, as target regions in the search for African-American type 2 diabetes susceptibility genes.
Our ascertainment scheme identifies families with patients with one of the most serious complications of diabetes. Thus our sample, the largest diabetic African-American family sample genotyped to date, may be more genetically homogeneous than many genome scans for type 2 diabetes. It is of note that the mean age at type 2 diabetes diagnosis of 41.8 years is almost 10 years younger than the figure of 51.0 years of age seen in the comparable African-American GENNID families (31), although our sample size was approximately fivefold larger. The relatively long duration of diabetes (mean 16.0 years) is likely to reflect our method of ascertainment. Because the probands had impaired renal function, at least one individual in each family has had a sufficient duration of type 2 diabetes to develop impaired renal function. The mean BMI of 31.7 kg/m2 is similar to that observed in the GENNID (31) and Project SuGAR (Sea Islands Genetic African American Registry) (32) African-American families, where mean BMI values were 33–34 kg/m2.
The only reported genome-wide scan to date of African Americans with type 2 diabetes has been conducted by the GENNID group (31). Investigation of a phenotype of impaired glucose homeostasis (combining type 2 diabetes, impaired glucose tolerance, and impaired fasting glucose) using 109 ASPs (152 individuals) showed the greatest evidence for linkage near D10S1412 (LOD 2.39). The only overlap in our data with this locus comes from the NPL regression interaction analysis with age at type 2 diabetes diagnosis (P = 0.007) (Table 3), where the linked families in our study had an older mean age at diagnosis. Genome-wide scans of type 2 diabetes (33) and insulin resistance syndrome phenotypes (34) from the Africa America Diabetes Mellitus study (35) of African diabetic families from Ghana and Nigeria reported in abstract form show no overlap with our linkage results.
Evidence supporting our linkage findings can be derived from two sources: replication of type 2 diabetes loci reported by previous studies and internal consistency between results using different statistical approaches. In unstratified analyses, the locus of greatest effect is on chromosome 6q24-27. In the GENNID genome-wide scan for impaired glucose homeostasis (combined type 2 diabetes, impaired glucose tolerance, and impaired fasting glucose) in African Americans, a LOD of ∼2.0 was detected on chromosome 6q (31). A study (36) of obesity in African Americans found that the second largest peak for percentage of body fat was at 6q, with an LOD of ∼1.5. NPL regression analysis revealed an interaction with BMI in our population at 155 cM (P = 0.0426), 8.5 cM proximal to the unadjusted 6q peak, but the linked families displayed a slightly lower mean BMI of 30.6 kg/m2, compared with a mean BMI of 32.0 kg/m2 in the unlinked pedigrees. Further work is required to understand the relationship, if any, between obesity and the linkage to type 2 diabetes within a region identified in the only two genome scans in samples of African Americans. Although the majority of families (68%) contain at least one relative pair concordant for both type 2 diabetes and impaired renal function, these families account for less than one-third of ASPs, and analysis using only the pedigrees concordant for impaired renal function did not provide evidence for linkage to 6q (data not shown).
Other genome-wide scans with at least some degree of overlap with our linkage peak at 6q24-q27 include those of type 2 diabetes in Pima Indians (37), possibly paternally derived (38), type 2 diabetes in Chinese Hans (39), and measures of insulin resistance, obesity, and lipids in Mexican Americans (40–42). More proximal diabetes-related 6q loci (29,38,43) probably differ from that detected in the African-American families. Within the 35.5-cM 1-LOD interval of the 6q linkage peak, there are several plausible candidates for type 2 diabetes. These include genes for estrogen receptor 1, tubby superfamily protein, insulin-like growth factor 2 receptor, mitogen-activated protein kinase kinase 4, and manganese superoxide dismutase.
Although the statistical methodology of the NPL regression and OSA approaches are fundamentally different types of tests for interaction, it is important that they identified the same relationships. Specifically, the NPL regression interaction analysis tested for whether the evidence for linkage varied by phenotype levels in the entire sample, and the OSA identified the trait-ordered subset of pedigrees that provided the maximal evidence for linkage and whether that maximum was a statistically significant increase over the entire sample. Consistent evidence for age at diagnosis effects on linkage from both nonparametric regression-based interaction analyses (Table 3) and OSA (Table 4) was obtained for chromosomes 1, 5, 7, 10, and 20. Maximal interaction values were within 4.5 cM of those revealed by OSA, and phenotypic effects were in identical directions for each locus. Similarly, the chromosome 7 locus found by the regression-based evaluation of interaction with BMI (Table 3) was also detected by subsetting on BMI (Table 4), with effects in the same direction and peaks at the same chromosomal position. It is interesting to note that the 6q linkage peak was not substantially changed using the subsetting approaches, suggesting that the hypothesized gene influences development of diabetes independently of age and adiposity.
The evidence for linkage to 7p was detected using the multilocus NPL regression method (Table 2), which accounts for some genetic heterogeneity, and both the lower BMI and earlier-onset interaction analyses of the NPL regression and OSA (Fig. 3). Given that major covariates for type 2 diabetes are higher BMI and later age of onset, linkage at 7p in those with lower BMI and earlier age at diagnosis could suggest that these individuals may have a higher genetic contribution to their diabetes status. The novel analytical approaches used can distinguish these individuals in the overall diabetic background. Linkage to this region of 7p has been reported in a study (44) of type 2 diabetes in Japanese families, with increased evidence for linkage in the families with mean family BMI <30 kg/m2. If this represents the same locus, both studies support a role in susceptibility in the leaner families, although the linked African-American families were still overweight or obese (mean family BMI 30.3 or 28.1 kg/m2) (Tables 3 and 4). Plausible candidates in this region include genes for phosphodiesterase 1C and neuropeptide Y.
Our interaction and OSA analyses of phenotypic parameters suffer from the following limitations. Age at diagnosis was used as a surrogate for age at type 2 diabetes onset, although diagnosis frequently lags behind disease onset and may only be made in the presence of other complications. However actual age at onset measures can only be determined in prospective studies, where sample size would be prohibitive for family-based linkage studies. BMI was calculated from measures taken at study entry. Particularly for individuals who have had impaired renal function for some time, this is unlikely to accurately represent the degree of obesity that impacted on type 2 diabetes susceptibility. We now collect data on self-reported maximum BMI for newly recruited patients and are evaluating different methodological approaches to more precisely evaluate the relationship between BMI and type 2 diabetes.
Overall, the evidence for linkage in the entire sample is modest, but the primary peaks of interest exceed the Lander and Kruglyak (45) criteria for suggestive linkage in a genome-wide scan. As in any genome scan, one must be mindful that some of these linkage results could represent chance events. Approximately one-third (210 of 638) of the ASPs in the genome scan were concordant for diabetic renal disease. Although our major linkage signal at chromosome 6q appears to be driven by diabetes or a diabetes-related phenotype, not all linkage signals equally partition the linkage evidence into diabetes-related phenotypes or impaired renal function (data not shown). Novel analytical approaches have identified regions that apparently interact with type 2 diabetes onset and BMI, and priority loci have been targeted for fine mapping and further investigation in studies of African Americans with type 2 diabetes.
Additional information for this article can be found in an online appendix at http://diabetes.diabetesjournals.org.
Article Information
This work was supported by grants R01 DK53591 (to D.W.B.), HL56266 (to B.I.F.), and U01 DK58026 (International Type 2 Diabetes Linkage Analysis Consortium) and genotyping services were provided by the CIDR. CIDR is fully funded through a federal contract from the National Institutes of Health to The Johns Hopkins University, contract number N01-HG-65403.
We thank the patients, their relatives, and the staff of the Southeastern Kidney Council/ESRD Network 6 for their participation. Our thanks also go to Nancy Cox, Michael Boehnke, and Catherine McKeon of the International Type 2 Diabetes Linkage Analysis Consortium for their support and encouragement.