Both the coding and control regions of mitochondrial DNA (mtDNA) play roles in the generation of diabetes; however, no studies have thoroughly reported on the combined diabetogenic effects of variants in the two regions. We determined the mitochondrial haplogroup and the mtDNA sequence of the control region in 859 subjects with diabetes and 1,151 normoglycemic control subjects. Full-length mtDNA sequences were conducted in 40 subjects harboring specific diabetes-related haplogroups. Multivariate logistic regression analysis with adjustment for age, sex, and BMI revealed that subjects harboring the mitochondrial haplogroup B4 have significant association with diabetes (DM) (odds ratio [OR], 1.54 [95% CI 1.18–2.02]; P < 0.001), whereas subjects harboring D4 have borderline resistance against DM generation (0.68 [0.49–0.94]; P = 0.02). Upon further study, we identified an mtDNA composite group susceptible to DM generation consisting of a 10398A allele at the coding region and a polycytosine variant at nucleotide pair 16184–16193 of the control region, as well as a resistant group consisting of C5178A, A10398G, and T152C variants. The OR for susceptible group is 1.31 (95% CI 1.04–1.67; P = 0.024) and for the resistant group is 0.48 (0.31–0.75; P = 0.001). Our study found that mtDNA variants in the coding and control regions can have combined effects influencing diabetes generation.
Variants in mitochondrial DNA (mtDNA) have been suggested as potential genetic etiologies for the generation of type 2 diabetes mellitus (T2DM). An A-to-G transition at nucleotide pair (np) 3243 in the tRNALeu(UUR) gene of mtDNA is a well-delineated diabetogenic factor with no racial exclusivity (1). Other potential influences include mtDNA rearrangements and several other point mutations/polymorphisms located within the coding region of mtDNA (2–4). Recently, a T-to-C transition at np 16189, which causes a homopolymeric tract of cytosines at np 16184–16193, was found to be associated with the generation of T2DM in certain ethnic groups (5–7). Unlike previous reports, the polycytosine tract (Poly-C) variant was located within the control (D-loop) region of mtDNA. Its diabetogenic pathomechanism was suggested to be linked to the replication process of mtDNA, an essential part of mitochondrial biogenesis (8). As well, its expression was observed to be further influenced by additional factors such as increased body weight and oxidative stress exerted from individuals’ environmental influences per se (9,10). However, its inconsistent diabetogenic role in different ethnicities suggested more factors involved in the process (11,12).
The human mitochondrial genome contains two parts: one encompasses DNA coding 13 mRNA, 2 rRNA, and 22 tRNA, and the other constitutes a control region responsible for the expression of mitochondrial genome. DNA variations can randomly arise within the coding and control regions of mtDNA during human evolution. These variants are inherited through the maternal lineage, and new variants may develop in each descendent branch of a population (13,14). These additional variants in each subgroup are further spread with the migration of populations into disparate ethnic groups. Due to the faster asymmetric rate of variant development in the control region, different variants in the control region may appear in the same cluster of coding region variants (15). Several mtDNA haplogroup determinations, primarily defined by variants in the coding region, were found to be associated with certain human diseases (16,17). However, the individual roles of control region variants on disease, and their combined effects when analyzed together with coding region variants, have yet to be fully investigated. In this article, we report the results of our study on the individual and combined effects of variants in the mtDNA control and coding regions on the development of diabetes.
RESEARCH DESIGN AND METHODS
A total of 2,010 unrelated Taiwanese of ethnic Chinese backgrounds were enrolled into this study. The subjects were divided into two groups according to their personal history and presence of diabetes. Group 1 consisted of 859 subjects (480 male and 379 female) with known histories of diabetes who were receiving regular follow-up care in our hospital. Individuals who had diabetes onset before the age of 30 years were not included in our study. Group 2 was composed of 1,151 nondiabetic individuals (671 male and 480 female) randomly selected from the health screening center or outpatient service. The nondiabetic status was determined by the patient’s history and a fasting plasma glucose level of <6.1 mmol/L (110 mg/dL) as well as a normal measurement of blood glycosylated hemoglobin level (HbA1c <6.0). Participants were all >30 years of age, the oldest being 80 years. Informed prior consent was given by all subjects. The studies were conducted according to the guidelines of the Declaration of Helsinki, and the study protocols were accepted by the Ethics Committee of the Chang Gung Memorial Hospital.
Methods for determination of mitochondrial haplogroup.
Genomic DNA was extracted from whole blood. We used 24 pair primers (Supplementary Table 1) to perform the gene amplification by multiplex PCR (PCR). The range of amplicon size was 190–300 base pairs. For this study, we used 94 probes for mitochondrial haplogroup definition. The oligonucleotide probe sequences for haplogrouping are shown in Supplementary Table 2. These synthesized oligonucleotide probes were modified at the 5′ end with a terminal amnion group and covalently bound to the carboxylated fluorescent microbeads using ethylene dichloride. The oligonucleotide-labeled microbeads (oligobeads) were mixed together for hybridization. After hybridization, the amplicons were labeled with streptavidin-phycoerythrin using the Eppendorf MasterCycler gradient (Eppendorf). Then the reactions were measured by the Luminex100 flow cytometer (Luminex). The detailed methodology used for genotyping followed the protocol described by Itoh et al. (18).
Selection of mitochondrial polymorphism for haplogroup classification.
By referencing the human mitochondrial single nucleotide polymorphism (mtSNP) database provided on the Mitomap website and the previously constructed phylogenetic trees for the Chinese population and the Japanese population, we selected 40 mtSNPs that define 15 major haplogroups (A, B, C, D, E, F, G, M7, M8, M9, M10, M11, M12, M13, and N9) and their constitutive subhaplogroups (B4, B5, D4, D5, F1, F2, F3, F4, M7a, M7b, M7c, M8a, and N9a) in our population (mtSNPs for the corresponding haplogroup are shown in Fig. 1) (19,20).
Methods for determination of mtDNA full-length sequence.
The entire mitochondrial genome was amplified as six fragments, each ∼3.0 kb in length, by a symmetric PCR method with the primer pairs (L and H primers) shown in Supplementary Table 3. The amplified fragments were analyzed by electrophoresis on a 1% agarose gel and visualized by staining with ethidium bromide. The first PCR DNA templates for the sequence analysis of the entire mitochondrial genome were amplified as 32 overlapping segments, each of ∼500–1,100 base pairs, by a symmetric PCR method with the primer pairs (FL and H primers) shown in Supplementary Tables 4 and 5, respectively. These second PCR fragments were analyzed by electrophoresis on a 1.5% agarose gel and visualized by staining with ethidium bromide. Primers used for sequence are shown in Supplementary Table 6. To identify each mtSNP, at least two overlapping DNA templates amplified with different primer pairs were used. Mitochondrial SNPs were identified by referencing the revised Cambridge sequence reported by Andrews et al. (21).
Methods for determination of mtDNA control region sequence.
The mtDNA control region segment (relative to region 15,911–602 in the Cambridge reference sequence) was amplified using forward primer L15911 (5′-ACCAGTCTTGTAAACCGGAG-3′) and reverse primer H602 (5′-GCTTTGAGGAGGTAAGCTAC-3′). The products were purified with gel extraction kits (Watson BioMedicals) and sequenced by using primer L15911 and another primer, L29 (5′-CTCACGGGAGCTCTCCATGC-3′), on an ABI 377XL DNA Sequencer (Applied Biosystems). However, due to the frequent conversion of thymine to cytosine and presence of homopolymeric cytosine tract at np 16184–16193 and 303–315 within the control region, the sequencing procedure ceased each time with samples harboring these variants. This required the procedure to be reverse sequenced by using two additional sets of primers: H81 (5′-CAGCGTCTCGCAATGCTATC-3′) and H602 (5′-GCTTTGAGGAGGTAAGCTAC-3′). DNA sequences were analyzed by the DNASTAR and Bio Edit Sequencing Analysis Software.
Statistical analysis was performed using the Statistical Package for Social Science program (SPSS for Windows, version 11.5; SPSS, Chicago, IL). We performed multivariate logistic regression analysis to adjust risk factors, with diabetes as a dependent variable and independent variables including age, sex, BMI, and occurrence rates of mtDNA haplogroups, rates of the identified mtDNA variants, or rates of the composite mtDNA variants. Results of the logistic regression model were presented as the odds ratio (OR) and 95% CI. Unless indicated otherwise, a P value <0.05 was considered significant. For comparison of data regarding the rate differences of mtDNA variants between various haplogroups, the χ2 test was conducted. Some haplogroups (M9, M10, M11, M12, M13, F3, F4, Y, and Z) with a limited number of cases were allocated into the groups, “Others in N” or “Others in M,” for comparison according to their related macrohaplogroup on the phylogenetic tree. Because of multiple comparisons of mtDNA haplogroups, we applied Bonferroni correction. Because we examined 16 haplogroups (A, B4, B5, C, D4, D5, E, F1, F2, G, M7b, M7c, M8, N9, others in N, and others in M), we divided 0.05 by 16 to give 0.0031. Thus, a P value of <0.0031 was considered statistically significant.
Association between diabetes and mitochondrial haplogroup.
Multivariate logistic analysis with adjustment for age, sex, and BMI revealed that subjects harboring mitochondrial haplogroup B are associated with an increased risk of diabetes generation (multivariate OR, 1.52 [95% CI 1.21–1.91]; P < 0.001), whereas those harboring mitochondrial haplogroup D are associated with resistance against development of diabetes (0.74 [0.57–0.95]; P = 0.018) (Table 1). Further analysis determined that there is a definitive link between subjects harboring subhaplogroup B4 and the development of diabetes (1.54 [1.18–2.02]; P < 0.001). This link was not found in subjects harboring subhaplogroup B5. In terms of haplogroup D, the protection against diabetes development was found in subhaplogroup D4 (0.68 [0.49–0.94]; P = 0.02), whereas no such connection was found in subhaplogroup D5. These results, after Bonferroni correction (setting at P < 0.0031), showed a significant relationship between generation of diabetes and subjects harboring mitochondrial haplogroup B4. However, the protective relationship between diabetes and subjects harboring mitochondrial haplogroup D4 became insignificant.
Identification of the probably diabetogenic-related mtDNA variants in the coding region.
For further identification of the specific mtDNA variants that may be implicated in the diabetogenic effects of mitochondrial haplogroups B and D, we conducted full-length mtDNA sequences in 40 unrelated cases known to harbor mitochondrial B4, B5, D4, and D5 haplogroups, with 10 cases randomly selected from each haplogroup (Table 2). We then analyzed the variant mtDNA of each case, with special focus on the nucleotide change within the functional location of the mitochondrial genome.
Five nonsynonymous mtDNA variants, which cause changes in amino acids, were consistently found in subjects harboring certain mtDNA haplogroups. Among them, an A-to-G transition at np 10398 (A10398G) was noted in haplogroup B5 but not in the diabetes-susceptible haplogroup B4. Additionally, the A10398G variant was present in the borderline resistant haplogroup D, including the haplogroups D4 and D5. Other findings included the presence of G8584A only in subhaplogroup B5; C5178A and A8701G only in subhaplogroups D4 and D5; and A5301G only in subhaplogroup D5. Twenty additional nonsynonymous mtDNA variants were also found in subjects harboring the four mtDNA haplogroups. These variants are random and showed no consistent presence in particular haplogroups.
Eighteen synonymous mtDNA variants and a nine-base pair deletion between genes for cytochrome oxidase subunit II and t-RNAlys were consistently found in subjects harboring certain haplogroups. These included eight mtSNPs (C4883T, T9540C, C10400T, T10873C, C12705T, T14783C, G15043A, and G15301A) specifically present in haplogroup D; two additional mtSNPs (G3010A and C14668T) in subhaplogroup D4 only; and three additional mtSNPs (T1107C, A9180G, and A10397G) in subhaplogroup D5 only. We also noted 5 of the 18 mtSNPs (A3537G, C6960T, T9950C, G10325A, and A15235G) in subhaplogroup B5. The nine-base pair deletion is a hallmark consistently found in subhaplogroups B4 and B5.
Among these mtDNA variants, of note are the synonymous C10400T and the nonsynonymous A8701G variants, which are commonly considered major mtDNA variant markers for disparate mitochondrial superhaplogroup M from N. Another synonymous variant, C12705T, is found in all haplogroups other than superhaplogroup R (in which it is 12705C). The R superhaplogroup is a rooting ancestor for most of the Caucasian population and for the Asian population harboring haplogroups B and F. In addition, the A10398G variant is a specific marker found to be associated with all haplogroups within the superhaplogroup M but also partial haplogroups within the superhaplogroup N. These fundamental mtDNA variants are noted to have existed at the early stages of modern human evolution and can be traced back through phylogenetics to the beginnings of human global migration.
Identification of the probably diabetogenic-related mtDNA variants in the control region.
In addition to the identified coding region variants in these diabetes-related haplogroups, we studied the role of mtDNA control region variants on the generation of diabetes (Figs. 2 and 3). Sequences of mtDNA control region from np 16024 to 576 were therefore performed in all 2,010 cases. We then analyzed the variants specifically present in the diabetes-related haplogroup B by comparing the subject group harboring B and the counterpart group harboring non-B to identify the control region mtDNA variants probably implicated in the generation of diabetes. We also analyzed the different occurrence rates between subhaplogroups B4 and B5, as well as D4 and D5, to determine the mtDNA variants probably responsible for the diabetes-related effects of these subhaplogroups. Two hypervariable locations of mtDNA in the control region, which cause the formation of a continuous Poly-C variant, or loss thereof, were specifically identified and compared on a segmental pattern between np 303–315 and 16184–16193 (22).
Between 7 and 28 mtSNP variants (average 15.3 ± 2.8 mtSNP) were found per subject, encompassing 0.6–2.4% of the total 1,122 nucleotide pair numbers (from np 16024 to 576) in the control region of mtDNA. After analysis of the mtDNA variants specifically present in the diabetes-related haplogroups, we found that 64 mtSNPs and the Poly-C variant at np 16184–16193 showed significant differences of occurrence between subjects harboring haplogroup B (n = 423) and the rest of the subjects (non-B haplogroups; n = 1,587). The difference between the occurrence rates of these specific mtDNA variants is shown in Fig. 2. Among them, the T16189C, T16217C mtSNP, and the Poly-C variant were noted to have distinctly higher rates of occurrence in haplogroup B. As well, mtSNPs at T16519C, T16140C, C16261T, T16136C, A210G, C16266A, G16274A, and G207A also showed higher rates of occurrence in haplogroup B. Contrarily, mtSNPs of C16223T, T489C, T16362C, T16304C, T152C, T16172C, T199C, and T16298C showed notably low rates of occurrence in haplogroup B. We also noted an additional 46 mtSNPs showing significant difference of occurrence rates with rate differences <12%. These control region variants are candidates probably implicated in the generation of diabetes among subjects harboring haplogroup B.
Upon further subhaplogroup analysis, we found that 17 mtSNPs had distinctly different rates of occurrence between B4 and B5, whereas 12 mtSNPs had distinctly different rates of occurrence between D4 and D5 (Fig. 3). Among these mtSNPs, T16217C was noted to be specifically present in the diabetes-susceptible B4 subhaplogroup, and C16261T, T16136C, and G499A had high occurrence rates in B4, whereas T16140C, A210G, and C16266T had high occurrence rates only in B5. Regarding the rate differences between D4 and D5, we have noted a high occurrence rate of C150T in D5 and moderate rates of C456T in D5 and G16129A in D4. There is an ∼100% rate of occurrence of Poly-C in D5, but a complete absence of the Poly-C variant in D4.
Investigation into the combined effects of coding and control region mtDNA variants on diabetes generation.
Serial analysis to examine the combined effects of the coding and the control region mtDNA variants on diabetes generation was conducted (Table 3). Composite groups combined the identified coding region mtSNP specifically present in these diabetes-related haplogroups (5178A, 5301G, 8584A, 8701A, 8701G, 10398A, 10398G, 10400C, 10400T, 12705C, and 12705T) and those control region variants with a significantly high difference of occurrence rate between the analyzed diabetes-related mtDNA haplogroups and subhaplogroups. Table 3 shows the results. Six out of a total 65 control region variants (Poly-C, T16217C, T16519C, T16136C, C16223T, and T152C) with significant rate differences between B and non-B groups and/or with a significant rate difference between subhaplogroups B4, B5 and D4, D5, which were also found to have significant relationships with diabetes when in combination with one of the previously identified coding region variants were isolated and listed in Table 3. The 59 variants not shown in the table met the above criteria but lacked, after a preliminary logistic analysis, statistically significant relationships with diabetes when in combination with one of the previously identified coding region variants.
After multivariate logistic analysis with adjustment for age, sex, and BMI, one control region variant (T152C) exhibits consistently resistant effects against diabetes generation when in combination with 5178A, 8701G, 10398G, and 10400T variants in the coding region. Whereas three control region variants, including T16136C, T16217C, and the presence of an np 16184–16193 Poly-C, show consistently susceptible effects when in combination with the wild-type 8701A, 10398A, 10400C, and 12705C mtDNA in the coding region. ORs for the significant composite groups are shown in Table 3. There are also some additional composite groups showing relationship with generation of diabetes (resistant: 5178A–16519C, and 12705T–16223T; susceptible: 12705C–16519C), but they all are only randomly present.
Identification of specific mtDNA composite groups for diabetes generation.
Through the above analysis, we identified some mtDNA composite groups consisting of specific control and coding region mtDNA alleles with implications in the generation of diabetes. Of those composite groups associated with susceptibility, three control region variants, the T16136C, T16217C, and the np 16184–16193 Poly-C, were found to associate with four coding region mtSNPs. These four mtSNPs match with the early ancestral determining mtSNPs of the human phylogenetic branches. The 8701A and 10400C are the determinant markers of macrohaplogroup N; as well, 12705C is the marker of N’s daughter clade, the R macrohaplogroup. The other coding region mtSNP of note is the 10398A, which is present in a majority of N haplogroups, changing to 10398G in some haplogroups, including the B5, Y (commonly found in the Asian population) and the I, J1, J2, and K1 haplogroups (commonly found in the European population). This mtDNA allele, when combined with the 16136C, 16217C, or Poly-C variant forms composite groups that are found in the diabetes-susceptible B4 subhaplogroup. The 16136C and 16217C variants, present only in the B4 subhaplogroup, are nonfunctional variants. Interestingly, although the Poly-C variant is present in both the B4 and B5 subhaplogroups, the key factor influencing diabetes generation lies in the 10398A–Poly-C composite group, which is present only in the B4 subhaplogroup, whereas in the B5 subhaplogroup, it is 10398G–Poly-C. The 10398A–Poly-C group shows consistent diabetogenic effects with an OR of 1.31 (95% CI 1.04–1.67; P = 0.024). It must be noted that the absence of this composite group, subsequent to the variant 10398A mutating into 10398G, likely negates such diabetogenic effects. The existence or absence of the composite group (10398A–Poly-C) in the diabetes-related haplogroups B4, B5, D4, and D5 appears to be a contributing factor influencing either diabetes resistance or susceptibility (Fig. 1).
For the mtDNA composite groups associated with resistance, the T152C variant, when combined with four different coding region mtSNPs, shows resistant effects against diabetes generation. Of the four mtSNPs, 10400T, 8701G, and 10398G are determinant markers of the macrohaplogroup M, whereas 5178A is a determinant marker of haplogroup D. The presence of these diabetes-resistant composite groups can also be found on the early branches of the phylogenetic mtDNA tree. But, this characteristic became evident only after the additional presence of the variants 5178A and 152C in the phylogenetic branches.
Examination of distribution of the identified composite components within different mitochondrial haplogroups and lineage thereof.
In an attempt to clarify the role of a founder effect, which might be implicated in the diabetogenesis of the identified mtDNA variants, we analyzed the presentation rate of each variant within different mitochondrial haplogroups. Table 4 shows the results. Generally, the identified mtDNA variants appear randomly dispersed, although partial clusters were noted in certain mitochondrial haplogroups. Of the diabetes susceptible composite components, Poly-C was randomly distributed within different haplogroups of the macrohaplogroups N and M. It was found to have a notably high presentation rate (93–100%) in the diabetogenic-related haplogroups B4, B5, and D5. The variants T16136C and T16217C were exclusively present in haplogroup B4. Of the diabetes-resistant composite components, we noted the presence of the T152C variant in almost all mitochondrial haplogroups, but with notably high rates of presentation in haplogroups A, D4, G, and Z.
In this study, we identified the mitochondrial haplogroup B as a group susceptible to the development of diabetes, whereas haplogroup D was found to be a borderline resistant group. Upon further analysis, these associations were only found in the subhaplogroups B4 (susceptible) and D4 (resistant). The counterpart subhaplogroups B5 and D5 showed no significant associations with the generation of diabetes. In a further comprehensive study of the differences between these implicated subhaplogroups, we found that the lack of mtDNA A10398G variant in the coding region distinguishes the diabetes-susceptible B4 from D4, D5, and B5; as well, a lack of a Poly-C variant at np 16184–16193 of the control region distinguishes the diabetes-resistant D4 from D5, B4, and B5 (Fig. 3). The persistence of the 10398A genotype in the coding region together with the presence of a Poly-C variant in the control region appears to play a role in the characterization of B4 as a diabetes-susceptible subhaplogroup. This observation was subsequently authenticated by the finding of a positive association between the presence of a composite 10398A–Poly-C mtDNA group and the generation of diabetes in our subjects. In our study, we also illustrate a random dispersion of Poly-C and other variants as found in different lineages of mitochondrial haplogroups (Table 4). These findings suggest the possibility of a founder effect playing a significant role is minimal at best.
An A-to-G transition at np 10398 of mtDNA, which causes an amino acid change from threonine to alanine, has previously been observed to be implicated in the generation of certain diseases. However, due to the lack of a consensus and the inconsistency of findings on the causative role of the 10398G allele on human conditions, it is plausible that other factors are influential. Further analysis found that additional variants in the coding or control regions of mtDNA also play roles in the generation of disease (23–26). This notion is further supported by our findings of the diabetes-related effects of the 10398A or 10398G alleles when present in combination with the Poly-C and 152C mtDNA variants. A thorough screening and subsequent study of ethnospecific mtDNA variants on disease association is therefore mandatory. There are also four additional nonsynonymous amino acid changes identified from the comparison between subject groups harboring diabetes-susceptible B4 and B5 and diabetes-resistant D4 and D5 haplogroups. These amino-acid substitutions are likely functionally conservative due to their negative association with diabetes in our study. However, a possible change in protein structure and functional expression after these substitutions warrants further study in order to develop a more comprehensive understanding of their roles.
An uninterrupted Poly-C variant at np 16184–16193 of mtDNA control region has been identified as a genetic risk factor for generation of T2DM in certain ethnic groups, especially within the East Asian population (7,9,27). However, this relationship with T2DM in the European population is still debatable (11,12). The suggested pathomechanism for this association is that a Poly-C variant may cause a defect in the replication process of mtDNA, which in turn decreases the mtDNA copy number and can lead to disease (28). This hypothesis has recently been authenticated by the finding of an association between the presence of an uninterrupted Poly-C and a low content of mtDNA copy number in human peripheral blood cells (22). Influence on either the replication or transcription process of mtDNA by noncoding region variants or the Poly-C variant was also illustrated by recent cellular model studies (29,30). This finding supports the notion that quantitative change of mtDNA copy number, subsequent to a variation in the control region of mtDNA, could be an epistatic cause of medical diseases. Our present study suggests the necessity of further characterization of the mutual interaction, either augmentation or attenuation, between different mtDNA variants in the coding region and control region in future studies of the pathological role of mtDNA variants on medical diseases.
Mitochondria are critical intracellular organelles responsible for cellular energy supply. Their roles in fulfilling physiological demands and overcoming the pathological challenges of cells are regulated by a highly efficient cellular pathway responsible for the transcription and replication of mtDNA, known as mitochondrial biogenesis. mtDNA, being a template for the execution of these two processes, plays an important role. In a recent study, a low-binding affinity between the single-stranded DNA-binding protein, activated during mtDNA replication, and the control region of cybrid cells harboring a Poly-C variant was observed (7). This observation offers further pathological evidence for a previous hypothesis suggesting that a Poly-C variant causes defects in the replication process. A study of the influence of mtDNA variants on cellular function also found that a particular mtDNA composite group, made up of a specific combination of the 8701/10398 alleles, may affect mitochondrial matrix pH and intracellular calcium dynamics, potentially causing pathophysiological change in cells and generation of disease (31). These two observations from cellular models imply that specific mtDNA change in either coding region or control region, or their combined effects, as an etiology of pathogenesis of medical diseases is plausible.
In this study, we found that an mtDNA composite group, consisting of a wild 10398A allele and an np 16184–16193 Poly-C variant, has a significant association with generation of diabetes. A similar finding has previously been reported in a study of the North Indian population (32). This composite group was specifically present in the subhaplogroup B4 and partially present in its rooting macrohaplogroups, R and N. However, it was not present in the macrohaplogroup M. Phylogenetically, N and M macrohaplogroups are believed to represent the two major human macrohaplogroups originating from the African continent. A variant form of np 10398, mutated from A to G, was found as a characteristic allele present in all descendents of macrohaplogroup M, but not N. However, it was noted that some haplogroups within the N macrohaplogroup also harbor 10398G, including the Caucasian haplogroups J and K and the Asian haplogroups B5 and Y. Of these haplogroups, J and K are found to have potentially protective effects against development of Parkinson disease (23), whereas in our study, we found that B5 is likely to provide a protective effect against development of diabetes. The harboring of this specific 10398G variant, possibly when associated with other nonsynonymous variants in the coding region, is notable for its potentially important role in human evolution by providing beneficial effects that protect against the development of certain human diseases.
This work was supported by research grants NSC-95-2314-B-182A-072 and NSC-96-2314-B-182A-122 from the National Science Council (Republic of China) and grants CMRPG850243, CMRPG850253, and CMRPG850263 from Chang Gung University College of Medicine and Kaohsiung Chang Gung Memorial Hospital. The Sequencing Core Facility is supported by the National Research Program for Genomic Medicine, National Science Council.
No potential conflicts of interest relevant to this article were reported.
C.-W.L. wrote the manuscript and researched data. J.-B.C. and T.-L.H. researched data. M.-M.T., S.-D.C., and Y.-C.C. reviewed and edited the manuscript. J.-H.C. contributed to discussion and reviewed and edited the manuscript. S.-W.W. and W.-C.L. researched data and contributed to discussion. T.-K.L. and P.-W.W. researched data and edited the manuscript. C.-W.L. is the guarantor of this work and, as such, had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
The authors thank James Waddell of the Han Mei Language Institute in Kaohsiung, Taiwan, for the proofreading assistance. The authors also acknowledge the technical support provided by the Sequencing Core Facility of the National Yang-Ming University Genome Research Center.