FUT2 encodes the α(1,2) fucosyltransferase that determines blood group secretor status. Homozygotes (A/A) for the common nonsense mutation rs601338A>G (W143X) are nonsecretors and are unable to express histo-blood group antigens in secretions and on mucosal surfaces. This mutation has been reported to provide resistance to Norovirus and susceptibility to Crohn’s disease, and hence we aimed to determine if it also affects risk of type 1 diabetes.
rs601338A>G was genotyped in 8,344 patients with type 1 diabetes, 10,008 control subjects, and 3,360 type 1 diabetic families. Logistic regression models were used to analyze the case-control collection, and conditional logistic regression was used to analyze the family collection.
The nonsecretor A/A genotype of rs601338A>G was found to confer susceptibility to type 1 diabetes in both the case-control and family collections (odds ratio for AA 1.29 [95% CI 1.20–1.37] and relative risk for AA 1.22 [95% CI = 1.12–1.32]; combined P = 4.3 × 10−18), based on a recessive effects model.
Our findings linking FUT2 and type 1 diabetes highlight the intriguing relationship between host resistance to infections and susceptibility to autoimmune disease.
Type 1 diabetes results from the complex interaction between genetic predisposition, the immune system, and the environment (1). The incidence of the autoimmune disease type 1 diabetes has increased dramatically over the last 25 years, which cannot be explained by genetic variation in the population. This increase in type 1 diabetes incidence is most likely due to the changing influence of many environmental factors—such as toxins, cow’s milk, vitamin D, and microbial infections—acting on an underlying genetic susceptibility. One hypothesis is that our increasingly hygienic, healthy environment has altered the microbiomes of individuals at risk and the immune system’s level of development and regulation such that in a higher proportion of children it is imbalanced away from regulation and toward autoimmunity (1). Accordingly, one would expect to find genes that are involved directly in host resistance to infectious disease among type 1 diabetes susceptibility genes (1,2).
The FUT2 gene on chromosome 19q13.33 codes for the α(1,2) fucosyltransferase responsible for the synthesis of the H antigen, which is the precursor of the ABO histo-blood group antigens in body fluids and on the intestinal mucosa. Subjects homozygous for FUT2 null alleles are nonsecretors (se) who do not express ABO antigens in saliva and the gastrointestinal tract. Nonsecretor status results from being homozygous for one of two nonsecretor variants: se428, which codes for a stop codon at position 143 (W143X rs601338A>G) in Europeans and Africans; and se385 (I129F, rs1047781A>T) in South East and East Asians.
The H antigen secretion depends on additional glycosyltransferases, including the Lewis (FUT3) A/B enzymes, found in epithelial cells and erythrocytes. Both the Lewis and ABO genes are highly polymorphic, giving rise to null phenotypes, which are thought to be related to evolutionary changes for host-pathogen interactions because many pathogens use surface glycoproteins in host evasion (3).
The genetic inability to secrete ABO blood group antigens in body fluids has been associated with a variety of infectious diseases, in particular with susceptibility to Candida albicans (4,5) and Streptococcus infection (6). However, FUT2 secretors are more susceptible to Helicobacter pylori (7). Additionally, the FUT2 143X/rs601338A allele has also been implicated in higher circulating serum vitamin B12 levels (8) and slower progression of HIV (9); and most intriguingly, it provides resistance to Norovirus infection (Norwalk virus) (10–12). Nonsecretors of FUT2 have been shown to have almost complete protection to GGII Noroviruses, which is a major cause of acute gastroenteritis worldwide, because they do not express the H type-1 oligosaccharide ligand required for GGII Norovirus binding (10–12).
Recently, a genome-wide association (GWA) study (13) and a meta-analysis of GWA studies (14) identified the FUT2 region as a Crohn’s disease locus. In the GWA study, McGovern et al. (13) reported several single nucleotide polymorphisms (SNPs) in high linkage disequilibrium (r2 > 0.80) with evidence of association with Crohn’s disease: rs492602, rs601338 (W143X), rs602662 (S258G), and rs485186. In the meta-analysis, Franke et al. (14) reported an association with rs281379 (odds ratio for allele A 1.07 [95% CI 1.04–1.11]; P = 7.4 × 10−12); r2 between rs281379 and the nonsecretor SNP rs601338 was 0.80 (14).
The FUT2 region also showed some evidence of association in a recent meta-analysis of type 1 diabetes GWA studies (15) (www.t1dbase.org). The most associated SNP was rs485186 T>C (multiplicative allelic effects P = 1.3 × 10−5, and genotype effects P = 3.3 × 10−8), which is in linkage disequilibrium with rs601338 (r2 of 0.85 in 60 unrelated CEU parents [www.1000genomes.org]). Because the FUT2 SNP rs485186 T>C association was based on partially imputed genotypes and did not meet the criteria for follow-up in the type 1 diabetes GWA meta-analysis study (15) (multiplicative allelic effects P < 1 × 10−6 [15]), and rs601338A>G (W143X) and provides the nonsecretor allele (A/A) in subjects of white European ancestry, we genotyped rs601338 in 8,344 case subjects, 10,008 control subjects, and 3,360 families (providing 5,182 trios) and tested for an association with type 1 diabetes.
RESEARCH DESIGN AND METHODS
Case subjects were diagnosed with type 1 diabetes before 17 years of age (mean age at diagnosis 7.8 years) and were from the Juvenile Diabetes Research Foundation/Wellcome Trust Diabetes and Inflammation Laboratory Genetic Resource Investigating Diabetes study (www.childhood-diabetes.org.uk/grid.shtml). Control subjects were obtained from the British 1958 Birth Cohort (www.cls.ioe.ac.uk/studies.asp?section=000100020003) and the Wellcome Trust Case-Control Consortium U.K. Blood Service Common Control sample collection (15,16). All samples were of white European ancestry.
Families consisted of 211 multiplex families from Northern Ireland, 394 multiplex families from the Diabetes U.K. Warren collection, 52 multiplex families from Yorkshire (U.K.), 246 simplex/multiplex families from Romania, 299 multiplex families from the Human Biological Data Interchange (U.S.), 951 simplex/multiplex families from Finland, and 1,207 families made available through the Type 1 Diabetes Genetics Consortium (T1DGC) (http://www-gene.cimr.cam.ac.uk/todd/dna-refs.shtml; http://www.t1dgc.org). The T1DGC families comprised 579 families from across Europe, 481 from North America, 52 from the U.K., and 95 from the Asian-Pacific region.
Genotyping.
A total of 8,344 type 1 diabetic subjects, 10,008 control subjects, and 3,360 type 1 diabetic families were genotyped at the FUT2 SNP rs601338 and at two HLA tag SNPs rs2187668 and rs660895 using TaqMan 5′ nuclease assay (Applied Biosystems, Warrington, U.K.) according to the manufacturer’s protocol. Genotyping was performed blind to case-control status and double scored to minimize error. Over 3,000 samples from the family collections were double genotyped, and no discordances were found.
It has been reported that there is a SNP (rs1800459) adjacent to rs601338 (8), which could be problematic for TaqMan primer or probe binding efficiency. We resequenced the flanking region of rs601138/rs1800459 in 768 samples, a mix of case and control subjects, and found rs1800459 to be monomorphic. No additional SNPs were found in the probe or primer sequences, therefore, we do not agree with Hazra et al. (8). The sequenced genotypes for rs601138 were concordant with the TaqMan genotypes.
Primers and probes were as follows: forward primer GGGAGTACGTCCGCTTCAC; reverse primer TGGCGGAGGTGGTGGTA; FAM probe CTGCTCCTGGACCTT; and VIC probe CCTGCTCCTAGACCTT. HLA-DRB1 and HLA-DQB1 were genotyped at four digit resolution using Dynal RELI SSO assays (Invitrogen Ltd, Paisley, U.K.).
Statistical analysis.
All statistical analyses were performed in either Stata (www.stata.com) or R (www.r-project.org). The genotype frequencies of SNP rs601338 in the control subjects conformed to Hardy-Weinberg equilibrium (HWE; P = 0.62). In contrast, we found deviation in the case subjects (P = 1.1 × 10−5), affected offspring (P = 2.2 × 10−4), and the unaffected parents (P = 0.0033), all adjusted for population. Disease association can result in deviation from HWE in case subjects, affected offspring, and parents of affected offspring, who are not representative of the general population. The deviation is due to an excess of A/A homozygotes in the case subjects. In Crohn’s disease case subjects, rs601338 genotype frequencies also deviated from HWE (13). Nevertheless, we performed a number of genotyping checks, including inspection of allele signal intensity plots, regenotyping 3,000 samples from the family collections and resequencing the flanking regions of rs601338 to check for unknown polymorphisms. No inconsistencies were detected.
The case-control collection was analyzed using a logistic regression model, adjusted for 12 geographical regions within Great Britain (Southwestern, Southern, Southeastern, London, Eastern, Wales, Midlands, North Midlands, Northwestern, East and West Riding, Northern, and Scotland) to account for possible confounding by geographical variation in genotype frequency and disease incidence. These regions corresponded to the place of collection for case and control subjects. Because the case and control subjects were well matched for region, this stratification resulted in little loss of power (17). The family collection was analyzed using conditional logistic regression, after generating pseudo-controls (18).
When testing for an association between type 1 diabetes and rs601338, we performed a 1-degree of freedom (df) likelihood ratio test to determine whether a 1-df multiplicative allelic effects model was an appropriate approximation for the data or whether a 2-df genotype effects model (no specific mode of inheritance assumed) was required. We obtained evidence for nonmultiplicative allelic effects in both the case-control and family collections (P = 2.8 × 10−3 and 0.025, respectively, for rejection of the multiplicative allelic effects model). The case-control and family genotype risks indicated a recessive predisposing effect (see genotype risks in Table 1), because two copies of the A allele (the nonsecretor allele) were required to confer susceptibility to type 1 diabetes. Consequently, we then performed an additional 1-df likelihood ratio test to determine whether a 1-df recessive effects model or a 2-df genotype effects model was more appropriate for the data. The scores and their variances were summed to pool case-control and family information. Importantly, there was no evidence of heterogeneity in the disease associations across different populations, regions or countries, despite there being significant frequency differences (Supplementary Tables 1–4).
. | Case subjects . | Control subjects . | OR (95% CI) . | P values . |
---|---|---|---|---|
Genotype model | 2.94x10−13 | |||
GG | 1,849 (22.2) | 2,435 (24.3) | 1.00 (ref) | |
AG | 3,943 (47.3) | 4,978 (49.7) | 1.05 (0.97–1.45) | |
AA | 2,552 (30.6) | 2,595 (25.9) | 1.33 (1.29–1.73) | |
Recessive model | 7.28x10−14 | |||
GG,GA | 5,792 (69.4) | 7,413 (74.1) | 1.00 (ref) | |
AA | 2,552 (30.6) | 2,595 (25.9) | 1.29 (1.20–1.37) | |
Affected offspring | Parents | RR (95% CI) | ||
Genotype model | 2.72x10−5 | |||
GG | 1,414 (27.4) | 1,989 (29.6) | 1.00 (ref) | |
AG | 2,447 (47.5) | 3,165 (47.1) | 1.04 (0.95–1.13) | |
AA | 1,295 (25.1) | 1,566 (23.3) | 1.26 (1.12–1.40) | |
Recessive model | 6.81x10−6 | |||
GG,GA | 3,861 (74.9) | 5,154 (76.7) | 1.00 (ref) | |
AA | 1,295 (25.1) | 1,566 (23.3) | 1.22 (1.12–1.32) |
. | Case subjects . | Control subjects . | OR (95% CI) . | P values . |
---|---|---|---|---|
Genotype model | 2.94x10−13 | |||
GG | 1,849 (22.2) | 2,435 (24.3) | 1.00 (ref) | |
AG | 3,943 (47.3) | 4,978 (49.7) | 1.05 (0.97–1.45) | |
AA | 2,552 (30.6) | 2,595 (25.9) | 1.33 (1.29–1.73) | |
Recessive model | 7.28x10−14 | |||
GG,GA | 5,792 (69.4) | 7,413 (74.1) | 1.00 (ref) | |
AA | 2,552 (30.6) | 2,595 (25.9) | 1.29 (1.20–1.37) | |
Affected offspring | Parents | RR (95% CI) | ||
Genotype model | 2.72x10−5 | |||
GG | 1,414 (27.4) | 1,989 (29.6) | 1.00 (ref) | |
AG | 2,447 (47.5) | 3,165 (47.1) | 1.04 (0.95–1.13) | |
AA | 1,295 (25.1) | 1,566 (23.3) | 1.26 (1.12–1.40) | |
Recessive model | 6.81x10−6 | |||
GG,GA | 3,861 (74.9) | 5,154 (76.7) | 1.00 (ref) | |
AA | 1,295 (25.1) | 1,566 (23.3) | 1.22 (1.12–1.32) |
Data are n (%) unless otherwise indicated. CI, confidence interval; OR, odds ratio; ref, reference genotype; RR, relative risk. The recessive model was found to be an appropriate approximation in both datasets (case-control P = 0.19 and family P = 0.35, for rejection of the recessive effects model).
A case subject–only (affected offspring only) test for age-at-diagnosis effects was performed by regressing age at diagnosis on SNP genotype (coded 0 or 1 for counts of the A/A genotype) on more than 13,500 case subjects. We adjusted for geographical region when analyzing case subjects from the case-control collections, and for collection when analyzing affected offspring from the family collections. We tested for sex effects in a similar manner using a logistic regression model with sex as the outcome variable.
To maximize power, interaction between HLA class II and FUT2 was only tested in the case subjects and affected family members, which assumes conditional independence in the control population (19). Under the null, the loci are assumed to interact multiplicatively, while under the alternative they interact nonmultiplicatively. Due to the number of genotypes present at HLA-DRB1 and HLA-DQB1, the coding of the class II locus had to be simplified. Therefore, two SNPs, rs2187668 and rs660895, were used to tag the HLA-DRB1*03 (DR3) and HLA-DRB1*04 (DR4) class II alleles in up to 15,841 type 1 diabetic subjects and affecteds (20). The SNPs were used to code the HLA-DRB1*03/04 genotype versus the remainder. The HLA-DRB1*03/04 genotype was confined to those that were HLA-DQB1*0302 positive, with the protective HLA-DRB1*0403 and *0407 included in the non-DR3/DR4 group where classical genotypes were available (12,476 subjects). The second coding adopted used six genotypes: HLA-DRB1*03/03, HLA-DRB1*03/04, HLA-DRB1*04/04, HLA-DRB1*04/X, HLA-DRB1*03/X, HLA-DRB1*X/X, where X represents any non–HLA-DRB1*03 or non-HLA-DRB1*04 allele, and included HLA-DRB1*0403, HLA-DRB1*0407, HLA-DQB1*0301 where classical genotypes were available. Interaction was tested using multinomial logistic regression, with FUT2 genotype as outcome variable and HLA genotype as predictors. Robust variance estimates were used to account for dependence between sibs, and geographical region or collection were included as strata.
The pseudo–control-case methods proposed by Cordell et al. (18) were used to test for parent-of-origin effects. Maternal and offspring genotypes were coded assuming a recessive model, and interaction of maternal genotype and offspring genotype tested in a conditional logistic regression model using a likelihood ratio test on 1-df.
RESULTS
The wild-type/functional allele (secretor allele) of rs601338A>G is the G allele, which has a frequency of 49% in the control samples and 48% in the European CEU HapMap samples (hapmap.ncbi.nlm.nih.gov; Supplementary Table 1). In the analysis of rs601338A>G we used the G allele as reference, because the nonsecretor allele (A) confers susceptibility to Crohn's disease and homozygotes for the nonsecretor allele are resistant to the Norovirus. We found consistent and convincing evidence of a recessive association between type 1 diabetes and rs601338 in both collections (5research design and methods; Table 1). In the case-control collection, the odds ratio for A/A against A/G and G/G was 1.29 (95% CI 1.20–1.37; P = 7.3 × 10−14) and in the family collection, the relative risk for A/A against A/G and G/G was 1.22 (95% CI 1.12–1.32; P = 6.8 × 10−6). The combined evidence for the case-control and family collections provided very convincing evidence for a type 1 diabetes locus at FUT2 (P = 4.3 × 10−18).
We found unconvincing evidence of age-at-diagnosis effect (P = 0.031) and no evidence of a sex effect (P = 0.71) in over 13,500 case subjects and affected offspring. In addition, there was no evidence of interaction between FUT2 and the primary type 1 diabetes susceptibility locus, HLA class II (P > 0.60; 5research design and methods). Given that the A/A genotype protects from Norovirus infection and confers susceptibility to type 1 diabetes, offspring of homozygous nonsecretor mothers who would not receive maternal antibodies to GGII Noroviruses, could have altered risk of type 1 diabetes distinct from the direct effect attributable to the FUT2 genotype of the offspring. Furthermore, maternal-fetal ABO incompatibility has also been shown to be a risk factor for type 1 diabetes (21). Consequently, FUT2 could also be associated with perinatal routes of infection or maternal-fetal immune dysregulation in the pathophysiology of type 1 diabetes. However, no evidence of a maternal-fetal interaction was obtained (P = 0.49; 5research design and methods).
To test whether rs601338 was the sole causal variant in the 19q13.33 region, which includes the fucosyltransferase gene (FUT1; also involved in the synthesis of the A and B blood group antigens), we combined the rs601338 genotype data with the GWA data from Barrett et al. (15) (116 SNPs) and conducted a stepwise regression analysis in 3,419 case subjects and 3,524 control subjects with complete genotype data (22). We found some unconvincing evidence for independent effects in the 19q13.33 region (data not shown). However, further analysis will be required on a dense SNP map of the 19q13.33 region in a much larger number of samples to map this region.
DISCUSSION
We found convincing evidence that having two copies of the nonsecretor allele (A) at rs601338 confers susceptibility to type 1 diabetes. This is only the fourth type 1 diabetes locus with a recessive-like effect, the others being INS/11p15.5, CCR5/3p21 (23), and BACH2/6q15 (J.D.C., J.M.M.H., D.J.S., J.A.T., unpublished data). Previously, two copies of the nonsecretor allele at rs601338 were shown to provide resistance to some strains of Norovirus. Conversely, having a single functional copy of the FUT2 gene and the subsequent Norovirus and/or some other infection (10–12) has a protective effect on type 1 diabetes.
It has previously been noted that the allele frequency of rs601338 se428 and other FUT2 polymorphisms vary between populations (Supplementary Table 1) (24), and this was also observed in our study (Supplementary Tables 2–4). The rs601338 se428 nonfunctional allele (A) has increased in frequency in the European and Sub-Saharan African populations (Supplementary Table 1) and is more common than the ancestral allele (G), suggesting that the mutant A allele has been under selective pressure (24). Previous studies have failed to detect significant signatures of positive selection for FUT2 (2,24), which can be explained by the possibility that the A allele is not a recent mutation in human populations allowing sufficient time for recombination to break down haplotypes, and the possibility of convergent evolution in different populations (24). The interpopulation differentiation statistic, FST, for rs601338 is 0.26 and, therefore, does provide some evidence supporting balancing selection as well as the existence of different evolutionary forces acting in different geographical areas (24).
Having nonfunctional alleles of FUT2 protects against Norovirus and slows the progression of HIV (9), which are both beneficial effects, and is also associated with other bacterial and fungal infections, including reduced susceptibility to H. pylori infection (http://geneticassociationdb.nih.gov/cgi-bin/tableview.cgi?table=allview&cond=GENE=). However, these nonfunctional alleles of FUT2 increase susceptibility to type 1 diabetes and Crohn’s disease, although in prewestern civilization, this is not expected to be a selective advantage since these immune diseases would have been very rare. The other possibility is that the mechanistic link between the enzyme FUT2 and type 1 diabetes has nothing to do with Norovirus or other pathogens and it involves some other target of this α(1,2) fucosyltransferase. We have recently found that the ABO gene is associated with positivity of autoantibodies to gastric parietal cells in type 1 diabetic patients (25). The FUT2 enzyme is required for synthesis of the ABO blood group. The ABO SNP rs657152G>T was not associated with risk of type 1 diabetes itself, although there was weak evidence of FUT2 with parietal cells positivity (25). Hence, FUT2 could also be associated with susceptibility to autoimmune gastritis, which causes vitamin B12 malabsorption and deficiency, and pernicious anemia (8).
Our results justify follow-up functional investigations and highlight the intriguing relationship between the alleles of host defense genes and those of complex chronic immune disorders, supporting a model in which alleles selected for diversity and resistance in host immune responses can now, in a modern environment predispose to autoimmune or immune-mediated chronic diseases (1,2). In this case, even one dose of a functional FUT2 genotype provides about 30% protection against type 1 diabetes across different populations of European origin. It will also be interesting to test if the Asian-specific nonsecretor variant of FUT2, I129F (rs1047781) is associated with type 1 diabetes in eastern and south eastern populations.
See accompanying commentary, p. 2685.
ACKNOWLEDGMENTS
This work was funded by the Juvenile Diabetes Research Foundation International (JDRF), the Wellcome Trust, the National Institute for Health Research Cambridge Biomedical Centre, and the National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK). The Cambridge Institute for Medical Research is in receipt of a Wellcome Trust Strategic Award (079895).
No potential conflicts of interest relevant to this article were reported.
D.J.S. and J.A.T. researched data. D.J.S., J.D.C., and J.M.M.H. conducted analyses. D.J.S., J.D.C., and J.A.T. wrote the manuscript. N.M.W. managed the data. P.C., T.M., and H.S. organized the DNA samples. K.D. contributed to discussion. All authors reviewed, edited, and discussed the manuscript.
The authors thank all the patients, control subjects, and family members for their participation. They thank the U.K. Medical Research Council and Wellcome Trust for funding the collection of DNA for the British 1958 Birth Cohort (MRC grant G0000934, WT grant 068545/Z/02). DNA control samples were prepared and provided by S. Ring, R. Jones, and M. and W. McArdle of the University of Bristol; D. Strachan of the University of London; and P. Burton of the University of Leicester. The authors thank David Dunger and Barry Widmer of the University of Cambridge, and the British Society for Paediatric Endocrinology and Diabetes for the type 1 diabetes case collection. The authors acknowledge use of DNA from the U.K. Blood Services collection of Common Controls, which is funded by the Wellcome Trust grant 076113/C/04/Z, by the Wellcome Trust/JDRF grant 061858, and by the National Institute for Health Research of England. The collection was established as part of the Wellcome Trust Case-Control Consortium. The authors acknowledge use of DNA from the Human Biological Data Interchange and from Diabetes UK for the U.S. and U.K. multiplex families, respectively; D. Savage of the Belfast Health and Social Care Trust, C. Patterson and D. Carson of Queen's University Belfast, and P. Maxwell of Belfast City Hospital for the Northern Irish families; the Genetics of Type 1 Diabetes in Finland, J. Tuomilehto, L. Kinnunen, E. Tuomilehto-Wolf, V. Harjutsalo, and T. Valle of the National Public Health Institute, Helsinki, for the Finnish families; and C. Guja and C. Ionescu-Tirgoviste of the Institute of Diabetes “N Paulescu,” Romania, for the Romanian families. This research uses resources provided by the Type 1 Diabetes Genetics Consortium, a collaborative clinical study sponsored by the NIDDK, the National Institute of Allergy and Infectious Diseases, the National Human Genome Research Institute, the National Institute of Child Health and Human Development, and JDRF and supported by U01 DK062418. The authors also thank H. Stevens, P. Clarke, G. Coleman, S. Duley, D. Harrison, S. Hawkins, M. Maisuria, T. Mistry, and N. Taylor for preparation of DNA samples.