In the genetic analysis of common, multifactorial diseases, such as type 1 diabetes, true positive irrefutable linkage and association results have been rare to date. Recently, it has been reported that a single nucleotide polymorphism (SNP), 1858C>T, in the gene PTPN22, encoding Arg620Trp in the lymphoid protein tyrosine phosphatase (LYP), which has been shown to be a negative regulator of T-cell activation, is associated with an increased risk of type 1 diabetes. Here, we have replicated these findings in 1,388 type 1 diabetic families and in a collection of 1,599 case and 1,718 control subjects, confirming the association of the PTPN22 locus with type 1 diabetes (family-based relative risk (RR) 1.67 [95% CI 1.46–1.91], and case-control odds ratio (OR) 1.78 [95% CI 1.54–2.06]; overall P = 6.02 × 10−27). We also report evidence for an association of Trp620 with another autoimmune disorder, Graves’ disease, in 1,734 case and control subjects (P = 6.24 × 10−4; OR 1.43 [95% CI 1.17–1.76]). Taken together, these results indicate a more general association of the PTPN22 locus with autoimmune disease.
Type 1 diabetes is a multigenic autoimmune disease with three loci identified so far, the HLA class II genes (1), the insulin gene on chromosome 11p15 (2,3), and the CTLA4 locus on 2q33 (4,5), all of which are involved in T-cell activation, homeostasis, and repertoire formation. Very recently, evidence for a fourth locus has been reported (6), PTPN22 on chromosome 1p13, which encodes a lymphoid protein tyrosine kinase (LYP) that is important in negative control of T-cell activation and in T-cell development (7,8). A nonsynonymous single nucleotide polymorphism (SNP) at nucleotide 1858 in codon 620 (Arg620Trp) in PTPN22 was associated with type 1 diabetes in North American and Sardinian collections (6). The disease-associated variant Trp620 may alter the binding of LYP to the cytoplasmic tyrosine kinase (6,9), which regulates the T-cell receptor–signaling kinases, T-cell–specific protein tyrosine kinase (LCK) and FYN (8,9).
Evidence for this association was based on two independent case-control collections from North America (294 case and 395 control subjects, P = 5.99 × 10−4, odds ratio [OR] for allele T = 1.83 [95% CI 1.28–2.60]) and Sardinia (174 case and 214 control subjects, P = 0.047, OR for allele T = 2.31 [0.93–5.82]). Because irrefutable results are rare in complex diseases (10,11,12), we sought to confirm the PTPN22/LYP Arg620Trp/1858C>T SNP association in several independent populations. We genotyped the SNP in 791 U.K., 336 U.S., and 261 Romanian multiplex and simplex type 1 diabetic families (13), providing 1,946 parent-child trio genotypes, and in 1,573 type 1 diabetic case subjects and 1,718 control subjects from the U.K. We confirmed the association in the family (P = 5.62 × 10−14, relative risk [RR] for allele T = 1.67 [95% CI 1.46–1.91]) and case-control (likelihood-ratio test P = 4.22 × 10−15, OR for allele T = 1.78 [95% CI 1.54–2.06]) collections, with the combined result being highly significant (P = 6.02 × 10−27) (Table 1). There was evidence of population heterogeneity in the 1858C>T genotype frequencies (P = 1.28 × 10−5), but not in the disease association (P = 0.98) (Table 2). Having confirmed the association and found a striking consistency across different Caucasian populations, we performed case-only locus-locus interaction analyses (14,15) between 1858C>T and the known type 1 diabetes susceptibility loci at IDDM1/HLA, the IDDM12/CTLA4 CT60 SNP (rs3087243) (5), and the IDDM2/INS variable number of tandem repeats (VNTR) (−23HphI, rs689) (2). No evidence for an interaction with CTLA4 CT60, INS VNTR, and HLA-DRB1 was consistently found between the analyses of the affected offspring from the family collection and the case subjects from the case-control collection and PTPN22/LYP Arg620Trp/1858C>T (Table 3). However, evidence of a statistical interaction, or lack of one, is difficult to interpret (14,15). The distribution of 1858C>T genotypes were also evaluated on the basis of age at onset of type 1 diabetes, parent of origin, and sex, and no consistent evidence of heterogeneity was found (Table 3).
Given that Graves’ disease, type 1 diabetes, autoimmune hypothyroidism, and other autoimmune diseases, such as rheumatoid arthritis, commonly cluster in the same families (16), it is likely they share some of the same susceptibility genes and alleles, as demonstrated at CTLA4 for type 1 diabetes and Graves’ disease (5). Owing to the central role of LYP in T-cell signaling, the Trp620 variant could be a shared determinant among different autoimmune and immune-mediated diseases. Hence, we investigated whether this locus was also associated with autoimmune thyroid disease, using a case-control collection (901 Graves’ disease case subjects and 833 control subjects, all of whom were independent of the type 1 diabetic control subjects study). We found evidence for an association (likelihood-ratio test P = 6.26 × 10−4, OR for allele T = 1.43 [95% CI 1.17–1.76]) (Table 4), indicating that this locus may have a general effect on predisposition to autoimmunity. Very recently, the Trp620 variant has also been associated with rheumatoid arthritis and systemic lupus erythematosus (17,18). It will be interesting to test in future experiments if Arg620Trp is the only disease-associated variant in the gene and in this chromosome region.
RESEARCH DESIGN AND METHODS
The 1,573 case subjects were recruited as part of the U.K. Genetic Resource Investigating Diabetes (GRID) study, which is a joint project between the University of Cambridge Departments of Pediatrics and Medical Genetics and is funded by the Juvenile Diabetes Research Foundation and the Wellcome Trust. The eventual aim of this project is to collect 8,000 case subjects with type 1 diabetes matched geographically across Great Britain (http://www-gene.cimr.cam.ac.uk/ucdr/grid.shtml) to 8,000 control subjects from the 1958 British Birth Cohort (http://www.cls.ioe.ac.uk/Cohort/Ncds/mainncds.htm) to allow statistically powered genetic association studies. The 1,718 1958 British Birth Cohort control subjects are part of a longitudinal study in which the subjects are British citizens born in a particular week in March 1958. The case subjects, all Caucasian and <16 years of age, have a mean age at onset of type 1 diabetes at 7.5 years, with an SD of 4 years. The regional distribution of case and control subjects are matched. All families were Caucasian of European descent and were composed of two parents and at least one affected child. The families consisted of 528 multiplex families from the Diabetes U.K. Warren 1 collection (20), including 56 simplex families from Yorkshire, providing 1,912 genotypes (2,040 individuals attempted to be genotyped), 336 multiplex families from the Human Biological Data Interchange (U.S.) (21), providing 1,315 genotypes (1,382 attempted), 263 multiplex/simplex families from Belfast (22), providing 847 genotypes (885 attempted), and 261 Romanian simplex families, providing 819 genotypes (845 attempted), with inclusion criteria as reported in Vella et al. (13). Caucasian, U.K.-born, Graves’ disease case subjects (n = 901) were recruited from thyroid clinics as described previously (5,23). Ethnically matched control subjects (n = 833) with no history of autoimmune disease were recruited at various sites in Birmingham and Oxford (independent of the U.K. GRID study). All DNA samples were collected after approval from the relevant research ethics committees, and written informed consent was obtained from the participants.
Genotyping.
Genotyping, in the type 1 diabetes collections, was undertaken using TaqMan (Applied Biosystems, Warrington, U.K.), and probes and primers were also designed by Applied Biosystems.
All genotyping was double scored to minimize error and a duplicate plate was typed to check genotyping quality. No mismatches were observed. The primers were as follows: forward, CAACTGCTCCAAGGATAGATGATGA, reverse, CCAGCTTCCTAACCACAATAAATG, FAM probe, TCAGGTGTCCGTACAGG, and VIC probe, TCAGGTGTCCATACAGG.
Genotyping in the Graves’ disease and control collection was undertaken using PCR followed by restriction enzyme digest with PCR primers and XcmI, as described by Bottini et al. (8).
Statistical analysis.
All statistical analyses were performed in the STATA statistical package (http://www.stata.com). Some additional STATA routines were used and may be downloaded from http://www-gene.cimr.cam.ac.uk/clayton/software/stata.
A score test was used to combine tests from family and case-control studies. If U is the score statistic, contrasting allele frequencies in case and control subjects or, in family studies, frequencies of transmitted and untransmitted alleles and V is the estimated variance of the score statistic, U2/V is asymptotically distributed as χ2, with 1 degree of freedom. To combine results, we first calculate U and V for each study and calculate an overall U and V by summing the contributions from each study, U = U1 + U2 and V = V1 + V2. We then calculate U2/V.
Arg620Trp allele frequencies in parents and control subjects were in Hardy-Weinberg equilibrium. The case-only (affected offspring only) locus-locus interaction analysis, defined as deviation from a multiplicative model for the joint effects of the two genotypes (25), was performed using regression model as a score test for association between genotypes in case subjects.
Potential population substructure within the U.K. case-control collections could slightly inflate the P values reported. We have, therefore, analyzed the case-control collection adjusting for 12 geographical regions within Britain. This increases the P value to 8.23 × 10−12 and OR to 172 (95% CI 1.47–2.01).
The 1858C>T/Trp620 allele and genotype frequencies and association test results in the type 1 diabetic family and case-control collections
Type 1 diabetic family collection . | Transmitted . | Untransmitted* . | RR (95% CI) . | P . |
---|---|---|---|---|
Allele T/Trp620 (TDT) | 565 | 339 | 1.67 (1.46–1.91) | 5.62 × 10−14 |
Genotypes | ||||
C/C | 1,364 (70.1) | 4,448 (76.2) | 1.00 (reference) | — |
C/T | 516 (26.5) | 1,288 (22.1) | 1.61 (1.38–1.87) | 9.96 × 10−10 |
T/T | 66 (3.4) | 102 (1.7) | 3.13 (2.18–4.48) | 5.30 × 10−10 |
Type 1 diabetic family collection . | Transmitted . | Untransmitted* . | RR (95% CI) . | P . |
---|---|---|---|---|
Allele T/Trp620 (TDT) | 565 | 339 | 1.67 (1.46–1.91) | 5.62 × 10−14 |
Genotypes | ||||
C/C | 1,364 (70.1) | 4,448 (76.2) | 1.00 (reference) | — |
C/T | 516 (26.5) | 1,288 (22.1) | 1.61 (1.38–1.87) | 9.96 × 10−10 |
T/T | 66 (3.4) | 102 (1.7) | 3.13 (2.18–4.48) | 5.30 × 10−10 |
Type 1 diabetes case-control collection . | Case subjects . | Control subjects . | OR (95% CI) . | P . |
---|---|---|---|---|
Alleles | ||||
C | 2,610 (83.0) | 3,077 (89.6) | 1.00 (reference) | — |
T | 536 (17.0) | 359 (10.4) | 1.78 (1.54–2.06) | 1.17 × 10−14 |
Genotypes | ||||
C/C | 1,077 (68.5) | 1,377 (80.2) | 1.00 (reference) | — |
C/T | 456 (29.0) | 323 (18.8) | 1.81 (1.53–2.13) | 1.34 × 10−12 |
T/T | 40 (2.5) | 18 (1.0) | 2.84 (1.62–4.98) | 2.73 × 10−4 |
Type 1 diabetes case-control collection . | Case subjects . | Control subjects . | OR (95% CI) . | P . |
---|---|---|---|---|
Alleles | ||||
C | 2,610 (83.0) | 3,077 (89.6) | 1.00 (reference) | — |
T | 536 (17.0) | 359 (10.4) | 1.78 (1.54–2.06) | 1.17 × 10−14 |
Genotypes | ||||
C/C | 1,077 (68.5) | 1,377 (80.2) | 1.00 (reference) | — |
C/T | 456 (29.0) | 323 (18.8) | 1.81 (1.53–2.13) | 1.34 × 10−12 |
T/T | 40 (2.5) | 18 (1.0) | 2.84 (1.62–4.98) | 2.73 × 10−4 |
Data are n (%), unless noted otherwise. For the case-control collection, using logistic regression, we assumed a multiplicative model (likelihood-ratio test, χ12 = 61.57, P = 4.22 × 10−15) because it was not significantly different (likelihood-ratio test, χ12 = 0.18, P = 0.67) from the full genotype model (likelihood-ratio test, χ22 = 61.75, P = 3.90 × 10−14) (19).
Untransmitted (pseudocontrol) data for genotypes in the type 1 diabetic family collection are estimated, using conditional logistic regression, as in Cordell and Clayton (19). TDT, transmission/disequilibrium test.
Population PTPN22/LYP1858C>T parental allele frequencies and transmission-disequilibrium test results for allele T/Trp620
Population . | No. of parent-child trios . | Parental allele frequency (%) . | Transmitted . | Untransmitted . | RR (95% CI) . | P . |
---|---|---|---|---|---|---|
U.K.* | 1,087 | 14.3 | 338 | 204 | 1.66 (1.40–1.98) | 8.6 × 10−9 |
Great Britain | 850 | 16.4 | 272 | 175 | 1.55 (1.28–1.87) | 4.5 × 10−6 |
Northern Ireland | 237 | 12.3 | 66 | 29 | 2.28 (1.47–3.53) | 1.5 × 10−4 |
U.S. | 626 | 12.9 | 173 | 100 | 1.73 (1.35–2.21) | 1.0 × 10−5 |
Romania | 233 | 10.2 | 54 | 35 | 1.54 (1.01–2.36) | 0.04 |
Population . | No. of parent-child trios . | Parental allele frequency (%) . | Transmitted . | Untransmitted . | RR (95% CI) . | P . |
---|---|---|---|---|---|---|
U.K.* | 1,087 | 14.3 | 338 | 204 | 1.66 (1.40–1.98) | 8.6 × 10−9 |
Great Britain | 850 | 16.4 | 272 | 175 | 1.55 (1.28–1.87) | 4.5 × 10−6 |
Northern Ireland | 237 | 12.3 | 66 | 29 | 2.28 (1.47–3.53) | 1.5 × 10−4 |
U.S. | 626 | 12.9 | 173 | 100 | 1.73 (1.35–2.21) | 1.0 × 10−5 |
Romania | 233 | 10.2 | 54 | 35 | 1.54 (1.01–2.36) | 0.04 |
Combined result for Great Britain and Northern Ireland.
P values obtained by using a regression model as a score test for association
. | Affected offspring . | Case subjects . |
---|---|---|
HLA-DRB1 | 0.203 (1,755) | 0.130 (1,604) |
INS VNTR | 0.054 (1,849) | 0.201 (1,599) |
CTLA4 (CT60) | 0.028 (1,816) | 0.742 (1,583) |
Age at onset | 0.240 (2,023) | 0.343 (1,657) |
Sex | 0.951 (2,061) | 0.029 (1,593) |
Parent of Origin | 0.681 (2,064) | N/A |
. | Affected offspring . | Case subjects . |
---|---|---|
HLA-DRB1 | 0.203 (1,755) | 0.130 (1,604) |
INS VNTR | 0.054 (1,849) | 0.201 (1,599) |
CTLA4 (CT60) | 0.028 (1,816) | 0.742 (1,583) |
Age at onset | 0.240 (2,023) | 0.343 (1,657) |
Sex | 0.951 (2,061) | 0.029 (1,593) |
Parent of Origin | 0.681 (2,064) | N/A |
Data are P (n individuals). Associations were investigated between 1858C>T/Trp620 and HLA-DRB1 (DR3/DR3, DR3/DR4, DR4/DR4, DR3/non-DR4, DR4/non-DR3, and non-DR3/non-DR4), INS VNTR in two subgroups (I/I and I/III + III/III), the CTLA4 CT60 SNP, age at onset of type 1 diabetes, and sex in case subjects (research design and methods). The regression model for the affected offspring included a population variable. N/A, not available.
PTPN22/LYP 1858C>T/Trp620 allele and genotype frequencies and association test results in the Graves’ disease case-control collection
. | Case subjects . | Control subjects . | Logistic regression . | . | |
---|---|---|---|---|---|
. | . | . | OR (95% CI) . | P . | |
Alleles | |||||
C | 1,544 (85.7) | 1,492 (89.6) | 1.00 (reference) | — | |
T | 258 (14.3) | 174 (10.4) | 1.43 (1.17–1.76) | 6.26 × 10−4 | |
Genotypes | |||||
C/C | 661 (73.4) | 669 (80.3) | 1.00 (reference) | — | |
C/T | 222 (24.6) | 154 (18.5) | 1.46 (1.16–1.84) | 1.42 × 10−3 | |
T/T | 18 (2.0) | 10 (1.2) | 1.82 (0.83–3.98) | 0.132 |
. | Case subjects . | Control subjects . | Logistic regression . | . | |
---|---|---|---|---|---|
. | . | . | OR (95% CI) . | P . | |
Alleles | |||||
C | 1,544 (85.7) | 1,492 (89.6) | 1.00 (reference) | — | |
T | 258 (14.3) | 174 (10.4) | 1.43 (1.17–1.76) | 6.26 × 10−4 | |
Genotypes | |||||
C/C | 661 (73.4) | 669 (80.3) | 1.00 (reference) | — | |
C/T | 222 (24.6) | 154 (18.5) | 1.46 (1.16–1.84) | 1.42 × 10−3 | |
T/T | 18 (2.0) | 10 (1.2) | 1.82 (0.83–3.98) | 0.132 |
Data are n (%), unless noted otherwise. We assumed a multiplicative model (likelihood-ratio test, χ12 = 11.95, P = 5.46 × 10−4) because it was not significantly different (likelihood-ratio test, χ12 = 0.12, P = 0.73) from the full genotype model (likelihood-ratio test, χ22 = 12.06, P = 2.41 × 10−3) (19).
D.S. and J.D.C. contributed equally to this work.
Article Information
We thank the Juvenile Diabetes Research Foundation, the Wellcome Trust, Diabetes U.K., and the Medical Research Council for financial support.
We gratefully acknowledge the participation of all of the patients, control subjects, and family members and thank Tasneem Hassanali, Jayne Hutchings, Gillian Coleman, Sarah Field, Trupti Mistry, Kirsi Bourget, Sally Clayton, Matthew Hardy, Jennifer Keylock, Pamela Lauder, Meeta Maisuria, William Meadows, Meera Sebastian, and Sarah Wood for preparing the DNA samples.