We determined longitudinal serum proteomics profiles from children with HLA-conferred diabetes susceptibility to identify changes that could be detected before seroconversion and positivity for disease-associated autoantibodies. Comparisons were made between children who seroconverted and progressed to type 1 diabetes (progressors) and those who remained autoantibody negative, matched by age, sex, sample periodicity, and risk group. The samples represented the prediabetic period and ranged from the age of 3 months to 12 years. After immunoaffinity depletion of the most abundant serum proteins, isobaric tags for relative and absolute quantification were used for sample labeling. Quantitative proteomic profiles were then measured for 13 case-control pairs by high-performance liquid chromatography-tandem mass spectrometry (LC-MS/MS). Additionally, a label-free LC-MS/MS approach was used to analyze depleted sera from six case-control pairs. Importantly, differences in abundance of a set of proteins were consistently detected before the appearance of autoantibodies in the progressors. Based on top-scoring pairs analysis, classification of such progressors was observed with a high success rate. Overall, the data provide a reference of temporal changes in the serum proteome in healthy children and children progressing to type 1 diabetes, including new protein candidates, the levels of which change before clinical diagnosis.
Introduction
The measurement of islet cell autoantibodies is currently the principle means of identifying an emerging threat of developing type 1 diabetes (1). The risks associated with the appearance of islet antibodies have been evaluated in depth, and overall, the appearance of multiple biochemically defined autoantibodies correlates with progression to disease irrespective of family history, genetic risk group, or autoantibody combination (1). Nevertheless, it still remains open whether finding even earlier indications of future disease development is possible. Such markers could shed further light on disease etiology and potentially be used in the evaluation of risks and preventive treatments.
Proteomic analyses in the study of type 1 diabetes has been previously reviewed (2) and applied in studies addressing differences in the sera of patients with diabetes and subjects without diabetes (3–5). Zhang et al. (5) compared protein levels in plasma from patients with type 1 diabetes and healthy subjects, observing significant differences in the abundance of 24 proteins. Similarly, Zhi et al. (4) detected differences in the levels of 21 serum proteins between patients with type 1 diabetes and healthy subjects; six of the proteins were validated by immunoassay.
Although in-depth comparisons of proteins in samples from healthy subjects and patients with type 1 diabetes have distinguished the diseased state, the identification of changes preceding this aggressive autoimmune disease is important for disease prediction and prevention. McGuire et al. (6) used a proteomic approach to identify predictive markers in the cord blood of children in whom type 1 diabetes developed later. Although their measurements with surface-enhanced laser desorption/ionization mass spectrometry revealed different patterns, the discriminating peaks were not identified.
To establish the origin and changes associated with the development of type 1 diabetes, careful selection of appropriate study groups is essential, such as have been established by prospective sampling from at-risk individuals (7,8). The Finnish Type 1 Diabetes Prediction and Prevention (DIPP) project collected samples from Finnish children with HLA-defined predisposition to type 1 diabetes (7,9), thus creating an extensive prospective sample collection from birth to diagnosis or otherwise healthy until 15 years of age. This resource has allowed investigation of the longitudinal profiles of a wide range of factors in children who developed type 1 diabetes, using samples ranging from early infancy to diagnosis, as well as sample measurements from carefully matched control subjects (10–14).
In the current study, we determined the longitudinal serum proteomics profiles of a group of children who are type 1 diabetes susceptible enrolled in the DIPP study. The measurements were made in sera from 38 children comprising 19 type 1 diabetes case-control pairs matched by date and location of birth, sex, and HLA-conferred genetic risk. The samples selected for analysis represent the time course from autoantibody negativity to seroconversion to diagnosis. The analyses were made using two mass spectrometry–based quantitative proteomics techniques. First, we used isobaric tags for relative and absolute quantification (iTRAQ) reagents, which have previously been extensively used in serum proteomics applications, including the development of robust analytical protocols and applied studies up to the scale of hundreds of subjects (15,16). Second, we used a label-free method, which has also been applied in serum proteomics analyses (17,18). The present results reveal a spectrum of changes and differences in the serum protein profiles between children progressing to type 1 diabetes and matched control subjects. Some of these changes were consistently detected before the appearance of autoantibodies. To our knowledge, this study is the first to report longitudinal proteomics profiles in children who develop type 1 diabetes as well as such profiles in healthy children.
Research Design and Methods
A schematic of the experimental design is illustrated in Fig. 1. Detailed description of the proteomics measurements, samples comparisons, and availability of the raw data are provided as supplementary information.
Schematic presentation of the study design. Using a prospective longitudinal serum sample collection from children with an HLA-conferred risk for type 1 diabetes. Samples were selected based on clinical outcome and the titers of diabetes-associated autoantibodies. The samples were prepared for proteomics analysis by mass spectrometry. Comparisons were made between children who developed type 1 diabetes and age-, HLA risk–, and sex-matched control subjects. Two quantitative approaches were applied: first, iTRAQ reagents and second, a label-free approach.
Schematic presentation of the study design. Using a prospective longitudinal serum sample collection from children with an HLA-conferred risk for type 1 diabetes. Samples were selected based on clinical outcome and the titers of diabetes-associated autoantibodies. The samples were prepared for proteomics analysis by mass spectrometry. Comparisons were made between children who developed type 1 diabetes and age-, HLA risk–, and sex-matched control subjects. Two quantitative approaches were applied: first, iTRAQ reagents and second, a label-free approach.
Subjects and Sample Collection
All children studied were participants in the Finnish DIPP study (9), where children identified as at risk for type 1 diabetes based on HLA genotype were followed prospectively from birth. Venous nonfasting blood samples were collected at each study visit; sera were separated and stored at −70°C within 3 h from collection. Serum islet cell autoantibody (ICA) measurements were made as previously described (19). For ICA-positive children, levels of GAD antibody (GADA), tyrosine phosphatase-related protein antibody (IA-2A), and insulin antibodies (IAA) were also analyzed.
The proteomics measurements were performed on sera from 19 case children who developed type 1 diabetes during the DIPP follow-up. Prospective serum samples (5–11 per child) were selected to represent phases of disease progression from autoantibody negativity to seroconversion to overt disease. A persistently autoantibody-negative control child was matched with each case child (typically in the order of seven samples per child) based on date and place of birth, sex, and HLA-DQB1 genotype. The prospective control serum samples were matched with the case samples by age at sample draw. Altogether, 266 serum samples were analyzed of which sera from 26 children (13 case-control pairs) were processed for iTRAQ analysis and sera from 12 children (6 case-control pairs) for analysis using a label-free approach (Table 1, Supplementary Table 1, and Fig. 2).
Summary of the children progressing to type 1 diabetes whose samples were studied with proteomics
. | . | . | . | . | . | Number of samples analyzed . | . | |
---|---|---|---|---|---|---|---|---|
ID . | Sex . | HLA-DQB1 risk alleles . | Age at seroconversion (years) . | Age at diagnosis (years) . | Autoantibodies detected . | Case . | Control . | Analysis method . |
D1 | M | *02, *03:02 | 1.3 | 2.2 | ICA, IAA, GADA, IA-2A | 7 | 7 | iTRAQ |
D2 | M | *02, *03:02 | 1.4 | 4.0 | ICA, IAA, GADA, IA-2A | 7 | 7 | iTRAQ |
D3 | M | *02, *03:02 | 1.3 | 3.9 | ICA, IAA, GADA, IA-2A | 7 | 7 | iTRAQ |
D4 | F | *02, *03:02 | 1.5 | 3.3 | ICA, IAA, GADA, IA-2A | 7 | 7 | iTRAQ |
D5 | F | *02, *03:02 | 3.4 | 7.0 | ICA, IAA, GADA, IA-2A | 7 | 7 | iTRAQ |
D6 | F | *02, *03:02 | 0.5 | 4.0 | ICA, IAA, GADA, IA-2A | 7 | 7 | iTRAQ |
D7 | F | *02, *03:02 | 0.6 | 4.1 | ICA, IAA, GADA, IA-2A | 7 | 7 | iTRAQ |
D8 | M | *02, *03:02 | 1.0 | 4.4 | ICA, IAA, GADA, IA-2A | 7 | 7 | iTRAQ |
D9 | M | *02, *03:02 | 2.5 | 3.6 | ICA, IAA, GADA | 7 | 7 | iTRAQ |
D10 | M | *03:02, x | 1.5 | 2.5 | ICA, IAA, GADA, IA-2A | 7 | 7 | iTRAQ |
D11 | M | *03:02, x | 1.3 | 4.0 | ICA, IAA, IA-2A | 11 | 9 | iTRAQ |
D12 | M | *03:02, x | 2.0 | 2.2 | ICA, IAA, GADA, IA-2A | 5 | 7 | iTRAQ |
D13 | M | *03:02, x | 3.5 | 5.5 | ICA, GADA, IA-2A | 7 | 6 | iTRAQ |
D14 | M | *02, *03:02 | 6.1 | 8.8 | ICA, IAA, GADA, IA-2A | 7 | 8 | Label free |
D15 | M | *02, *03:02 | 2.6 | 8.3 | ICA, IAA | 6 | 7 | Label free |
D16 | F | *02, *03:02 | 1.0 | 10.0 | ICA, IAA, GADA, IA-2A | 6 | 6 | Label free |
D17 | F | *02, *03:02 | 5.0 | 7.7 | ICA, IAA, GADA, IA-2A | 7 | 7 | Label free |
D18 | M | *03:02, x | 1.3 | 12.1 | ICA, IAA, GADA, IA-2A | 6 | 8 | Label free |
D19 | F | *03:02, x | 1.3 | 8.6 | ICA, IAA, IA-2A | 7 | 8 | Label free |
. | . | . | . | . | . | Number of samples analyzed . | . | |
---|---|---|---|---|---|---|---|---|
ID . | Sex . | HLA-DQB1 risk alleles . | Age at seroconversion (years) . | Age at diagnosis (years) . | Autoantibodies detected . | Case . | Control . | Analysis method . |
D1 | M | *02, *03:02 | 1.3 | 2.2 | ICA, IAA, GADA, IA-2A | 7 | 7 | iTRAQ |
D2 | M | *02, *03:02 | 1.4 | 4.0 | ICA, IAA, GADA, IA-2A | 7 | 7 | iTRAQ |
D3 | M | *02, *03:02 | 1.3 | 3.9 | ICA, IAA, GADA, IA-2A | 7 | 7 | iTRAQ |
D4 | F | *02, *03:02 | 1.5 | 3.3 | ICA, IAA, GADA, IA-2A | 7 | 7 | iTRAQ |
D5 | F | *02, *03:02 | 3.4 | 7.0 | ICA, IAA, GADA, IA-2A | 7 | 7 | iTRAQ |
D6 | F | *02, *03:02 | 0.5 | 4.0 | ICA, IAA, GADA, IA-2A | 7 | 7 | iTRAQ |
D7 | F | *02, *03:02 | 0.6 | 4.1 | ICA, IAA, GADA, IA-2A | 7 | 7 | iTRAQ |
D8 | M | *02, *03:02 | 1.0 | 4.4 | ICA, IAA, GADA, IA-2A | 7 | 7 | iTRAQ |
D9 | M | *02, *03:02 | 2.5 | 3.6 | ICA, IAA, GADA | 7 | 7 | iTRAQ |
D10 | M | *03:02, x | 1.5 | 2.5 | ICA, IAA, GADA, IA-2A | 7 | 7 | iTRAQ |
D11 | M | *03:02, x | 1.3 | 4.0 | ICA, IAA, IA-2A | 11 | 9 | iTRAQ |
D12 | M | *03:02, x | 2.0 | 2.2 | ICA, IAA, GADA, IA-2A | 5 | 7 | iTRAQ |
D13 | M | *03:02, x | 3.5 | 5.5 | ICA, GADA, IA-2A | 7 | 6 | iTRAQ |
D14 | M | *02, *03:02 | 6.1 | 8.8 | ICA, IAA, GADA, IA-2A | 7 | 8 | Label free |
D15 | M | *02, *03:02 | 2.6 | 8.3 | ICA, IAA | 6 | 7 | Label free |
D16 | F | *02, *03:02 | 1.0 | 10.0 | ICA, IAA, GADA, IA-2A | 6 | 6 | Label free |
D17 | F | *02, *03:02 | 5.0 | 7.7 | ICA, IAA, GADA, IA-2A | 7 | 7 | Label free |
D18 | M | *03:02, x | 1.3 | 12.1 | ICA, IAA, GADA, IA-2A | 6 | 8 | Label free |
D19 | F | *03:02, x | 1.3 | 8.6 | ICA, IAA, IA-2A | 7 | 8 | Label free |
See also Supplementary Table 1 for more information on the case subjects and matched control subjects. x ≠ *02, *03:01, *06:02/3.
Timing of the serum sample collection (years) relative to the first detection of diabetes-associated autoantibodies (A) and relative to the diagnosis of type 1 diabetes (B). ♦, samples profiled for the children who progressed to type 1 diabetes. For the comparison between children (healthy vs. progressors), the analyses considered protein abundance throughout the series (119 vs. 119 age-matched samples) and samples before (45 vs. 45 age-matched samples) and after (74 vs. 74 age-matched samples) detection of seroconversion. Other details of the comparisons are indicated in Table 2. An additional 13 case and 13 control samples were included in the measurements. See Supplementary Tables 2 and 3 for further information.
Timing of the serum sample collection (years) relative to the first detection of diabetes-associated autoantibodies (A) and relative to the diagnosis of type 1 diabetes (B). ♦, samples profiled for the children who progressed to type 1 diabetes. For the comparison between children (healthy vs. progressors), the analyses considered protein abundance throughout the series (119 vs. 119 age-matched samples) and samples before (45 vs. 45 age-matched samples) and after (74 vs. 74 age-matched samples) detection of seroconversion. Other details of the comparisons are indicated in Table 2. An additional 13 case and 13 control samples were included in the measurements. See Supplementary Tables 2 and 3 for further information.
Sample Preparation
Serum samples were depleted of the most abundant proteins using immunoaffinity columns from Beckman Coulter (ProteomeLab IgY-12) and Agilent (Hu14) columns. The same depletion method was always applied to the follow-up samples of each case-control pair.
For iTRAQ labeling, the samples were processed in accordance with the manufacturer’s protocol for 8plex reagents (AB Sciex, Framingham, MA) and then fractionated using strong cation exchange chromatography as previously described (20). Samples from 26 children were compared using the iTRAQ method, applying 27 paired/cross-referenced 8plex iTRAQ labeling schemes of the samples. Samples from 12 additional children were analyzed in quadruplicate using a label-free quantitative (LFQ) approach (depletion with the Hu14 system), with concentration and digestion performed in a similar manner as for the iTRAQ samples (21), with the digests otherwise unfractionated before high-performance liquid chromatography-tandem mass spectrometry (LC-MS/MS).
LC-MS/MS Analysis
LC-MS/MS analyses were performed with a QSTAR Elite time-of-flight instrument and an Orbitrap Velos Pro Fourier transform instrument. For the analysis of iTRAQ-labeled samples, the collision-induced dissociation and higher-energy collisional dissociation modes were used to record positive ion tandem mass spectra for the QSTAR Elite and Orbitrap Velos, respectively. The LFQ data were acquired with the Orbitrap Velos using collision-induced dissociation. Chromatographic separations were made with 150 mm × 75 μm internal diameter columns packed with magic C18-bonded silica (200 Å) using binary gradients of water and acetonitrile with 0.2% formic acid.
LC-MS/MS Data Processing
The iTRAQ data were analyzed with ProteinPilot software using the Paragon identification algorithm (22) with a Human Swiss-Prot database (18 August 2011; 20,245 entries). The database searches were made in thorough mode, specifying 8plex iTRAQ quantification, trypsin digestion, and MMTS (S-methyl methanethiosulfonate) modification of cysteine. The QSTAR data were analyzed directly, and the Orbitrap data were converted to mascot generic format using Proteome Discoverer version 1.3 (Thermo Scientific) (23). False discovery rates (FDRs) for protein identification were estimated using the ProteinPilot PSEP functionality (24,25). A confidence threshold of 95% for protein identification was applied. iTRAQ ratios were calculated using ProteinPilot.
LFQ data were analyzed with Proteome Discoverer together with Mascot 2.1 (Matrix Science). The search criteria were trypsin digestion, MMTS modification of cysteine, deamidation of N/Q, and methionine oxidation, using the aforementioned database. For the quantitative analysis, Progenesis version 4.0 software was used for feature detection, alignment, and calculation of intensity-based abundance measurements for each protein (26). To facilitate comparison of the label-free data with the iTRAQ results, the intensity values of each protein were scaled relative to the median intensity of each protein across the paired case-control sample series.
Data Analysis
Serum Proteomics Differences Between Healthy Children and Type 1 Diabetes Progressors
Case-control abundance ratios were calculated for the paired samples. The ratios were log2 transformed and used in rank product analyses (27) to identify differences throughout the time series (n = 19), before the detection of autoantibody seroconversion (n = 14), and before diagnosis (n = 19). The rank product analyses were made with 10,000 times permutations, and an FDR ≤5% was applied (Benjamini-Hochberg correction). The averaged log2 case-control ratios were used to compare time intervals selected on the basis of the similarity of the sample series as indicated in Table 2.
Serum proteins detected at different levels in children progressing to type 1 diabetes and matched control subjects
Protein name . | Entry . | Entry name . | Average unique peptides iTRAQ . | Average unique peptides label free . | Average % sequence coverage . | Sample comparisons where detected . | Average case-control ratio . | % FDR . |
---|---|---|---|---|---|---|---|---|
Mannose-binding protein C | P11226 | MBL2 | 24 | 9 | 39 | a | 0.92 | 2.0 |
Complement factor H–related protein 5 | Q9BXR6 | FHR5 | 10 | 9 | 10 | a | 1.24 | 4.0 |
Complement component C9 | P02748 | CO9 | 49 | 35 | 36 | a,e,f | 1.20 | 3.3 |
Apolipoprotein C-IV | P55056 | APOC4 | 7 | 6 | 22 | a,b,c,d | 0.62 | 0.6 |
Apolipoprotein C-II | P02655 | APOC2 | 8 | 5 | 63 | a,b | 0.76 | 0.0 |
Profilin-1 | P07737 | PFN1 | 5 | 5 | 30 | d | 1.42 | 3.0 |
e | 0.81 | 0.4 | ||||||
Coagulation factor IX | P00740 | FA9 | 11 | 3 | 19 | g | 1.46 | 4.5 |
Dopamine beta-hydroxylase | P09172 | DOPO | 10 | 5 | 15 | e | 1.39 | 0.0 |
C4b-binding protein beta chain | P20851 | C4BPB | 5 | 2 | 21 | e | 1.32 | 5.0 |
Adiponectin | Q15848 | ADIPO | 13 | 3 | 33 | e | 0.68 | 0.01 |
Sex hormone–binding globulin | P04278 | SHBG | 30 | 13 | 50 | e | 0.79 | 0.9 |
Periostin | Q15063 | POSTN | 13 | 7 | 17 | e | 0.76 | 0.9 |
Transforming growth factor-beta–induced protein ig-h3 | Q15582 | BGH3 | 17 | 10 | 24 | e | 0.83 | 1.6 |
Peptidase inhibitor 16 | Q6UXB8 | PI16 | 15 | 16 | 19 | e | 0.83 | 1.0 |
Protein S100-A9 | P06702 | S10A9 | 6 | 3 | 40 | f | 1.41 | 4.4 |
Protein name . | Entry . | Entry name . | Average unique peptides iTRAQ . | Average unique peptides label free . | Average % sequence coverage . | Sample comparisons where detected . | Average case-control ratio . | % FDR . |
---|---|---|---|---|---|---|---|---|
Mannose-binding protein C | P11226 | MBL2 | 24 | 9 | 39 | a | 0.92 | 2.0 |
Complement factor H–related protein 5 | Q9BXR6 | FHR5 | 10 | 9 | 10 | a | 1.24 | 4.0 |
Complement component C9 | P02748 | CO9 | 49 | 35 | 36 | a,e,f | 1.20 | 3.3 |
Apolipoprotein C-IV | P55056 | APOC4 | 7 | 6 | 22 | a,b,c,d | 0.62 | 0.6 |
Apolipoprotein C-II | P02655 | APOC2 | 8 | 5 | 63 | a,b | 0.76 | 0.0 |
Profilin-1 | P07737 | PFN1 | 5 | 5 | 30 | d | 1.42 | 3.0 |
e | 0.81 | 0.4 | ||||||
Coagulation factor IX | P00740 | FA9 | 11 | 3 | 19 | g | 1.46 | 4.5 |
Dopamine beta-hydroxylase | P09172 | DOPO | 10 | 5 | 15 | e | 1.39 | 0.0 |
C4b-binding protein beta chain | P20851 | C4BPB | 5 | 2 | 21 | e | 1.32 | 5.0 |
Adiponectin | Q15848 | ADIPO | 13 | 3 | 33 | e | 0.68 | 0.01 |
Sex hormone–binding globulin | P04278 | SHBG | 30 | 13 | 50 | e | 0.79 | 0.9 |
Periostin | Q15063 | POSTN | 13 | 7 | 17 | e | 0.76 | 0.9 |
Transforming growth factor-beta–induced protein ig-h3 | Q15582 | BGH3 | 17 | 10 | 24 | e | 0.83 | 1.6 |
Peptidase inhibitor 16 | Q6UXB8 | PI16 | 15 | 16 | 19 | e | 0.83 | 1.0 |
Protein S100-A9 | P06702 | S10A9 | 6 | 3 | 40 | f | 1.41 | 4.4 |
The analyses were made by rank product analysis of the case-control ratios, with the samples considered in relation to time intervals relative to diagnosis and seroconversion (as indicated). a, throughout (n = 19); b, preseroconversion (n = 14); c, 9–12 months preseroconversion; d, 3–6 months preseroconversion; e, postseroconversion and <1.5 year prediagnosis (n = 19); f, 15–18 months postseroconversion (n = 14); g, 3–6 months postseroconversion (n = 14).
Because the two data sets (iTRAQ vs. label-free) demonstrated a close overlap of the proteins repeatedly detected and quantified, the ranked results from these analyses were combined to investigate the longitudinal paired differences and trends. Previous studies have indicated that these two methods are complementary (28,29). If the protein was absent in either the case or the control subject, the paired measurement was not used or imputed to minimize the influence of missing values.
Longitudinal Serum Proteomics Profiles in Children Progressing to Type 1 Diabetes
Spearman rank correlation analyses were used to assess whether any of the protein profiles were related to progression to diabetes. The analyses were made for the collected case-control abundance ratios of the paired samples. For this comparison, there were 11 well-matched pairs with samples before and after seroconversion (Table 3). The analysis was repeated separately for the case- and the control-to-reference ratios. To unify this analysis, the age/time axis was scaled between birth and diagnosis (0–1). An absolute Spearman correlation coefficient ≥0.4 was considered a valid weak correlation (two-tailed P value based on 10,000 permutations of the time axis, FDR ≤5%).
Longitudinal correlation of serum proteins specific to the children who progressed to type 1 diabetes
Protein name . | Entry name . | Entry . | Average unique peptides iTRAQ . | Average unique peptides label free . | Average % sequence coverage* . | Correlation coefficient (Spearman) . |
---|---|---|---|---|---|---|
Fetuin-B | FETUB | Q9UGM5 | 18 | 7 | 30 | 0.63 |
Serum amyloid P-component | SAMP | P02743 | 32 | 9 | 37 | 0.51 |
Clusterin | CLUS | P10909 | 70 | 35 | 43 | 0.50 |
C4b-binding protein alpha chain | C4BPA | P04003 | 21 | 11 | 24 | 0.49 |
C4b-binding protein beta chain | C4BPB | P20851 | 5 | 2 | 20 | 0.48 |
Complement factor I | CFAI | P05156 | 50 | 44 | 48 | 0.45 |
Inter-alpha-trypsin inhibitor heavy chain H4 | ITIH4 | Q14624 | 251 | 92 | 63 | 0.44 |
Apolipoprotein C-IV | APOC4 | P55056 | 4 | 6 | 22 | 0.44 |
Insulin-like growth factor–binding protein 3 | IBP3 | P17936 | 12 | 13 | 27 | 0.43 |
Serum amyloid A-4 protein | SAA4 | P35542 | 5 | 10 | 22 | 0.43 |
Complement component C8 alpha chain | CO8A | P07357 | 51 | 35 | 38 | 0.42 |
Complement C1q subcomponent subunit B | C1QB | P02746 | 27 | 17 | 29 | 0.42 |
Hyaluronan-binding protein 2 | HABP2 | Q14520 | 20 | 12 | 26 | 0.40 |
Complement component C8 gamma chain | CO8G | P07360 | 28 | 7 | 58 | 0.40 |
Transforming growth factor-beta–induced protein ig-h3 | BGH3 | Q15582 | 17 | 10 | 24 | −0.41 |
Ectonucleotide pyrophosphatase/phosphodiesterase family member 2 | ENPP2 | Q13822 | 11 | 4 | 12 | −0.41 |
Poliovirus receptor | PVR | P15151 | 5 | 4 | 8 | −0.42 |
Vinculin | VINC | P18206 | 6 | 3 | 6 | −0.42 |
N-acetylmuramoyl-l-alanine amidase | PGRP2 | Q96PD5 | 61 | 31 | 50 | −0.42 |
Contactin-1 | CNTN1 | Q12860 | 8 | 1 | 9 | −0.43 |
L-lactate dehydrogenase B chain | LDHB | P07195 | 9 | 2 | 24 | −0.46 |
Extracellular superoxide dismutase (Cu-Zn) | SODE | P08294 | 7 | 4 | 28 | −0.48 |
Apolipoprotein A-IV | APOA4 | P06727 | 189 | 97 | 74 | −0.54 |
Adiponectin | ADIPO | Q15848 | 13 | 3 | 33 | −0.54 |
Neural cell adhesion molecule 1 | NCAM1 | P13591 | 13 | 5 | 17 | −0.60 |
Insulin-like growth factor–binding protein 2 | IBP2 | P18065 | 6 | 6 | 19 | −0.64 |
Protein name . | Entry name . | Entry . | Average unique peptides iTRAQ . | Average unique peptides label free . | Average % sequence coverage* . | Correlation coefficient (Spearman) . |
---|---|---|---|---|---|---|
Fetuin-B | FETUB | Q9UGM5 | 18 | 7 | 30 | 0.63 |
Serum amyloid P-component | SAMP | P02743 | 32 | 9 | 37 | 0.51 |
Clusterin | CLUS | P10909 | 70 | 35 | 43 | 0.50 |
C4b-binding protein alpha chain | C4BPA | P04003 | 21 | 11 | 24 | 0.49 |
C4b-binding protein beta chain | C4BPB | P20851 | 5 | 2 | 20 | 0.48 |
Complement factor I | CFAI | P05156 | 50 | 44 | 48 | 0.45 |
Inter-alpha-trypsin inhibitor heavy chain H4 | ITIH4 | Q14624 | 251 | 92 | 63 | 0.44 |
Apolipoprotein C-IV | APOC4 | P55056 | 4 | 6 | 22 | 0.44 |
Insulin-like growth factor–binding protein 3 | IBP3 | P17936 | 12 | 13 | 27 | 0.43 |
Serum amyloid A-4 protein | SAA4 | P35542 | 5 | 10 | 22 | 0.43 |
Complement component C8 alpha chain | CO8A | P07357 | 51 | 35 | 38 | 0.42 |
Complement C1q subcomponent subunit B | C1QB | P02746 | 27 | 17 | 29 | 0.42 |
Hyaluronan-binding protein 2 | HABP2 | Q14520 | 20 | 12 | 26 | 0.40 |
Complement component C8 gamma chain | CO8G | P07360 | 28 | 7 | 58 | 0.40 |
Transforming growth factor-beta–induced protein ig-h3 | BGH3 | Q15582 | 17 | 10 | 24 | −0.41 |
Ectonucleotide pyrophosphatase/phosphodiesterase family member 2 | ENPP2 | Q13822 | 11 | 4 | 12 | −0.41 |
Poliovirus receptor | PVR | P15151 | 5 | 4 | 8 | −0.42 |
Vinculin | VINC | P18206 | 6 | 3 | 6 | −0.42 |
N-acetylmuramoyl-l-alanine amidase | PGRP2 | Q96PD5 | 61 | 31 | 50 | −0.42 |
Contactin-1 | CNTN1 | Q12860 | 8 | 1 | 9 | −0.43 |
L-lactate dehydrogenase B chain | LDHB | P07195 | 9 | 2 | 24 | −0.46 |
Extracellular superoxide dismutase (Cu-Zn) | SODE | P08294 | 7 | 4 | 28 | −0.48 |
Apolipoprotein A-IV | APOA4 | P06727 | 189 | 97 | 74 | −0.54 |
Adiponectin | ADIPO | Q15848 | 13 | 3 | 33 | −0.54 |
Neural cell adhesion molecule 1 | NCAM1 | P13591 | 13 | 5 | 17 | −0.60 |
Insulin-like growth factor–binding protein 2 | IBP2 | P18065 | 6 | 6 | 19 | −0.64 |
A subset of proteins was identified where the absolute Spearman correlation coefficient was >0.4 (permutation-based two-tailed test with FDR ≤5%) that were not observed at or above these thresholds in the control subjects. The analysis was based on the changes observed in 11 children representing the samples before and after seroconversion (D1, D4, D5, D9, D10, D11, D12, D14, D15, D17, and D19). The functional enrichment for these proteins is shown in Table 4. The equivalent analysis was made for the control subjects, and direct comparison of these lists is shown in Supplementary Table 6A and B.
*Coverage based on iTRAQ measurements.
Subject and Status Classification
The top-scoring pairs (TSP) method was applied to identify whether combinations of the quantified proteins could classify the samples and subjects (30,31). The leave-one-out method was used for cross-validation. The method was applied with the subject-averaged log2 case-to-control abundance ratios for the time periods compared and similarly for the log2 subject-to-reference ratios. Although highlighted by the rank product analyses, apolipoprotein C-IV (APOC4) was detected in only 16 of 19 subject pairs; these 16 were analyzed separately with the TSP method. The failure to quantify APOC4 in all children was attributed to differences in instrument performance rather than to its absence.
Hierarchical Clustering and Correlation Analysis
To identify proteins with similar longitudinal profiles and highlight intersubject differences, k-medians clustering of the diabetes case-control paired subject data was done using the Pearson correlation coefficient (k = 15). The Multiexperiment Viewer was used for these analysis (32).
For the comparative analysis of changes in the complement proteins, the Pearson correlation coefficients from each subject of complement component 5 (CO5) with the other proteins were used together in rank product analyses. CO5 was selected because of its central role in the formation of the membrane attack complex.
Gene Ontology Annotation and Pathway Analysis
DAVID (33) was used to perform functional annotation and pathway analysis of the proteins correlated with time to diagnosis and to further analyze the protein clusters where the correlation coefficient was ≥0.6. To reduce bias in these enrichment analyses, the protein background was scaled to the identified proteins or used in the comparison (34).
Comparisons With Published Data
We compared the current study results with those from two studies of the serum proteomes of patients with type 1 diabetes (4,5). Collectively, these reports present 38 proteins (in the UniProt database) putatively associated with type 1 diabetes status (Supplementary Table 9).
Results
The iTRAQ measurements detailed, on average, the quantitative comparison of 220 proteins, and in total, 658 proteins were identified and quantified with two or more unique peptides. In comparing with reference concentrations (35) and after excluding depletion targets, these spanned a range of estimated concentrations of six orders of magnitude. With the analyses using a label-free approach, 261 proteins were consistently detected and quantified with more than one unique peptide and spanned a similar dynamic range of detection. The comparison of the proteins identified by the iTRAQ and label-free methods revealed an overlap of a core 248 proteins detected with two or more peptides.
Differences Between the Serum Proteomes of Children Who Developed Type 1 Diabetes and Age-Matched Control Subjects
The children who developed type 1 diabetes had lower levels of APOC4 and apolipoprotein C-II (APOC2) than age-matched healthy control subjects (FDR <1%). Similarly, mannose-binding protein C (MBL2) was also lower than in the matched control subjects (FDR 2%). In contrast, the relative abundance of complement factor H–related protein 5 (FHR-5) and CO9 were higher in the children who developed type 1 diabetes (FDR <5%) (Table 2).
In samples before seroconversion, lower levels of APOC4 and APOC2 were apparent in children who developed type 1 diabetes than in the matched control subjects (FDR <1% and 4%, respectively) (Supplementary Fig. 9B). Similarly, specific consideration of the samples 3–6 months before seroconversion was consistent with the lower levels of both APOC2 and APOC4 as well as with a larger relative abundance of profilin-1 (PFN1).
With a similar analysis of the age-matched data from the period after detection of seroconversion, several proteins were distinguished with a lower relative abundance, including sex hormone–binding globulin, adiponectin (ADIPO), and periostin (FDR <5%). A higher relative abundance of dopamine β-hydroxylase was observed as well as an apparent peak in protein S100-A9 and a decrease in PFN1 (Table 2).
Longitudinal Changes in the Serum Proteomes of Children En Route to Type 1 Diabetes
From the analysis of protein abundance ratios, no significant correlations were observed between the case-control ratios and the time to diagnosis, with the strongest being found with Ig mu-chain C region (a depletion target) and tetranectin, which were positively and negatively correlated, respectively. On the contrary, both the case- and the control-to-reference correlations gave a much clearer indication of the longitudinal changes in the serum proteomes. Changes in the abundance of 26 proteins (14 increased and 12 decreased, FDR ≤5%) (Table 3) were distinct from proteins observed to be correlated in both the case and the control children.
Serum Proteomics Classification of the Subjects Progressing to Type 1 Diabetes
TSP analysis classified the children progressing to type 1 diabetes at a success rate of 91% (Fig. 3A), the area under the curve being 0.85 (Supplementary Fig. 9A). The classification was based on the combination of the relative levels of APOC4 and afamin (AFAM), which were lower and higher than in the control subjects, respectively (P = 3 × 10−4, Wilcoxon signed rank test) (Fig. 3B). Similar analysis of the preseroconversion data did not reveal any clear classification, whereas for the postseroconversion data, vitronectin (VTN) and CO5 classified the children who progressed to type 1 diabetes with a success rate of 77% (Supplementary Fig. 10). With the evaluation of longitudinal changes in subjects progressing to type 1 diabetes, TSP analysis resulted in classification between the pre- and postseroconversion samples at a success rate of ∼80% based on changes in abundance of both apolipoprotein A-IV and insulin-like growth factor–binding protein complex acid labile subunit (Supplementary Fig. 11).
A: Classification between children who developed type 1 diabetes and age-matched control subjects based on abundance of APOC4 and AFAM. The TSP method was used, yielding a 91% success rate. ▲, control subjects; □, case subjects. B: Relative abundance measurements for APOC4 and AFAM for case and control subjects.
A: Classification between children who developed type 1 diabetes and age-matched control subjects based on abundance of APOC4 and AFAM. The TSP method was used, yielding a 91% success rate. ▲, control subjects; □, case subjects. B: Relative abundance measurements for APOC4 and AFAM for case and control subjects.
Functional Annotation Enrichment Analysis and Hierarchical Clustering
For the proteins observed to be positively correlated specifically in children who progressed to type 1 diabetes, a significant enrichment of proteins was associated with inflammation and immune response (Tables 3 and 4). There was no specific functional enrichment in the inversely correlated proteins.
GO annotations enriched in proteins increasing in children who progressed to type 1 diabetes
Term . | P value . | Proteins . | % FDR . |
---|---|---|---|
GO:0002526 acute inflammatory response | 3.8E-04 | P07357, Q14624, P04003, P05156, P02746, P20851, P10909, P07360, P02743, P35542 | 0.3 |
GO:0019724 B-cell–mediated immunity | 8.6E-04 | P07357, P04003, P05156, P02746, P20851, P10909, P07360 | 1.0 |
GO:0006958 complement activation, classical pathway | 8.6E-04 | P07357, P04003, P05156, P02746, P20851, P10909, P07360 | 1.0 |
GO:0016064 immunoglobulin-mediated immune response | 8.6E-04 | P07357, P04003, P05156, P02746, P20851, P10909, P07360 | 1.0 |
GO:0002455 humoral immune response mediated by circulating immunoglobulin | 8.6E-04 | P07357, P04003, P05156, P02746, P20851, P10909, P07360 | 1.0 |
GO:0006954 inflammatory response | 0.0010 | P07357, Q14624, P04003, P05156, P02746, P20851, P10909, P07360, P02743, P35542 | 1.3 |
Complement pathway | 0.0013 | P07357, P04003, P05156, P02746, P20851, P10909, P07360 | 1.3 |
GO:0002250 adaptive immune response | 0.0014 | P07357, P04003, P05156, P02746, P20851, P10909, P07360 | 1.7 |
GO:0002449 lymphocyte-mediated immunity | 0.0014 | P07357, P04003, P05156, P02746, P20851, P10909, P07360 | 1.7 |
GO:0002460 adaptive immune response based on somatic recombination of immune receptors built from immunoglobulin superfamily domains | 0.0014 | P07357, P04003, P05156, P02746, P20851, P10909, P07360 | 1.7 |
GO:0002443 leukocyte-mediated immunity | 0.0018 | P07357, P04003, P05156, P02746, P20851, P10909, P07360 | 2.1 |
GO:0006952 defense response | 0.0032 | P07357, Q14624, P04003, P05156, P02746, P20851, P10909, P07360, P02743, P35542 | 3.8 |
Innate immunity | 0.0039 | P07357, P04003, P05156, P02746, P20851, P10909, P07360 | 3.8 |
GO:0006959 humoral immune response | 0.0041 | P07357, P04003, P05156, P02746, P20851, P10909, P07360 | 4.8 |
GO:0002541 activation of plasma proteins involved in acute inflammatory response | 0.0041 | P07357, P04003, P05156, P02746, P20851, P10909, P07360 | 4.8 |
GO:0002253 activation of immune response | 0.0041 | P07357, P04003, P05156, P02746, P20851, P10909, P07360 | 4.8 |
GO:0006956 complement activation | 0.0041 | P07357, P04003, P05156, P02746, P20851, P10909, P07360 | 4.8 |
Term . | P value . | Proteins . | % FDR . |
---|---|---|---|
GO:0002526 acute inflammatory response | 3.8E-04 | P07357, Q14624, P04003, P05156, P02746, P20851, P10909, P07360, P02743, P35542 | 0.3 |
GO:0019724 B-cell–mediated immunity | 8.6E-04 | P07357, P04003, P05156, P02746, P20851, P10909, P07360 | 1.0 |
GO:0006958 complement activation, classical pathway | 8.6E-04 | P07357, P04003, P05156, P02746, P20851, P10909, P07360 | 1.0 |
GO:0016064 immunoglobulin-mediated immune response | 8.6E-04 | P07357, P04003, P05156, P02746, P20851, P10909, P07360 | 1.0 |
GO:0002455 humoral immune response mediated by circulating immunoglobulin | 8.6E-04 | P07357, P04003, P05156, P02746, P20851, P10909, P07360 | 1.0 |
GO:0006954 inflammatory response | 0.0010 | P07357, Q14624, P04003, P05156, P02746, P20851, P10909, P07360, P02743, P35542 | 1.3 |
Complement pathway | 0.0013 | P07357, P04003, P05156, P02746, P20851, P10909, P07360 | 1.3 |
GO:0002250 adaptive immune response | 0.0014 | P07357, P04003, P05156, P02746, P20851, P10909, P07360 | 1.7 |
GO:0002449 lymphocyte-mediated immunity | 0.0014 | P07357, P04003, P05156, P02746, P20851, P10909, P07360 | 1.7 |
GO:0002460 adaptive immune response based on somatic recombination of immune receptors built from immunoglobulin superfamily domains | 0.0014 | P07357, P04003, P05156, P02746, P20851, P10909, P07360 | 1.7 |
GO:0002443 leukocyte-mediated immunity | 0.0018 | P07357, P04003, P05156, P02746, P20851, P10909, P07360 | 2.1 |
GO:0006952 defense response | 0.0032 | P07357, Q14624, P04003, P05156, P02746, P20851, P10909, P07360, P02743, P35542 | 3.8 |
Innate immunity | 0.0039 | P07357, P04003, P05156, P02746, P20851, P10909, P07360 | 3.8 |
GO:0006959 humoral immune response | 0.0041 | P07357, P04003, P05156, P02746, P20851, P10909, P07360 | 4.8 |
GO:0002541 activation of plasma proteins involved in acute inflammatory response | 0.0041 | P07357, P04003, P05156, P02746, P20851, P10909, P07360 | 4.8 |
GO:0002253 activation of immune response | 0.0041 | P07357, P04003, P05156, P02746, P20851, P10909, P07360 | 4.8 |
GO:0006956 complement activation | 0.0041 | P07357, P04003, P05156, P02746, P20851, P10909, P07360 | 4.8 |
Enrichment was calculated for proteins with an absolute Spearman correlation coefficient ≥0.4 (FDR ≤5%) that were not observed at or above these thresholds in the control subjects. A background of the 208 proteins detected for these analyses was used in the enrichment analysis. The protein names and detection details are indicated in Table 3. GO, gene ontology.
From similar enrichment analysis after k-medians clustering of the age-matched subject data, lipid and cholesterol transport, acute inflammatory response, and humoral and innate immunity were the most frequently observed enriched functional annotations (FDR ≤5%) for distinct protein profiles between the children progressing to type 1 diabetes and matched control subjects. These clusters frequently included a high representation of the complement proteins (data not shown). Because CO5 plays an important role in the formation of the membrane attack complex, we further analyzed the changes in the complement proteins correlated with CO5. This analysis revealed stronger correlations in the children en route to type 1 diabetes than in the age-matched control subjects (Wilcoxon signed rank test P = 0.004 and 0.0002, for positively and inversely correlated proteins, respectively). Overall, these analyses supported a strong positive correlation of the components of the membrane attack complex with CO5 (i.e., CO8A, CO8B, CO8G, CO6, CO9 [although not CO7]) in both groups, although the inverse correlations with CO5 in the case subjects were more clearly different from the control subjects (Table 5). From a separate analysis of proteins correlated with CO7, no such clear correlations were observed.
Correlation analysis of protein relative abundance profiles with CO5
Protein name . | Protein abbreviation . | Accession . | Case % FDR . | Type 1 diabetes median . | Control % FDR . | Control median . |
---|---|---|---|---|---|---|
Complement component C8 gamma chain | CO8G | P07360 | 0.0 | 0.86 | 0.0 | 0.64 |
Serum amyloid P-component | SAMP | P02743 | 0.0 | 0.82 | 0.1 | 0.73 |
Complement component C9 | CO9 | P02748 | 0.0 | 0.81 | 0.0 | 0.73 |
Complement C1r subcomponent | C1R | P00736 | 0.0 | 0.8 | 15 | 0.41 |
Complement factor H–related protein 5 | FHR5 | Q9BXR6 | 0.0 | 0.78 | 0.0 | 0.62 |
Ceruloplasmin | CERU | P00450 | 0.0 | 0.77 | 0.0 | 0.65 |
Leucine-rich alpha-2-glycoprotein | A2GL | P02750 | 0.0 | 0.76 | 0.0 | 0.65 |
Inter-alpha-trypsin inhibitor heavy chain H4 | ITIH4 | Q14624 | 0.0 | 0.75 | 0.0 | 0.68 |
Complement component C8 beta chain | CO8B | P07358 | 0.0 | 0.74 | 0.0 | 0.68 |
Complement factor B | CFAB | P00751 | 0.0 | 0.72 | 0.0 | 0.71 |
Complement factor I | CFAI | P05156 | 0.0 | 0.72 | 0.0 | 0.76 |
Complement component C6 | CO6 | P13671 | 0.1 | 0.71 | 0.0 | 0.69 |
C4b-binding protein alpha chain | C4BPA | P04003 | 0.0 | 0.71 | 0.1 | 0.6 |
Alpha-1-acid glycoprotein 1 | A1AG1 | P02763 | 0.0 | 0.7 | 0.0 | 0.75 |
Complement C1s subcomponent | C1S | P09871 | 0.0 | 0.7 | 1.0 | 0.39 |
Complement component C8 alpha chain | CO8A | P07357 | 0.0 | 0.66 | 0.1 | 0.66 |
Lipopolysaccharide-binding protein | LBP | P18428 | 0.4 | 0.66 | 0.2 | 0.65 |
Alpha-1-antichymotrypsin | AACT | P01011 | 0.0 | 0.66 | 0.0 | 0.77 |
Protein S100-A9 | S100A9 | P06702 | 0.8 | 0.65 | 1.0 | 0.28 |
Complement C4-B | CO4B | P0C0L5 | 0.0 | 0.65 | 0.3 | 0.62 |
Alpha-1-acid glycoprotein 2 | A1AG2 | P19652 | 0.0 | 0.64 | 0.0 | 0.72 |
Complement factor H–related protein 3 | FHR3 | Q02985 | 0.4 | 0.54 | 0.2 | 0.55 |
Apolipoprotein A-IV | APOA4 | P06727 | 0.0 | −0.51 | 5.0 | −0.36 |
Alpha-2-HS-glycoprotein | FETUA | P02765 | 0.5 | −0.52 | 9.0 | −0.29 |
Complement component C1q receptor | C1QR1 | Q9NPY3 | 0.4 | −0.55 | 20 | −0.2 |
Sex hormone–binding globulin | SHBG | P04278 | 0.2 | −0.56 | 10 | −0.39 |
Receptor-type tyrosine-protein phosphatase gamma | PTPRG | P23470 | 0.2 | −0.59 | 4.0 | −0.33 |
Periostin | POSTN | Q15063 | 0.0 | −0.59 | 10 | −0.32 |
Vasorin | VASN | Q6EMK4 | 0.0 | −0.61 | 0.3 | −0.42 |
Peptidase inhibitor 16 | PI16 | Q6UXB8 | 0.0 | −0.63 | 2.0 | −0.34 |
Collectin-11 | COL11 | Q9BWP8 | 0.1 | −0.64 | 6.0 | −0.3 |
72 kDa type IV collagenase | MMP2 | P08253 | 0.0 | −0.65 | 7.0 | −0.2 |
Endothelial protein C receptor | EPCR | Q9UNN8 | 0.0 | −0.65 | 0.9 | −0.48 |
Collagen alpha-1(I) chain | CO1A1 | P02452 | 0.2 | −0.66 | 3.0 | −0.34 |
Aggrecan core protein | PGCA | P16112 | 0.0 | −0.66 | 15 | −0.19 |
Collagen alpha-1(VI) chain | CO6A1 | P12109 | 0.0 | −0.67 | 0.1 | −0.48 |
Contactin-1 | CNTN1 | Q12860 | 0.0 | −0.68 | 6.0 | −0.4 |
Collagen alpha-1(XII) chain | COCA1 | Q99715 | 0.0 | −0.71 | 5.0 | −0.3 |
Protein name . | Protein abbreviation . | Accession . | Case % FDR . | Type 1 diabetes median . | Control % FDR . | Control median . |
---|---|---|---|---|---|---|
Complement component C8 gamma chain | CO8G | P07360 | 0.0 | 0.86 | 0.0 | 0.64 |
Serum amyloid P-component | SAMP | P02743 | 0.0 | 0.82 | 0.1 | 0.73 |
Complement component C9 | CO9 | P02748 | 0.0 | 0.81 | 0.0 | 0.73 |
Complement C1r subcomponent | C1R | P00736 | 0.0 | 0.8 | 15 | 0.41 |
Complement factor H–related protein 5 | FHR5 | Q9BXR6 | 0.0 | 0.78 | 0.0 | 0.62 |
Ceruloplasmin | CERU | P00450 | 0.0 | 0.77 | 0.0 | 0.65 |
Leucine-rich alpha-2-glycoprotein | A2GL | P02750 | 0.0 | 0.76 | 0.0 | 0.65 |
Inter-alpha-trypsin inhibitor heavy chain H4 | ITIH4 | Q14624 | 0.0 | 0.75 | 0.0 | 0.68 |
Complement component C8 beta chain | CO8B | P07358 | 0.0 | 0.74 | 0.0 | 0.68 |
Complement factor B | CFAB | P00751 | 0.0 | 0.72 | 0.0 | 0.71 |
Complement factor I | CFAI | P05156 | 0.0 | 0.72 | 0.0 | 0.76 |
Complement component C6 | CO6 | P13671 | 0.1 | 0.71 | 0.0 | 0.69 |
C4b-binding protein alpha chain | C4BPA | P04003 | 0.0 | 0.71 | 0.1 | 0.6 |
Alpha-1-acid glycoprotein 1 | A1AG1 | P02763 | 0.0 | 0.7 | 0.0 | 0.75 |
Complement C1s subcomponent | C1S | P09871 | 0.0 | 0.7 | 1.0 | 0.39 |
Complement component C8 alpha chain | CO8A | P07357 | 0.0 | 0.66 | 0.1 | 0.66 |
Lipopolysaccharide-binding protein | LBP | P18428 | 0.4 | 0.66 | 0.2 | 0.65 |
Alpha-1-antichymotrypsin | AACT | P01011 | 0.0 | 0.66 | 0.0 | 0.77 |
Protein S100-A9 | S100A9 | P06702 | 0.8 | 0.65 | 1.0 | 0.28 |
Complement C4-B | CO4B | P0C0L5 | 0.0 | 0.65 | 0.3 | 0.62 |
Alpha-1-acid glycoprotein 2 | A1AG2 | P19652 | 0.0 | 0.64 | 0.0 | 0.72 |
Complement factor H–related protein 3 | FHR3 | Q02985 | 0.4 | 0.54 | 0.2 | 0.55 |
Apolipoprotein A-IV | APOA4 | P06727 | 0.0 | −0.51 | 5.0 | −0.36 |
Alpha-2-HS-glycoprotein | FETUA | P02765 | 0.5 | −0.52 | 9.0 | −0.29 |
Complement component C1q receptor | C1QR1 | Q9NPY3 | 0.4 | −0.55 | 20 | −0.2 |
Sex hormone–binding globulin | SHBG | P04278 | 0.2 | −0.56 | 10 | −0.39 |
Receptor-type tyrosine-protein phosphatase gamma | PTPRG | P23470 | 0.2 | −0.59 | 4.0 | −0.33 |
Periostin | POSTN | Q15063 | 0.0 | −0.59 | 10 | −0.32 |
Vasorin | VASN | Q6EMK4 | 0.0 | −0.61 | 0.3 | −0.42 |
Peptidase inhibitor 16 | PI16 | Q6UXB8 | 0.0 | −0.63 | 2.0 | −0.34 |
Collectin-11 | COL11 | Q9BWP8 | 0.1 | −0.64 | 6.0 | −0.3 |
72 kDa type IV collagenase | MMP2 | P08253 | 0.0 | −0.65 | 7.0 | −0.2 |
Endothelial protein C receptor | EPCR | Q9UNN8 | 0.0 | −0.65 | 0.9 | −0.48 |
Collagen alpha-1(I) chain | CO1A1 | P02452 | 0.2 | −0.66 | 3.0 | −0.34 |
Aggrecan core protein | PGCA | P16112 | 0.0 | −0.66 | 15 | −0.19 |
Collagen alpha-1(VI) chain | CO6A1 | P12109 | 0.0 | −0.67 | 0.1 | −0.48 |
Contactin-1 | CNTN1 | Q12860 | 0.0 | −0.68 | 6.0 | −0.4 |
Collagen alpha-1(XII) chain | COCA1 | Q99715 | 0.0 | −0.71 | 5.0 | −0.3 |
The table indicates the median value of the top-ranked correlation coefficients from 19 × 2 subjects (case and control). To assign consistency in the observations, a rank product analysis was made for the generated coefficients and is represented by the % FDRs for the highest-ranked proteins.
Comparison With Serum Proteomics of Patients With Type 1 Diabetes
We compared the present data with observations from two studies of the serum proteomes of patients with type 1 diabetes (4,5). From the merged list of proteins of the latter studies, 32 of 38 were clearly defined in the present data. In large-scale targeted validations by Zhi et al. (4), patients with type 1 diabetes were found to have significantly higher serum levels of ADIPO, insulin-like growth factor–binding protein 2 (IGFBP2), serum amyloid protein A, and C-reactive protein and significantly lower levels of myeloperoxidase and transforming growth factor-beta–induced protein (BGH3) (4,5). Comparatively, we found statistically significant lower abundances of both BGH3 and ADIPO in the children progressing to diabetes within the time frame of 18 months from diagnosis (Table 2), whereas a more pronounced time-related decrease was observed in these and IGFBP2 (Table 3). Among the biomarker candidates evaluated by Zhang et al. (5), we observed time-correlated increases of β-ala-his dipeptidase and glutathione peroxidase 3, in both the progressors and the control subjects, whereas a more pronounced increase in clusterin over time was indicated in the progressors (Table 3 and Supplementary Table 9).
Discussion
The serum proteomics profiles of a type 1 diabetes risk cohort enrolled in the Finnish DIPP study were analyzed. Comparisons were made between children who developed type 1 diabetes–associated autoantibodies and subsequently progressed to overt diabetes and age-matched children who remained persistently autoantibody negative. APOC4 and APOC2 were detected at lower levels in prospective samples of children progressing to diabetes. Although recent studies reinforced the hypothesis of a role of viral infections in type 1 diabetes (13), lower levels of apolipoproteins have been associated with viral infection (36,37). In view of this potential association, we compared the current sample data with enterovirus data available in the DIPP project. Of the subjects considered in the proteomics measurements, neutralizing antibodies to coxsackievirus B1 data were available from 12. Six of eight progressors had antibodies to the virus, whereas none of the four control subjects was antibody positive. Due to lack of control data, no further analyses were performed.
Analysis with the TSP classifier demonstrated classification within this group of subjects with a 91% success rate based on APOC4 together with AFAM levels. AFAM is involved in vitamin E transport and has been associated with insulin secretion in islet cells (38). However, because these measurements represent a small, yet well-controlled group of individuals, the global scope of these markers on a wider scale are not clear and need to be analyzed in other cohorts.
In the children who were autoantibody positive, a lower abundance of ADIPO was observed. ADIPO is involved both in the control of fat metabolism and in insulin sensitivity and has been positively correlated with insulin sensitivity in patients with type 1 diabetes (4,39). PFN1 has been associated with inflammation and insulin resistance (40), and notably, significant differences were detected both before and after seroconversion (decreasing in the latter). Further evaluation of this observation is needed. Collectively, these findings appear to reflect metabolic differences and changes preceding the diagnosis of type 1 diabetes. Although we used age-matched control subjects, these changes should ideally be considered in relation to growth and puberty as well as to the broad range of ages among the subjects and the size of the cohort.
This study revealed correlations in the levels of the complement proteins and in particular, the components of the membrane attack complex. In general, the components of this complex circulate independently and interact in a highly specific manner following the cleavage of CO5 (41). We interpreted the observed correlation in the relative abundance of circulating concentrations to reflect activation of the terminal pathway. The best correlations for these components were found with CO8G and CO9. Because CO8G is less abundant than CO8A and CO8B (35) and CO9 contributes six molecules to the membrane attack complex, stoichiometry may be the reason that these proteins provide a better reflection of activation changes. Although the complement system is an integral and functional component of blood, it has been implicated as a contributor to the development of various autoimmune diseases, including type 1 diabetes (42). Thus, the present observations of the distinct profiles of the complement system in the children progressing to type 1 diabetes may reflect different challenges to the immune system or levels of immune responses. Notably, in relation to the complement proteins, the TSP analysis of samples postseroconversion relative to control samples provides classification with a success rate of 77% based on CO5 and VTN. VTN is involved in the regulation of the complement system and minimizes off-target effects during complement-mediated attack (43). In addition, a larger relative abundance of CO9 was observed in the children progressing to type 1 diabetes. From the time-wise correlations, the analysis of data from children progressing to type 1 diabetes emphasizes an increasing immune response before diagnosis (Table 4). Similarly, with the paired subjects, the detection of increasing Ig mu-chain C region could similarly reflect the ongoing immune process.
Although MBL2 levels are variable within the population, the progressors displayed predominantly lower levels than matched control subjects. Notably, MBL2 is an important component of immune defense, where it is a key protein in lectin activation of the complement pathway, and as a genetic disorder, a deficiency of MBL2 is associated with an increased infection risk (44). However, no strong correlations were identified between the relative abundance of MBL2 and the other proteins quantified.
Consistent with a number of other multiple-comparison serum iTRAQ measurements, the data were limited to comparison of ∼250 proteins between subject pairs (15,16,45,46). Although the present measurements were mostly restricted to the description of the moderate abundance proteome, an underlying tenet in the proteomics community is that characterization of the lower abundance serum proteins holds the key to finding biomarkers (47). Instrumental improvements (48), alternative fractionation strategies, and data-independent approaches have begun to address this problem (49).
Validation measurements in a larger and independent cohort are an important next step in this research. On the basis of the present findings, studies exploiting selected reaction monitoring (SRM) are currently in progress to validate these putative biomarkers of development of β-cell autoimmunity. As a pertinent example of the application of this methodology, García-Bailo et al. (50) applied SRM protocols on the scale of 1,000 individuals to quantify a similar panel of plasma proteins. Our future goal is to investigate the extent to which we can improve the detection of disease activity and the prediction of type 1 diabetes through the detection of a selected set of serum protein targets by SRM together with other parameters, such as autoantibody status and other analytes identified from our ongoing omics studies.
In summary, using mass spectrometry–based analysis of immunodepleted sera, we demonstrate for the first time serum proteomics profiles of the prediabetic transition all the way to diagnosis, comparing profiles between children progressing to type 1 diabetes and healthy children. These results demonstrate shared and group-specific longitudinal changes against a background of wide subject heterogeneity, suggesting that components of the moderately abundant serum proteins could indicate the emerging threat of type 1 diabetes. Future developments of this nature could include the determination of a panel of thus-related proteins in addition to current markers of seroconversion as a means to stratify disease risk.
Article Information
Acknowledgments. The authors thank the DIPP families for participation; S. Simell and the staff of the DIPP study for work with the families and obtaining the samples of the study; J. Hakalax, M. Laaksonen, and E. Pakarinen (of the Department of Pediatrics, Turku University Hospital, Turku, Finland, and the Department of Pediatrics, University of Turku, Turku, Finland) for handling and managing study participant data; and V. Simell (of the Department of Pediatrics, Turku University Hospital, Turku, Finland, and the Department of Pediatrics, University of Turku, Turku, Finland) for managing the DIPP biobank and samples. The measurements presented in this work were performed at the Turku Centre for Biotechnology Proteomics core facility from which the excellent technical support of Arttu Heinonen and constructive criticisms of Susumu Imanishi (both of the Turku Centre for Biotechnology, University of Turku, Turku, Finland) are greatly appreciated. Waltteri Hosia is acknowledged for early work in this project and former students Ida Koho and Joona Valtonen (all from Turku Centre for Biotechnology, University of Turku, Turku, Finland) are thanked for their assistance. AB Sciex, particularly Sean Seymour, is thanked for assistance with ProteinPilot.
Funding. The work was financially supported by the National Technology Agency of Finland grants (40453/04, 40229/08, and 40398/11), JDRF (17-2011-586), and Academy of Finland grants (77773, 203725, 207490, 116639, 115939, 123864, 126063, 110432 to J.M. and the Centre of Excellence in Molecular Systems Immunology and Physiology Research, 2012–2017, Decision No. 250114 to M.K., H.L., O.S., and R.L.).
Duality of Interest. No potential conflicts of interest relevant to this article were reported.
Author Contributions. R.M. set up the experimental methods, prepared and analyzed samples, analyzed and interpreted the data, and was the key author of the manuscript. S.D.B. prepared and analyzed samples and played an important role in implementing the label-free method. T.E., E.L., and J.S. participated in establishing and testing methods for the analysis of the longitudinal data. E.V.N. contributed to the clustering analysis and preliminary analysis of the label-free data. H.K. was involved in the interpretation of the observations and discussions throughout the study. J.M. participated in the initiation of the study. M.V.-M. participated in selecting the samples. H.H. was responsible for the virus analysis within the study. R.V. and M.K. were responsible for the analyses of diabetes-associated autoantibodies. J.I. was responsible for the DNA isolation and HLA screening of the study children. T.S., J.T., and O.S. provided the samples and the clinical information of the study children. D.R.G. was involved in selecting the analytical methods and data analysis strategies. H.L. supervised T.E. and E.L. and participated in the interpretation of the results. O.S. and R.L. initiated the study, designed the study setup, supervised the study, and participated in the interpretation of the results and writing the manuscript. All authors contributed to the final version of the manuscript. R.M., H.L., O.S., and R.L. are the guarantors of this work and, as such, had full access to all the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis.