Progression to clinical type 1 diabetes varies among children who develop β-cell autoantibodies. Differences in autoantibody patterns could relate to disease progression and etiology. Here we modeled complex longitudinal autoantibody profiles by using a novel wavelet-based algorithm. We identified clusters of similar profiles associated with various types of progression among 600 children from The Environmental Determinants of Diabetes in the Young (TEDDY) birth cohort study; these children developed persistent insulin autoantibodies (IAA), GAD autoantibodies (GADA), insulinoma-associated antigen 2 autoantibodies (IA-2A), or a combination of these, and they were followed up prospectively at 3- to 6-month intervals (median follow-up 6.5 years). Children who developed multiple autoantibody types (n = 370) were clustered, and progression from seroconversion to clinical diabetes within 5 years ranged between clusters from 6% (95% CI 0, 17.4) to 84% (59.2, 93.6). Children who seroconverted early in life (median age <2 years) and developed IAA and IA-2A that were stable-positive on follow-up had the highest risk of diabetes, and this risk was unaffected by GADA status. Clusters of children who lacked stable-positive GADA responses contained more boys and lower frequencies of the HLA-DR3 allele. Our novel algorithm allows refined grouping of β-cell autoantibody–positive children who distinctly progressed to clinical type 1 diabetes, and it provides new opportunities in searching for etiological factors and elucidating complex disease mechanisms.
Introduction
Clinical type 1 diabetes is commonly preceded by the development of autoantibodies against pancreatic β-cell antigens, such as insulin autoantibodies (IAA), GAD autoantibodies (GADA), insulinoma-associated antigen-2 autoantibodies (IA-2A), and zinc transporter 8 autoantibodies (ZnT8A) (1). In particular, children who develop two or more of these autoantibody types almost inevitably progress to clinically symptomatic diabetes (2). These findings have led to a new staging of type 1 diabetes that classifies the presence of advanced β-cell autoimmunity (multiple autoantibody types) but no symptoms of diabetes as an early stage of disease, that is, presymptomatic type 1 diabetes (3,4). However, the duration of progression from presymptomatic to clinical type 1 diabetes varies among children who are positive for multiple autoantibody types (2). Autoantibody characteristics stratify diabetes risk; these characteristics include age at seroconversion (2,5–7), antibody number (8–10), titer (6,7,9–12), affinity (13,14), antigen specificity (9,15–17), and epitope binding (9,14,18,19). Nevertheless, the relation between various longitudinal autoantibody profiles and the rate of progression to diabetes are rarely studied. The Environmental Determinants of Diabetes in the Young (TEDDY) study recently reported that among children positive for multiple autoantibody types, those who reverted from GADA-positive to GADA-negative status at follow-up had greater risk of diabetes than those with persistent autoantibodies (20). Likewise, clustering children on the basis of similarities between sequential autoantibody patterns in the German BABYDIAB cohort revealed delayed progression to type 1 diabetes in children positive for multiple autoantibody types and who became IAA-negative at follow-up (21). To our knowledge, however, no study to date has analyzed longitudinal profiles of multiple autoantibodies with due consideration of the timing of changes in the qualitative status of the various autoantibodies.
TEDDY study provides unique opportunities for analyzing longitudinal autoantibody profiles on the basis of a whole time series of autoantibody sequences that are available because type 1 diabetes–associated autoantibodies were frequently sampled and measured, starting in early infancy (22). This could refine stratification of progression to clinical diabetes on the basis of similarities in the timing of changes in autoantibody responses. However, the high complexity and multivariate nature of the longitudinal autoantibody data remain challenging obstacles to analysis. To address this issue, we developed a mathematical algorithm based on Haar wavelet decomposition that enables children to be clustered according to similarities in their longitudinal autoantibody profiles. In contrast to most published approaches (2,5–10,12,20), our proposed method does not require a priori definition of relevant autoantibody patterns or seroconversion ages, but intrinsically groups children by taking longitudinal characteristics into account.
Research Design and Methods
Study Population and Samples
TEDDY study is a prospective cohort study with the primary goal of identifying environmental causes of type 1 diabetes. It includes six clinical research centers: three in the U.S. (Colorado, Georgia/Florida, Washington) and three in Europe (Finland, Germany, Sweden). Details of the study design and methods have been published previously (22). TEDDY study enrolled 8,676 children who are genetically at risk for developing type 1 diabetes on the basis of their HLA genotype (23). Enrolled children are monitored prospectively from age 3 months to age 4 years, with study visits every 3 months until age 4 and thereafter every 3 or 6 months, depending on autoantibody positivity, until age 15 years. Children who are persistently positive for any autoantibody are monitored every 3 months until the age of 15 years or the onset of type 1 diabetes. If remission of all autoantibodies occurs for four consecutive visits or a period of 1 year, an interval of 6 months becomes effective. Autoantibody-negative children are monitored every 6 months. The study was approved by local institutional review or ethics boards and monitored by an external evaluation committee formed by the National Institutes of Health. All participants provided written informed consent before participating in the genetic screening and in the prospective follow-up.
As of 31 December 2014, 618 children had developed confirmed persistent β-cell autoantibodies (IAA, GADA, IA-2A, or multiple types; 242 children were positive for a single autoantibody type and 376 for multiple autoantibody types) during a median follow-up of 6.5 years (interquartile range 5.2–8.0 years); 172 of those had developed diabetes. To avoid bias due to short follow-up profiles, all children with fewer than five longitudinal samples were excluded from the analysis (n = 18). Thus this analysis included 600 children (230 positive for a single autoantibody type and 370 for multiple autoantibody types), 165 of whom developed diabetes. We analyzed the qualitative status of IAA, GADA, and IA-2A over time using 37,047 measurements from these children from birth.
β-Cell Autoantibodies
IAA, GADA, and IA-2A were measured in two laboratories by using radiobinding assays, as previously described (22). In the U.S., all sera were assayed at the Barbara Davis Center for Childhood Diabetes at the University of Colorado, Denver; in Europe, all sera were assayed at the University of Bristol, Bristol, U.K. Both laboratories reported high sensitivity, specificity, and concordance (24). All samples positive for β-cell autoantibodies and 5% of negative samples were retested at the other reference laboratory, and if the results were concordant, the results were deemed confirmed. Persistent β-cell autoimmunity was defined as the presence of an autoantibody at two or more consecutive visits 3 months apart and confirmed by two TEDDY laboratories. Age at seroconversion was defined as the age of the child on the initial date of seroconversion to persistent β-cell autoimmunity, as previously described (25). A child was considered to be positive for multiple autoantibody types if at least two autoantibodies—IAA, GADA, IA-2A, or a combination—were positive in two consecutive samples, or if at least two of these autoantibody types were positive in the last available sample before the development of type 1 diabetes. An autoantibody response was defined as transiently positive if at least two consecutive samples were autoantibody-positive followed by at least two consecutive autoantibody-negative samples or an autoantibody-negative last available sample. An autoantibody response was defined as stable-positive if it was not transiently positive. An autoantibody profile was defined as the qualitative status of IAA, GADA, IA-2A (i.e., positive or negative as defined by a cutoff) at a single time point, building a three-dimensional binary vector. A longitudinal autoantibody profile was defined as the temporal sequence of all of a child’s single autoantibody profiles. Type 1 diabetes was defined according to American Diabetes Association criteria for diagnosis (3).
Statistical Analysis
Based on binary longitudinal autoantibody profiles of IAA, GADA, and IA-2A (i.e., temporal sequences of the positive or negative autoantibody status of all children), we developed a mathematical algorithm using Haar wavelets (26) to quantify the similarity between longitudinal autoantibody profiles. Hierarchical clustering was subsequently performed to group children on the basis of similarities. Imputation of data was required whenever samples were missing from a sequence of autoantibody measurements. A missing sample was assigned as autoantibody-positive if the samples immediately preceding and immediately after the missing sample were positive for the particular autoantibody. In all other cases, missing samples were assigned as autoantibody-negative.
Follow-up time, and accordingly, the number of available samples and autoantibody measurements, varied considerably among children. We observed a bimodal distribution of follow-up time (Supplementary Fig. 1): 82 children (69 positive for multiple and 13 for single autoantibody types) were followed up for up to 42 months, and 518 children (301 positive for multiple and 217 for single autoantibody types) were followed up for more than 42 and up to 122 months. Because only shared follow-up periods could be used for pairwise comparisons of children, short periods shared by children with considerably different durations of follow-up did not contain sufficient information to achieve reasonable clustering results on the basis of wavelet coefficients alone, as children with qualitatively different longitudinal autoantibody profiles could be clustered together. Therefore, we first grouped children with long (more than 42 months) or short (up to 42 months) follow-up periods separately and later integrated the children with short follow-up into the clusters of children with long follow-up by using a combination of similar autoantibody patterns and similar timing of autoantibody development.
First, we applied a Haar wavelet decomposition to autoantibody sequences of children with more than 42 months of follow-up; this was done separately for IAA, GADA, and IA-2A. The resulting wavelet coefficients of the three autoantibodies were then combined into a single vector, which was used to estimate Euclidean distances of wavelet coefficients between pairs of children. Hierarchical clustering with complete linkage was performed on the resulting distances; this was done separately for multiple autoantibody–positive and single autoantibody–positive children. Second, children with shorter follow-up (up to 42 months) were assigned to clusters of children with longer follow-up by using a combination of distances that were based on wavelet coefficients and a recently described algorithm for measuring similarity between sequential autoantibody patterns (21). Both distance measures were calculated between a child with shorter follow-up and all children with longer follow-up. Next, a child with shorter follow-up was assigned to a cluster of children with longer follow-up by calculating a Student t statistic comparing distances of children in each respective cluster to distances of the remaining children. The Student t statistic of both distance measures was summed up, and each child with short follow-up was assigned to the cluster with long follow-up with the maximum of this sum. The combination of distances from wavelet coefficients and from sequential autoantibody patterns ensured that children with short follow-up times were assigned to clusters of children with both similar timing of autoantibody appearance and similar longitudinal autoantibody profiles.
We used Kaplan-Meier survival analysis with the log-rank test to compare between clusters the progression from autoantibody seroconversion to type 1 diabetes. We used as the event time the time from the age at seroconversion to the age at diagnosis of diabetes or the age at last contact (for children without diabetes). Analysis considered censoring those children who were lost to follow-up. The 5-year diabetes-free survival is presented for clusters comprising 10 or more children. We used the Fisher exact test to compare frequencies between groups. All statistical analyses were performed using R version 3.2.2.
Results
Clustering of Multiple Autoantibody–Positive Children
We hypothesized that clustering multiple autoantibody–positive children on the basis of their consecutive profiles of IAA, GADA, and IA-2A could provide refined stratification with respect to progression to clinical type 1 diabetes and disease etiopathogenesis. Children who developed multiple β-cell autoantibody types (n = 370) were clustered on the basis of wavelet coefficients. We used the resulting dendrogram (Fig. 1) to define 12 multiple autoantibody clusters (mC1–mC12), each comprising 12–88 children who differed with respect to their age at autoantibody appearance and their autoantibody profile at follow-up (Fig. 2). Characteristics of the children in these clusters are summarized in Table 1. The clusters differed considerably with respect to the percentage of children who progressed from seroconversion to clinical diabetes within 5 years, ranging from 6% (95% CI 0, 17.4; cluster mC9) to 84% (59.2, 93.6; mC5) (Table 1). In particular, those clusters with the shortest distances from each other in the dendrogram (e.g., mC7 and mC8) (Fig. 2) had markedly different risks of diabetes, indicating that the approach could distinguish children with different progression on the basis of relatively small differences in their longitudinal autoantibody profiles. Next we explored whether the clusters could stratify progression in children with common characteristics such as similar age at seroconversion and similar autoantibody patterns.
Hierarchical clustering results for longitudinal autoantibody profiles of 370 children who developed multiple β-cell autoantibodies. The dendrogram is divided into 12 multiple autoantibody clusters (mC1–mC12). Each column within a cluster represents the follow-up time from birth (age) for one child. The qualitative status of IAA, GADA, and IA-2A is indicated by color (red = antibody-positive; blue = antibody-negative) with respect to the child’s age when antibodies were measured.
Hierarchical clustering results for longitudinal autoantibody profiles of 370 children who developed multiple β-cell autoantibodies. The dendrogram is divided into 12 multiple autoantibody clusters (mC1–mC12). Each column within a cluster represents the follow-up time from birth (age) for one child. The qualitative status of IAA, GADA, and IA-2A is indicated by color (red = antibody-positive; blue = antibody-negative) with respect to the child’s age when antibodies were measured.
Aggregated longitudinal profiles of IAA, GADA, and IA-2A for children in the multiple-autoantibody clusters (mC1–mC12). For each cluster, the percentages of children who had the respective autoantibodies are indicated by color (white: 0% positive; red: 100% positive) with respect to age. The blue line indicates the age until which >50% of children in the cluster were followed up, and the green lines indicate the age until which >25% of children in the cluster were followed up. Autoantibody profiles are plotted until only two children in the cluster remained in follow-up.
Aggregated longitudinal profiles of IAA, GADA, and IA-2A for children in the multiple-autoantibody clusters (mC1–mC12). For each cluster, the percentages of children who had the respective autoantibodies are indicated by color (white: 0% positive; red: 100% positive) with respect to age. The blue line indicates the age until which >50% of children in the cluster were followed up, and the green lines indicate the age until which >25% of children in the cluster were followed up. Autoantibody profiles are plotted until only two children in the cluster remained in follow-up.
Distribution of features among the multiple autoantibody clusters
. | Autoantibody clusters . | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
All clusters . | mC1 . | mC2 . | mC3 . | mC4 . | mC5 . | mC6 . | mC7 . | mC8 . | mC9 . | mC10 . | mC11 . | mC12 . | |
Children, n | 370 | 35 | 30 | 12 | 21 | 27 | 88 | 27 | 24 | 16 | 30 | 19 | 41 |
Age (years) at seroconversion, median (IQR) | 2.0 (1.1–3.1) | 4.0 (3.0–4.7) | 4.2 (2.8–5.5) | 1.0 (0.8–2.0) | 7.2 (3.3–7.5) | 1.0 (0.8–1.1) | 1.4 (0.8–1.8) | 3.3 (3.0–3.9) | 2.4 (1.8–2.9) | 2.5 (2.3–3.0) | 1.7 (1.0–2.2) | 1.6 (1.1–2.0) | 1.5 (1.0–2.3) |
T1D prevalence | 41 | 11 | 17 | 50 | 14 | 85 | 66 | 7 | 46 | 12 | 63 | 21 | 32 |
5-Year T1D risk, % (95% CI) | 44 (38–49) | 12 (0–23) | 28 (0–49) | 51 (5–75) | 26 (0–49) | 84 (59–94) | 63 (51–72) | 9 (0–21) | 53 (21–72) | 6 (0–17) | 41 (20–57) | 22 (0–39) | 40 (19–56) |
Maternal T1D | 6 | 14 | 10 | 8 | 0 | 7 | 5 | 4 | 4 | 6 | 3 | 0 | 5 |
HLA | |||||||||||||
HLA-DR3/DR3 | 7 | 9 | 3 | 8 | 5 | 0 | 3 | 4 | 0 | 31 | 17 | 21 | 7 |
HLA-DR3/DR4 | 57 | 69 | 63 | 33 | 48 | 44 | 64 | 56 | 38 | 56 | 43 | 68 | 63 |
HLA-DR4/DR4 | 18 | 17 | 23 | 33 | 14 | 15 | 10 | 26 | 42 | 13 | 27 | 0 | 15 |
HLA-DR4/DRx | 18 | 6 | 10 | 25 | 33 | 41 | 22 | 15 | 21 | 0 | 13 | 11 | 15 |
ZnT8A | 62 | 51 | 73 | 83 | 52 | 44 | 58 | 74 | 71 | 75 | 77 | 47 | 56 |
Born by cesarean delivery | 22 | 26 | 20 | 50 | 24 | 37 | 17 | 26 | 17 | 25 | 20 | 16 | 20 |
Male sex | 56 | 57 | 50 | 83 | 52 | 74 | 55 | 52 | 67 | 69 | 50 | 47 | 41 |
. | Autoantibody clusters . | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
All clusters . | mC1 . | mC2 . | mC3 . | mC4 . | mC5 . | mC6 . | mC7 . | mC8 . | mC9 . | mC10 . | mC11 . | mC12 . | |
Children, n | 370 | 35 | 30 | 12 | 21 | 27 | 88 | 27 | 24 | 16 | 30 | 19 | 41 |
Age (years) at seroconversion, median (IQR) | 2.0 (1.1–3.1) | 4.0 (3.0–4.7) | 4.2 (2.8–5.5) | 1.0 (0.8–2.0) | 7.2 (3.3–7.5) | 1.0 (0.8–1.1) | 1.4 (0.8–1.8) | 3.3 (3.0–3.9) | 2.4 (1.8–2.9) | 2.5 (2.3–3.0) | 1.7 (1.0–2.2) | 1.6 (1.1–2.0) | 1.5 (1.0–2.3) |
T1D prevalence | 41 | 11 | 17 | 50 | 14 | 85 | 66 | 7 | 46 | 12 | 63 | 21 | 32 |
5-Year T1D risk, % (95% CI) | 44 (38–49) | 12 (0–23) | 28 (0–49) | 51 (5–75) | 26 (0–49) | 84 (59–94) | 63 (51–72) | 9 (0–21) | 53 (21–72) | 6 (0–17) | 41 (20–57) | 22 (0–39) | 40 (19–56) |
Maternal T1D | 6 | 14 | 10 | 8 | 0 | 7 | 5 | 4 | 4 | 6 | 3 | 0 | 5 |
HLA | |||||||||||||
HLA-DR3/DR3 | 7 | 9 | 3 | 8 | 5 | 0 | 3 | 4 | 0 | 31 | 17 | 21 | 7 |
HLA-DR3/DR4 | 57 | 69 | 63 | 33 | 48 | 44 | 64 | 56 | 38 | 56 | 43 | 68 | 63 |
HLA-DR4/DR4 | 18 | 17 | 23 | 33 | 14 | 15 | 10 | 26 | 42 | 13 | 27 | 0 | 15 |
HLA-DR4/DRx | 18 | 6 | 10 | 25 | 33 | 41 | 22 | 15 | 21 | 0 | 13 | 11 | 15 |
ZnT8A | 62 | 51 | 73 | 83 | 52 | 44 | 58 | 74 | 71 | 75 | 77 | 47 | 56 |
Born by cesarean delivery | 22 | 26 | 20 | 50 | 24 | 37 | 17 | 26 | 17 | 25 | 20 | 16 | 20 |
Male sex | 56 | 57 | 50 | 83 | 52 | 74 | 55 | 52 | 67 | 69 | 50 | 47 | 41 |
Data are percentages unless otherwise indicated. IQR, interquartile range; T1D, type 1 diabetes.
Children Who Seroconverted at a Very Young Age
First, we compared clusters of children with a similar young age at seroconversion but variable longitudinal autoantibody profiles with respect to differences in their progression to clinical type 1 diabetes. We therefore selected all clusters with children who were a median age <2 years at seroconversion. This resulted in six clusters (Fig. 3A) characterized by the development of either three stable-positive autoantibodies (cluster mC6), two stable-positive autoantibodies and a transiently positive or negative third autoantibody (clusters mC5, mC10, mC12), or stable-positive GADA (cluster mC11) or stable-positive IA-2A (cluster mC3) and a transiently positive or negative second, or even third, autoantibody. Regardless of GADA status, children with the combination of stable-positive IAA and IA-2A (mC6 and mC5) had similar 5-year risks for diabetes: their risk was significantly higher than the risks in all remaining clusters of children with a very young age at seroconversion (P < 0.0001; hazard ratio [HR] 2.8 [95% CI 1.9, 4.2]) (Fig. 3B and Table 1). In contrast, the 5-year risks for diabetes were not significantly different between clusters of children with the combination of stable-positive GADA and IA-2A (mC10) or stable-positive IAA and GADA (mC12) and those with only stable-positive GADA (mC11) or IA-2A (mC3) (Fig. 3B and Table 1). However, the overall frequency of diabetes throughout follow-up was higher in clusters of children with stable-positive IA-2A (mC10 [63%] and mC3 [50%]) than in those without stable-positive IA-2A (mC12 [32%] and mC11 [21%]; P = 0.002) (Fig. 3C).
Characteristics of clusters of children who had multiple autoantibodies and who seroconverted at a very young age (median age <2 years). A: The percentage of children in each cluster who were stable-positive, transiently positive, or negative for IAA, GADA, and IA-2A at follow-up. B: The cumulative diabetes-free survival from autoantibody seroconversion. C: The overall frequency of diabetes throughout follow-up.
Characteristics of clusters of children who had multiple autoantibodies and who seroconverted at a very young age (median age <2 years). A: The percentage of children in each cluster who were stable-positive, transiently positive, or negative for IAA, GADA, and IA-2A at follow-up. B: The cumulative diabetes-free survival from autoantibody seroconversion. C: The overall frequency of diabetes throughout follow-up.
Children With Similar Autoantibody Patterns
Second, we compared clusters of children with similar autoantibody patterns but variable age at seroconversion with respect to differences in their progression to clinical type 1 diabetes. We therefore grouped clusters on the basis of autoantibody patterns over time and then compared clusters within the groups according to the median age at seroconversion: <2, 2–4, or >4 years. This resulted in four groups of three clusters each (Fig. 4A and Supplementary Fig. 2A). Clusters were characterized by the development of either stable-positive IAA, GADA, and IA-2A (clusters mC6, mC7, and mC2); stable-positive IA-2A and IAA (clusters mC5 and mC8) or only stable-positive IA-2A (clusters mC3); stable-positive IA-2A and GADA (clusters mC10, mC9, mC4); or stable-positive GADA and IAA (cluster mC12) or only stable-positive GADA (clusters mC11 and mC1). Within each cluster group, younger age at seroconversion was generally associated with increased 5-year risk for diabetes (Fig. 4B–E), with the exception of children in cluster mC3, who seroconverted at a median age <2 years, developed stable-positive IA-2A, but lost IAA reactivity during follow-up (Figs. 2 and 4A) and presented with relatively delayed progression to clinical diabetes (Fig. 4C). The most significant effects of younger age at seroconversion on diabetes risk were observed among children who developed three stable-positive autoantibody types (mC6 vs. mC7/mC2; P < 0.0001; HR 5.4 [95% CI 2.5, 11.9]) (Fig. 4B), those developing stable-positive IA-2A and IAA (mC5 vs. mC8; P = 0.02; HR 2.3 [1.1, 4.9]) (Fig. 4C), and those developing stable-positive IA-2A and GADA (mC10 vs. mC9/mC4; P = 0.045; HR 3.9 [1.0, 9.3]) (Fig. 4D). Clusters of children with seroconversion at a median age <2 years also showed a higher overall frequency of diabetes than did those with similar autoantibody patterns but older age at seroconversion (mC6 vs. mC7/mC2 [P < 0.0001]; mC5 vs. mC8 [P = 0.007]; mC10 vs. mC9/mC4 [P < 0.0001]) (Supplementary Fig. 2B). In contrast, the 5-year risk for diabetes and overall frequency of diabetes were not statistically different between children who mainly lacked IA-2A and developed stable-positive GADA (mC12 vs. mC11 vs. mC1; P > 0.05 for all pairwise comparisons) (Fig. 4E and Supplementary Fig. 2B), and this was irrespective of age at seroconversion. Of note, the differences in diabetes risk and overall diabetes frequency between clusters of multiple autoantibody–positive children were not explained by ZnT8A status (Supplementary Fig. 2C).
Progression to type 1 diabetes among clusters of children who were positive for multiple autoantibody types and who have similar autoantibody characteristics but among whom the age at seroconversion varied. Clusters are organized into four groups of three clusters each on the basis of the similarity of autoantibody profiles. A: The percentage of children who were stable-positive, transiently positive, or negative for IAA, GADA, and IA-2A at follow-up. B–E: Cumulative diabetes-free survival from autoantibody seroconversion.
Progression to type 1 diabetes among clusters of children who were positive for multiple autoantibody types and who have similar autoantibody characteristics but among whom the age at seroconversion varied. Clusters are organized into four groups of three clusters each on the basis of the similarity of autoantibody profiles. A: The percentage of children who were stable-positive, transiently positive, or negative for IAA, GADA, and IA-2A at follow-up. B–E: Cumulative diabetes-free survival from autoantibody seroconversion.
Features Associated With Autoantibody Patterns
Clusters lacking stable-positive GADA responses (clusters mC5, mC8, mC3) included more boys (P = 0.002) (Fig. 5A) and showed lower frequencies of the HLA-DR3 allele (P = 0.0002) (Fig. 5B) than did all other multiple-autoantibody clusters.
The proportions of boys (A) and HLA-DR genotypes (B) among the children in the multiple-autoantibody clusters (mC1–mC12). Clusters are organized into four groups of three clusters each on the basis of the similarity of autoantibody profiles. The group comprising mC5, mC8, and mC3 included a significantly larger proportion of boys (P = 0.002) and a lower frequency of HLA-DR3 (P = 0.0002) than did the other cluster groups.
The proportions of boys (A) and HLA-DR genotypes (B) among the children in the multiple-autoantibody clusters (mC1–mC12). Clusters are organized into four groups of three clusters each on the basis of the similarity of autoantibody profiles. The group comprising mC5, mC8, and mC3 included a significantly larger proportion of boys (P = 0.002) and a lower frequency of HLA-DR3 (P = 0.0002) than did the other cluster groups.
Clustering of Single Autoantibody–Positive Children
To analyze characteristics associated with different patterns of autoantibody reactivity against single β-cell antigens, 230 children positive for a single autoantibody type were clustered on the basis of wavelet decomposition of longitudinal time series of IAA, GADA, and IA-2A. We used the resulting dendrogram (Supplementary Fig. 3) to define nine single-autoantibody clusters (sC1–sC9) containing groups of 5–50 children who differed with respect to their longitudinal autoantibody profiles (Supplementary Fig. 4). Characteristics of the children in these clusters are summarized in Supplementary Table 1.
Children with stable-positive GADA responses were clustered into two groups (clusters sC1 and sC2) that differed with respect to the children’s age at seroconversion (P < 0.0001) (Supplementary Table 1), but they had similar 5-year risk for diabetes (Supplementary Fig. 5A and B). Compared with clusters sC1 and sC2 combined, significantly increased 5-year risk for diabetes was found for cluster sC5, comprising children with stable-positive IAA (P = 0.008; HR 4.3 [95% CI 1.3, 13.5]) (Supplementary Table 1 and Supplementary Fig. 5A and B). None of the children in clusters characterized by either transiently positive GADA (clusters sC3 and sC6) or transiently positive IAA (clusters sC7, sC8, sC9) developed diabetes during follow-up (Supplementary Table 1 and Supplementary Fig. 5A and B). A cluster characterized by transiently positive IA-2A was not observed.
Differences were found in ZnT8A positivity between clusters (Supplementary Table 1 and Supplementary Fig. 5C). While 80% of children (4 of 5) in the small cluster sC4 (stable-positive IA-2A) developed ZnT8A, only 37% of children (15 of 41) in cluster sC1 (stable-positive GADA), 12% of children (4 of 33) in cluster sC2 (stable-positive GADA), and 10% of children (2 of 21) in cluster sC5 (stable-positive IAA) developed ZnT8A.
HLA genotype was associated with single-autoantibody clusters. Of note, HLA DR3-DQ2/DR3-DQ2 was absent in clusters of children with stable-positive IA-2A (sC4) or stable-positive IAA (sC5). In contrast, HLA DR3-DQ2/DR3-DQ2 was relatively common among clusters of children with stable-positive GADA (sC1 [38%] and sC2 [24%]) or transiently positive GADA (sC3 [44%] and sC6 [24%]) (Supplementary Table 1 and Supplementary Fig. 5D). Clusters of children with stable-positive GADA (sC1 and sC2) showed higher frequency of the HLA-DR3 allele than did clusters of children with stable-positive IAA (sC5; P = 0.003) (Supplementary Fig. 5D) or stable-positive IA-2A (sC4; P = 0.05) (Supplementary Fig. 5D).
Discussion
In this study we tackled the challenge of combined analysis of complex longitudinal profiles of multiple biomarkers—namely, three types of β-cell autoantibodies—in a time-resolved fashion. We specifically considered the sequence of changes in the qualitative status (i.e., positive or negative) of each autoantibody in every serum sample collected throughout follow-up of 600 children who developed confirmed persistent IAA, GADA, IA-2A, or a combination of these while participating in TEDDY study; in total this comprised more than 37,000 antibody measurements. We also specifically considered the age of the children when these changes in autoantibody status occurred. Using a novel wavelet-based algorithm, we were able to define similarities among the longitudinal autoantibody profiles of the children, including the temporal resolution of changes in autoantibody patterns. On the basis of these similarities, then, we could hierarchically cluster children who were positive for single and multiple autoantibody types to define clusters that were associated with markedly different rates of progression from seroconversion to clinical diabetes, particularly among those children with multiple autoantibodies. Diabetes progressed within 5 years in 6% to 84% of those children. Furthermore, we could pinpoint specific autoantibody patterns and characteristics related to different progression rates. We suggest that our approach holds great potential for refined explorations of the underlying etiology of different phenotypes of β-cell autoimmunity.
Strengths of our study include the unique and well-defined cohort and our use of an innovative analytical approach. TEDDY study is the largest prospective study to date that monitors genetically at-risk children for the development of β-cell autoimmunity and type 1 diabetes (22). TEDDY researchers have collected various possible exposures that could be important to the occurrence and progression of β-cell autoimmunity (27–32). Associations with genetic risk factors and age at occurrence, type and levels of β-cell autoantibodies, and progression to clinical type 1 diabetes have recently been reported (7,20,25,33–35). This analysis adds to these previous studies in that our novel approach is data-driven and considers changes in autoantibody characteristics at the time they occur.
We made use of Haar wavelet coefficients (26) to define similarities between longitudinal profiles of children. This approach holds a number of advantages for analyzing prospective study data. However, it has not yet been used in prospective studies of type 1 diabetes. Wavelets enable a time- and frequency-based decomposition of time series data. By applying an iterative scheme, coefficients determined at the earlier steps capture “high-frequency” information in the data, such as on-off switches, whereas coefficients at later steps in the iteration allow long-term trends to be identified in time series data. Wavelets are therefore a powerful tool for characterizing dynamic temporal patterns in autoantibody progression. Still, intrinsic characteristics of the method need to be considered. First, using Haar wavelets (i.e., decomposition on the basis of piecewise constant functions) might not provide the best orders of approximation, although they are computationally very efficient. Nevertheless, for the type of data analyzed in this study, Haar wavelet coefficients captured the information well enough, and no wavelets with high-order moments were needed. Second, in order to compare the children’s longitudinal autoantibody profiles, we had to cut down the time series of various lengths down to the length of the shorter series. Thus, in particular when comparing a very short time series with a longer one, comparison on the basis of wavelets ignores a substantial part of the information provided by the longer series. We compensated for this deficiency in our analysis by combining wavelet decomposition with another qualitative algorithm provided previously (21). Third, the method also requires that time series be sampled at equal intervals. Although this is the case in TEDDY study, other decompositions would have to be applied for scattered data.
We focused our analysis on the group of children who tested positive for multiple β-cell autoantibodies. Considerable differences exist between children at this presymptomatic stage of type 1 diabetes with respect to the time until clinical onset of the disease (2,7,36). A well-known risk factor for fast progression to clinical diabetes among autoantibody-positive individuals is young age at seroconversion (5–7,34). It is therefore remarkable that we could 1) distinguish different rates of progression among clusters of children who were positive for multiple autoantibody types (n = 217), all of whom seroconverted at very young age; and 2) link differences in progression to defined longitudinal autoantibody profiles. The highest risks were found in 115 children who developed both stable-positive IAA and stable-positive IA-2A responses early in life (clusters mC6 and mC5). It is interesting that the presence (mC6) or absence (mC5) of stable-positive GADA responses did not influence the high risk in those children. On the contrary, risk was significantly lower for 102 children who seroconverted early and developed multiple autoantibodies but not stable-positive IAA and stable-positive IA-2A. This is in line with our previous observation in the BABYDIAB cohort that losing IAA reactivity is associated with delayed progression to type 1 diabetes in multiple autoantibody–positive children (21). Among clusters of children with similar autoantibody patterns, younger age at seroconversion was associated with faster progression to diabetes. An exception to this rule were children who developed stable-positive GADA but lacked IA-2A responses; they progressed relatively slowly regardless of age at seroconversion.
Of note, to develop an autoantibody response to GAD that was stable-positive over time, and therefore was presumably relevant for the individual immune phenotype and disease pathogenesis, the majority of such children seemed to require HLA-DR3. Associations have been reported between HLA-DR and β-cell autoantibody specificity (13,14,25,37–39). In particular, TEDDY study recently demonstrated that the presence of HLA-DR4 or HLA-DR3 strongly influenced the appearance of either IAA or GADA, respectively, as the first autoantibody in children (25,35). Our current data suggest an influence of HLA genotype on the longitudinal autoantibody profile. Likewise, male sex has been associated with IAA only as the first autoantibody in children (35). We observed here a predominance of boys among those children with longitudinal autoantibody profiles lacking stable-positive GADA responses, which requires further attention.
As a limitation of our study, longitudinal ZnT8A profiles could not be included in the current clustering analyses because of an incomplete time series of ZnT8A measurements, which otherwise would have considerably reduced our sample size. However, we considered the overall ZnT8A status of each child in our analysis. As expected, this revealed that some children in our single-autoantibody clusters in fact had developed ZnT8A as a second positive β-cell autoantibody. The strongest effect was found in the small cluster sC4, characterized by stable-positive IA-2A; in this cluster, four of five children (all male and carrying HLA-DR4) were ZnT8A-positive, and two have progressed to clinical diabetes. This illustrates that certain low-frequency immune patterns could be highly relevant to the disease. With respect to longitudinal GADA patterns, our study of children could underestimate the effect of these patterns on diabetes risk, given that GADA is associated with onset of type 1 diabetes at an older age (40). Another limitation is that the study population was highly selected for HLA-conferred risk of type 1 diabetes (23). Validation is therefore necessary in a study population that is not preselected and in cohorts of individuals who seroconverted to β-cell autoantibodies at an older age in order to ensure the wider applicability of our observations.
Altogether, our data support the notion that gene-environment interactions influence the individual pattern of β-cell autoantibodies (i.e., the pattern of main target autoantigens), the timing of their appearance, dynamic changes over time and progression to diabetes. It is possible that certain disease-promoting factors or conditions could act on genetically predisposed individuals only within certain age ranges. Identifying such etiological factors could potentially pave the way for new preventive therapies, and we believe that our analytical approach could prove useful in that search.
In conclusion, our novel wavelet-based clustering algorithm allows refined grouping of children who are positive for multiple β-cell autoantibody types. This data-driven approach can identify groups of children with distinct progression to clinical type 1 diabetes and provides new opportunities for elucidating complex disease mechanisms.
Article Information
Acknowledgments. The authors especially acknowledge all families participating in the TEDDY Study.
Funding. The TEDDY study is funded by the National Institute of Diabetes and Digestive and Kidney Diseases (U01-DK-63829, U01-DK-63861, U01-DK-63821, U01-DK-63865, U01-DK-63863, U01-DK-63836, U01-DK-63790, UC4-DK-63829, UC4-DK63861, UC4-DK-63821, UC4-DK-63865, UC4-DK-63863, UC4-DK-63836, UC4-DK-95300, UC4-DK-100238, UC4-DK-106955, UC4-DK-112243, UC4-DK-117483, and contract no. HHSN267200700014C), National Institute of Allergy and Infectious Diseases, Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institute of Environmental Health Sciences, JDRF, Centers for Disease Control and Prevention, and National Institutes of Health Clinical Center. This work was supported in part by the National Institutes of Health/National Center for Advancing Translational Sciences Clinical and Translational Science Awards to the University of Florida (UL1-TR-000064) and the University of Colorado (UL1-TR-001082). The funders had no impact on the design, implementation, analysis, and interpretation of the data.
Duality of Interest. No potential conflicts of interest relevant to this article were reported.
Author Contributions. D.E. performed the analysis. D.E. and W.z.C. developed the algorithm. D.E., W.z.C., E.B., M.R., W.A.H., J.-X.S., Å.L., J.T., K.V., A.J.K.W., L.Y., B.A., J.P.K., A.-G.Z., and P.A. attest to meeting the International Committee of Medical Journal Editors uniform requirements for authorship by making substantial contributions to conceiving and designing this paper; acquiring, analyzing, and interpreting the data; drafting or revising the article for intellectual content; and giving final approval of the published version. D.E., W.z.C., and P.A. interpreted the findings and wrote the manuscript. W.z.C. and P.A. proposed the analysis. E.B., M.R., W.A.H., J.-X.S., Å.L., J.T., K.V., A.J.K.W., L.Y., B.A., J.P.K., and A.-G.Z. acquired, analyzed, or interpreted data and reviewed and edited the manuscript for intellectual content. M.R., W.A.H., J.-X.S., Å.L., J.T., B.A., J.P.K., and A.-G.Z. designed TEDDY Study. W.z.C. and P.A. are the guarantors of this work and, as such, had full access to all the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis.