Plasma protein N-glycan profiling integrates information on enzymatic protein glycosylation, which is a highly controlled ubiquitous posttranslational modification. Here we investigate the ability of the plasma N-glycome to predict incidence of type 2 diabetes and cardiovascular diseases (CVDs; i.e., myocardial infarction and stroke).
Based on the prospective European Prospective Investigation of Cancer (EPIC)-Potsdam cohort (n = 27,548), we constructed case-cohorts including a random subsample of 2,500 participants and all physician-verified incident cases of type 2 diabetes (n = 820; median follow-up time 6.5 years) and CVD (n = 508; median follow-up time 8.2 years). Information on the relative abundance of 39 N-glycan groups in baseline plasma samples was generated by chromatographic profiling. We selected predictive N-glycans for type 2 diabetes and CVD separately, based on cross-validated machine learning, nonlinear model building, and construction of weighted prediction scores. This workflow for CVD was applied separately in men and women.
The N-glycan–based type 2 diabetes score was strongly predictive for diabetes risk in an internal validation cohort (weighted C-index 0.83, 95% CI 0.78–0.88), and this finding was externally validated in the Finland Cardiovascular Risk Study (FINRISK) cohort. N-glycans were moderately predictive for CVD incidence (weighted C-indices 0.66, 95% CI 0.60–0.72, for men; 0.64, 95% CI 0.55–0.73, for women). Information on the selected N-glycans improved the accuracy of established and clinically applied risk prediction scores for type 2 diabetes and CVD.
Selected N-glycans improve type 2 diabetes and CVD prediction beyond established risk markers. Plasma protein N-glycan profiling may thus be useful for risk stratification in the context of precisely targeted primary prevention of cardiometabolic diseases.
Cardiometabolic diseases, such as type 2 diabetes and cardiovascular disease (CVD; i.e., myocardial infarction, and stroke), constitute a major burden on public health systems. Primary prevention of cardiometabolic end points is therefore particularly advantageous to improve population health. Accurate risk prediction is an important tool to allocate resources for primary prevention to those who will maximally benefit.
Moreover, refined molecular phenotyping may differentiate between subgroups of high-risk individuals with distinct pathophysiological mechanisms of action. Thus, broadening the spectrum of molecular risk markers is a promising step toward precisely tailored prevention concepts. Specifically, it is required to discover biomarkers from biological domains that have thus far been underrepresented in risk prediction efforts.
N-Glycosylation denotes an essential and ubiquitous, complexly regulated, posttranslational modification at defined asparagine residuals in target proteins. N-glycosylation enhances both structural and functional protein diversity (1,2) and is fundamentally different from nonenzymatic glycation, reflected through HbA1c levels. Protein glycosylation is essential for self-recognition in biological systems, organization of intermolecular interactions, and determination of the spatial configuration of molecules (3). Plasma protein N-glycosylation patterns are partially genetically determined (4) and remarkably stable within healthy individuals (5). Composition of the plasma N-glycome can be reliably measured in chromatography-based high-throughput profiling approaches (6). Altogether, these facts suggest potential clinical utility of monitoring N-glycan patterns as early indicators of pathogenic processes.
Alterations in N-glycan profiles were cross-sectionally linked to various pathological conditions, such as obesity (7), inflammation (8), type 2 diabetes (9), and cardiovascular risk factors (10), indicating a potential link between N-glycan profiles and cardiometabolic risk. However, powerful prospective investigations that link N-glycomics to the onset of cardiometabolic end points are scarce.
Here we present the first human population study on the predictive value of the total plasma protein N-glycome for future incidence of major cardiometabolic end points. We developed plasma N-glycan–based risk scores to predict type 2 diabetes and CVD in baseline-healthy participants of the European Prospective Investigation into Cancer and Nutrition (EPIC)-Potsdam study and evaluated the predictive performance of these scores in comparison with established risk prediction tools.
Research Design and Methods
The prospective EPIC-Potsdam cohort study includes 27,548 participants (16,644 women and 10,904 men), who were recruited within an age range of 35–65 years from the general population between 1994 and 1998 (11). Baseline assessments comprised anthropometric measures and blood sampling by qualified medical personnel. Blood pressure was measured with the individual sitting with the arm elevated at heart level by using oscillometric devices (BOSO-Oscillomat; Bosch & Sohn, Jungingen, Germany), and the mean of the second and third reading was used. Lifestyle, sociodemographic characteristics, and current health status were assessed with validated questionnaires and in face-to-face interviews (12). Participants were then actively contacted by sending out questionnaires and, if necessary, by telephone every 2–3 years with response rates between 90 and 96% per follow-up round (13). All participants gave informed consent for biomedical research, and the study was approved by the Ethics Committee of the State of Brandenburg, Germany (11).
Nested case-cohorts were constructed for efficient studies into molecular phenotypes (Supplementary Fig. 1). From all participants who provided blood at baseline (n = 26,437), a random sample (subcohort, n = 2,500) was drawn. In addition, all incident diabetes and CVD cases that occurred in the full cohort until a specified censoring date were included. Statistically accounting for the oversampling of cases, this design provides unbiased risk estimates for the full cohort.
For type 2 diabetes, the censoring date was 31 August 2005 (820 incident cases). After exclusion of participants with missing follow-up information, prevalent diabetes at recruitment, insufficient blood specimens, or nonverifiable information on diabetes incidence, the analytical sample comprised 2,813 participants (1,578 women and 1,235 men), including 743 participants with incident type 2 diabetes, from whom 64 were randomly included in the subcohort. The median follow-up time for type 2 diabetes was 6.5 years (interquartile range 6–8.6 years). This study sample was split into a training set (two-thirds of the population) and a validation set (one-third of the population) applying random selection within strata according to sex and case status.
For CVD, 508 incident primary cardiovascular events occurred until censoring date (31 November 2006). After equivalent exclusions (using prevalent and nonverifiable CVD instead of diabetes as the exclusion criterion), the CVD sample comprised 2,558 participants (1,481 women and 1,077 men), including 418 participants with incident CVD (44 randomly included in the subcohort). The median follow-up time for CVD was 8.3 years (interquartile range 7.5–9.2 years).
Hints of potential incidence of diabetes and CVD were detected based on self-reports of disease diagnosis, of disease-related medication or dietary treatments in the regular follow-up procedure, or based on information from other health-related data (such as death certificates for CVD). For all participants with a potential incident end point, the diagnosis was verified by contacting the treating primary care physician. Only participants with physician-verified diagnosis of type 2 diabetes (ICD-10 code: E11), myocardial infarction or stroke (ICD-10 codes: I21 for acute myocardial infarction, I63.0–I63.9 for ischemic stroke, I61.0–I61.9 for intracerebral hemorrhage, I60.0–I60.9 for subarachnoid hemorrhage, and I64.0–I64.9 for unspecified stroke) and a diagnosis date after the baseline examination were considered incident cases. For CVD, the World Health Organization MONICA (MONItoring Trends and Determinants on CArdiovascular Diseases) criteria were applied to classify events into possible, probable, and definite (for further details see Supplementary Note 1).
Individual plasma samples (10 μL) were denatured by incubation with SDS (Invitrogen) at 65°C for 10 min. N-Glycans were then released with the addition of 1.2 units of peptide:N-glycosidase F (Promega) and overnight incubation at 37°C (14). The released N-glycans were labeled with 2-aminobenzamide (Sigma-Aldrich), separated by hydrophilic interaction liquid chromatography on an Acquity Ultra-Performance Liquid Chromatography (UPLC Technology) instrument (Waters), and quantified with a fluorescence detector. Data were processed using an automated integration method (15), and chromatograms were separated into 39 glycan peaks (GPs), where the integrated area under each peak relative to the total integrated area was used as a measure of relative abundance of the designated N-glycan group (Supplementary Note 2 and Supplementary Fig. 2). Supplementary Table 1 matches the detected peaks to the most abundant glycan structures. Samples were randomly distributed across batches, and laboratory personnel were blinded to case status. Total cholesterol, HDL cholesterol, random glucose, and HbA1c were measured using an automatic ADIVA 1650 analyzer (Siemens Medical Solutions, Erlangen, Germany) at the University Clinic Tübingen, Tübingen, Germany.
First, N-glycan data were adjusted for age by applying multiple fractional polynomial (MFP) regression separately in men and women. The age-adjusted residuals were used to construct an N-glycan score to predict type 2 diabetes by applying the following steps (Supplementary Fig. 3 and Supplementary Note 3):
Machine learning–based preselection: N-glycan type 2 diabetes risk markers were preselected by applying random survival forest (RSF) in combination with a backward selection procedure in the training set (16).
Building the model: MFP testing in Cox proportional hazards models was used to omit redundant information and select the optimal power transformation for nonlinear associations (17).
Deriving a weighted score: All GPs were variance standardized (µ = 0, SD = 1). Betas from the Cox model from step 2 were used as weights to build the glycan-based type 2 diabetes prediction score (GST2D).
Type 2 diabetes prediction with the GST2D was evaluated in the independent validation cohort. Several models were assessed for their ability to predict type 2 diabetes incidence:
an N-glycan score based on age-adjusted, type 2 diabetes–related N-glycans (GST2D)
a weighted combination of the GST2D and age (GST2D+age)
plasma HbA1c and random glucose concentrations (GlucMarkers)
a weighted combination of the GST2D and GDRS (GST2D+GDRS)
a weighted combination of the GST2D and GlucMarkers (GST2D+ GlucMarkers)
The GDRS relies on information on age, sex, anthropometry (waist circumference and height), lifestyle (physical activity, smoking status, and consumption of coffee, red meat, and whole grain), hypertension, and family history of diabetes.
The CVD analyses were stratified by sex and did not include an internal validation step because of power considerations. Based on the hypothesis of shared risk factors between diabetes and CVD, we first tested whether the GST2D was also linked to cardiovascular risk. In parallel, an exploratory selection of potential CVD-specific predictors was conducted analogous to the selection of diabetes-related N-glycans, with minor modifications (Supplementary Fig. 4):
Machine learning–based preselection: RSF selection was applied in men and women separately.
Building the model: MFP was used to combine exploratory selected N-glycans from step 1 with the GST2D (only included if significantly associated with CVD risk) allowing for nonlinearity.
Deriving a weighted score: Combination of predictors into a weighted score was only applied if more than one predictor was selected.
The recalibrated version of the American Heart Association (AHA) score (20) was used as an established prediction tool for comparison. The AHA score includes information on age, total and HDL cholesterol, systolic blood pressure, antihypertensive medication, smoking, and prevalent type 2 diabetes and relies on sex-specific risk equations.
To evaluate model predictions, the receiver operating characteristics (ROC) curve was used to visualize adequacy of model predictions, weighted C-indices (accounting for the case-cohort design) were estimated to evaluate the model discrimination between cases and noncases, and the weighted continuous net reclassification index (NRI) was estimated to evaluate whether additional information on the N-glycans improved disease prediction with established risk prediction tools (21).
Statistical analyses were performed in SAS 9.4 and R 3.3.2 software. For RSFs we used the randomForestSRC package in R (https://cran.r-project.org/web/packages/randomForestSRC/index.html). For MFP modeling we used a modified version of the SAS macro by Sauerbrei et al. (https://mfp.imbi.uni-freiburg.de/software). For the estimation of weighted C-indices and weighted NRIs, we used SAS macros by Cook et al. (https://ncook.bwh.harvard.edu/sas-macros.html).
The distribution of baseline characteristics in EPIC-Potsdam participants is provided in Table 1. Among participants with incident cardiometabolic end points, median age, waist circumference, and total cholesterol were higher, while median height, HDL cholesterol plasma concentrations, and whole grain consumption were lower compared with the cohort-representative subcohort. Hypertension was more prevalent among participants with incident cardiometabolic diseases. Current smoking at baseline was most prevalent in participants with incident CVD in the study course. The distribution of total plasma protein N-glycans at baseline in the EPIC-Potsdam cohort and age-adjusted and sex-stratified intercorrelation of the assessed N-glycans are shown across participants with and without incident cardiometabolic diseases in Supplementary Table 2 and Supplementary Figs. 5–7. Sex-stratified and age-adjusted correlations of GPs with blood lipids (total and HDL cholesterol and triglycerides) are reported in Supplementary Table 3.
Glycan-Based Prediction of Type 2 Diabetes Incidence
Stepwise RSF selection in the training set indicated that 8 of 39 N-glycans were informative with regard to time to type 2 diabetes incidence, namely, GP10, GP18, GP20, GP22, GP26, GP32, GP35, and GP36 (Supplementary Fig. 8). MFP modeling included GP20, GP26, and GP36 as linear predictors and GP18, GP32, and GP35 as nonlinear terms (, GP322, and ), while GP10 and GP22 were omitted. After influential observations were excluded, this model was used to estimate standardized betas yielding the following N-glycan score equation for type 2 diabetes prediction:
In both the training and the validation cohort, GP36 was related to lower relative risk of type 2 diabetes, whereas GP20 and GP26 were linked to higher risk. For GP18 and GP35, the power transformation reversed the order of values, so that the positive and negative risk estimate for the transformed terms translate into lower GP18-related and higher GP35-related type 2 diabetes risk with higher values on the original scale. GP32 was related to lower risk in the training cohort but was noninformative for type 2 diabetes risk in the validation cohort. The type 2 diabetes hazard ratio (HR) for 1 SD higher GST2D in the validation cohort was 3.73 (95% CI 3.0–4.63) (Table 2).
ROC curves of the GST2D and GST2D+age are displayed in Fig. 1, in comparison with the GDRS (Fig. 1A) and established clinical type 2 diabetes risk markers (HbA1c and blood glucose concentrations) (Fig. 1B) in the validation cohort. The C-index of 0.83 (95% CI 0.78–0.88) for type 2 diabetes prediction with GST2D in the validation cohort indicated strong discriminatory ability of the selected age-adjusted plasma N-glycans alone. Addition of age (GST2D+age) improved the N-glycan–based type 2 diabetes prediction (C-index 0.87, 95% CI 0.82–0.91), which was comparable to the predictive performance of the GDRS (C-index 0.88, 95% CI 0.82–0.92). A weighted combination of the two scores (GST2D+GDRS) conferred better discrimination, as indicated by a C-index of 0.9 (95% CI 0.85–0.94). Improvement of the model predictions by GST2D+GDRS compared with the reference model (GDRS) was corroborated by a positive NRI of 0.62 (95% CI 0.53–0.71). Comparison of the reclassification indices among participants with incident type 2 diabetes (RIcases 0.42, 95% CI 0.37–0.46) and without type 2 diabetes incidence (RInoncases 0.2, 95% CI 0.19–0.22) clarified that the net reclassification improvement was predominantly driven by assignment of higher risk to cases (i.e., improved model sensitivity) (Table 3).
Similarly, the combination of GST2D with HbA1c and glucose (C-index 0.91, 95% CI 0.87–0.95) outperformed type 2 diabetes prediction with these established clinical risk markers alone (C-index 0.88, 95% CI 0.83–0.92). Again, net reclassification analyses underpinned the added predictive value of glycans compared with HbA1c+glucose alone (NRI 0.87, 95% CI 0.79–0.94) and clarified that this model improvement was dominated by higher sensitivity (i.e., assignment of higher risk to participants with incident type 2 diabetes) (Table 3).
Glycan-Based Prediction of Incident Cardiovascular Events
We first tested whether the GST2D was also linked to cardiovascular risk in sex-stratified analyses in the case-cohort. In men, higher GST2D was associated significantly with higher risk of cardiovascular events. In addition, the exploratory selection procedure suggested CVD predictive ability of GP16, GP23, and GP29 in men (Supplementary Fig. 9). The association of GP29 was rendered nonsignificant when additionally adjusted for GST2D. Therefore, the N-glycan–based CVD score for men (GSCVD) was constructed as a weighted combination of GP16 (associated with lower CVD risk) and GP23 and GST2D (both associated with higher CVD risk). Higher GSCVD points were associated with a higher risk of cardiovascular events in men in an age-adjusted model (HR per SD 1.89, 95% CI 1.57–2.27) (Table 2). The weighted C-index for GSCVD in men of 0.66 (95% CI 0.6–0.72) suggested moderate discrimination of age-adjusted N-glycans alone. The C-index of 0.71 (95% CI 0.65–0.77) for GSCVD+age corresponds to good discrimination (Fig. 1C). However, discrimination by the AHA score was slightly better (C-index 0.73, 95% 0.67–0.78), and the combination of GSCVD+AHA score did not yield a higher C-index (0.72, 95% CI 0.66–0.78). Still, closer inspection of model improvement (GSCVD+AHA score vs. AHA score) with the NRI (0.31, 95% CI 0.24–0.39) suggested improved prediction due to increased specificity (i.e., assignment of lower risk to participants without incident cardiovascular events) (Table 3).
In women, GST2D was not independently associated with CVD risk, and exploratory selection pointed at GP5 as a CVD risk marker (Supplementary Fig. 10). GP5 was inversely associated with incident cardiovascular events in women (HR per SD 0.7, 95% CI 0.57–0.86). The C-index for GP5 (0.64, 95% CI 0.55–0.73) indicated moderate predictive ability of this N-glycan alone, while good model predictions were generated for GP5+age (C-index 0.75, 95% CI 0.66–0.82) (Fig. 1D). Discrimination by the AHA score (C-index 0.74, 95% CI 0.65–0.82) was comparable to GP5+age. A combination of GP5 and the AHA score in women yielded a higher C-index of 0.77 (95% CI 0.69–0.85). The NRI of 0.58 (95% CI 0.49–0.67) corroborated net model improvement by combining GP5 with the AHA score. Higher sensitivity was the main driver of the improved predictions (RIcases 0.5, 95% CI 0.46–0.54) (Table 3).
External Validation and Sensitivity Analyses
We validated the association with type 2 diabetes risk in a case-control study nested within the Finland Cardiovascular Risk Study (FINRISK) cohort, including 37 participants with incident type 2 diabetes and 37 age- and sex-matched participants who remained diabetes free and served as control subjects (study sample was described elsewhere ). In this sample, 1 SD higher GST2D was associated with a doubling of type 2 diabetes risk (odds ratio 2.03, 95% CI 1.13–3.63), and the C-index of 0.76 (95% CI 0.65–0.87) indicated strong type 2 diabetes prediction over 10 years (Supplementary Fig. 11).
Sensitivity analyses demonstrated that adjustment for lipid lowering and antihypertensive medication did not substantially alter the association of glycan scores with cardiometabolic risk. The CVD results were robust against exclusion of participants with prevalent diabetes at baseline and after restriction to participants with a definite cardiovascular event according to the MONICA criteria. Associations of GSCVD (men) and GP5 (women) did not substantially differ between stroke and myocardial infarction (Supplementary Table 4).
From a panel of 39 N-glycan groups originating from total plasma proteins, 6 N-glycan residuals exhibited strong, robust, and mutual independent associations with type 2 diabetes risk. A score based on these N-glycans was strongly predictive for type 2 diabetes incidence and improved risk prediction with known risk factors. The strong link between the glycan score and type 2 diabetes risk was replicated in internal and external validation studies. Our results further suggested that information on age-adjusted N-glycans alone was moderately predictive for incident CVD and that additional information on N-glycans may improve CVD prediction with the established clinical AHA score. In summary, we provide first-line evidence that plasma N-glycome screening generates valuable markers for the risk of developing type 2 diabetes and CVD.
The cross-sectional link between plasma protein glycosylation patterns and type 2 diabetes and cardiometabolic risk markers is well documented (8,9,22–24), and a small case-control study reported a link between baseline N-glycan profiles and type 2 diabetes risk (22). Our internally validated findings now demonstrate that a plasma N-glycan score, GST2D, accurately predicts type 2 diabetes incidence. Moreover, we externally validated these results in the independent FINRISK study.
Furthermore, we showed that the GST2D improved prediction with the widely applied GDRS (25) and with plasma HbA1c and random blood glucose concentration. The probable source proteins of the type 2 diabetes–related N-glycans comprise (26) α1-acid glycoprotein (branched, highly sialylated GP26, GP32, GP35, and GP36) and transferrin, hemopexin, and α-antitrypsin (GP18 and GP20). Through transport, signaling, and intermolecular interactions, these proteins are involved in different aspects of metabolism and immunity (26) and were previously associated with increased risk of several diseases (27), including type 2 diabetes (28–30) and CVD (31–33). Importantly, altered glycosylation of plasma proteins affects their function. A recent study, for example, showed that incomplete glycosylation of lipoproteins leads to accumulation of cholesterol in cells (34). Prospective human studies with integrated glycomics and glycoproteomics data will be able to address the interaction between protein concentration and glycosylation patterns in relation to disease risk and are thus highly warranted.
In sex-stratified analyses, age-adjusted plasma N-glycans were moderately predictive for CVD incidence. Moreover, N-glycans improved CVD prediction with the AHA score, which combines age with the most established clinical risk factors for CVD. GP5 particularly assigned higher risk to women with incident CVD. Among men, the net reclassification improvement was determined by assigning lower risk to participants without incident cardiovascular events, which may be less relevant from a clinical standpoint. The probable source proteins of exclusively CVD-related GPs are IgG (GP5 and GP16) and IgA (GP23). Hence, type 2 diabetes–related markers were also useful for CVD prediction in men, which may be interpreted as a hint toward pathways related to both end points. The fact that all CVD-specific glycan markers stemmed from Ig’s suggests a particular role of glycosylation-dependent immune response in CVD etiology.
Several studies suggest that protein N-glycosylation is mechanistically involved in type 2 diabetes etiology. For instance, the ST6GAL1 gene has been linked to type 2 diabetes risk, where its risk variant was related to increased expression of the encoded α2,6-sialyltransferase in β-cells and to decreased stimulated insulin secretion (35). It was also recently shown in mouse models that improper sialylation of IgG can cause diabetes (36). Other candidate pathways that may link the N-glycome to cardiometabolic diseases comprise glycosylation-dependent protein trafficking (e.g., surface distribution of GLUTs) (37,38), the hexosamine pathway (39), and adaptive immunity (40). N-glycan–based biomarkers may thus reflect the activity of intracellular pathways that predetermine diabetes onset, and integrated molecular epidemiological and experimental studies to elucidate these pathways are highly encouraged by our observations.
To our knowledge, this is the largest prospective study linking N-glycomics to cardiometabolic end points. Limitations of this study are related to sample size and validation of glycan-based CVD prediction, to the limited ability to test robustness of glycan-based disease prediction across different ethnicities, and to the laboratory measurement approach, and will be discussed in the following.
The link between GST2D and type 2 diabetes risk in the external FINRISK cohort was strong, but the C-index was slightly lower compared with EPIC-Potsdam analyses. However, the matched case-control design, which is not optimal to generate generalizable prediction estimates, and some imprecision due the relatively small sample size in FINRISK, may explain these minor differences. Overall, our results support the generalizable value of plasma N-glycans for type 2 diabetes prediction in populations of middle-European descendance. However, validation of our findings in other races and ethnicities is warranted.
The power to detect CVD-related N-glycans was lower due to the sex-stratified analysis design and slightly fewer incident cases, and we therefore renounced the sample split. Cross-validated RSF should still have selected robust predictors. Nonetheless, external validation of our CVD results is warranted.
It is noteworthy that some of the N-glycans were highly correlated so that the selection of other, closely related N-glycans as risk markers may have resulted in similarly accurate predictions. Moreover, we explored N-glycan residuals from total plasma proteins for cardiometabolic risk markers, which implies that the interpretation of our findings remains somewhat speculative with regard to the source proteins. Our observations, for example, suggest a targeted investigation of Ig-specific N-glycosylation patterns in relation to CVD risk.
In this work we have demonstrated that profiling of the total plasma protein N-glycome generates highly informative predictors of the future onset of cardiometabolic end points. The N-glycan–based prediction of type 2 diabetes incidence was stronger, and N-glycan predictors for cardiovascular end points only partially overlapped with those related to type 2 diabetes. Moreover, our results suggest that N-glycan biomarkers improve prediction of type 2 diabetes and CVD when combined with established risk scores. From a biological perspective, specific N-glycan structures likely reflect distinct disease-related pathways. We therefore conclude that plasma N-glycan biomarkers should complement the toolbox of novel biomarkers that are considered for risk stratification in precision prevention approaches.
M.B.S. and G.L. share senior authorship.
Acknowledgments. The authors thank the Human Study Centre (HSC) of the German Institute of Human Nutrition Potsdam-Rehbrücke, namely, the trustee and the data hub, for data processing, the biobank for the processing of biological samples, and Manuela Bergmann, the head of the HSC, for contribution to the study design and leading the underlying processes of data generation. Furthermore, the authors thank all EPIC-Potsdam participants for their invaluable contribution to the study.
Funding. The work was supported by the Federal Ministry of Science, Germany (grant no. 01 EA 9401) and the European Union (grant no. SOC 95201408 05 F02) for the recruitment phase of the EPIC-Potsdam study, by the German Cancer Aid (grant no. 70-2488-Ha I) and the European Community (grant no. SOC 98200769 05 F02) for the follow-up of the EPIC-Potsdam study, and by a grant from the German Federal Ministry of Education and Research (Bundesministerium für Bildung und Forschung) to the German Center for Diabetes Research (DZD, grant 82DZD00302). Plasma glycome analysis was performed in the Genos Glycoscience Research Laboratory and partly supported by the European Union’s Horizon 2020 Framework Programme grants IMforFUTURE (grant agreement no. 721815) and GlySign (grant agreement no. 722095) as well as by the European Structural and Investment Funds IRI grant (no. KK.01.2.1.01.0003), CEKOM grant (no. KK.01.2.2.03.0006), and Croatian National Centre of Research Excellence in Personalized Healthcare grant (no. KK.01.1.1.01.0010).
Duality of Interest. G.L. is the founder and chief executive officer of Genos Ltd, a private research organization that specializes in high-throughput glycomics analysis and has several patents in this field. F.V., J.Š., and O.G. are employees of Genos Ltd. No other potential conflicts of interest relevant to this article were reported.
Author Contributions. C.W., O.G., M.B.S., and G.L. designed the study. T.Š., N.R., F.V., J.Š., and D.R. were involved in laboratory measurements and processing of the glycomics data. C.S. and S.D. contributed to prediction methods. C.W. and O.K. conducted the statistical analysis. C.W. drafted the manuscript. C.W., M.B.S., and G.L. are responsible for the content of the manuscript. All authors contributed to the interpretation of the results and critically revised the manuscript. C.W. is the guarantor of this work and, as such, had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.