A quarter of the world’s population is estimated to meet the criteria for metabolic syndrome (MetS), a cluster of cardiometabolic risk factors that promote development of coronary artery disease and type 2 diabetes, leading to increased risk of premature death and significant health costs. In this study we investigate whether the genetics associated with MetS components mirror their phenotypic clustering. A multivariate approach that leverages genetic correlations of fasting glucose, HDL cholesterol, systolic blood pressure, triglycerides, and waist circumference was used, which revealed that these genetic correlations are best captured by a genetic one factor model. The common genetic factor genome-wide association study (GWAS) detects 235 associated loci, 174 more than the largest GWAS on MetS to date. Of these loci, 53 (22.5%) overlap with loci identified for two or more MetS components, indicating that MetS is a complex, heterogeneous disorder. Associated loci harbor genes that show increased expression in the brain, especially in GABAergic and dopaminergic neurons. A polygenic risk score drafted from the MetS factor GWAS predicts 5.9% of the variance in MetS. These results provide mechanistic insights into the genetics of MetS and suggestions for drug targets, especially fenofibrate, which has the promise of tackling multiple MetS components.
Introduction
Metabolic syndrome (MetS) is defined by the International Diabetes Federation and American Heart Association/National Heart, Lung, and Blood Institute as the presence of three out of five symptoms, including elevated fasting glucose (FG), low HDL cholesterol (HDL-C), high diastolic or systolic blood pressure (SBP), elevated triglycerides (TG), and increased waist circumference (WC) (1). MetS confers increased risk of coronary artery disease and type 2 diabetes (2,3) and is related to elevated risk for cholelithiasis and mental disorders, amongst others (4–6). The prevalence of MetS has increased rapidly among U.S. adults, from 25.3% in 1988–1994 to 38.3% in 2017–2018 (7–9). MetS is caused by a combination of environmental factors, such as a sedentary lifestyle and poor diet, and genetic factors (10), which are not yet completely understood. A single causative etiology to MetS has not been established; both obesity and insulin resistance have been thought to be at the core of MetS (11). Excess adipose tissue (especially abdominal) is known to release products that directly influence hypertension, hyperglycemia, and dyslipidemia (11). On the other hand, insulin-resistant muscle tissue can contribute to the development of MetS components as well (11). Twin heritability estimates of the individual MetS components range between 0.33 (TG) and 0.40 (HDL-C) (12). More recently, the largest genome-wide association study (GWAS) of MetS, by Lind (13), identified a number of genetic loci associated with MetS risk. This GWAS was run on 291,107 individuals with self-reported British descent and European ethnicity. MetS status was based on similar criteria as described above.
While MetS is often studied based on phenotypic clustering, in this study we focus on genetic clustering of its components, by using structural equation modeling (SEM). SEM is a multivariate statistical analysis technique that combines factor analysis and a structural (path) model (Supplementary Material). Next to its potential to explore latent constructs and their relationships with measured variables, an advantage of SEM is its ability to examine many (complex) relationships simultaneously while accounting for measurement error (14). Combined with genetics, SEM can model the shared genetic architectures across phenotypes into one or more latent factors and identify variants with effects on this shared dimension (15).
It is well accepted and established that MetS, a combination of several risk factors, carries a greater risk for adverse clinical outcomes than a single risk factor (16,17). Furthermore, it is known that when one of the MetS components is present within an individual, e.g., increased WC, the chance that another is present as well is highly elevated. As MetS is not a disease that can be directly measured but, rather, is estimated based on phenotypic clustering of its components, this study exploits SEM to investigate whether genetics associated with the individual components show clustering as well. The following questions are asked: (1) To what extent are the genetic/biological elements for individual MetS components shared? (2) Can we leverage this genetic overlap to detect additional genetic loci associated with MetS? And (3) can we investigate biological features underlying MetS? To answer these questions, we applied genomic SEM (15) on previously published GWAS summary statistics to scrutinize the genetic factor structure underlying the MetS components. Having established one common factor using genomic SEM, we run a GWAS on this common MetS factor, identify functional features of associated single nucleotide polymorphisms (SNPs), and investigate overlap between the common MetS factor and the individual components. By addressing these questions, we aim to provide insight into the genetic architecture underlying MetS.
Research Design and Methods
This study was performed according to a preregistered analysis plan (https://osf.io/kwq27/).
Input Summary Statistics
To maximize sample size, we selected GWAS reflecting a continuous spectrum of the MetS components FG, HDL-C, SBP, TG, and WC. Genomic SEM requires that GWAS be run with genome-wide arrays and so including no individuals genotyped on a targeted chip such as MetaboChip (15). We selected GWAS run on Europeans only. Furthermore, we choose only GWAS that were not corrected for BMI. Correcting for BMI is often performed, as metabolic traits are highly genetically correlated, but for our study this would not result in reflection of the underlying genetic structure properly because this would alter the correlation between WC and the other MetS components. Selected GWAS used in our main analyses are outlined in Supplementary Material, and a full description of methods can be found in the individual manuscripts.
Cross Trait Genetic Correlations
Observed scale SNP heritabilities of the MetS components and genetic correlations between the components and other traits and diseases were estimated with LD Score regression (LD SCore [LDSC] software]) (18) (Supplementary Material). Observed scale SNP heritability was used instead of the customary liability scale heritability for binary phenotypes, for comparative purposes but also because the latter requires sample prevalence of a phenotype, which we could not estimate for the MetS factor. Observed scale SNP heritability assumes an underlying continuous liability, which is likely to be the case in MetS.
Exploratory and Confirmatory Factor Analysis
We used genomic SEM (15) to model multivariate genetic associations among MetS components based on genetic correlations and SNP heritabilities derived from GWAS summary statistics. Genomic SEM is not biased by sample overlap. In the first stage, the genetic covariance matrix, derived from LDSC, and sampling matrix for the five MetS components were estimated in genomic SEM. Quality control for this step consisted of removing SNPs with a minor allele frequency (MAF) <1% (when available), INFO score <0.9 (when available), SNPs from the MHC region, and SNPs not present in HapMap 3. MAF and INFO scores were not always available in the individual GWAS summary statistics, but filtering SNPs to HapMap 3 should ensure a set of relatively common SNPs of good quality.
First, an exploratory factor analysis (EFA) was performed to investigate how many factors were needed to describe the observed genetic covariance matrix between the five MetS components. For this EFA, promax rotation was used in the R factanal package (19). A scree plot was generated with the R nFactors package (20). As a one-factor model was suggested, a confirmatory factor analysis was run within genomic SEM to establish how well this one-factor model fitted the data. Model fit, used to evaluate the extent to which the model-implied covariance matrix approximates the empirical, observed covariance matrix, is considered good with comparative fix index values >0.95 (21). Furthermore, standardized root mean square values <0.10 are considered acceptable fit and <0.05 good fit (15).
Common Factor GWAS and Functional Annotation
Using the univariate genetic summary statistics for each of the components as indicators of the MetS genetic common factor, we used genomic SEM to run a GWAS on the MetS factor. Following genomic SEM guidelines, quality control for this step involved restricting to SNPs with an INFO score >0.6 (when available) and to SNPs present in the 1000 Genomes phase 3 European reference data set with a MAF of >1% in the reference panel (22). The qqman package in R was used to generate a Manhattan plot (23). Summary statistics and code are publicly available via https://ctg.cncr.nl/software/summary_statistics. Summary statistics obtained from the common factor MetS GWAS and those from the individual risk factor GWAS were submitted to FUMA v1.3.7 (24) for exploration of functional consequences of all candidate SNPs, with use of default parameters. Additionally, MAGMA, version 1.08 (25), was used in FUMA to perform gene-based, gene-set, and gene-property analyses (Supplementary Material). To further specify expression of identified genes, we used FUMA to run cell-type analyses with expression data from mouse and human brain single-cell RNA–sequencing analyses (26).
Polygenic Prediction
We used common factor MetS and MetS components summary statistics to predict variance explained of FG, HDL-C, SBP, TG, and WC measurements as well as MetS diagnosis in an external data set, from the Framingham Heart Study (FHS), using polygenic risk scoring. Quality control is described in Supplementary Material. Because FHS subjects were included in the GWAS of FG, HDL-C, and TG that were used in this study, we used Erase Sample Overlap and Relatedness (EraSOR) (27) to correct for sample overlap (14% for FG and 7% for HDL-C and TG) (Supplementary Material). We then calculated polygenic risk scores (PRS), which consist of the sum of GWAS alleles weighted by their effect sizes, to predict MetS diagnosis and MetS component measurements in FHS, with PRSice-2 (28). We used summary statistics from the MetS components (adjusted with EraSOR for FG, HDL-C, and TG because of sample overlap with FHS), the EraSOR-adjusted MetS factor, and Lind GWAS, with default parameters. Sex, age2, and the first 20 principal components were added as covariates. Finally, we investigated the ability to predict MetS diagnosis with a logistic regression model containing the MetS factor PRS and covariates and a model with all the MetS components PRS and covariates. We calculated Nagelkerke R2 and compared the performance of the two models with a likelihood ratio test using the rms package in R (20,29).
Drug Gene Set Analysis
For identification of drugs associated with MetS that may be candidates for treatment, genetically informed drug repurposing was performed with use of DRUg Gene SEt Analysis (DRUGSEA) (30), a software tool for performing drug–gene set analysis. DRUGSEA tests drug-phenotype associations using competitive gene set analysis in MAGMA (25) (Supplementary Material).
Data and Resource Availability
The data set generated and analyzed during the current study are available from the CTG Lab repository (https://ctg.cncr.nl/software/summary_statistics/). The resources used during the current study are available under the following: summary statistics, FG https://magicinvestigators.org/downloads/, HDL-C and TG https://csg.sph.umich.edu/willer/public/lipids2010/, SBP https://atlas.ctglab.nl/traitDB/3380, WC https://atlas.ctglab.nl/traitDB/3185, coronavirus disease 2019 (COVID-19) https://www.covid19hg.org/results/, coronary artery disease and cholelithiasis https://www.leelabsg.org/resources/, type 2 diabetes https://diagram-consortium.org/downloads.html, MetS GWAS by Lind https://www.ebi.ac.uk/gwas/efotraits/EFO_0000195, schizophrenia https://walters.psycm.cf.ac.uk/, other psychiatric traits https://www.med.unc.edu/pgc/download-results/, smoking initiation and alcoholic drinks per week https://conservancy.umn.edu/handle/11299/201564, insomnia https://ctg.cncr.nl/software/summary_statistics, age at first birth, number of children, and educational attainment https://www.thessgac.com/, and height https://portals.broadinstitute.org/collaboration/giant/index.php/GIANT_consortium_data_files#GWAS_Anthropometric_2014_Height_Summary_Statistics. Software includes LD Score regression, https://github.com/bulik/ldsc; genomic SEM, https://github.com/genomicsem/genomicsem; linkage disequilibrium information 1000 Genomes EUR phase 3 and HapMap 3 SNP list, https://utexas.app.box.com/s/vkd36n197m8klbaio3yzoxsee6sxo11v; FUMA, https://fuma.ctglab.nl/; MAGMA, https://ctg.cncr.nl/software/magma; EraSOR, https://choishingwan.gitlab.io/EraSOR/; and PRSice-2, https://choishingwan.github.io/PRS-Tutorial/prsice/.
Results
Genetic Correlations Between MetS Components
Genetic correlation is a quantitative parameter that describes the genetic relationship between two traits or between two GWAS on the same trait. Genetic correlations between GWAS summary statistics of the MetS components (Table 1 and Supplementary Table 1), calculated with LD Score regression (LDSC) (18), are negative for HDL-C, as expected, as decreased HDL-C is a MetS component (11), and positive for all others. The highest correlation is between HDL-C and TG (rG = −0.60 ± 0.07) (Fig. 1A and Supplementary Tables 2 and 3). SBP has relatively the lowest correlation with other components, confirming conjectures that blood pressure is less “metabolic” than other components (11). Despite the varying magnitude of genetic correlation between MetS components, all except one correlation are significant (Fig. 1A and Supplementary Table 3), suggesting that a shared genetic structure underlies MetS.
Genetic correlations between MetS components, confirmatory factor analysis, and common factor GWAS of MetS. A: Genetic correlations between the MetS components, calculated with LDSC. *Genetic correlation between two components is significant at the multiple testing threshold (P < 0.005 [0.05 of 10 correlations]). B: Path diagram of the standardized common factor model estimated with genomic SEM. SEs are shown within parentheses. C: Manhattan plot of the MetS factor GWAS. For each chromosome with a lead SNP with a P value <1e−20, the protein coding gene that is in closest proximity to the strongest SNP is shown (one per chromosome). The red line indicates whether a SNP is genome-wide significant, shown here as the negative log10-transformed P value on the y-axis. The y-axis is limited to −log10(P) = 100 to improve visualization (maximum −log10(P) = 151.29 for the locus near FTO). The subscript G is used to indicate genetic variables. U reflects the variance in the MetS components not explained by the common factor.
Genetic correlations between MetS components, confirmatory factor analysis, and common factor GWAS of MetS. A: Genetic correlations between the MetS components, calculated with LDSC. *Genetic correlation between two components is significant at the multiple testing threshold (P < 0.005 [0.05 of 10 correlations]). B: Path diagram of the standardized common factor model estimated with genomic SEM. SEs are shown within parentheses. C: Manhattan plot of the MetS factor GWAS. For each chromosome with a lead SNP with a P value <1e−20, the protein coding gene that is in closest proximity to the strongest SNP is shown (one per chromosome). The red line indicates whether a SNP is genome-wide significant, shown here as the negative log10-transformed P value on the y-axis. The y-axis is limited to −log10(P) = 100 to improve visualization (maximum −log10(P) = 151.29 for the locus near FTO). The subscript G is used to indicate genetic variables. U reflects the variance in the MetS components not explained by the common factor.
Selected summary statistics of MetS components
Trait . | N . | Genomic risk loci . | SNP heritability, h2 (SE) . | Reference . |
---|---|---|---|---|
FG | 46,186 | 13 | 0.09 (0.02) | Dupuis et al. (2010) (54) |
HDL-C | 99,000 | 50 | 0.11 (0.02) | Teslovich et al. (2010) (55) |
SBP | 361,402 | 255 | 0.13 (0.01) | Watanabe et al. (2019) (52) |
TG | 96,598 | 31 | 0.15 (0.03) | Teslovich et al. (2010) (55) |
WC | 385,932 | 367 | 0.20 (0.01) | Watanabe et al. (2019) (52) |
Trait . | N . | Genomic risk loci . | SNP heritability, h2 (SE) . | Reference . |
---|---|---|---|---|
FG | 46,186 | 13 | 0.09 (0.02) | Dupuis et al. (2010) (54) |
HDL-C | 99,000 | 50 | 0.11 (0.02) | Teslovich et al. (2010) (55) |
SBP | 361,402 | 255 | 0.13 (0.01) | Watanabe et al. (2019) (52) |
TG | 96,598 | 31 | 0.15 (0.03) | Teslovich et al. (2010) (55) |
WC | 385,932 | 367 | 0.20 (0.01) | Watanabe et al. (2019) (52) |
Factor Analyses
For investigation of how many latent factors should be constructed, an EFA was performed. Results from this analysis suggest retaining one factor (Supplementary Fig. 4A), which corresponds well with reported phenotypic factor models of MetS (31). A follow-up confirmatory factor analysis shows adequate fit to the data [χ2(5) = 28.42, P = 3.01e−5, comparative fix index = 0.95, standardized root mean square = 0.058]. This model (hereafter referred to as “the MetS factor”) was selected as our final factor model. Factor loadings are relatively high (0.45–0.70) for HDL-C, TG, WC, and FG and lower but still significant for SBP (0.22 (P = 2.69e−21)) (Fig. 1B and Supplementary Table 4).
Common Factor GWAS
We then used the MetS factor, summarizing the genetic variance shared between the five MetS components, to identify SNPs associated with MetS. We employed genomic SEM, which can be used even in the case of uneven sample sizes or sample overlap (15), to conduct a common factor GWAS on the MetS factor; i.e., we estimated individual SNP effects, using an effective population size of 461,920 (as estimated with genomic SEM). We identify 6,718 genome-wide significant (P < 5e−8) SNPs (Fig. 1C) tagging 318 independent (r2 < 0.1) lead SNPs located in 235 genomic risk loci (Supplementary Table 5). The estimated LDSC linkage disequilibrium intercept indicates no evidence of bias due to uncorrected population stratification (intercept 0.97 [SE 0.02]) (Supplementary Fig. 4C).
Total observed scale SNP heritability (h2) for our MetS factor is 0.14 (SE 0.01), compared with 0.09 (0.005) observed scale h2 from the largest GWAS on MetS to date by Lind (13). This previous GWAS, which was based on phenotypic assessment (the presence of three or more of five components) of binarized MetS, yielded 61 loci (13), of which we replicated 35 (Supplementary Table 6).
Functional Annotation
Because SNPs affect phenotypes through influencing gene expression or protein structure, we mapped SNP-level results to genes in four ways. FUMA (24) was used to annotate individual significant SNPs to genes through 1) positional mapping, coupling SNPs to 845 genes by genomic location; 2) cis– and trans–expression quantitative trait loci mapping, linking SNPs to 2,186 genes of which the expression may be influenced; and 3) chromatin interaction mapping, linking SNPs to 2,636 genes with which they have three-dimensional DNA-DNA interactions (see research design and methods). Overall, FUMA implicates 3,693 mapped genes (Supplementary Table 7). Additionally, MAGMA (25) was used to perform a gene-based association study. SNPs were mapped to 518 significantly associated genes (P < 2.82e−6 [0.05/17,706 genes]) (Supplementary Fig. 4B and Supplementary Table 8). A total of 328 genes were mapped with all four methods (Supplementary Table 9). Of those, ABCA1 was especially notable because it was mapped by four separate loci on four different chromosomes. ABCA1 aids HDL-C in transporting excess cholesterol and has a function in TG metabolism and blood glucose homeostasis (32,33).
As genes do not function in isolation, we ran a competitive gene set analysis in MAGMA, which tests whether the mean association of genes within a gene set with the genetic MetS factor is stronger than that of genes not represented in that gene set. Fifteen gene sets are significantly enriched at P < 3.23e−6 (0.05/15,481 gene sets) (Fig. 2A and Supplementary Table 10). Six of these are involved in lipoprotein particle remodeling. Because there is considerable overlap between these gene sets, we used a series of conditional gene set analyses to better characterize these associations (see research design and methods and Supplementary Table 10). This indicates that there were five independently associated gene sets: TG-rich lipoprotein particle remodeling, DNA binding, DNA repair after ultraviolet radiation, and negative and positive regulation of the biosynthetic process (Supplementary Table 10). Most results are in line with findings from earlier MetS (13,34,35).
Gene-set, tissue, and cell-type enrichment analyses. Bonferroni-corrected significant analyses (horizontal line, the number of tests differs per analysis) are depicted in blue. For each analysis, the top 20 results are shown. y-axes show the −log10-transformed P values of association. A: MAGMA gene set analysis using 15,481 gene sets from MSigDB v6.2. B: Gene expression profiles of MetS factor–associated genes obtained from MAGMA gene property analysis using 54 tissues from the Genotype-Tissue Expression (GTEx) database, version 8. C and D: Single-cell gene expression profiles of associated genes obtained from MAGMA gene property analysis using RNA-sequencing data from all mouse tissue sets (C) (n = 805) and all human brain tissue sets (D) (n = 255) available in FUMA (Supplementary Material). GW, gestational week; UV, ultraviolet.
Gene-set, tissue, and cell-type enrichment analyses. Bonferroni-corrected significant analyses (horizontal line, the number of tests differs per analysis) are depicted in blue. For each analysis, the top 20 results are shown. y-axes show the −log10-transformed P values of association. A: MAGMA gene set analysis using 15,481 gene sets from MSigDB v6.2. B: Gene expression profiles of MetS factor–associated genes obtained from MAGMA gene property analysis using 54 tissues from the Genotype-Tissue Expression (GTEx) database, version 8. C and D: Single-cell gene expression profiles of associated genes obtained from MAGMA gene property analysis using RNA-sequencing data from all mouse tissue sets (C) (n = 805) and all human brain tissue sets (D) (n = 255) available in FUMA (Supplementary Material). GW, gestational week; UV, ultraviolet.
Next, we examined which tissues and cell types were enriched for genes associated with the MetS factor. Linking MAGMA gene-based P values to tissue-specific gene expression, we observe strong enrichment (significant at P < 9.26e−4 for 54 tissues) of genes expressed in the cerebellum (P = 7.72e−10 and P = 4.99e−10 for the cerebellum and cerebellar hemisphere, respectively), as well as the (frontal) cortex (Brodmann area 9, P = 2.38e−4, and cortex, P = 2.81e−4) and the pituitary (P = 4.04e−4) (Fig. 2B and Supplementary Table 11).
The significance of brain tissue was confirmed with FUMA cell-type analyses in mice, which show enrichment (significant at P < 6.21e−05 for 805 cell types) in brain neurons (P = 2.04e−5), embryonic mesenchyme neurons (neuropeptide Y high) (P = 2.13e−5) and oligodendrocyte precursor cells (P = 4.72e−5), besides enteroendocrine cells from the large intestine (P = 1.90e−7), which produce gut hormones that coordinate appetite, food absorption, digestion, and insulin secretion (36) (Fig. 2C and Supplementary Table 12). All significant cell types remain significant after within–data set conditional cell-type analysis with forward selection (with repeatedly retaining the cell type with the lowest marginal P value for each pair of significantly associated cell types).
Focusing on human brain cells, we find strong associations (significant at P < 1.96e−4 for 255 cell types) with GABAergic cells (midbrain GABA cells, P = 1.21e−12; midbrain neuroblast GABA cells, P = 3.72e−12; GABAergic neurons in the prefrontal cortex at gestational week 26, P = 1.25e−10; GABAergic neurons in the prefrontal cortex at gestational week 23, P = 1.70e−8; hippocampal GABAergic2 interneurons, P = 1.03e−5; and GABAergic neurons in the prefrontal cortex at gestational week 16, P = 3.64e−5), cells belonging to the dopamine system (midbrain dopaminergic 1 cells, P = 1.28e−6, and pyramidal neurons from the hippocampal Cornu Ammonis 1 region, P = 1.19e−4), and neuroblasts (midbrain mediolateral neuroblast 5 cells, P = 1.16e−6) (Fig. 2D and Supplementary Table 13). After within-dataset conditional cell type analysis, associations for midbrain GABA cells, GABAergic neurons in the prefrontal cortex at gestational week 26, hippocampal GABAergic2 interneurons, and pyramidal neurons from the hippocampal Cornu Ammonis 1 region remained significant.
Overlap Between MetS Components and MetS Factor
To examine what the MetS components share and what makes them unique, we investigated overlapping loci, genes, gene sets, tissues, and cell types among the MetS components and the MetS factor. Of the 235 loci associated with MetS factor, 27 do not overlap with any loci for MetS components, 155 overlap with loci associated with one MetS component, and 53 overlap with loci associated with two or more MetS components (Supplementary Table 14). One MetS locus is significant in four of the five components. Whereas to conclude that these overlapping loci represent the same signal would require functional follow-up studies, this overlap does suggest that some genetic signal of MetS components is shared.
From further scrutinizing overlap among MetS components in loci, mapped genes from the MAGMA gene-based analysis, gene sets, tissues, and cell types, we can conclude there is little pairwise overlap (Fig. 3 and Supplementary Tables 14–19). Overall, both our MetS factor and the Lind GWAS show the largest overlap with WC (except for gene sets and for mouse cell types). Because WC both had the highest factor loading on the MetS factor and had the highest SNP heritability (Table 1), we aimed to investigate to what extent WC is driving our MetS results. We therefore repeated the MetS factor GWAS without WC. Results are described in Supplementary Material. The results suggest that, even though there is significant genetic correlation with the MetS factor when we leave out WC, the functional features differ, and WC therefore seems to be a key determinant of the genetics associated with the MetS factor. However, we cannot rule out that the large overlap of the MetS factor GWAS with the WC GWAS is determined by the power of the latter.
Overlap between MetS factor GWAS results with components and the MetS GWAS by Lind. Overlapping genomic risk loci (A), significant genes from the MAGMA gene-based analysis (B), gene sets (C), enriched tissues (D), mouse cell types (E), and human brain cell types (F). MetS components are within black lines. The cell color reflects the percentage shared of phenotype 1 results on the y-axis, with phenotype 2 on the x-axis. MetS Lind, phenotypic MetS GWAS by Lind.
Overlap between MetS factor GWAS results with components and the MetS GWAS by Lind. Overlapping genomic risk loci (A), significant genes from the MAGMA gene-based analysis (B), gene sets (C), enriched tissues (D), mouse cell types (E), and human brain cell types (F). MetS components are within black lines. The cell color reflects the percentage shared of phenotype 1 results on the y-axis, with phenotype 2 on the x-axis. MetS Lind, phenotypic MetS GWAS by Lind.
Genetic Correlations With External Traits
In investigation of whether the MetS factor showed genetic overlap with other traits, we used LDSC to estimate genetic correlations of our MetS factor GWAS with various external traits. The genetic correlation between the MetS factor GWAS and Lind GWAS is 0.92 (P = 0) (Fig. 4 and Supplementary Table 26). The MetS factor shows significant (at P = 1.92e−3 [0.05/26 traits]) positive genetic correlations with known phenotypically associated diseases and traits: type 2 diabetes (rG = 0.69, P = 3.39e−239), cholelithiasis (rG = 0.52, P = 2.34e−31), coronary artery disease (rG = 0.48, P = 1.58e−75), childhood BMI (rG = 0.47, P = 2.03e−38), and very severe respiratory confirmed COVID-19 (rG = 0.26, P = 6.30e−7). Less expected significant positive genetic correlations are with attention-deficit/hyperactivity disorder (rG = 0.34, P = 2.85e−35), smoking initiation (rG = 0.24, P = 1.71e−44), insomnia (rG = 0.21, P = 1.14e−17), number of children born (rG = 0.17, P = 1.56e−8), major depressive disorder (rG = 0.16, P = 7.54e−11), cannabis use disorder (rG = 0.16, P = 5.47e−7), and height (rG = 0.10, P = 9.08e−9). Significant negative genetic correlations were found with age at first birth (rG = −0.40, P = 5.01e−82), educational attainment (rG = −0.37, P = 5.97e−123), anorexia nervosa (rG = −0.22, P = 2.40e−15), and schizophrenia (rG = −0.09, P = 7.97e−07).
Genetic correlations between the MetS factor and other complex diseases and traits. Error bars represent 95% CIs. Asterisks reflect significance at the multiple testing–corrected P value threshold (0.05/26 = 1.92e−3): 10e−30 < P < 1.92e−3 (*), 10e−60 < P < 10e−30 (**), 10e−90 < P < 10e−60 (***), and P < 10e−90 (****). ADHD, attention-deficit/hyperactivity disorder.
Genetic correlations between the MetS factor and other complex diseases and traits. Error bars represent 95% CIs. Asterisks reflect significance at the multiple testing–corrected P value threshold (0.05/26 = 1.92e−3): 10e−30 < P < 1.92e−3 (*), 10e−60 < P < 10e−30 (**), 10e−90 < P < 10e−60 (***), and P < 10e−90 (****). ADHD, attention-deficit/hyperactivity disorder.
Polygenic Prediction
To investigate predictive power of the MetS factor GWAS, we calculated PRS with PRSice-2 (28) using summary statistics of the MetS factor, the MetS components used in this study, and the Lind GWAS (13). We then investigated what proportion of phenotypic variance of FG, HDL-C, SBP, TG, WC, and MetS (research design and methods) they explained in unrelated second and third generation participants of the FHS (n = 2,095). Because the FG, HDL-C, and TG GWAS contained FHS participants, we corrected those summary statistics for sample overlap with EraSOR (27) and reran our MetS factor GWAS with those adjusted summary statistics. Variance explained in MetS is 0.059 with MetS factor GWAS summary statistics, which is higher than all other summary statistics (Fig. 5 and Supplementary Table 27). Furthermore, variance explained by MetS PRS plus covariates is 0.21 (Nagelkerke R2), whereas variance explained by all MetS component PRS plus covariates is 0.19, showing that the MetS PRS predictive abilities are better than the sum of its parts (P = 0.0058) (Supplementary Table 28).
Variance explained by MetS factor GWAS summary statistics, the MetS components GWAS summary statistics used to define MetS factor, and the largest phenotypic MetS GWAS, by Lind. R2 indicates the proportion of variance explained by the PRS, independent of covariates. MetS Lind, phenotypic MetS GWAS by Lind.
Variance explained by MetS factor GWAS summary statistics, the MetS components GWAS summary statistics used to define MetS factor, and the largest phenotypic MetS GWAS, by Lind. R2 indicates the proportion of variance explained by the PRS, independent of covariates. MetS Lind, phenotypic MetS GWAS by Lind.
Drug Repurposing
To investigate whether any clinically used drugs are candidates to treat MetS, we ran a drug-repurposing analysis with DRUGSEA (30). After competitive and conditional gene-set analyses, one drug is significantly associated with MetS factor, fenofibrate (P = 7.84e−06) (Supplementary Table 29). Fenofibrate activates the peroxisome proliferator–activated receptor α, which triggers lipoprotein lipase leading to lipolysis and improved HDL-C and TG levels (37). Additionally, several drug categories were enriched for drugs with high MAGMA Z statistics from the drug gene set analysis run in MAGMA (Supplementary Material).
Discussion
Leveraging the power advantage expected in a genomic SEM approach, we studied the genetics of MetS, which was defined by a combination of FG, HDL-C, SBP, TG, and WC. We show that the genetic and biological elements associated with MetS components are most often unique (see Supplementary Note for a disquisition about whether MetS can be seen as a [homogeneous] syndrome). Their modest but significant genetic correlation was modeled into a common MetS factor, and a GWAS run on that factor yielded 235 associated genome-wide significant loci and provided new biological insights in MetS etiology.
First, our MetS factor GWAS results point toward involvement of the brain, especially the cerebellum. Many food intake–regulating hormones cross the blood-brain barrier and act as signaling factors in the central nervous system, controlling appetite and adipose tissue lipolysis, for example. Tissue enrichment analysis of genes found in the Lind GWAS also pointed uniquely toward involvement of the cerebellum (Supplementary Table 17), confirming our findings. In a recent study by Low et al. (38), the cerebellum showed differences in neural activity in individuals with Prader-Willi syndrome, a syndrome characterized by obesity and lack of satiation. They later showed that activating neurons within the mouse cerebellum led to a pronounced reduction of food intake. They suggested that this was guided by an increase in striatal dopamine levels and a reduction of the phasic dopamine response following food consumption, which is likely to reduce meal size by attenuating the reward value of food. Zooming in on human brain cells, we find a strong signal for dopaminergic cells as well as GABAergic cells. Targeting GABAergic pathways might be a promising therapeutic strategy for improving MetS and associated diseases; GABA knockdown in mice improved insulin sensitivity, decreased food intake, and induced weight loss (39).
Second, we observed positive genetic correlations with diseases known to be phenotypically associated with MetS, such as coronary artery disease and type 2 diabetes. Furthermore, MetS factor showed genetic correlations with neuropsychiatric diseases such as attention-deficit/hyperactivity disorder, smoking initiation, insomnia, major depressive disorder, cannabis use disorder, and schizophrenia. This is consistent with earlier reports that revealed clinical (40) or genetic (41,42) overlap between MetS components and neuropsychiatric disease and is backed by the neuroregulatory cell involvement and the enrichment of antidepressant drugs in the drug repurposing analysis.
Third, our MetS factor GWAS showed the highest overlap, both genetically and in follow-up results, e.g., the enrichment of brain tissues (43), with WC. Furthermore, the most significant loci from the MetS factor GWAS are near FTO and MC4R, two genes strongly implicated by obesity GWAS (44). This corresponds with other studies that point toward abdominal obesity, which exacerbates other cardiometabolic risk factors, as a major driving force and therapeutic target of MetS (11,45). The Lind GWAS also showed relatively the largest overlap with WC, albeit less pronounced than our MetS factor. However, our MetS factor GWAS is affected by the large sample size and high SNP heritability (h2 = 0.20) of the WC GWAS, and WC’s relatively high genetic factor loading on the MetS factor, which increases power to detect WC-related SNP effects on the MetS factor (15). Larger sample size and higher SNP heritability also affect the power to detect enriched gene sets, tissues, and cell types and might have been insufficient for FG and TG, for example.
Finally, results of drug repurposing analyses suggest fenofibrate as a potential therapeutic agent. Early trials show a moderate beneficial effect of fenofibrate on cardiovascular events (46), but recent reports indicated a significant reduction of major cardiovascular events in individuals with MetS with fenofibrate used as an add-on to statin (adjusted hazard ratio 0.74 [95% CI 0.58–0.93], P = 0.01) (47). Furthermore, it has been suggested that fenofibrate might have a greater beneficial effect in individuals with multiple MetS features (48). Studies in rodents show that fenofibrate targets multiple MetS components, including TGs, HDL-C, insulin resistance, and obesity (49–51). However, it is unlikely that one drug would prove to be a panacea for all individuals with MetS, given the heterogeneity of the syndrome.
It is important to consider this work in the light of its main limitations. As we did not have access to an independent, well-powered sample to replicate our MetS factor GWAS findings, our results should be interpreted with caution. The GWAS summary statistics of SBP that we used did not correct for antihypertensive medication use. However, the well-powered GWAS on SBP from the International Consortium for Blood Pressure (ICBP) cohort all corrected for BMI, which would lead to an incorrect genetic structure of the proposed SEM model. We therefore opted for the uncorrected SBP GWAS by Watanabe et al. (52). Also, mapped SNPs and genes might not be causative. Fine mapping and further functional experiments could point toward causative genes and guide therapeutic strategies, such as is exemplified for metabolic traits by Akbari et al. (53), for example. Furthermore, our results depend on the availability of functional data sets in FUMA. Finally, the samples used in our study are, for technical reasons, restricted to individuals of European ancestry and might not generalize to the general population.
In conclusion, to tackle the complex observed phenotypic and genetic covariance of the components that make up MetS, we have used an SEM approach that leverages genetic correlations of individual components to delineate the genetic architecture of MetS (10). We show shared and unique enrichment in specific biological pathways among MetS components, tissues, and cell types. Our findings give starting points for further functional follow-up experiments and provide valuable information for potential therapeutic targets.
This article contains supplementary material online at https://doi.org/10.2337/figshare.20497188.
Article Information
Acknowledgments. The Framingham Heart Study data were obtained from the database of Genotypes and Phenotypes (dbGaP) (accession no. phs000007). The authors thank all study participants, researchers, and staff (including Meta-Analyses of Glucose and Insulin-related traits Consortium [MAGIC] investigators, the Early Growth Genetics [EGG] Consortium, the COVID-19 host genetics initiative, the DIAbetes Genetics Replication And Meta-analysis [DIAGRAM] consortium, the Psychiatric Genomics Consortium [PGC], the Genetic Investigation of ANthropometric Traits [GIANT] Consortium, CTGlab, and the Social Science Genetic Association Consortium [SSGAC]) who contributed to publicly available summary statistics and the developers of the tools that enabled this study.
Funding. C.d.L. was funded by F. Hoffman-La Roche AG. M.N. is supported by a ZonMw Vici grant, 2020 (09150182010020).
Duality of Interest. No potential conflicts of interest relevant to this article were reported.
Author Contributions. E.S.v.W., I.E.J., J.E.S., C.d.L., M.N., and D.P. contributed to study conceptualization. E.S.v.W. performed data curation. Investigations were performed by E.S.v.W. and N.Y.B. Software was used by E.S.v.W. and N.Y.B. Visualization was performed by E.S.v.W. E.S.v.W. wrote the original draft of the manuscript. E.S.v.W., N.Y.B., J.E.S., C.d.L., M.N., S.v.d.S., and D.P. reviewed and edited the manuscript. E.S.v.W. is the guarantor of this work and, as such, had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
Prior Presentation. Parts of this study were presented in abstract form at the 8th Joint Dutch/UK Clinical Genetics Societies and Cancer Genetics Groups meeting.