Although hyperlipidemia is traditionally considered a risk factor for type 2 diabetes (T2D), evidence has emerged from statin trials and candidate gene investigations suggesting that lower LDL cholesterol (LDL-C) increases T2D risk. We thus sought to more comprehensively examine the phenotypic and genotypic relationships of LDL-C with T2D. Using data from the UK Biobank, we found that levels of circulating LDL-C were negatively associated with T2D prevalence (odds ratio 0.41 [95% CI 0.39, 0.43] per mmol/L unit of LDL-C), despite positive associations of circulating LDL-C with HbA1c and BMI. We then performed the first genome-wide exploration of variants simultaneously associated with lower circulating LDL-C and increased T2D risk, using data on LDL-C from the UK Biobank (n = 431,167) and the Global Lipids Genetics Consortium (n = 188,577), and data on T2D from the Diabetes Genetics Replication and Meta-Analysis consortium (n = 898,130). We identified 31 loci associated with lower circulating LDL-C and increased T2D, capturing several potential mechanisms. Seven of these loci have previously been identified for this dual phenotype, and nine have previously been implicated in nonalcoholic fatty liver disease. These findings extend our current understanding of the higher T2D risk among individuals with low circulating LDL-C and of the underlying mechanisms, including those responsible for the diabetogenic effect of LDL-C–lowering medications.
Rates of cardiovascular disease (CVD) and type 2 diabetes (T2D) are among the most pressing health concerns worldwide. These two diseases share many risk factors and tend to co-occur, because there is an excess of CVD among individuals with T2D (1,2). Yet, controversy remains over whether all risk factors exert similar effects on the development of these two conditions. LDL cholesterol (LDL-C) is a class of highly atherogenic particles, and circulating levels of LDL-C are a causal risk factor for CVD across the life span (3). However, several lines of evidence suggest that decreased levels of circulating LDL-C are associated with an increased T2D risk.
Lipid-lowering medications, in particular from the statin drug class, are effective at lowering levels of circulating LDL-C and rates of adverse cardiovascular events (4) but convey an increased T2D risk (odds ratio [OR] 1.09) (5,6) in a dose-dependent manner (7). This increased risk, however, is outweighed at a population level by the cardiovascular event rate reduction. An increased T2D risk has also been reported in observational studies. Individuals with low levels of circulating LDL-C (e.g., <60 mg/dL) exhibit a higher risk of prevalent and incident T2D (8,9), and among individuals with coronary disease, LDL-C and T2D are inversely related (10). In addition, individuals with familial hypercholesterolemia exhibit a decreased risk of T2D as well as lower BMI and triglyceride (TG) levels (11).
Genetic studies have lent further support to inverse phenotypic associations between LDL-C and T2D, with recent studies pointing to genetic loci that harbor variants exerting opposing effects on LDL-C and T2D. These include loci containing the HMGCR (12), APOE (13,14), PCSK9 (12,14,15), NPC1L1 (12,14), PNPLA3 (14), TM6SF2 (14), GCKR (14), and HNF4A (14) genes. Furthermore, Fall et al. (16) and White et al. (17) have both found that genetically predicted higher circulating LDL-C was associated with a lower risk of T2D. Yet, genetic findings show that not all variants have opposing effects on circulating LDL-C levels and T2D risk. LDL-C–lowering variants in ABCG5/G8 and LDLR genes were not shown to alter T2D risk (12), and subsets of LDL-C–lowering alleles pose a stronger risk for T2D than the full gamut (16). Circulating LDL-C levels, like T2D, are reflective of a number of physiological processes. The findings outlined above suggest that there is heterogeneity in T2D outcomes, depending on which pathways are the primary LDL-C–lowering mechanisms, and that genetic studies may give us insights into these pathways. For example, it is not clear whether these associations are driven by changes in circulating levels of LDL-C or by changes in intracellular levels of cholesterol. A better understanding of which genetic loci lower circulating levels of LDL-C and increase T2D risk may yield mechanistic insights that could help develop therapeutic options that lower lipid levels without raising the risk of T2D and help identify individuals at greater risk for T2D with statin use.
Here, we first examined the relationship of directly measured circulating LDL-C levels with prevalent T2D, HbA1c, and BMI. We then sought to identify, for the first time on a genome-wide scale, loci simultaneously associated with lower LDL-C and increased T2D (and vice versa). Upon identifying variants, we sought to generate additional mechanistic insights by testing of associations with seven other traits in the UK Biobank related to T2D, LDL-C, and nonalcoholic fatty liver disease (NAFLD).
Research Design and Methods
Data from the UK Biobank were used for 1) phenotypic data analysis, which examined the associations of circulating levels of LDL-C and TG with T2D, HbA1c, and BMI, and 2) discovery genome-wide association study (GWAS) for variants that are associated with lower circulating levels of LDL-C and higher T2D. The UK Biobank is a prospective cohort study of ∼500,000 individuals between the ages of 39 and 72 years living throughout the U.K. Participants attended 1 of 21 assessment centers in the U.K. and had their blood drawn for biomarker and genetic analysis and weight and height measured to derive BMI (kg/m2). Directly measured circulating LDL-C, HbA1c, HDL-C, TG, alanine aminotransferase (ALT), and AST were obtained from all UK Biobank participants at the baseline visit between 2006 and 2010 in a nonfasting state. LDL-C was assessed by enzymatic protective selection analysis on a Beckman Coulter AU5800.
To define prevalent T2D case and control subjects, we used criteria previously used by Yaghootkar et al. (18) and Eastwood et al. (19). We first excluded individuals with a missing age of T2D diagnosis, reporting a T2D diagnosis within 1 year of the baseline examination, those self-reporting type 1 diabetes in the verbal interview, and women reporting only gestational diabetes on the touchscreen or verbal interview. Prevalent T2D was defined using the following criteria: 1) self-reported diabetes diagnosed by a doctor during the touchscreen, or self-reported T2D or generic diabetes in verbal interviews; 2) having a nonmissing age of diagnosis and an age of diagnosis >35 years of age (>30 years of age for participants reporting an ethnicity of South Asian or African Caribbean); and 3) not using insulin within 1 year of diagnosis to exclude possible type 1 diabetes case subjects. Control subjects were participants with no self-reported diabetes of any type from the touchscreen or verbal interview, no self-reported insulin use in the touchscreen or verbal interview, those not excluded according to the aforementioned criteria, and those not reporting nonmetformin T2D medication (see the list in Supplementary Table 2).
UK Biobank Genotypes
Genotypes in the UK Biobank were obtained with the Affymetrix UK Biobank Axiom Array (Santa Clara, CA), whereas 10% of participants were genotyped with the Affymetrix UK BiLEVE Axiom Array. Details regarding imputation, principal components analysis, and quality control procedures are described elsewhere (20). The analysis excluded individuals with unusually high heterozygosity, with a high (>5%) missing rate, or with a mismatch between self-reported and genetically inferred sex. Single nucleotide polymorphisms (SNPs) out of Hardy-Weinberg equilibrium (P < 1 × 10−6), with a high missing rate (>1.5%), with a low minor allele frequency (<0.1%), or with a low imputation accuracy (info <0.4) were excluded from analyses. This resulted in the availability of ∼15 million SNPs for analysis.
Diabetes Genetics Replication and Meta-Analysis and Global Lipids Genetics Consortium GWAS Meta-Analysis Summary Statistics
The latest GWAS meta-analysis summary statistics for T2D (unadjusted for BMI) were obtained from the Diabetes Genetics Replication and Meta-Analysis consortium (DIAGRAM), which includes data on up to 898,130 individuals (74,124 case and 824,006 control subjects), including UK Biobank individuals (21). We used the results of our GWAS of circulating LDL-C in UK Biobank, along with the aforementioned DIAGRAM-T2D results, for the discovery of inverse association signals. We then replicated LDL-C associations of our top hits with an independent GWAS meta-analysis of LDL-C from the Global Lipids Genetics Consortium (GLGC) (22) (n = 188,577); this meta-analysis does not include the UK Biobank study. Across the UK Biobank, DIAGRAM, and GLGC summary statistics, we aligned all SNP alleles and their corresponding effects by using the harmonize function in the TwoSampleMR package in R software (23).
To evaluate and plot the prevalence of T2D and of BMI and HbA1c by decile of circulating LDL-C and TG in the UK Biobank, we excluded all participants who self-reported (at baseline) use of cholesterol-lowering medications during the touchscreen survey, or cholesterol-lowering medication during the verbal interview (see Supplementary Table 1 for list of medications). Approximately 91% of individuals taking cholesterol-lowering medications were taking statins. To examine levels of HbA1c by decile of circulating LDL-C, we excluded participants defined as T2D cases (see above). We further excluded individuals with outlier values of HbA1c >4 SDs from the mean. Deciles were calculated using the “quantcut” function in the “gtools v3.5.0” library in R software. Once decile were established, T2D prevalence by LDL-C/TG decile was calculated and plotted with CIs determined by the Clopper-Pearson interval (24). Mean HbA1c and BMI and their distributions are shown in boxplots for each decile of circulating LDL-C. We further examined T2D prevalence by circulating LDL-C decile separately in men and women, and in different age-groups (40–49 years, 50–59 years, and 60–69 years).
To statistically evaluate these phenotypic associations, we performed logistic regression with T2D prevalence as the outcome and linear regression with HbA1c and BMI as outcomes. As mentioned above, all individuals on cholesterol-lowering medication were excluded. To normalize residuals, we transformed circulating LDL-C, TG, HbA1c, and BMI by inverse normalization for all linear regression analyses. For each analysis, we used the same exclusion criteria as those mentioned above and adjusted for “last eating” time (excluding individuals reporting extreme values, >16 h), age, sex, and center. We considered an expanded model with additional covariates: education (college/university degree or not), Townsend Deprivation Index, BMI, hypertension status (self-reported status or hypertension medication), ethnicity (white/European or not), family history of T2D (at least one first-degree family member), smoking status (never, past, current), and alcohol consumption (never or only special occasions, one to three times per month, one to two times per week, three to four times per week, daily/almost daily). In analyzing the association of circulating LDL-C with T2D, we also tested for interactions with sex and age and provided stratified analyses accordingly. Finally, we also examined the association of LDL-C with T2D among only the individuals taking cholesterol-lowering medication.
To address possible ascertainment bias of prevalent T2D case subjects due to exclusion of people taking cholesterol-lowering medication, we performed a sensitivity analysis using propensity score matching to remove bias between the two groups due to observed covariates (25). We used the propensity score to match on the probability of taking cholesterol-lowering medication given the set of baseline characteristics listed in Supplementary Table 3. All covariates were selected based on previous literature or if considered potential significant confounders for cholesterol-lowering medication use or T2D (26–28). Matching analyses were performed using R software and the package MatchIt v3.0.2 with 1:1 nearest-neighbor matching and a caliper width equal to 0.1 to achieve balanced covariates between the two groups (29). Standardized mean differences were used to assess covariate balance before and after matching. Standardized mean differences <0.1 were considered adequately balanced to reduce significant differences between the two groups (30). Subsequent analyses were conducted among all individuals, regardless of cholesterol-lowering medication use, using logistic regression for T2D as the outcome and linear regression for HbA1c as the outcome. Model 1 was adjusted for time since eating, age, sex, and center. Model 2 was additionally adjusted for BMI and use of cholesterol-lowering medication. BMI was included as a covariate because the addition of BMI to the unadjusted model in the matched sample changed the regression coefficient for LDL-C by >10%. Although cholesterol medication use did not result in this magnitude of change, we adjusted for cholesterol medication use in model 2 to address any potential confounding. The only covariates considered in these regression analyses were those used in the main analyses stratified by cholesterol-lowering medication (see above and model 2 in Supplementary Tables 6 and 7). To meet the assumptions of linear regression, HbA1c, LDL-C, and BMI were inverse normalized in all linear regression models of HbA1c regressed on LDL-C. We used complete case analysis to address missing data. As a result of greater missingness for HDL-C (n = 35,382), analyses were repeated excluding HDL-C from the matching covariates. However, the regression results were not attenuated, and only results including HDL-C were reported.
For the GWAS of LDL-C in the UK Biobank, the circulating LDL-C level of individuals on cholesterol-lowering medication was corrected by dividing it by a correction factor of 0.63 (31). We also ran a GWAS only on individuals not taking cholesterol-lowering medication. We transformed LDL-C by inverse normalization. We used BOLT-LMM software (32) to perform GWAS on individuals of European descent (n = 431,167) and included “last eating” time (see above), sex, age, age2, center, genotyping chip, and the first 10 principal components as covariates. BOLT-LMM performs a linear mixed model regression that includes a random effect of all SNP genotypes other than the one being tested. We aligned effect sizes across the GWAS summary statistics of each trait to the same effect allele using the harmonize function, as mentioned above. We used metaCCA v1.12.0 (33) to perform a multivariate GWAS with the LDL-C and T2D GWAS summary statistics. Briefly, metaCCA implements a canonical correlation analysis on GWAS summary statistic data in which the phenotype correlation structure was estimated from the univariate GWAS summary statistics. We first selected only those SNPs that exhibited opposite directions of univariate effects for LDL-C and T2D and having a metaCCA P < 5 × 10−8. To further minimize the potential of selecting false-positive loci, we selected among these SNPs only those with a univariate association P < 5 × 10−5 for each of LDL-C and T2D. At this univariate P-value threshold, the prior probability of a given SNP associated with two traits and with discordant direction of effect under the null hypothesis corresponds to 0.00005 × 0.000025 = 1.25 × 10−9 (34). SNPs within <500 kb of each other or in linkage disequilibrium of r2 > 0.05 were clumped together, and the SNP with the lowest metaCCA P value was reported.
For the replication of the 44 discovered loci, we considered both the univariate results for LDL-C from GLGC and multivariate results from metaCCA using the GLGC LDL-C and DIAGRAM T2D. Because of incomplete overlap of SNPs in GLGC with those in the UK Biobank and DIAGRAM and differences in population composition, we examined all SNPs within each locus identified in the discovery stage (i.e., the base pair range at a given locus for which all SNPs satisfied the above univariate and multivariate criteria in the discovery analysis). After further restricting to only variants for which the effect size for (GLGC) LDL-C and T2D exhibited opposite directions of effect, we chose the SNP with the lowest metaCCA P value. A locus was considered to be successfully replicated if this top SNP had a univariate LDL-C P < 5 × 10−3 and a metaCCA P < 5 × 10−5. Among the replicated loci, we tested for colocalization using the DIAGRAM T2D and UK Biobank LDL-C results to determine whether, at a given locus, the two traits are likely to be affected by the same causal variant. Specifically at each of the replicated loci, we used the coloc v3.2-1 package in R software (35) to test for colocalization using all SNPs within 250 kb of the SNP with the lowest metaCCA P value. We used default parameters and priors. We considered that there was evidence for colocalization if the posterior probability for a shared causal variant hypothesis 4 (PP.H4) was >80%.
To test the association of the 31 SNPs (T2D increasing allele) that we identified with a range of other cardiometabolic traits that are known to be related to LDL-C and T2D and are available in the UK Biobank, we used similar methods described above for the circulating LDL-C GWAS. For TG and HDL-C, we excluded individuals reporting cholesterol-lowering medication. For ALT and AST, we excluded 15,138 individuals with medical conditions, other than NAFLD, that could affect liver enzyme levels (36). For HbA1c, we excluded individuals with prevalent T2D (see above). For the waist-to-hip ratio, we additionally adjusted for BMI before inverse normalization and subsequent GWAS. We inverse normalized all traits before the GWAS. We tested the association of each of the 31 SNPs with each of these seven additional phenotypes. We then normalized the effect sizes by dividing the β-coefficients by the corresponding SEs and dividing by the square root of the respective sample size. We used hierarchical clustering to group the identified variants according to their pattern of association with all nine traits, including T2D and circulating LDL-C. Clustering was performed with the hclust v3.6.2 function in R, with the Euclidian metric to calculate distances, and the Ward clustering method (37). Cluster stability was assessed by using the clValid v0.6-6 package in R software, evaluating the hierarchical, k-means, and partitioning around medoids methods, and evaluating 2–10 clusters (38). Finally, using UK Biobank individual-level data, we used a multivariate approach, MultiPhen v2.0.3 (39), which uses ordinal regression to model each SNP as the outcome and includes all traits as covariates (except for T2D) in addition to age and sex. We present only β-coefficients from these models because the P values are nearly all >0.05, possibly due to the inclusion of many correlated phenotypes into each model.
Data and Resource Availability
The data that support the findings of this study are available to researchers, upon application, from the UK Biobank, but restrictions apply to the availability of these data, which were used under license for the current study. Data from the DIAGRAM and GLGC consortia are publically available at their respective websites: https://www.diagram-consortium.org/ and http://lipidgenetics.org/.
In a sample size of 375,783 individuals after exclusion of individuals on lipid-lowering medication, T2D prevalence was 0.8% and was higher in men (1.15%) than in women (0.54%). Individuals with prevalent T2D had lower circulating LDL-C and HDL-C, higher circulating TG, higher HbA1c, and higher BMI (Supplementary Table 4). Among individuals on cholesterol-lowering medication (n = 78,626), T2D prevalence was 18.2%, and individuals with T2D had lower circulating LDL-C and HDL-C, and higher circulating TG, HbA1c, and BMI (Supplementary Table 5).
Association of Circulating LDL-C With T2D
We observed an inverse relationship between circulating LDL-C and T2D prevalence (OR 0.41 [95% CI 0.39, 0.43] per mmol/L unit of LDL-C, P = 1.26 × 10−263). Individuals in the lowest decile of circulating LDL-C exhibited the highest prevalence of T2D, and a consistent decrease in T2D prevalence was observed with increasing circulating LDL-C (Fig. 1). We found a very similar negative association of circulating LDL-C with T2D among only the individuals reporting the use of cholesterol-lowering medication. We found a significant interaction of circulating LDL-C with sex (P = 1.52 × 10−13), whereby the association of circulating LDL-C with T2D prevalence was stronger among men (OR 0.35 [95% CI 0.32, 0.37] per mmol/L unit of LDL-C, P = 7.37 × 10−215) than among women (OR 0.51 [0.47, 0.55] per mmol/L unit of LDL-C, P = 3.63 × 10−62) (Supplementary Table 6 and Supplementary Fig. 1). We also observed a stronger inverse association between circulating LDL-C and T2D prevalence among older individuals (Pinteraction = 3.54 × 10−13) (Supplementary Table 6 and Supplementary Fig. 3). Positive associations were found between circulating LDL-C and both HbA1c (after exclusion of individuals with T2D; β = 0.14, SE = 0.0017, P < 5.0 × 10−300) and BMI (β = 0.16, SE = 0.0016, P < 5.0 × 10−300) (Fig. 1 and Supplementary Table 6). We also observed a positive association between circulating TG and T2D prevalence (OR 1.34 [95% CI 1.31, 1.38], P = 8.03 × 10−109) (Supplementary Table 6 and Supplementary Fig. 4). Among individuals on cholesterol-lowering medications, we found a nearly identical negative association of circulating LDL-C and T2D prevalence but a much weaker positive association with HbA1c, and a negative association with BMI (Supplementary Table 7 and Supplementary Fig. 1). In models including additional covariates, the results remained very similar (Supplementary Tables 6 and 7). Results were also very similar in propensity score–matching analyses. In a total sample size of ∼70,000 individuals, the T2D prevalence was 6.69%, and these analyses showed similar negative associations of circulating LDL-C with T2D (OR 0.51 [95% CI 0.49, 0.54] in model 2) and positive associations of circulating LDL-C with HbA1c (Supplementary Tables 3 and 8).
Loci Associated Inversely With LDL-C and T2D
We identified 44 loci associated in opposite directions with circulating LDL-C and T2D using the UK Biobank LDL-C and the DIAGRAM T2D results (Supplementary Table 9). In an analysis in which we used a GWAS of circulating LDL-C excluding individuals on cholesterol-lowering medication, we observed nearly identical results (Supplementary Table 10). Among these 44 loci, 31 replicated with respect to LDL-C association when using the GLGC LDL-C GWAS results instead of UK Biobank (Table 1). Several loci are previously known or suspected to be inversely associated with circulating LDL-C and T2D (HMGCR, APOE, NPC1L1, PNPLA3, TM6SF2, GCKR, and HNF4A). However, most of the loci are novel for this LDL-C–T2D trait. Of these novel loci, 12 have previously been identified for LDL-C in the GLGC GWAS, 14 were previously identified in T2D GWAS, and 14 have not been identified previously with either trait. The loci with the strongest degree of opposing effects include FNDC7-STXBP3, SORT1-PSMA5, HMGCR-POC5, PPP1R3B, and GCKR (Fig. 2). Colocalization analyses suggest that of the 31 loci, GCKR, PPP1R3B, TM6SF2, HNF4A, MICAL3, and PNPLA3 have the same causal variant(s) influencing circulating LDL-C and T2D (PP.H4 > 0.8). Although most SNPs showed colocalization at shared or distinct causal variants, a few loci showed no evidence of colocalization (Supplementary Table 11).
|UK Biobank LDL-C and DIAGRAM T2D .||GLGC .|
|Chr .||bp_min .||bp_max .||Nearest genes .||Top SNP-metaCCA .||metaCCA-min P .||T2D β .||T2D P .||LDL β .||LDL P .||Top SNP-metaCCA .||LDL P .||metaCCA P .|
|UK Biobank LDL-C and DIAGRAM T2D .||GLGC .|
|Chr .||bp_min .||bp_max .||Nearest genes .||Top SNP-metaCCA .||metaCCA-min P .||T2D β .||T2D P .||LDL β .||LDL P .||Top SNP-metaCCA .||LDL P .||metaCCA P .|
β-Coefficients refer to the SNP with the lowest P value for each trait in the respective region. The minimum univariate P value for LDL-C in GLGC summary statistics as well as the top SNP and P value of the top metaCCA SNP for the region discovered in UK Biobank is also shown. bp_min and bp_max are the minimum and maximum base pair position for which all SNPs have P < 5E−5 for LDL-C and T2D, with opposite directions of effect, and with metaCCA P < 5E−8. Human build GRCh37/hg19; β-coefficients correspond to log-ORs for T2D and SDs for LDL-C. P values from GLGC LDL-C associations are for SNPs showing opposite direction of association with T2D (DIAGRAM). Chr, chromosome.
The variants that we have identified can be linked with genes that affect de novo fatty acid synthesis, hepatic lipid uptake, hepatic lipid export, peripheral tissue lipid balance, fatty liver of unknown origin, insulin secretion, and insulin action (Supplementary Table 12). They are associated in distinct patterns across a range of cardiometabolic traits (Fig. 3). At these loci, the T2D-increasing alleles are generally associated with higher HbA1c levels and lower HDL-C levels, although this pattern is not entirely consistent across all 31 SNPs. According to cluster stability evaluation, two clusters were optimally identified by hierarchical clustering (Supplementary Table 13). However, it is difficult to discern any consistent trait association patterns that differentiate the two sets of loci. In Supplementary Fig. 6 we present the trait-specific β-coefficients based on MultiPhen, some of which are substantially different from the univariate results.
We used the largest sample to date to examine the association of circulating LDL-C with T2D prevalence and found that individuals with low circulating LDL-C exhibit a higher prevalence of T2D. Then, in the first genome-wide analysis aimed at identifying variants associated with both lower circulating LDL-C and higher T2D risk, we identified 24 novel loci exerting opposite-direction effects on these traits. Our analyses lend weight to the notion that the association between lower circulating LDL-C and increased T2D risk is driven, at least in part, by a specific group of genetic variants that may be implicated via diverse mechanisms, including hepatic lipid synthesis, export, and uptake, as well as insulin secretion and action. These variants provide insight into the heterogeneous outcomes for different lipid and glucose metabolism pathways.
We found that low circulating LDL-C is associated with greater T2D prevalence, which is consistent with two previous studies examining T2D prevalence (8) and incidence (9). In addition, we found that lower levels of circulating LDL-C are associated with lower HbA1c (among individuals without T2D) and lower BMI. Our finding that lower circulating LDL-C is associated with increased T2D prevalence but lower HbA1c appears counterintuitive. It is important to note that the latter association was performed in a slightly different subset than the first association (i.e., excluding those with T2D). We may be observing a threshold effect, whereby the etiology of “normal” HbA1c variation is somewhat distinct from the etiology of crossing into overt T2D (e.g., 37). It is also possible that our results could be affected by collider bias because individuals on cholesterol-lowering medications are excluded from our main analysis. However, we observed a similar inverse relationship of circulating LDL-C with T2D in the set of people on cholesterol-lowering medication and in a propensity score–matching analysis. We also find that unlike LDL-C, TG levels are positively associated with T2D prevalence. This opposing relationship of circulating LDL-C and TG with T2D prevalence may suggest that LDL particles are being overfilled in individuals with T2D.
Previous research into loci that jointly alter the risk for circulating LDL-C and T2D has focused on the genomic targets of lipid-lowering medications in the hope that these analyses will give specific insights into associated T2D risk. On one hand, our analyses confirmed that variants in HMGCR (41) and NPC1L1 (14) are associated with lower circulating LDL-C and increased T2D risk. On the other hand, our analyses did not identify variants at PCSK9. The lowest T2D P value was 0.003 in this region for a SNP with opposite direction coefficient. However, our analyses identified a fourth target of lipid-lowering medications: variants in the peroxisome proliferator–activated receptor (PPARG) gene, the target of fibrates and thiazolidinediones.
We observed nine variants previously identified as being associated with NAFLD: PNPLA3, GCKR, TM6SF2, PPP1R3B, ERLIN1-CWF19L1, REEP3, HNF1A, SLC2A2, and MICAL3 (42–44). Furthermore, five of the seven colocalizing loci are among these nine. This enrichment for NAFLD-related genes may reflect increased synthesis and storage of TG and reduced export/secretion of VLDL, leading to reduced circulating LDL-C. Indeed, the LDL-C–decreasing alleles at most of these loci are associated with increased liver enzymes, indicative of hepatic steatosis, with the exception of GCKR and SLC2A2, consistent with a previous finding (45). In turn, lower levels of circulating LDL-C along with increased liver enzymes would be expected to indicate increased NAFLD and T2D. A recent bidirectional Mendelian randomization study provides support for this hypothesized causal effect of NAFLD on T2D (46). Our findings that liver fat may be an important mediator of the effect of cholesterol lowering on T2D is consistent with a report showing that liver fat may help identify statin-taking individuals at risk for T2D (47). Finally, it is noteworthy that the HMGCR variant that lowers circulating LDL-C is not associated with any significant change in liver enzymes, potentially reflecting the lack of an increase in NAFLD incidence seen with statin medications (48).
Our analyses identified a number of variants previously implicated in lipid and glucose metabolism. Sortilin 1 (SORT1) is highly expressed in adipocytes, and the sortilin gene product facilitates the formation and export of VLDL from the liver (49,50). The role of SORT1 in T2D risk is not well understood. Sortilin 1 is required for insulin-dependent glucose uptake (51–53), yet Sort1-knockout mice may show reduced glucose and glycolic intermediates in the fasted state (54). This highlights again the potential for heterogeneous paths in T2D risk and the dependence on multiple pathways of lipid and glucose metabolism to explain our findings.
Several loci were also identified that are known to be related to T2D, without known associations with circulating LDL-C. These include THADA, C2CD4A, CENPW, and SLC12A8 (21). In addition, we identified several variants associated with lower circulating LDL-C but increased T2D risk with no known biological pathways linking these loci to either trait. SLC2A2, which encodes GLUT2, has not previously been associated with circulating LDL-C or T2D in the large respective GWAS consortia. However, GLUT2 is key to hepatic glucose uptake after a meal and the associated hepatic de novo lipogenesis (55). In fact, liver-specific GLUT2 knockout decreases liver TG concentrations. Importantly, GLUT2 expression in the β-cell is required for the glucose-stimulated insulin response (56). In turn, a locus that decreases GLUT2 expression would be expected to limit serum insulin, increase HbA1c, and decrease circulating LDL-C.
Our approach is subject to several limitations. We used prevalent T2D in the UK Biobank, which limits inferences related to the direction of causality. As incident T2D cases develop in the UK Biobank, it will be important to examine the association of circulating LDL-C at baseline with incident T2D. The risk of a false-positive finding (i.e., a SNP that is associated with two traits in opposite directions, each with P < 5 × 10−5, and with a genome-wide significant metaCCA P value) is extremely low. However, a limitation of our study is that the replication is limited to a replication of the LDL-C effect estimates of these SNPs. This could lead to an increased risk of false-positive signals with respect to the associations of the SNPs with T2D. It is also difficult to identify the causal gene at identified loci. Although we annotated these loci according to nearby genes and/or previous annotation, the listed and mentioned genes may not necessarily be directly implicated, if at all. Our identification of loci associated with both LDL-C and T2D does not necessarily imply that in each case, the effects of the genetic variant on each trait are linked by a common pathway or mechanism. In other words, it is possible that the way in which a variant causes a lowering of LDL-C could be distinct (different tissue and/or pathway) from the way in which it increases T2D. It is thus possible that some of the variants identified are not having effects on a common pathway. Furthermore, because we are not necessarily identifying a single causal variant, it is possible that within a locus, multiple different variants affect each trait. Results from colocalization do suggest that in many cases, there are different causal variants. However, differences in patterns of linkage disequilibrium between the UK Biobank and DIAGRAM consortium studies could reduce our ability to colocalize causal variants. If there was not a shared pathway, we might expect that the effect size of LDL-C would be directly proportional to the effect on T2D across all LDL-C–lowering alleles. However, there are many variants that are known to be strongly associated with LDL-C that are not identified in this analysis (e.g., LDLR, APOB, ABCG5/8).
In conclusion, our results suggest that low circulating LDL-C may be a risk factor for T2D, although further study is warranted. We have identified a collection of genetic variants that may provide insight into the mechanisms underlying the diabetogenic risk of low circulating LDL-C and of lipid-lowering medications, and the decreased T2D risk among individuals with familial hypercholesterolemia.
See accompanying article, p. 2058.
This article contains supplementary material online at https://doi.org/10.2337/figshare.12389273.
Acknowledgments. The authors acknowledge the vital contributions of the GLGC and DIAGRAM as well as all organizers and participants of individual participating studies. This research was conducted using the UK Biobank resource under application number 15678. The authors thank the participants and organizers of the UK Biobank.
Funding. The study received support from the National Heart, Lung, and Blood Institute (R01-HL-136528). A.C.W. was funded, in part, by U.S. Department of Agriculture/Agricultural Research Service cooperative agreement no. 58-3092-5-001.
The contents of this publication do not necessarily reflect the views or policies of the U.S. Department of Agriculture, nor does mention of trade names, commercial products, or organizations imply endorsement by the U.S. Government. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Duality of Interest. No potential conflicts of interest relevant to this article were reported.
Author Contributions. Y.C.K. conceived and designed the study. Y.C.K., A.A., M.N., and J.Z. performed data analyses. Y.C.K., M.N., B.J.R., and A.C.W. wrote the manuscript. J.M.O., B.J.R., and A.C.W. contributed to writing of the introduction and discussion. All authors read and edited the full manuscript. Y.C.K. is the guarantor of this work and, as such, had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
Prior Presentation. Parts of this study were presented in abstract form at the American Society of Human Genetics 2019 Annual Meeting, Houston, Texas, 15–19 October 2020.