Advances in small RNA sequencing have revealed the enormous diversity of small noncoding RNA (sRNA) classes in mammalian cells. At this point, most investigators in diabetes are aware of the success of microRNA (miRNA) research and appreciate the importance of posttranscriptional gene regulation in glycemic control. Nevertheless, miRNAs are just one of multiple classes of sRNAs and likely represent only a minor fraction of sRNA sequences in a given cell. Despite the widespread appreciation of sRNAs, very little research into non-miRNA sRNA function has been completed, likely due to some major barriers that present unique challenges for study. To emphasize the importance of sRNA research in cardiometabolic diseases, we highlight the success of miRNAs and competitive endogenous RNAs in cholesterol and glucose metabolism. Moreover, we argue that sequencing studies have demonstrated that miRNAs are just the tip of the iceberg for sRNAs. We are likely standing at the precipice of immense discovery for novel sRNA-mediated gene regulation in cardiometabolic diseases. To realize this potential, we must first address critical barriers with an open mind and refrain from viewing non-miRNA sRNA function through the lens of miRNAs, as they likely have their own set of distinct regulatory factors and functional mechanisms.
Introduction
MicroRNA (miRNA) research has enjoyed two decades of remarkable success and landmark studies that have defined miRNA biological function and role(s) in disease pathogenesis, including diabetes (1). This groundswell of interest into miRNAs, along with technological advances in sequencing, has also elevated research into other non-miRNA small noncoding RNAs (sRNAs). The composition of the mammalian transcriptome includes both coding and noncoding transcripts. Noncoding RNAs are further classified based on length (2). For example, the arbitrary cutoff between long noncoding RNAs (lncRNAs) and sRNAs is generally accepted as 200 nucleotides (nts). Strikingly, both lncRNA and sRNA transcripts are further processed into short-length sRNA fragments (<50 nts in length)—hereby referred to as non-miRNA sRNAs—likely through regulated processes resulting in guided and/or positional hydrolysis (2). Mammalian cells express an assortment of short-length non-miRNA sRNAs, including sRNAs derived from parent tRNAs (tRNA-derived sRNAs [tDRs]), rRNAs (rDRs), snoRNAs (snoDRs), snRNAs (snDRs), Y RNAs (yDRs), and many other miscellaneous RNAs (other sRNAs [osDRs]) (Table 1 and Fig. 1). Despite multiple reports describing expression changes of non-miRNA sRNAs in cells and extracellular fluids, to date, the functional relevance and physiological impact of non-miRNA sRNAs are largely unknown, particularly in glucose metabolism and diabetes.
Small and long noncoding RNAs and their role in cardiometabolic diseases
Classification . | Abbreviation . | Length (nts) . | Potential roles in cardiometabolic disease . | Ref. . |
---|---|---|---|---|
Long noncoding RNAs | ||||
Competing endogenous RNAs | ceRNAs | >200 | miRNA sponge that includes circRNAs and lncRNAs | (54,55) |
Circular RNAs | circRNAs | >200 | miRNA sponge, rRNA maturation | (56) |
Long noncoding RNAs | lncRNAs | >200 | Epigenetic regulation | (57) |
Parent small noncoding RNAs | ||||
Primary/precursor miRNAs | pri/pre-miRNAs | <200 | ||
Ribosomal RNAs | rRNAs | <200 | Part of ribosomes, protein synthesis | |
Small nuclear RNAs | snRNAs | <200 | Intron splicing from mRNA precursors | (58) |
Small nucleolar RNAs | snoRNAs | <200 | Guide posttranscriptional modifications of RNAs (rRNAs) | (59) |
Transfer RNAs | tRNAs | <200 | Transfer amino acids to the ribosome for proteinsynthesis | (60) |
Y RNAs | Y RNAs | <200 | Assist RNA binding proteins | (61) |
sRNA cleavage products | ||||
MicroRNAs | miRNAs | 19–24 | Posttranscriptional gene regulation | (18) |
rRNA-derived sRNAs | rDRs | <50 | rRNA processing, gene regulation, DNA binding | (62) |
snRNA-derived sRNAs | snDRs | <50 | ||
snoRNA-derived sRNAs | snoDRs | <50 | Posttranscriptional gene regulation | (63) |
tRNA-derived sRNAs | tDRs | <50 | Translation suppression, metabolic inheritance | (40,41) |
Y RNA-derived sRNAs | yDRs | <50 | Gene expression regulation | (64,65) |
Other miscellaneous derived sRNAs | osDRs | <50 |
Classification . | Abbreviation . | Length (nts) . | Potential roles in cardiometabolic disease . | Ref. . |
---|---|---|---|---|
Long noncoding RNAs | ||||
Competing endogenous RNAs | ceRNAs | >200 | miRNA sponge that includes circRNAs and lncRNAs | (54,55) |
Circular RNAs | circRNAs | >200 | miRNA sponge, rRNA maturation | (56) |
Long noncoding RNAs | lncRNAs | >200 | Epigenetic regulation | (57) |
Parent small noncoding RNAs | ||||
Primary/precursor miRNAs | pri/pre-miRNAs | <200 | ||
Ribosomal RNAs | rRNAs | <200 | Part of ribosomes, protein synthesis | |
Small nuclear RNAs | snRNAs | <200 | Intron splicing from mRNA precursors | (58) |
Small nucleolar RNAs | snoRNAs | <200 | Guide posttranscriptional modifications of RNAs (rRNAs) | (59) |
Transfer RNAs | tRNAs | <200 | Transfer amino acids to the ribosome for proteinsynthesis | (60) |
Y RNAs | Y RNAs | <200 | Assist RNA binding proteins | (61) |
sRNA cleavage products | ||||
MicroRNAs | miRNAs | 19–24 | Posttranscriptional gene regulation | (18) |
rRNA-derived sRNAs | rDRs | <50 | rRNA processing, gene regulation, DNA binding | (62) |
snRNA-derived sRNAs | snDRs | <50 | ||
snoRNA-derived sRNAs | snoDRs | <50 | Posttranscriptional gene regulation | (63) |
tRNA-derived sRNAs | tDRs | <50 | Translation suppression, metabolic inheritance | (40,41) |
Y RNA-derived sRNAs | yDRs | <50 | Gene expression regulation | (64,65) |
Other miscellaneous derived sRNAs | osDRs | <50 |
Read-length distribution of non-miRNA sRNAs. Parent noncoding RNAs and the read-length distribution (x-axis) of their sRNA products as reported by reads per million total reads (y-axis). Read lengths for miRNAs are generally around 22 nts, while non-miRNA sRNAs can be quite diverse, as seen with snDRs. Mouse liver, n = 7. Created using Biorender.com.
Read-length distribution of non-miRNA sRNAs. Parent noncoding RNAs and the read-length distribution (x-axis) of their sRNA products as reported by reads per million total reads (y-axis). Read lengths for miRNAs are generally around 22 nts, while non-miRNA sRNAs can be quite diverse, as seen with snDRs. Mouse liver, n = 7. Created using Biorender.com.
Although new miRNA studies in glucose metabolism are still being reported and provide the basis for future drug therapies (3), many miRNA investigators have spread to the far corners of the RNA world in search of the next big thing. Consequently, there has been a recent explosion of studies exploring other types of noncoding RNAs; however, these have largely been restricted to the investigation of long-length RNA transcripts (>200 nts), e.g., lncRNAs, pseudogenes, competing endogenous RNAs (ceRNAs), and circular RNAs (circRNAs) (4–8). Despite the high likelihood that these transcripts have other functions, investigations are mainly focused on their ability to competitively bind to and sequester miRNAs (6,9). This highlights a fundamental deficiency in noncoding RNA research, and we need to move toward discovering new functions of both long RNAs (e.g., lncRNAs) and short-length sRNAs (e.g., non-miRNA sRNAs) outside of miRNA activity. Here, we discuss the current state of noncoding RNA research in cholesterol and glucose metabolism and outline the current barriers and potential solutions in pursuing non-miRNA sRNA biology.
Recent Success of miRNA Research in Cardiometabolic Diseases
The study of miRNAs has far outpaced the alternative, i.e., non-miRNA sRNAs, and miRNA research has benefitted tremendously from widespread availability and catalogs of predesigned miRNA tools to rapidly perform the standard set of experiments required for miRNA-based investigation. Advances in miRNA studies have also been greatly aided by a centralized miRNA database (miRbase.org), multiple user-friendly free software for mRNA target prediction studies (in silico), and, most importantly, established canonical pathways for biogenesis and function (10–12). Using these and other wonderful resources for miRNA research, investigators have recently reported novel functions of miRNAs linked to cholesterol and glucose metabolism. For example, the miR-29 family was recently reported to be a negative regulator of the sterol sensing pathway through repression of regulation of sterol regulatory element-binding protein (SREBP) cleavage-activating protein (SCAP), potentially within a feedback network to limit cholesterol and lipid metabolism (13). Moreover, liver-specific Dicer1 knockout mice were found to have increased expression of β-hydroxy β-methylglutaryl-CoA reductase (Hmgcr), the rate-limiting enzyme in cholesterol biosynthesis (14). Although every cell can synthesize cholesterol, the liver is a major source of plasma cholesterol levels, thus supporting a key role for Dicer processing and miRNA activity in the regulation of cholesterol metabolism. Dicer is an RNase III enzyme responsible for cleaving precursor miRNA into mature miRNA forms, and in the aformentioned study, increased Hmgcr activity in liver-specific Dicer1 knockout mice was attributed to loss of miR-29 processing (10,14). Conversely, we have recently reported that inhibition of miR-29 in vivo reduced hepatic lipogenesis, specifically de novo cholesterol biosynthesis (15). In this study, injection of locked nucleic acid (LNA) inhibitors significantly decreased plasma cholesterol levels by 40% in C57BL/6 mice (in vivo) and significantly decreased the cellular conversion of radiolabeled acetate into cholesterol within hepatoma cells (in vitro) (15). Neither the study by Ru et al. (13) nor the study by Liu et al. (14) directly measured the impact of miR-29 inhibition on plasma cholesterol levels or cholesterol synthesis assays (i.e., acetate incorporation assay), which provide the most direct test of the impact for miR-29 on the directional influence on hepatic cholesterol synthesis. Despite the strong evidence that miR-29 has the potential to regulate Hmgcr, the rate-limiting enzyme in cholesterol biosynthesis, the impact of Dicer1 deficiency on liver cholesterol content and blood cholesterol levels could be due to a simultaneous reduction in other miRNAs, besides miR-29, that are processed by Dicer in the liver (14). Most interestingly, the miR-29 family also likely plays a role in glycemic control. For example, Praveen Sethupathy and colleagues (16) recently reported that inhibition of miR-29b-3p in vivo resulted in improved glycemic control and reduced insulin resistance, thus supporting a critical role for miR-29 in both cholesterol and glucose metabolism in vivo. In addition, miR-29 has also been shown to play a critical role in glucose metabolism in pancreatic islets and β-cell functions. For example, Rutter and colleagues (17) reported that miR-29 is upregulated in pancreatic β-cells and directly targets the plasma membrane monocarboxylate transporter, which facilitates normal insulin secretion. Conversely, miR-29 has also been shown to aid in pancreatic β-cell death in type 1 diabetic mice through regulation of the antiapoptotic gene induced myeloid leukemia cell differentiation protein (Mcl1) (18). Nonetheless, miR-29 is likely one of many miRNAs that contribute to both cholesterol and glucose metabolism. We have also previously shown that miR-27b-3p is a posttranscriptional regulatory hub for lipid metabolism, i.e., miR-27b-3p is predicted to regulate more lipid-associated genes than expected by chance, and altered miR-27b-3p expression was associated with inversely regulated lipid metabolism (19). Other groups have also reported key roles for miR-27 in cholesterol homeostasis, including regulation of low-density lipoprotein receptor (LDLR) and ATP-binding cassette transfer protein A1 (ABCA1), key membrane proteins in cholesterol uptake and efflux, respectively (20,21). miR-27b has also recently been found to control key genes within glucose pathways, including regulation of the insulin receptor (INSR) in adipocytes (22). Over time, miRNAs have repeatedly proven to be critical regulators of metabolism, and the success in miRNA biology is likely a direct result of investigators having access to predesigned reagents and databases as well as a canonical mechanism of function that hypotheses and models can be applied to. In contrast, the tools and databases for non-miRNA sRNAs are severely underdeveloped and non-miRNA sRNA research is not afforded all of the luxuries that miRNA research currently enjoys.
Role of ceRNA in Cardiometabolic Diseases
Many miRNA researchers have turned to other types of noncoding RNAs; however, most research activity migrated toward longer noncoding RNAs as opposed to other types of short-length non-miRNA sRNAs. As a result, there has been an incredible burst of basic research into lncRNAs, pseudogenes, and circRNAs. For example, multiple lncRNAs have recently emerged as critical regulators of cholesterol metabolism, and these newly identified transcripts include LeXis (23), NONMMUG027912 (24), ENST00000602558.1 (25), DAPK-IT1 (26), and NONRATT021972 (27). Further information on the gene regulatory mechanisms of lncRNAs in cholesterol and lipid metabolism is reviewed by van Solingen et al. (28). In parallel, multiple lncRNAs have been demonstrated to contribute to glycemic control, including H19 (29), MALAT1 (30), and Bhmt-AS (4). The role of lncRNAs in glycemic control and diabetes is reviewed by Ruan (5). Based on the literature, the most frequently reported biological function for lncRNAs relates to their ability to bind to and inhibit miRNAs from regulating target genes. For example, there is a wave of research into the functional role(s) of noncoding RNA transcripts serving as miRNA sponges, also known as ceRNAs (6,9).
Although early controversies likely delayed the growth of this new field, there has been a burst of scientific advances in this area (31). Recently, we reported that cholesterol homeostasis regulator of miRNA expression (CHROME), a primate-specific lncRNA, serves as a ceRNA and regulates cholesterol metabolism through binding to and suppressing miR-27b-3p, miR-33a/b-5p, and miR-128-3p (7). All three of these miRNAs have previously been reported to regulate specific genes that likely contribute to glucose metabolism (21,22,32–34). CHROME is a remarkable example of an lncRNA that has the capacity to sequester and inhibit the activity of multiple key metabolic miRNAs and thus likely control a substantial number of critical important genes for cholesterol and glucose homeostasis to achieve a higher level of posttranscriptional regulatory control.
Most ceRNAs have been identified as lncRNAs or circRNAs. It should be noted, however, that miRNA sponge activity does not account for all reported functions of lncRNAs or circRNAs, but it does account for a sizeable fraction. Initially, the concept of ceRNAs regulating miRNA activity was met with justified skepticism, which was primarily centered on the issue of transcriptome-wide miRNA binding-site abundance. Nonetheless, these issues may not be as critical as once thought, as many sound and convincing ceRNA studies have been recently published, including strong evidence supporting ceRNA regulation of cholesterol (7,8,23,28) and glucose homeostasis (35). It is now clear that miRNA-mediated posttranscriptional gene regulation likely contributes to multifaceted regulatory networks in glucose metabolism and diabetes, and ceRNA regulation of miRNAs greatly adds to this complexity. Due to the relative infancy of ceRNA research, we expect that ceRNAs will be further implicated in cardiometabolic diseases. Nonetheless, the biggest area of potential discovery may lie in the investigation not of miRNAs and ceRNAs but of non-miRNA sRNAs, reinforcing the notion that we must be open-minded toward new functions for noncoding RNAs that are not related to miRNA activity.
Non-miRNA sRNAs
Although miRNAs have gained the most attention, non-miRNA sRNAs are collectively more abundant in cells and extracellular fluids than miRNAs. For example, we recently performed in-depth sequencing analyses of sRNAs associated with lipoproteins, bile, urine, and liver using high-throughput sRNA sequencing (sRNA-seq) in mice (36). In normal C57BL/6 mouse livers, miRNA reads accounted for ∼20% of the host sRNA read counts compared with non-miRNA sRNAs at ∼80% (Fig. 2A). The most abundant sRNA class in livers was rDRs (55%), followed by miRNAs (20%), snoDRs (15%), and tDRs (7%), with contribution of the other miscellaneous classes (osDRs) (Fig. 2A). The non-miRNA sRNAs, particularly the rDRs, are not likely random degradation products as multiple features support a regulated biogenesis process. For example, each sRNA class produces a distinct pattern of sRNA lengths, e.g., rDRs and tDRs are enriched for sequences approximately 45 nts and 35 nts in length, respectively (Fig. 1). Moreover, non-miRNA sRNAs are consistently produced from specific domains and enriched regions of the parent RNA. For example, sRNA sequences that are processed from 18S rRNA are cleaved from two distinct internal domains (Fig. 2B); however, the sequences and lengths of the sRNAs that are processed from these enriched domains are highly variable. At this time, there is little to no research into the biological functions and physiological relevance of rDRs in biology. Based on their high expression and regulated processing, rDRs likely contribute to some form of gene regulation, potentially of genes associated with cholesterol and glucose metabolism. However, this likely occurs through completely unknown mechanisms, as rDRs and other non-miRNA sRNAs are not normally present in the canonical Argonaute family-containing RNA induced silencing complex (AGO-RISC) that facilitates miRNA-based posttranscriptional gene regulation (10,11). Despite reports that some non-miRNA sRNAs are detected in AGO-RISC, the levels of non-miRNAs are considerably low, and this does not support robust posttranscriptional gene regulation by non-miRNA sRNAs within the canonical AGO-RISC silencing process (37). To solidify this point, we performed a meta-analysis of publicly available AGO2 cross-linking immunoprecipitation (CLIP)-seq data to demonstrate the diversity of sRNAs in the AGO2-RISC (38). Briefly, we downloaded sRNA-seq data sets from Gene Expression Omnibus and performed sRNA analyses using our in-house pipeline TIGER (Tools for Integrative Genome analysis of Extracellular sRNAs) (36). Based on this meta-analysis, we found that sRNAs in AGO2-RISC are almost entirely miRNAs, with only minor evidence that other sRNAs, e.g., tDRs, are present (Fig. 2C). Currently, the biological functions for non-miRNA sRNAs are not well understood; however, if we assume that they are more than biological noise, they likely confer some level of gene regulation, albeit they are not likely to occur in canonical miRNA-mediated silencing mechanisms. This has likely created a barrier to their investigation as it requires the discovery of a completely new mechanism of function for gene regulation, which is difficult and discouraging despite the tremendous potential for novel research and biology. The most well-studied class of non-miRNA sRNAs is tDRs, which have indeed proven to have multiple gene regulatory functions outside of AGO2-RISC (39).
Diversity of small RNAs. A, B, and D: Small RNA sequencing analyses of mouse liver samples (n = 7, S1–S7). A: Percentage of miRNA and non-miRNA sRNA reads (y-axis) for each mouse liver sample (x-axis). rDRs are the most abundant sRNAs in each sample. B: Enrichment domains of non-miRNA sRNAs across 18S rRNA, as reported as positional base counts. x-axis: Positional base count of 18S sRNAs; y-axis: mouse liver samples. Red, highly enriched base count; white, no enrichment. Across 7 samples, 18S rRNA products are predominantly produced from two distinct internal domains. C: Meta-analysis of AGO2 CLIP-seq data of 293S cells (38). Percentage of miRNA and non-miRNA sRNA reads (y-axis) for each 293S sample (x-axis). AGO2-RISC complex predominantly contains miRNAs and not non-miRNA sRNAs. D: Box plots showing unique sequences per million reads (y-axis) for the top 10 of sRNAs of each class (x-axis) in mouse liver samples. n = 7. Non-miRNA sRNAs have diverse amount of unique sequences compared with miRNAs. Wilcoxon rank sum test, ***P < 0.0001.
Diversity of small RNAs. A, B, and D: Small RNA sequencing analyses of mouse liver samples (n = 7, S1–S7). A: Percentage of miRNA and non-miRNA sRNA reads (y-axis) for each mouse liver sample (x-axis). rDRs are the most abundant sRNAs in each sample. B: Enrichment domains of non-miRNA sRNAs across 18S rRNA, as reported as positional base counts. x-axis: Positional base count of 18S sRNAs; y-axis: mouse liver samples. Red, highly enriched base count; white, no enrichment. Across 7 samples, 18S rRNA products are predominantly produced from two distinct internal domains. C: Meta-analysis of AGO2 CLIP-seq data of 293S cells (38). Percentage of miRNA and non-miRNA sRNA reads (y-axis) for each 293S sample (x-axis). AGO2-RISC complex predominantly contains miRNAs and not non-miRNA sRNAs. D: Box plots showing unique sequences per million reads (y-axis) for the top 10 of sRNAs of each class (x-axis) in mouse liver samples. n = 7. Non-miRNA sRNAs have diverse amount of unique sequences compared with miRNAs. Wilcoxon rank sum test, ***P < 0.0001.
Multiple studies have reported tDR changes in biological tissues and fluids in response to disease; however, functional loss-of-function studies for specific tDRs are limited, so complete understanding of their impact remains to be determined. Nevertheless, one area of fascinating metabolic research that has emerged is the biological relevance of sperm tDRs conferring paternal and/or transgenerational metabolic inheritance (40,41). For example, multiple studies have reported that tDRs in sperm are a conduit for the transfer of paternal metabolic health features to progeny and have defined the impact of low-protein and high-fat diets on this process (40,41). Importantly, Chen et al. (41) demonstrated high-fat diet–fed fathers had offspring with altered pancreatic islet transcriptomes, impaired glucose tolerance, and increased insulin resistance. A recent study also found that paternal exercise negated the effects of high-fat diets in fathers on offspring; specifically, exercise improved glucose tolerance and glucose uptake and reduced fat accumulation (42). The underlying biology of these effects was demonstrated to be conferred by sperm tDRs, e.g., paternal exercise was found to reverse the observed diet-induced increase in sperm tDRs in the aforementioned study (42). Most interestingly, tDRs have also been shown to be enriched in hypertrophic hearts compared with controls, and this phenomenon is apparently also passed to offspring through tDRs in sperm (43). This study reported that cardiac tDRs likely regulate the tissue inhibitor of metalloproteinase 3 (Timp3) and contribute to fibrosis and apoptosis in the hearts of offspring (43). Furthermore, it was recently reported that this is not just a consequence of paternal metabolic health, as maternal metabolic effects were also shown to be transmitted across generations via tDRs (44). Maternal metabolic features (e.g., high-fat diet–induced effects) were found to be transferred to F1 offspring sperm (by tDRs), which in turn affected two more generations of offspring (44). Remarkably, sperm tDRs were reported to not only transmit obesity-associated phenotypes but also alter gene expression in the brains of offspring associated with addiction (44). While these studies and others have reported that sperm tDRs recognize and repress target genes (mRNAs), recent results from Qi Chen and colleagues (45) suggest that RNA base modifications harbored on sperm tDRs conferred the transgenerational metabolic inheritance. For example, deletion of a specific tRNA methyltransferase, tRNA aspartic acid methyltransferase 1 (Trdmt1), resulted in reduced tDR m5C modifications at the C38 position and increased tDR content in sperm (45). Strikingly, this report demonstrated that loss of this one modification conferred the paternal metabolic inheritance specifically related to high-fat diet–induced impaired glucose metabolism (45). In a nonepigenetic inheritance study, deficiency of another methyltransferase, tRNA methyltransferase 10 homolog A (TRM10A), caused hypomethylation of parent tRNAs and their tDR products, which resulted in pancreatic β-cell death (46). In addition to tDRs, rDRs are also abundant in mature sperm, suggesting that other sRNAs in sperm may also confer these transmitted metabolic outcomes; however, this remains to be determined (47). These reports comprise a very exciting aspect of non-miRNA sRNA function in metabolism, specifically for tDRs and rDRs, and represent a novel process by which risk of diabetes or impaired metabolic control is passed on to future generations. It should be noted that analyzing tDRs in sRNA-seq data sets presents its own set of challenges and that this important topic has also been reviewed elsewhere (48,49). Importantly, many sRNA-seq approaches likely miss a considerable fraction of tDRs and rDRs due to base modifications on the sRNAs. For example, we have known for some time that RNA base modifications impede reverse transcriptase activity and prevent cDNA first-strand synthesis, a critical step in sRNA-seq library preparation. Therefore, heavily modified tDRs, as well as rDRs, are not likely to be fully represented in sRNA-seq data sets, resulting in underestimation of tDR and rDR content in biological samples (50,51). To overcome this barrier, it is recommended that investigators use demethylation-based sRNA-seq and/or improved reverse transcriptase enzyme approaches to facilitate the inclusion of modified sRNAs in the sequencing reactions and gain a more comprehensive picture of the non-miRNA sRNA signature in biological samples. Based on strong evidence that non-miRNA sRNAs are likely more abundant than miRNAs and have the capacity to harbor and transfer metabolic disease–linked imprints, the general paucity of research into the biological function of non-miRNA sRNAs in glucose metabolism and diabetes is a problem and represents a great need in diabetes research.
Current Barriers to Investigating Non-miRNA sRNAs
The central issue for studying non-miRNA sRNA classes is that their cleavage and processing events are imprecise—considerably less precise than miRNA processing; thus, non-miRNA sRNAs are often highly variable in both length and sequence. Therefore, the diversity of individual unique sequences produced by a single parent RNA is high and substantially greater than miRNAs. This point is readily apparent in sRNA-seq data sets. For example, we recently found that each class of non-miRNA sRNAs had significantly more unique sequences (nonredundant read counts per million sRNA counts) than miRNAs in mouse livers (n = 7) for the top 10 most abundant sRNAs per class for each sample: tDRs (P < 0.0001 compared with miRNA using a Wilcoxon rank sum test), rDRs (P < 0.0001), snoDRs (P < 0.0001), snDRs (P < 0.0001), and osDRs (P < 0.0001) (Fig. 2D) (36). This demonstrates that non-miRNA sRNA processing is not as precise or uniform as miRNA biogenesis, which creates a barrier for their further analyses. In some instances, a single sequence will be substantially more abundant than other candidate sequences for non-miRNA sRNAs, which will greatly increase confidence in candidate selection; however, this may not always be readily apparent, and the expected high diversity of sequences and lengths for non-miRNA sRNAs can often present a challenge for selecting a single sRNA sequence for further study.
This issue of sequence variability creates a few critical barriers for downstream investigation of non-miRNA sRNAs, including 1) the selection of a single candidate sRNA sequence to study, 2) the lack of predesigned reagents, probes, and reporters, and 3) the high variability across species, samples, and even cells. The first and foremost barrier to studying non-miRNA sRNAs is the challenge of identifying a candidate sRNA to study. For miRNAs, this is relatively easy since miRNAs have a low number of unique sequences. Most likely, one of these few unique miRNA sequences would account for the majority of the molecules for a selected miRNA. Researchers attempting to investigate most non-miRNA sRNA classes do not have this luxury and must choose a sequence they deem to be most representative, which is often not obvious. Furthermore, for miRNAs, the primary and precursor miRNA transcripts are rapidly processed to the mature form, and thus the mature product likely represents a transcriptional response to the biological stimulus. On the contrary, quantification of a single non-miRNA sRNA, e.g., a selected unique sequence, may not accurately represent transcriptional activation of the parent RNA but could represent a processing response to a biological stimulus. Moreover, parent transcripts for many of the non-miRNA sRNAs, e.g., tRNAs and rRNAs, are likely to be more abundant than the cleaved products, i.e., non-miRNA sRNAs, and are likely to be more stable than primary or precursor miRNA transcripts. These observations support that expression analyses for miRNAs and non-miRNA sRNAs could represent disparate cellular responses, and the lack of a direct link between transcriptional changes of the parent RNAs and changes to the abundance of a unique sequence for a given non-miRNA sRNA likely dampens enthusiasm for non-miRNA sRNA research.
The next major barrier to non-miRNA sRNA research is the lack of predesigned tools, probes, and reagents for validation of sRNA-seq results and downstream functional analyses. To study the expression and function of candidate miRNAs, one only has to search online catalogs to purchase predesigned PCR probes, miRNA inhibitors (e.g., LNA inhibitors or antagomiRs), and gene reporter luciferase constructs and reagents. On the contrary, non-miRNA sRNAs require designing a custom probe, inhibitors, and other tools, which in and of itself is not particularly challenging but is often more expensive. Another potential barrier to studying non-miRNA sRNAs is the high variability of sequences and lengths for non-miRNA sRNAs between samples and subjects. One beneficial feature of miRNAs is that miRNA expression and processing are generally consistent between samples. This does not appear to be the case for non-miRNA sRNAs, despite the fact that the non-miRNA sRNAs are generally thought to be processed from parent RNAs that have strong conservation between species and are consistent across samples in a group, i.e., tRNAs and rRNAs. On the contrary, non-miRNA sRNAs are not consistent across samples or conserved between species. The contributing parent RNA transcripts may very well be conserved, but due to the imprecise nature of their cleavage, the products are likely to be highly variable. Although the lack of interspecies conservation for non-miRNA sRNAs is problematic, particularly for representative models of disease or preclinical experiments, the lack of intraspecies consistency represents more of a barrier for investigation, as it is difficult to study the function of a non-miRNA sRNAs if the sequences and lengths are different in each sample. This barrier poses unique challenges to study the physiological impact of non-miRNA sRNAs in glucose metabolism and diabetes. In addition to the technical and methodological problems, another major barrier to studying non-miRNA sRNAs is the general lack of understanding of their underlying biological functions.
Although there are many exceptions, miRNAs for the most part posttranscriptionally regulate genes through a canonical process where mature miRNAs are loaded into AGO-RISC and target mRNAs harboring seed-based miRNA target sites within their 3′ untranslated regions. This does not appear to be the case for non-miRNA sRNAs (Fig. 2C), as any potential gene regulation would likely be mediated through multiple different mechanisms. Although it is difficult to gauge, the lack of established mechanisms for non-miRNA sRNAs likely contributes to the limited interest in their investigation. Nevertheless, non-miRNA sRNAs may very well contribute to pathophysiology in a meaningful way, and thus entire new fields of gene regulation and sRNA biology are likely to be discovered if these barriers to progress are resolved.
Potential Ladders to Climb Over the Barriers
It is easy to discount non-miRNA sRNAs as biological noise and potentially having limited biological relevance; however, we believe there is great potential to be discovered if these barriers are addressed. Currently, however, there are more problems than solutions to these inherent issues to studying non-miRNA sRNAs. Nevertheless, a few conceptual changes may advance the field, particularly those related to how we define the regulatory space. At this time, it is unclear whether PCR quantification of a single non-miRNA sRNA sequence is sufficient to make expression and/or processing claims for a given sRNA in response to disease or a biological context. Instead of selecting a single sRNA fragment for PCR, another option for downstream analysis includes custom-designed hybridization probes that detect a common motif or core of the fragment that is shared among all or most of the fragments for a given non-miRNA sRNA locus. The consequence of this approach is that the parent RNAs that were processed into the non-miRNA sRNA fragments will also contain the common motif and will be included in the detection signal; a major problem when the parent RNA species are much more abundant than the processed fragments. This overestimation problem with parent RNA can be addressed by size-selecting the sRNAs (or converted cDNAs) to only include short-length sRNAs (e.g., <50 nts in length) in the analysis.
The sheer number of potential regulatory molecules for non-miRNA sRNAs is extreme and likely distracts from meaningful investigation of the biological functions of a single non-miRNA sRNA sequence. One major conceptual advance to overcoming this issue would be to investigate whether sRNAs in cells regulate gene expression or cellular phenotypes independent of length and sequence. For example, it is possible that non-miRNA sRNA function is not conferred by individual sRNA molecules based on sequence and antisense recognition of target molecules, but their biological relevance is tied to their abundance (in toto) and is simply related to their form, i.e., short-length single-stranded RNAs (ssRNAs). It is entirely possible and plausible that non-miRNA sRNAs regulate gene expression, signaling cascades, and cellular phenotypes through activation of specific ssRNA receptors or RNA binding proteins (ribonucleoproteins). For example, endosomal Toll-like receptors 7 and 8, which have been studied as pathogenic receptors of ssRNA viruses, have also been reported to recognize miRNAs and other sRNAs (52). Moreover, it is plausible that there are other receptors (e.g., cytoplasmic sensors) and/or ribonucleoproteins that could potentially recognize non-miRNA sRNAs and elicit a specific cellular response; however, identification of novel cytoplasmic RNA binding proteins is likely required to advance this hypothesis.
To address the major issue of sequence and length variance for non-miRNA sRNAs across samples, one potential improvement would be to try positional cleavage counts instead of expression counts for individual sequences. For example, instead of trying to quantify the many different sRNA fragments for a given sRNA, the 5′ terminal start position base counts could be used in tandem with other bioinformatic approaches to quantify the impact of the cellular response toward processing of the parent RNA. This can be achieved using the distribution analysis of the base counts for the 5′ terminal start positions of the sRNA fragments. For example, in our recent study of mouse livers, the positional base counts clearly show enriched positions and intensity of parent RNAs processing of non-miRNA sRNA fragments (Fig. 3). This analysis also highlights the differences in cleavage precision between sRNA classes, with miRNAs having greater precision than non-miRNA sRNAs. Researchers could then validate sRNA-seq results and quantify parent RNA processing using different methods, including circularized reverse transcription coupled with PCR or a modified version of 5′ rapid amplification of cDNA ends (RACE) PCR for sRNAs (53). It is unknown whether these conceptual changes will advance the field, but it is clear that changes and/or improvements are needed to overcome the current barriers to studying non-miRNA sRNAs in cardiometabolic diseases, e.g., diabetes.
Positional cleavage counts of small RNAs. Distribution of 5′ start position (x-axis) counts (cleavage counts) (y-axis) from selected candidate miRNA and non-miRNA sRNAs in mouse livers (n = 7). 5′ miRNAs are predominantly processed at position 0 from their parent RNA, while non-miRNA sRNAs can be processed from sRNAs. miRNAs, blue; rDRs, orange; snoDRs, purple; snDRs, red; tDRs, green; osDRs, mustard.
Positional cleavage counts of small RNAs. Distribution of 5′ start position (x-axis) counts (cleavage counts) (y-axis) from selected candidate miRNA and non-miRNA sRNAs in mouse livers (n = 7). 5′ miRNAs are predominantly processed at position 0 from their parent RNA, while non-miRNA sRNAs can be processed from sRNAs. miRNAs, blue; rDRs, orange; snoDRs, purple; snDRs, red; tDRs, green; osDRs, mustard.
Conclusions
Collectively, the major obstacles to studying non-miRNA sRNAs are directly related to imprecise processing, which is a barrier to candidate selection and causes high variability across samples. The most likely path forward will require new biology and advances in methodology, particularly in bioinformatics. Critically important research into the physiological impact of miRNAs in cardiometabolic diseases is still being conducted, and novel findings are continuing to be reported. Nevertheless, there are many other classes of sRNAs that have not been extensively studied that may have equal or more regulatory potential than miRNAs. It is most likely time to leave the comfort of miRNA research and turn our collective attention to the many other sRNAs that are being neglected in cholesterol and glucose metabolism research.
This article is part of a special article collection available at https://diabetes.diabetesjournals.org/collection/small-noncoding-RNAs-in-diabetes.
Article Information
Funding. This work was funded by the W.M. Keck Foundation and grant awards from the National Institutes of Health National Heart, Lung, and Blood Institute (HL128996, HL127173, HL116263, DK103067).
Duality of Interest. No potential conflicts of interest relevant to this article were reported.
Appendix
The research design and methods are as follows. Fig. 1: After reads were mapped to different sRNA categories using the TIGER system, the read length distribution for each sRNA category was visualized by histogram plot. Fig. 2A and C: Seven mouse liver samples were analyzed using the TIGER system. The eight sequencing result files of data set GSE44378 were downloaded from NCBI (National Center for Biotechnology Information), converted to FASTQ format, and analyzed using the TIGER system. The percentage of each sRNA category identified in each sample was summarized. Fig. 2B: For each base position in the 18S rRNA sequence, the read coverage indicated the number of reads that covered this base position in sample. Read coverage percentage of this position was calculated as read coverage divided by total reads mapped to 18S rRNA and visualized by geom_tile of ggplot2. Fig. 2D: In each sample of the publicly available mouse liver data set, the unique sequences per million reads (USPM), for each sRNA category, were calculated by normalizing unique sequences mapped to those 10 selected features by the total number of reads mapped to those 10 features, multiplied by 1 million. Wilcoxon rank sum tests were used to test differences in USPM. Fig. 3: One feature (sRNA) was picked as representative for each sRNA category in sample S6 of the mouse liver data set. The start position of each read mapped to this feature was recorded and then visualized by histogram plot.