Previous Article |
Table of Contents
| Next Article
BIOLOGICAL SCIENCES / MICROBIOLOGY
Identification of a large noncoding RNA in extremophilic eubacteria
,

,
,
Departments of *Molecular, Cellular, and Developmental Biology and
Molecular Biophysics and Biochemistry, and
Howard Hughes Medical Institute, Yale University, P.O. Box 208103, New Haven, CT 06520-8103
Edited by Sidney Altman, Yale University, New Haven, CT, and approved October 19, 2006 (received for review August 30, 2006)
| Abstract |
|---|
|
|
|---|
610 nucleotides, and the 35 representatives examined exhibit extraordinary conservation of nucleotide sequence and base pairing. Structural probing of the OLE RNA from Bacillus halodurans corroborates a complex secondary structure model predicted from comparative sequence analysis. The patterns of structural conservation, and its unique phylogenetic distribution, suggest that OLE RNA carries out a complex and critical function only in certain extremophilic bacteria.
isoprenoid | riboswitch | ribozyme | superoperon
RNAs such as tRNAs, rRNAs, and some ribozymes have noncoding functions that have long been known to be central to RNA processing and protein synthesis mechanisms. However, it seems possible that additional noncoding RNAs will be found that perform fundamental biochemical tasks that to date have been assumed to be the exclusive province of protein factors. A number of noncoding RNAs discovered recently have proven to participate as the key components of gene regulation systems (e.g., refs. 1719). Additional examples of widespread noncoding RNAs in bacteria, such as 6S RNA (2022), the dual-function tmRNA (23), and noncoding portions of messenger RNAs such as T-Box elements (24) and riboswitches (2527) perform important gene control and molecular sensing tasks that are critical for cells to function normally. The existence of so many RNAs with atypical functions implies that some newly discovered noncoding RNAs might perform surprising and important roles in fundamental cellular processes. In this article, we describe features of a noncoding RNA element whose size, structural sophistication, and unique phylogenetic distribution are suggestive of a complex biological function.
| Results |
|---|
|
|
|---|
However, these lists of structured RNA motifs routinely include elements that do not appear to be riboswitches (e.g., ref. 21). Many of these predicted RNA motifs carry only a few conserved nucleotides and have relatively simple secondary structure architectures compared with riboswitches (data not shown). Some of these unclassified motifs are likely to represent noncoding RNAs with functions that do not demand highly specific nucleotide sequences or structures to carry out their functions or otherwise are not well conserved among distantly related bacterial species. In addition, the genomic positions of the DNAs that code for these elements frequently are inconsistent with riboswitches, because the latter are almost always located in the 5' untranslated regions of the genes they control.
One of the most striking examples of a new-found structured RNA element was encountered during a recent bioinformatics search of 116 bacterial genomes. Although the search revealed a number of new riboswitch candidates in
-proteobacteria (28), one RNA motif "hit" was not in this bacterial lineage and therefore was not described in the previous report. However, evidence for sequence conservation corresponding to this RNA was made available on the Breaker Laboratory Intergenic Sequence Server (BLISS; http://bliss.biology.yale.edu) (11, 28). Subsequently, we found that this RNA is listed as a candidate gene control element amongst a large number of hits provided as supplementary data for a bioinformatics report published by another laboratory (12).
We call members of this class "OLE RNAs" because they have a particularly ornate secondary structure, are distinctively large compared with the sizes of other well conserved noncoding RNAs, and have been found almost exclusively in species of eubacteria that are characterized as extremophilic. Initially, only the region of OLE RNA encompassed by P12.2 through P14.1 (Fig. 2) was identified. We iteratively used a combination of manual sequence comparison and computational sequence analysis (see Materials and Methods), which revealed the full extent of its size and increased the number of OLE RNA examples identified. This effort yielded 15 representatives (Fig. 1) that are
600 nt in length and are almost exclusively present in Firmicutes.
|
|
Primers for population PCR were designed to hybridize near the 5' and 3' termini of the OLE element (see SI Fig. 5), corresponding to two conserved regions of the gene that were identified upon examining the 15 representatives present in sequenced bacterial genomes (Fig. 1). The majority of the products generated by population PCR corresponded in length (
580 nt) to that expected if known OLE RNA sequences were amplified with the same primers. Although cloning of the PCR products sometimes yielded individual constructs that were substantially larger or smaller than typical OLE RNA templates (data not shown), only those of similar size produced new representatives of the motif. Of 48 clones sequenced, 35 yielded OLE RNA sequences, of which 20 were unique (SI Fig. 5).
Elaborate Sequence and Structural Features of OLE RNAs.
The alignment of all OLE RNAs identified (SI Fig. 5) was used to define a consensus sequence (Fig. 1) and a secondary structure model (Fig. 2) for this class of RNAs. The alignment of these sequences from diverse organisms reveals that the RNA exhibits a strikingly high level of sequence and secondary structure conservation. Of the
610 nucleotides in OLE RNAs spanning from the start of P1 to its completion, nearly half (294 nucleotides) exhibit only one or no mutations relative to the defined consensus sequence. This exceptionally high level of sequence conservation, much of it present within bulges and loops of a complex secondary structure, indicates that OLE RNA forms a complex tertiary structure that is critical for its biological function.
Some of the conserved nucleotides also occur in segments of the RNA that are predicted to form base-paired stems. However, many of the nucleotides within predicted stems are not restricted in sequence identity. These relatively less conserved nucleotides almost always retain the ability to base-pair with their predicted partners. In total, 23 base-paired elements were defined, and the importance of all these stems for the function of the RNA is supported by frequent evidence of nucleotide covariation. This pattern of sequence conservation also suggests that OLE RNA is not a transposable element or a recently acquired genetic element. Unlike OLE RNAs, such elements exhibit far greater sequence similarity between representatives and they lack substantive structural elements with sequence covariation.
We performed a series of in-line probing assays (34) to provide further support for the secondary structure model presented (SI Figs. 6 and 7, and data not shown). In-line probing assays take advantage of the conformation-dependent rates of spontaneous RNA degradation. The resulting data can be used to identify RNA linkages that reside in unstructured regions because their greater flexibility typically permits greater levels of RNA cleavage compared with linkages within structured regions. The sequence from B. halodurans exhibits a pattern of spontaneous RNA cleavage (SI Fig. 7) that largely is consistent with the model derived from comparative sequence analysis (Fig. 2), again indicating that OLE RNAs form an exceptionally complex secondary structure.
OLE RNAs Are Embedded in a Large Operon. In 14 of the 15 organisms known to carry OLE RNA, the gene encoding the RNA is located immediately 5' relative to an ORF for a protein of unknown function. A second ORF further downstream is the dxs gene, which codes for 1-deoxy-D-xylulose-5-phosphate synthase. This compound is a metabolic intermediate for the biosynthesis of a number of fundamental metabolites including some coenzymes and lipids. With riboswitches, the proximity of genes for metabolic enzymes frequently can be used to rationalize the ligand specificity and the gene-control function of the RNA element. Therefore, we sought to further clarify the composition of the transcriptional unit that carries OLE RNA in B. halodurans.
To verify the expression of OLE RNA in B. halodurans, and to establish the length of the RNA transcript, we isolated total RNA from cells and performed a series of RT-PCRs (Fig. 3 A and B). DNA primer pairs also were designed to determine which of the adjacent genes (Fig. 3A) were expressed in a transcriptional unit that is contiguous with OLE RNA. For example, the presence of RT-PCR products in lanes 33' and 44' (Fig. 3B) confirms that at least some OLE RNA is expressed by B. halodurans grown under aerobic conditions in peptone-starch-carbonate medium (pH 10, 37°C). Furthermore, only DNA primer pairs 66' and 1010' failed to amplify a product of the expected sizes, indicating that the RNA is expressed as part of a large polycistronic mRNA containing 10 other genes. This finding is consistent with the architecture of the corresponding genomic region in Bacillus subtilis, which is predicted to carry transcription terminators preceding and following the analogous operon (but lacking OLE RNA).
|
In other organisms, the ORFs flanking the gene for OLE RNA are homologous, and their genomic arrangements are the same. The only known exception occurs in Desulfitobacterium hafniense, in which the OLE RNA gene resides at a completely different locus, although the remainder of the operon lacking OLE RNA is still present elsewhere in the genome. This suggests that OLE RNA is not exclusively a riboswitch control element that is needed to regulate genes within the large operon. Furthermore, the D. hafniense OLE RNA gene is flanked by intergenic regions that are >200 nt, and therefore the RNA might be expressed as a monocistronic transcript. These findings suggest that the biological function of the RNA does not require that it be coexpressed within the larger polycistronic mRNA.
If OLE RNA is functional as a separate RNA transcript, then it also might be processed from the larger operon transcript produced by most organisms. We examined this possibility by conducting 5' RACE (rapid amplification of cDNA ends) (35, 36), which can be used to identify the 5' termini of processed RNAs. We identified a dominant PCR product and determined the termini of these molecules by cloning and sequencing the resulting amplicons. Of 17 clones examined, 13 had 5' termini that mapped to the P1 region of OLE RNA (Fig. 3C). Therefore, OLE RNAs appear to be processed to yield a dominant RNA product whose nucleotide sequence initiates at the same location as the highly conserved sequences and structures that are characteristic of OLE RNAs. Of the remaining clones, two terminate in the junction between P4 and P5, and two terminate near the loop of P5.
Co-occurrence of OLE RNA and a Predicted Membrane Protein. The gene immediately downstream of the OLE RNA gene in all of the organisms but D. hafniense encodes a putative membrane protein whose function remains unknown (Fig. 3A; BH2780 in B. halodurans). In related organisms that lack OLE RNA (such as B. subtilis or Clostridium perfringens), this predicted membrane protein also is absent, even though the operon otherwise retains the same organization. In addition, we conducted a BLAST search with the BH2780 protein sequence and found that the only significant hits corresponded to organisms that also carry an OLE RNA gene. These findings suggest that a functional link might exist between OLE RNAs and BH2780-like proteins.
Comparative amino acid sequence analysis of the 15 BH2780 homologs from sequenced bacterial genomes reveals that considerable conservation occurs in several regions of the protein (SI Fig. 8). Some of these conserved amino acids reside within four predicted transmembrane domains (SI Figs. 8 and 9), although most of the conserved residues are present in two segments of the polypeptide that are predicted to be positioned on the cytoplasmic side of the membrane. Moreover, preliminary efforts to purify BH2780 protein from a transgenic Escherichia coli strain support cell membrane localization (data not shown).
Mutations in OLE RNA Do Not Alter Gene Expression. OLE RNAs are distinctive in that they carry many more highly conserved nucleotides compared with known riboswitch classes (2527). OLE RNAs also are at least three times longer than the largest known riboswitch aptamers (Fig. 4A), again suggesting that OLE RNAs are not likely to function exclusively as gene control elements. However, the juxtaposition of OLE RNA elements with the same ORFs is similar to the arrangements commonly seen with riboswitches, and this caused us to consider the possibility that OLE RNAs were a new class of metabolite-sensing gene control element.
|
-galactosidase reporter gene in transformed B. subtilis cells also was examined by using methods similar to those reported previously for assessing riboswitch function (37, 38).
Fusion of the entire noncoding portion of the intergenic region between the B. halodurans geranyltranstransferase and BH2780 coding regions (including OLE RNA) to a
-galactosidase gene carrying its own translation start site yields
100 Miller units when transformed B. subtilis cells are grown in either minimal or rich medium (data not shown). In contrast, fusions of this same noncoding region in each of three reading frames to a
-galactosidase gene lacking its own translation start site does not yield any reporter gene expression in rich medium, suggesting that OLE RNA lacks signals for translation initiation.
Most importantly, no changes were observed in gene expression between the wild-type transcriptional fusion and OLE RNA mutants that disrupt stems P4, P6.1, P12.1, and P12.2 (data not shown). Our results are consistent with the hypothesis that OLE RNA is unlikely to be a gene-control element. It is important to note, however, that the growth conditions examined might not permit the normal function of some riboswitch types, and that the transgenic organism used does not naturally carry OLE RNA.
| Discussion |
|---|
|
|
|---|
Bioinformatics searches for OLE RNAs from published sequences derived from environmental samples isolated from the Sargasso sea (39), Minnesota soil and whale fall (40), and an acid mine drainage biofilm (41) also did not provide any additional hits. In contrast, cloning of OLE RNA sequences from genomic DNA isolated from a hypersaline mat produced numerous sequences. Although the bacterial mat is estimated to contain at least 750 different species, the apparent enrichment for OLE RNAs might be due to the presence of this RNA primarily in organisms that survive in extreme growth conditions (Fig. 4B).
Of the 15 OLE RNA representatives, 14 are present adjacent to ORFs that encode proteins important for isoprenoid biosynthesis. Changes in isoprenoid type and amount can alter membrane stability, and chemical variations of isoprenoids are favored by some anaerobic bacteria (42, 43). This characteristic localization of OLE RNA genes suggests that the RNA might contribute to biochemical processes that produce unique features of membranes in some extremophilic or anaerobic bacteria. Unfortunately, the operon carrying OLE RNA is populated with other ORFs whose protein products cannot easily be assigned to a single biological process (see legend of Fig. 3A). This operon could be considered a "superoperon," which is a term that is sometimes used to denote long mRNAs that code for an eclectic mixture of proteins (4446). Therefore, the precise role of OLE RNA cannot easily be deduced from the functions of its coexpressed neighbors.
OLE RNAs are distinctly greater in length and complexity than almost all other noncoding RNAs that are known to be widespread in bacteria. All well conserved noncoding RNAs with sizes similar to OLE RNAs (Fig. 4A) are known to be ribozymes that carry out sophisticated chemical reactions (RNA hydrolysis, RNA splicing, protein synthesis). One structural element in particular, the P12.2 through P14.1 region (Fig. 4C), exhibits a remarkable symmetry in sequence and structure that includes two GNRA tetraloops. GNRA tetraloops are commonly occurring structural elements in structured RNAs that are intrinsically stable and often interact with proteins or other RNA elements (4749). The closest structural precedent to the P12.2 through P14.1 region is the tandem aptamer arrangement found in cooperative glycine-binding riboswitches (38), and perhaps this region of OLE RNA also binds two identical ligands.
The co-occurrence of a putative membrane protein of unknown function (BH2780) is suggestive of a functional link between the RNA and the protein. The protein sequence alignment reveals conserved stretches of basic residues (lysine and arginine) located at either the amino or carboxy termini of the proteins (SI Fig. 8), which is a characteristic of some RNA-binding proteins. Therefore, we are currently investigating the possibility that OLE RNA and the OLE-associated protein BH2780 form a stable functional complex. Although the true function of OLE RNA remains a mystery, the characteristics of this RNA suggest that it might have a complex function that is critical for many species of eubacteria that live in extreme environments.
| Materials and Methods |
|---|
|
|
|---|
PCR Amplification and Cloning of OLE RNA Sequences from Environmental Samples.
Community genomic DNA was obtained as a gift from Norman Pace (University of Colorado, Boulder), who extracted samples from a hypersaline microbial mat from Guerrero Negro, Baja California Sur, Mexico (31). DNAs used in this study were isolated from layers 8, 9, and 10 (other layers were not examined), which sample a microbial community located at a depth of 1360 mm from the surface in a zone that is rich in hydrogen sulfide. One hundred to 200 ng of DNA from each layer was PCR-amplified with degenerate synthetic DNA primers 5'-GAAGRCGGGSCTAAAAATCCG and 5'-GGCWTAGGTCCAGYAGGWTTTCCC, where R represents A or G; Y represents T or C; S represents C or G; and W represents A or T (1:1 mixtures). The DNA primers (prepared by the Keck Foundation Biotechnology Resource Center at Yale University) correspond to highly conserved P4 and P12.2 regions of OLE RNA and were expected to produce
580-bp amplification products.
Each PCR contained 10 mM Tris·HCl (pH 8.3 at 23°C), 50 mM KCl, 2 mM MgSO4, 250 µM each dNTP, 2 µM each DNA primer, 100 mM betaine, 80 µg/ml BSA, and 1 unit of Taq DNA polymerase (Promega). Amplicons of
580 bp dominated the PCR products generated from layer 9 DNA, and these were isolated by agarose gel electrophoresis, purified from the gel by using a QIAquick kit (Qiagen), cloned into TOPO TA pCR2.1 vector, and transformed into E. coli TOP10 cells by using a protocol supplied by the vendor (Invitrogen). Plasmids from 48 colonies that carried an insert of
580 bp were purified, and the appropriate DNA segments were sequenced by the Keck Foundation Biotechnology Resource Center at Yale University.
RNA Expression Analysis by RT-PCR. B. halodurans cells (ATCC 21591) were grown at 37°C in peptone-starch-carbonate medium (ATCC medium 542, which contains the following constituents: 5.0 g/liter peptone, 5.0 g/liter yeast extract, 1.0 g/liter K2HPO4, 0.2 g/liter MgSO4·7H2O, and 10 g/liter soluble starch). The medium was autoclaved at 121°C for 20 min, and subsequently 100 ml of sterile 10% Na2CO3 was added to yield a pH of 10.
Total RNA was isolated from cells grown overnight. Cells (3 OD600) were pelleted and resuspended in 100 µl of 3 mg/ml lysozyme in TE buffer [10 mM Tris·HCl (pH 7.5 at 25°C)/1 mM EDTA]. Cell lysis was facilitated by a freezethaw cycle before isolating RNA with 1 ml of TRIzol reagent (Invitrogen) by using the protocol supplied by the vendor. Five micrograms of total RNA was used as template for each 20-µl reverse transcription reaction with SuperScript II reverse transcriptase (Invitrogen) and an appropriate DNA primer, using the buffer and the protocol supplied by the vendor. Negative controls without enzyme were included for each sample or DNA primer pair used. Five microliters of each reverse transcription reaction was used as template for PCR with the DNA primers indicated for each experiment.
5' RACE. RNA was isolated as described above from B. halodurans cells grown overnight in peptone-starch-carbonate medium. Fifteen micrograms of RNA was used to determine cDNA ends using a protocol similar to that described previously (35, 36). The RNAs were prepared either with or without tobacco acid pyrophosphatase to distinguish primary transcript 5' ends from internal 5' processing sites. The DNA primer for cDNA synthesis, 5'-ACATGCCTTTGTGGATCGCCCTACTA, is complementary to the P6 region of OLE RNA. A second DNA primer for subsequent PCR amplification of cDNAs, 5'-GGACACTGACATGGACTGAAGGAGTA, is homologous to the adaptor RNA, 5'-GGACACUGACAUGGACUGAAGGAGUAGAAAC, used for 5' RACE. PCR products that were detected both with and without tobacco acid pyrophosphatase treatment were purified by using a QIAquick kit and cloned by using a TOPO TA kit, and 17 clones were sequenced.
| Acknowledgements |
|---|
|
|
|---|
| Footnotes |
|---|
Abbreviations: OLE, ornate, large, extremophilic
To whom correspondence should be addressed. E-mail: ronald.breaker{at}yale.edu
Author contributions: E.P.-F. and R.R.B. designed research; E.P.-F., J.E.B., and A.R. performed research; E.P.-F., J.E.B., A.R., and R.R.B. analyzed data; and E.P.-F. and R.R.B. wrote the paper.
The authors declare no conflict of interest.
This article is a PNAS direct submission.
This article contains supporting information online at www.pnas.org/cgi/content/full/0607493103/DC1.
© 2006 by The National Academy of Sciences of the USA
| References |
|---|
|
|
|---|
This article has been cited by other articles in HighWire Press-hosted journals:
![]() |
J.-h. Ko and S. Altman OLE RNA, an RNA motif that is highly conserved in several extremophilic bacteria, is a substrate for and can be regulated by RNase P RNA PNAS, May 8, 2007; 104(19): 7815 - 7820. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Welz and R. R. Breaker Ligand binding and gene control characteristics of tandem riboswitches in Bacillus anthracis RNA, April 1, 2007; 13(4): 573 - 582. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||