Simple sequence repeats in prokaryotic genomes
- Jan Mrázek†,‡,§,
- Xiangxue Guo†, and
- Apurva Shah†
- †Department of Microbiology and
- ‡Institute of Bioinformatics, University of Georgia, Athens, GA 30602
-
Communicated by Samuel Karlin, Stanford University, Stanford, CA, March 19, 2007 (received for review December 5, 2006)
Abstract
Simple sequence repeats (SSRs) in DNA sequences are composed of tandem iterations of short oligonucleotides and may have functional and/or structural properties that distinguish them from general DNA sequences. They are variable in length because of slip-strand mutations and may also affect local structure of the DNA molecule or the encoded proteins. Long SSRs (LSSRs) are common in eukaryotes but rare in most prokaryotes. In pathogens, SSRs can enhance antigenic variance of the pathogen population in a strategy that counteracts the host immune response. We analyze representations of SSRs in >300 prokaryotic genomes and report significant differences among different prokaryotes as well as among different types of SSRs. LSSRs composed of short oligonucleotides (1–4 bp length, designated LSSR1–4) are often found in host-adapted pathogens with reduced genomes that are not known to readily survive in a natural environment outside the host. In contrast, LSSRs composed of longer oligonucleotides (5–11 bp length, designated LSSR5–11) are found mostly in nonpathogens and opportunistic pathogens with large genomes. Comparisons among SSRs of different lengths suggest that LSSR1–4 are likely maintained by selection. This is consistent with the established role of some LSSR1–4 in enhancing antigenic variance. By contrast, abundance of LSSR5–11 in some genomes may reflect the SSRs' general tendency to expand rather than their specific role in the organisms' physiology. Differences among genomes in terms of SSR representations and their possible interpretations are discussed.
Footnotes
- §To whom correspondence should be addressed at: Department of Microbiology, University of Georgia, 550 Biological Sciences, Athens, GA 30602. E-mail: mrazek{at}uga.edu
-
Author contributions: J.M. designed research; J.M., X.G., and A.S. performed research; J.M. and X.G. analyzed data; and J.M. wrote the paper.
-
The authors declare no conflict of interest.
-
¶ Mrázek, J., Kypr, J. (1994) Miami Biotechnol Short Rep 5:39.
-
This article contains supporting information online at www.pnas.org/cgi/content/full/0702412104/DC1.
- Abbreviations:
- SSR,
- simple sequence repeat;
- LSSR,
- “long” SSR (see Methods for definition);
- LSSR1–4,
- LSSR composed of iterations of 1-mer to 4-mer;
- LSSR5–11,
- LSSR composed of iterations of 5-mer to 11-mer.
- © 2007 by The National Academy of Sciences of the USA





