Table 1. General genome features
T. denticolaT. pallidum*B. burgdorferi*L. interrogans
Size, bp 2,843,201 1,138,012 910,725 4,691,184
G+C content, % 37.9 52.8 28.6 36.0
Protein-coding genes
    No. with assigned function 1,223 542 487 2,060
    No. of unknown function 352 35 22 146
    No. of conserved hypotheticals§ 477 175 102 569
    No. with no database match 734 288 242 1,952
    Total 2,786 1,040 853 4,727
        Average CDS size, bp 939 1,017 992 778
        Coding, % 92.1 93.0 93.5 78.4
        rRNA 6 6 5 4
        tRNA 44 45 34 37
  • * The distribution of CDSs in the T. pallidum and B. burgdorferi chromosomes are derived from the original annotation. These numbers, particularly hypothetical and conserved hypothetical proteins, may be significantly different with updated blast searches and annotation.

  • The genome information for L. interrogans represents combined data from both chromosomes.

  • Unknown function, significant sequence similarity to a named protein for which no function is currently attributed.

  • § Conserved hypothetical protein, sequence similarity to a translation of another ORF, however no experimental evidence for protein expression exists.

  • Hypothetical protein, no significant similarity to any other sequenced protein.

  • Twenty-five of the total number of CDSs in T. denticola possess one or more authentic frameshifts, point mutations, or are truncated.