Table 1. Preferred genome partners based on maximally representative clusters
Shared nonexclusively
Shared exclusively
Phylum/division Uncorrected Corrected Uncorrected Corrected
Aquificales γ-Proteobacteria Thermotogales Euryarchaeota ε-Proteobacteria
Bacteroidetes γ-Proteobacteria Listeria γ-Proteobacteria Chlorobi
Chlamydiales γ-Proteobacteria Aquificales Lactobacillales ε-Proteobacteria
Chlamydiales γ-Proteobacteria Aquificales γ-Proteobacteria Bacteroidetes
Crenarchaeota Euryarchaeota Clostridia Euryarchaeota Euryarchaeota
Cyanobacteria γ-Proteobacteria Clostridia α-Proteobacteria α-Proteobacteria
Euryarchaeota γ-Proteobacteria Clostridia Crenarchaeota Crenarchaeota
High-G + C Firmicutes γ-Proteobacteria Clostridia γ-Proteobacteria α-Proteobacteria
Nanoarchaeota Euryarchaeota Aquificales γ-Proteobacteria Chlamydiales
Planctomycetes γ-Proteobacteria Clostridia γ-Proteobacteria γ-Proteobacteria
Spirochaetales γ-Proteobacteria Clostridia γ-Proteobacteria Cyanobacteria
Thermotogales Clostridia Aquificales Euryarchaeota Clostridia
Thermus/Deinococcus γ-Proteobacteria Aquificales γ-Proteobacteria Cyanobacteria
α-Proteobacteria γ-Proteobacteria β-Proteobacteria γ-Proteobacteria γ-Proteobacteria
β-Proteobacteria γ-Proteobacteria Clostridia γ-Proteobacteria γ-Proteobacteria
γ-Proteobacteria β-Proteobacteria β-Proteobacteria β-Proteobacteria β-Proteobacteria
ε-Proteobacteria γ-Proteobacteria Aquificales γ-Proteobacteria Aquificales
Bacilli γ-Proteobacteria Clostridia Clostridia Clostridia
Clostridia γ-Proteobacteria Listeria Bacilli Bacilli
Lactobacilli γ-Proteobacteria Listeria Listeria Listeria
Listeria Bacilli Staphylococci Lactobacilli Lactobacilli
Mollicutes γ-Proteobacteria Listeria γ-Proteobacteria Staphylococci
Staphylococci Bacilli Listeria Bacilli Bacilli
  • Genomes are grouped according to the National Center for Biotechnology Information level 4 taxonomy, except for proteobacteria and low-G + C Gram-positive divisions, which we subdivide into four and six divisions, respectively, as implied by the MRP supertree (see Fig. 6). For each defined taxonomic group G, we determine the group G′ ≠ G that is represented most frequently in MRCs that also contain proteins from group G. The requirement for co-occurrence can be either nonexclusive, in which case all MRCs containing proteins from G and G′ are counted, or exclusive, with counts derived from only those MRCs that contain proteins from G and G′ but no other taxonomic group. Counts are presented based on the raw count of shared MRCs, and after normalization (dividing each raw count by the total number of MRCs that contain a protein from G and/or G′). For instance (first line), MRCs in which proteins from Aquificales (Aquifex aeolicus) and γ-Proteobacteria co-occur nonexclusively are more numerous than those containing proteins from Aquificales and any other single phylum represented in our data set (column 2). However, normalization for the large number of MRCs that contain γ-proteobacterial proteins (column 3) reveals that proteins from Aquificales co-occur preferentially with those from Thermotogales (Thermotoga maritima). The preferred exclusive partner based on raw count is Euryarchaeota (column 4), but after normalization is ε-Proteobacteria (column 5).