Table 1.

H. dujardini assembly comparison

Genome assemblynHd.2.3UNC (13)
Scaffold metrics
 No. scaffolds13,20222,497
 Span (Mb)134.96252.54*
 Min length (bp)5002,000
 N50 length (bp)50,53115,907
 Scaffolds in N507014,078
 GC proportion0.4520.469
Quality assessment
 CEGMA completeness97.2%94.8%
 CEGMA average copies1.553.52
 RNA-Seq mapping92.8%89.5%
Genome content
 Protein-coding genes23,02139,532*
 Contaminant span (Mb)1.5 (1.1%)68.9 (27.3%)
 Initial bacterial HGT loci5546,663
 Bacterial contaminants3559,872
 HGT with expression196NA
  • An extended version of this table is available as SI Appendix, Table S1. NA, not applicable.

  • * The UNC genome was reported (13) to have a span of 212 Mb and contain 38,145 genes, but the correct values are derived from the deposited data files from ref. 13.

  • 9,872 loci were predicted on the 68.9 Mb of contaminant scaffolds, but not all were flagged as fHGT by Boothby et al. (13).