Population genomics of Mycobacterium tuberculosis in the Inuit

Edited by Carl F. Nathan, Weill Cornell Medical College, New York, NY, and approved September 16, 2015 (received for review April 13, 2015)
October 19, 2015
112 (44) 13609-13614
Science Sessions podcast
Tracking endemic tuberculosis

Significance

Through an in-depth analysis of whole-genome sequencing data from Nunavik, Québec, we inferred the evolution of a single dominant strain of Mycobacterium tuberculosis. Our analyses suggest that M. tuberculosis was first introduced into this region in the early 20th century. Since this time, M. tuberculosis has spread extensively, predominantly within but also between villages. Despite a genomic profile that lacks features of a hypervirulent strain, this strain has thrived in this region and continues to cause outbreaks. This suggests that successful clones of M. tuberculosis need not be inherently exceptional; host or social factors conducive to transmission may contribute to the ongoing tuberculosis epidemic in this and other high-incidence settings.

Abstract

Nunavik, Québec suffers from epidemic tuberculosis (TB), with an incidence 50-fold higher than the Canadian average. Molecular studies in this region have documented limited bacterial genetic diversity among Mycobacterium tuberculosis isolates, consistent with a founder strain and/or ongoing spread. We have used whole-genome sequencing on 163 M. tuberculosis isolates from 11 geographically isolated villages to provide a high-resolution portrait of bacterial genetic diversity in this setting. All isolates were lineage 4 (Euro-American), with two sublineages present (major, n = 153; minor, n = 10). Among major sublineage isolates, there was a median of 46 pairwise single-nucleotide polymorphisms (SNPs), and the most recent common ancestor (MRCA) was in the early 20th century. Pairs of isolates within a village had significantly fewer SNPs than pairs from different villages (median: 6 vs. 47, P < 0.00005), indicating that most transmission occurs within villages. There was an excess of nonsynonymous SNPs after the diversification of M. tuberculosis within Nunavik: The ratio of nonsynonymous to synonymous substitution rates (dN/dS) was 0.534 before the MRCA but 0.777 subsequently (P = 0.010). Nonsynonymous SNPs were detected across all gene categories, arguing against positive selection and toward genetic drift with relaxation of purifying selection. Supporting the latter possibility, 28 genes were partially or completely deleted since the MRCA, including genes previously reported to be essential for M. tuberculosis growth. Our findings indicate that the epidemiologic success of M. tuberculosis in this region is more likely due to an environment conducive to TB transmission than a particularly well-adapted strain.
The tubercule bacillus, Mycobacterium tuberculosis, is a highly successful, medically important human-adapted pathogen. Studies of diverse strain collections reveal a geographic aggregation of the principal M. tuberculosis lineages (1) consistent with a dissemination of this organism around the world with the paleo migration (2). Ancient DNA studies also support the notion that M. tuberculosis has caused disease in humans for thousands of years. Thus, it can be inferred that M. tuberculosis has evolved in step with its human host, successfully responding to changes in the host and its environment that could affect the capacity to cause transmissible disease.
In contrast to the global diversity of M. tuberculosis strains (13), we have previously observed limited genetic diversity in the Nunavik region of Québec (4). One possible explanation is a founder strain, wherein genetic similarity is due to a single recent introduction of a bacterium and may not necessarily represent ongoing spread between communities. In this scenario, isolates might have indistinguishable genotypes by conventional genotyping modalities (restriction fragment length polymorphism, mycobacterial interspersed repetitive units, spoligotyping) but distinct genotypes when assessed using a higher-resolution method, namely whole-genome sequencing (WGS) (5). An additional explanation is that a single clone of M. tuberculosis is currently spreading both within and between villages; however, the great distances between these communities that are not linked by roads make intervillage spread less likely. These possible explanations need not be mutually exclusive.
To evaluate these possibilities, we conducted WGS on M. tuberculosis isolates from Nunavik isolated over 23 y. Estimation of the divergence date of the most recent common ancestor (MRCA) provided evidence that tuberculosis (TB) was introduced into this region in the early 20th century, following which time there has been substantial ongoing transmission, predominantly within villages. This setting provides a unique opportunity to study the genomic characteristics of an epidemiologically successful strain of M. tuberculosis over time.

Results

Whole-Genome Sequencing and Lineage Identification.

There were 149 microbiologically confirmed TB cases diagnosed in Nunavik between 2001 and 2013; we obtained high-quality WGS data for 137/149 (92%). An additional 26 genomes were successfully sequenced from strains previously sampled between 1990 and 2000 (4). In total, WGS was conducted on 163 M. tuberculosis isolates. The average depth of coverage was 44.6× across 99.6% of the H37Rv reference genome.
All 163 genomes from the Nunavik region presented the 7-bp deletion in polyketide synthase (pks) 15/1 that characterizes lineage 4 of M. tuberculosis (the Euro-American lineage) (6). By comparing the Nunavik isolates with three genomes from each of the M. tuberculosis lineages (1–7), we observed that the 163 genomes were tightly clustered in two distinct sublineages: one consisting of 153 isolates (major; Mj) and the other consisting of 10 isolates (minor; Mn) (Fig. 1). Phylogenetic analyses based on single-nucleotide polymorphisms (SNPs) (Figs. 1 and 2) were supported by deletions confirmed by PCR (Fig. S1 and Dataset S1).
Fig. 1.
Maximum likelihood tree of 163 M. tuberculosis isolates from Nunavik and 21 representative genomes of lineages 1–7. Phylogenetic clusters based on 9,016 single-nucleotide polymorphic loci identified across 184 genomes compared with H37Rv (solid black circle). The scale bars represent the number of substitutions per site. Bootstrap values from 1,000 replicates are shown for branches within the Mj and Mn sublineages. For clarity, only values ≥98 are shown.
Fig. 2.
Maximum likelihood tree of 163 M. tuberculosis isolates from Nunavik. Phylogenetic clusters were identified based on 1,288 single-nucleotide polymorphic loci compared with H37Rv. Solid and dashed lines indicate isolates of the Mj and Mn sublineages, respectively. Colored shapes represent the reference genome (bordered black square) and the villages of Nunavik: A (bordered blue triangle), B (full orange square), C (bordered purple circle), D (full green diamond), E (bordered purple diamond), K (full pink triangle), and other (full green circle). *Genome with a unique single-nucleotide polymorphism profile. #Phylogenetic clusters defined previously in ref. 32. Years of diagnosis are indicated. Bootstrap support from 1,000 replicates is shown. Branches supported by less than 80% of bootstrap replicates are collapsed.
Fig. S1.
Most recent common ancestors and deletion events during the evolution of the Mj sublineage of Nunavik. Because similar estimates with overlapping 95% highest posterior density intervals were obtained in all three MRCA analyses, MRCA dates obtained via analysis 1 are presented for simplicity. Arrows indicate the positions and annotations of the H37Rv reference genome. A phylogenetic cluster was defined as at least two genomes sharing a minimum of two single-nucleotide polymorphic loci. Gray and colored circles represent the common ancestors and phylogenetic clusters based on identified SNPs, respectively. Isolate 58385 had a unique single-nucleotide polymorphism profile (i.e., was not clustered) and, because the posterior density for the divergence date of this isolate was <0.8, has not been shown. The dashed line indicates the point estimate for the MRCA of Mj, under analysis 1. Left of the dashed line, prediversification; right of the dashed line, postdiversification.
Excluding SNPs in PE/PGRS and PPE genes, as well as mobile elements, as these may be at higher risk of false positives (5, 7), 1,288 single-nucleotide polymorphic loci were included comparing all genomes together against H37Rv (Dataset S2). The 153 isolates of the Mj sublineage had an average of 674 SNPs compared with H37Rv; the 10 isolates of the Mn sublineage had an average of 451 SNPs. There were 442 SNP loci shared across all Mj isolates, unique to this sublineage, and 214 SNP loci shared by all 10 Mn isolates that were not present in the Mj sublineage. According to the barcode proposed by Coll et al. (8) and the PhyTb tool of the PhyloTrack library (pathogenseq.lshtm.ac.uk/phytblive/index.php), the Mj and Mn sublineages can be classified as M. tuberculosis 4.1.2 and 4.8, respectively.
Phylogenetic analysis and the geographic distribution of isolates further distinguished the Mj and Mn sublineages (Fig. 2). To quantify diversity, we determined the number of pairwise SNPs within each sublineage. Among isolates of the Mj sublineage, the median number of pairwise SNPs was 46 [interquartile range (IQR) 13–49], with a maximum of 72. For isolates of the Mn sublineage, the median number of pairwise SNPs was 1 (IQR 0–2), with a maximum of 22. Nine of the 10 isolates from this sublineage were from the same village.

Transmission Occurs Mostly Within Villages.

To evaluate where most ongoing transmission occurs, we examined pairwise SNPs between isolates of the Mj sublineage, within and between villages, as these comprised over 90% of the cases in this region. The median number of pairwise SNPs was significantly lower for intravillage pairs (6, IQR 3–46) than for intervillage pairs (47, IQR 44–50, Wilcoxon–Mann–Whitney, P < 0.00005). For both intra- and intervillage comparisons, a bimodal distribution was evident (Fig. 3). For the intravillage pairs (n = 3,689), the first mode comprised 61% of all pairwise comparisons and had a median of 3 SNPs (IQR 2–5). For the intervillage pairs (n = 7,939), the first mode comprised only 12% of all pairwise comparisons, and had a median of 9 SNPs (IQR 6–13). For both intra- and intervillage pairs, the second mode had similar distributions (median 47, IQR 45–49 and median 48, IQR 45–50, respectively), consistent with the star-like pattern shown in Figs. 1 and 2.
Fig. 3.
Pairwise SNPs between isolates of the major sublineage of Nunavik. There were a total of 11,628 pairwise comparisons: 3,689 intravillage case pairs and 7,939 intervillage case pairs.
We also considered thresholds for transmission based on published M. tuberculosis substitution rates (0.5 SNPs per genome per y, 95% confidence interval 0.3–0.7) (5, 9). For a study spanning 23 y (1991–2013 inclusive), we expected that epidemiologically linked cases would be separated by no more than 12 SNPs. Applying this threshold, 2,208 of 3,689 (60%) intravillage pairs were separated by 12 or fewer SNPs, compared with 683 of 7,939 (9%) intervillage pairs (two-sample z test for difference in proportions, P < 0.00005). Sensitivity analyses applying substitution rates of 0.3 and 0.7 SNPs per genome per y yielded similar results.

M. tuberculosis Diversified in Nunavik During the 20th Century.

Relative to the global genetic diversity of M. tuberculosis, the total diversity of strains in Nunavik was low, consistent with a recent introduction of TB into this region. To evaluate this hypothesis, we estimated the MRCAs for each sublineage using Bayesian molecular dating (10, 11). Constraining the substitution rate of M. tuberculosis based on previous estimates (5, 9) we inferred the MRCA of the Mj sublineage to be 1919 [95% highest posterior density interval (HPD) 1892–1946], with other divergence dates within the Mj sublineage scattered over the 20th century (Table 1, analysis 1). The Mn sublineage was found to have an MRCA of 1976 (95% HPD 1951–1994). Repeating these analyses without constraining the substitution rate yielded similar results (Table 1, analyses 2 and 3).
Table 1.
Estimated year of divergence of M. tuberculosis sublineages and clusters of Nunavik
Phylogenetic sublineages and clustersAnalysis 1*Analysis 2Analysis 3
Mj–Mn1053 (602–1450)1243 (836–1575)744 (230–1216)
Mj1919 (1892–1946)1922 (1890–1950)1904 (1873–1930)
Mj-I-II1942 (1919–1964)1947 (1921–1967)1925 (1898–1948)
Mj-V1952 (1929–1973)1956 (1932–1978)1935 (1909–1958)
Mj-IV1965 (1949–1978)1966 (1951–1980)1958 (1941–1973)
Mj-III.a.b.c1999 (1993–2004)1999 (1993–2004)2000 (1993–2004)
Mj-VI1999 (1995–2000)1999 (1995–2000)1999 (1995–2000)
Mn1976 (1951–1994)1979 (1953–1997)1969 (1943–1987)
All numbers are expressed in calendar years, rounded to the nearest whole number. Analysis 1: calibration point, concatenated alleles. Analysis 2: no calibration point, concatenated alleles. Analysis 3: no calibration point, weighting for constant sites. The median date of divergence is shown in years, with corresponding 95% highest posterior density intervals.
*
Results of this analysis are reported in the text.
Strain code per Lee et al. (32).

Natural Selection of M. tuberculosis in a New Environment.

The M. tuberculosis population may have experienced a new regime of natural selection upon its introduction into Nunavik. First, M. tuberculosis could have experienced a population bottleneck upon introduction, reducing the efficacy of natural selection and allowing the fixation of deleterious mutations. Second, upon entry into a new environment, M. tuberculosis could have experienced positive selection, retaining fitter variants over time. Third, if the environment was conducive to transmission of M. tuberculosis, there may have been a relaxation of purifying selection across the entire genome. These scenarios are not mutually exclusive, and other scenarios are possible as well.
To measure natural selection at the protein level, we used the ratio of nonsynonymous to synonymous substitution rates (dN/dS), reasoning that this should remain stable over time in the absence of changing regimes of natural selection (12). Specifically, we tested the null hypothesis that dN/dS remained the same pre- and postdiversification of each M. tuberculosis sublineage. We first reconstructed the ancestral sequences of the MRCA for each sublineage, along with that of the common ancestor for these two sublineages (denoted “Mj–Mn”). We then compared the nonsynonymous and synonymous SNPs (nsSNPs and sSNPs, respectively) between these reconstructed ancestors (Mj–Mn versus Mj, Mj–Mn versus Mn) to obtain the dN/dS for each sublineage prediversification. To calculate the dN/dS postdiversification (i.e., subsequent to the MRCAs for each sublineage), we generated a concatenated sequence of codons for both the Mj and Mn sublineages that included all SNP loci and compared each sequence with that of its respective ancestor. In this phylogenetic approach, each independent SNP was counted exactly once (i.e., SNPs present in multiple isolates were not recounted). In total, for the Mj sublineage, we identified 229 nsSNPs and 154 sSNPs before its introduction into Nunavik, compared with 238 nsSNPs and 107 sSNPs that occurred subsequently (Dataset S2). The dN/dS ratio for SNPs prediversification was 0.534, consistent with published estimates for M. tuberculosis (13), whereas the dN/dS postdiversification was 0.777 (Table 2, analysis 1a; G test based on numbers of nsSNPs and sSNPs pre- and postdiversification, P = 0.010). Singleton SNPs, present in only one isolate, are expected to be enriched in nonsynonymous mutations destined to be purged by purifying selection. To evaluate whether the increased dN/dS was attributable to these transient mutations, we restricted our analysis to SNPs present in ≥2 isolates. We still observed a significant increase in the dN/dS, going from 0.534 prediversification to 0.928 postdiversification (Table 2, analysis 1b). There was no significant difference in postdiversification nsSNPs and sSNPs comparing analyses with and without singletons (Fisher’s exact test, P = 0.472). As an alternative method of calculating the dN/dS postdiversification, we conducted a pairwise analysis wherein the median dN/dS was obtained by comparing each of the 153 Mj isolates with its respective ancestral sequence. This yielded similar results, whether singletons were included or excluded (Table 2, analysis 2). Compared with the Mj sublineage, the dN/dS ratios for the Mn sublineage were more stable over time (Table 2).
Table 2.
dN/dS of M. tuberculosis sublineages pre- and postdiversification in Nunavik
 Mj sublineageMn sublineage
AnalysisPrediversificationPostdiversificationP valuePrediversificationPostdiversificationP value
1a: all SNPs0.5340.7770.0100.5470.6150.873
1b: excluding singletons0.5340.9280.0050.5470.7590.767*
2a: all SNPs0.5340.947<0.000050.5470.7590.006
2b: excluding singletons0.5340.953<0.000050.5470.7590.006
Ancestral sequences were reconstructed for the MRCA of the Mj–Mn sublineages, as well as the Mj sublineage and the Mn sublineage. Prediversification: 229 nonsynonymous SNPs and 154 synonymous SNPs identified in the Mj sublineage, and 113 nsSNPs and 75 sSNPs in the Mn sublineage. Analysis 1a: dN/dS prediversification was calculated by comparing ancestral sequences. For postdiversification, concatenated sequences of codons for each sublineage were generated based on all SNP loci identified, with SNPs in more than one isolate only contributing once. Overall, there were 238 nsSNPS and 107 sSNPs in the Mj sublineage, and 13 nsSNPs and 8 sSNPs in the Mn. These concatenated sequences were then compared with their respective ancestral sequences to obtain dN/dS. The raw counts of nonredundant nsSNPs and sSNPs pre- and postdiversification were compared for each sublineage using the G test, with P values shown. Analysis 1b: excluding singleton SNPs. The G test was based on 120 nsSNPs and 46 sSNPs for Mj and 8 nsSNPs and 4 sSNPs for Mn postdiversification. Analysis 2a: dN/dS was calculated for each isolate compared with its respective ancestral sequence (i.e., 153 Mj isolates were compared with the imputed ancestral sequence for Mj). Within each sublineage, the median dN/dS was calculated and is shown above. Analysis 2b: excluding singleton SNPs. The Wilcoxon signed-rank test was used to compare the median dN/dS postdiversification for each sublineage with its respective prediversification estimate.
*
Fisher’s exact test due to cell counts <5.
The efficiency of purifying selection to remove deleterious nonsynonymous mutations is reduced when populations undergo dramatic size fluctuations due, for example, to bottlenecks or exponential growth. To investigate whether the increased dN/dS ratio in the Mj sublineage was due to an expanding bacterial population size over time, we constructed Bayesian skyline plots (Fig. S2). Model comparison using Akaike’s information criterion for Markov chain Monte Carlo samples [AICM (14)] rejected an exponential population growth in favor of a constant population size or Bayesian skyline model (Table S1). Together, these results suggest that the genome-wide increase in dN/dS was not due to a population bottleneck followed by exponential growth, nor to a lack of time for purifying selection to purge deleterious nsSNPs.
Fig. S2.
Bayesian skyline plots of M. tuberculosis in Nunavik. (A) All sublineages together. (B) The major sublineage. (C) The minor sublineage. The estimated effective population sizes through time are shown (black line). The shaded area represents the 95% credibility intervals.
Table S1.
Model comparison by posterior simulation-based analog of Akaike’s information criterion
Clock modelCoalescent priorAICMSEExponential growthConstant population sizeBayesian skyline
RelaxedExponential growth11830577.14±0.099−10.577−19.503
RelaxedConstant population size11830566.57±0.12110.577−8.926
RelaxedBayesian skyline11830557.64±0.1219.5038.926
Lower AICM values (14) indicate better model fit. The differences between AICM are reported. Positive values indicate better relative model fit of the row’s model compared with the column’s model.

Genes Affected by SNPs and/or Deletions.

Unlike genome-wide relaxation, wherein the whole genome is affected, positive selection is thought to target specific genes (15, 16). Across the 153 genomes of the Mj sublineage, we identified 218 and 227 genes with nsSNPs pre- and postdiversification, respectively (Dataset S2). To evaluate whether any particular categories of M. tuberculosis genes were unusually variable postdiversification, we tabulated these SNPs according to gene categories described in the literature (Fig. 4 and Datasets S3 and S4). There was no statistically significant difference between the proportion of genes with nsSNPs in any categories pre- versus postdiversification (two-sample z test for difference in proportions, P > 0.05). However, genes predicted to be conditionally essential for M. tuberculosis survival in vitro, in macrophages, or in vivo were not spared nsSNPs (Dataset S5). Mutations in essential genes often affected a residue that is conserved in the closely related mycobacterial species Mycobacterium canettii (17) and Mycobacterium kansasii (18), with three genes (Rv0338c, echA5, and murC) incurring distinct nsSNPs in different strains (Dataset S5).
Fig. 4.
Proportion of genes with nonsynonymous single-nucleotide polymorphisms (Top) and the number of deleted genes (Bottom) for the major sublineage, pre- and postdiversification. Gene categories are as defined in the publications: M. tuberculosis (MTB) deletions (36), bacillus Calmette–Guérin (BCG) deletions (37), essential genes in vitro (20), in macrophages (38), or in vivo (19), M. tuberculosis-specific genes (39), lateral gene transfer or duplication acquisition (39), human T-cell epitopes (7), genes coding membrane proteins (40), mobile elements (7), and genes coding PPE family proteins (7). Genes designated as PE/PGRS, PPE, or mobile elements were excluded from the SNP analysis (7).
In addition to these potentially deleterious SNPs, all Mj isolates lacked eight regions, resulting in 13 deleted genes. Certain strains also suffered a further seven deletions, disrupting 28 genes (Dataset S1). Certain gene categories appeared overrepresented in postdiversification deletions (e.g., genes acquired through lateral gene transfer, mobile elements), but the low number of deleted genes precluded robust statistical analysis (Fig. 4 and Dataset S1). Four genes predicted to be essential in genomic screens were completely (Rv2335) or partially (Rv1939, Rv2885c, and Rv3135) deleted in some isolates of the Mj sublineage (Dataset S3). Rv2335 (i.e., cysE) codes for a serine acetyltransferase, predicted to be essential for survival in vivo (19), that was absent in eight isolates. Rv2885c codes for a transposase in the IS1539 insertion sequence that is predicted to be essential for survival in vivo (19), whereas Rv3135 codes for PPE50 and is predicted to be essential for survival in vitro (20). Rv1939 codes for an oxydoreductase predicted to be essential for survival in vivo (19) that was deleted in one isolate (18421) (Dataset S1).

Discussion

The Inuit originally came from eastern Siberia, via the Bering Strait, in two waves over several thousands of years (21). Given the recognized close association between M. tuberculosis and human populations, it is theoretically possible that they brought an East Asian lineage of M. tuberculosis with them to the Canadian Arctic. Our data refute this scenario by revealing only lineage 4 (Euro-American) isolates. The low amount of genetic diversity among isolates from different villages indicates that the vast majority of TB cases in this region are the consequence of a single introduction of M. tuberculosis, perhaps from Europe, around the early 20th century. The introduction and diversification of a single dominant clone in Nunavik provide an unobstructed view of M. tuberculosis over time, enabling us to draw certain inferences about the epidemiology and evolution of this highly successful human-adapted pathogen.
The Inuit have had casual interactions with Europeans since the 17th century, most notably with whalers and explorers who sailed along the coasts of Hudson’s Bay and Labrador (22). However, the first permanent settlements of the Hudson’s Bay Company in the region now known as Nunavik date to the late 19th and early 20th centuries, following which there were more sustained interactions between the Inuit and traders (23). Our MRCA estimates support an introduction of TB into this region during this period, which is consistent with some, but not all, historical accounts of when TB was first observed (24). The apparent lack of TB before the early 20th century, despite several centuries of Inuit–European interactions, supports that TB is generally not spread through casual contact, as is the case for measles or chickenpox. This is also consistent with our analysis of the pairwise SNPs between isolates across villages; only a small proportion of intervillage case pairs had low SNP differences, arguing against transmission during casual contact, as can occur at cultural gatherings that bring together members of different villages. Supporting this, villages often had one predominant strain, and individual strains were mostly confined to one village (Fig. 2). This observation presents both an opportunity and a challenge for public health; whereas TB should in theory be amenable to control through scaled-up efforts, it may be that village-by-village, rather than regional, interventions will be needed to interrupt transmission in this setting.
In a number of high-incidence countries, the emergence of an epidemiologically successful strain has been attributed to virulence features encoded in the bacterial genome (25). For instance, the polyketide synthase-derived phenolic glycolipid (PGL) coded by the intact pks15/1 locus of strain HN878 (Beijing genotype) induces hyperlethality in murine disease models (26), potentially explaining the emergence of the Beijing strain in a number of settings worldwide (27). Furthermore, compared with other clinical strains, strains 1471 and HN878 (Beijing genotypes) result in increased macrophage necrosis (28) and more progressive pathology in experimental infections (29). However, although certain strains have a propensity to cause accelerated life-threatening pathology in experimental models, it is not yet clear whether this property predicts epidemiologic success, as a strain that causes chronic, nonprogressive pathology may be the most likely to transmit.
In Nunavik, we observed a set of related strains that meet the epidemiologic criterion of success, without any clear genomic indicators of increased bacterial virulence. Instead, for the Mj sublineage, we observe an enrichment of nsSNPs since its introduction into this region, some of which are expected to affect the function of proteins that contribute to the survival of M. tuberculosis during infection. There are a number of potential causes of an increased dN/dS, including insufficient time for purifying selection to act, positive selection, relaxed purifying selection, and genetic drift. Whereas an increased dN/dS at the tips of a phylogenetic tree may indicate insufficient time for purifying selection (13), the postdiversification inflation of dN/dS holds even with the exclusion of evolutionarily recent singleton SNPs. Therefore, a simple time dependence is unlikely to be the only explanation. Positive selection is unlikely to inflate the dN/dS across the entire genome but rather should target genes with specific functions (15, 16). Although we did not identify any particular functional category of genes enriched in nsSNPs, this does not exclude positive selection on a small number of genes. However, it suggests that positive selection was not the pervasive force leading to a high dN/dS genome-wide. The remaining potential explanations for the dN/dS elevation are a genome-wide relaxation of purifying selection and genetic drift. The nsSNPs and deletions in putatively essential genes provide further support for these two interpretations.
The global M. tuberculosis population has been previously shown to evolve through mostly weak selection and strong drift (30); here we show that the same is true on a local level, to an even greater extent. Given that drift will have stronger effects when effective populations are reduced (31) and that our data suggest that population size remained more or less constant, we hypothesize that relaxation of purifying selection has contributed significantly to the evolution of the Nunavik strain of M. tuberculosis. Further investigation in this and other similar populations is needed. Regardless of the forces that have driven the elevated dN/dS, our findings suggest that M. tuberculosis has not thrived in Nunavik due to a unique virulence profile of the bacteria. It follows that M. tuberculosis control in this region, and in similar settings, will require looking beyond the bacterial culprit to the social conditions that foster TB.

Materials and Methods

Detailed methods can be found in SI Materials and Methods. In brief, the Nunavik region is composed of 14 Inuit communities, with a total population of 12,090 (in 2011). Between 1990 and 2013, there were 200 cases of TB in Nunavik, of which 163 were available for whole-genome sequencing using the MiSeq 250 System (Illumina). Reads were assembled and compared as previously described (32). The final dataset of SNPs excluded those in PE/PGRS and PPE genes, as well as mobile elements, as these may be prone to false positives (5, 7). Deletion events were identified with the Integrative Genomics Viewer (33) and confirmed by PCR and Sanger sequencing. Concatenated sequences of the SNPs were used to generate phylogenetic trees via the maximum likelihood method in Molecular Evolutionary Genetics Analysis [MEGA (34)]. Divergence times for the 163 Nunavik isolates were estimated using Bayesian Markov chain Monte Carlo methods [Bayesian Evolutionary Analysis by Sampling Trees (10, 11)], with H37Rv used as an outgroup.
We used three approaches to derive MRCAs. Using the concatenated sequences of SNPs across the 163 genomes, we first conducted an analysis that incorporated prior knowledge of the substitution rate of M. tuberculosis in the form of a calibration node for the Mj sublineage (analysis 1). We then performed an analysis agnostic to the reported substitution rate (i.e., without calibration), also using concatenated sequences (analysis 2). We then repeated this second analysis but applied a correction for the constant sites across the genomes (analysis 3).
Different coalescent models were tested to explore changes in effective population size over time (35). The AICM (14) was used to select the model providing the best fit. Bayesian skyline plots were generated (Fig. S2).
To calculate the dN/dS ratios, the ancestral sequences for each MRCA (Mj–Mn, Mj, and Mn) were reconstructed manually (Dataset S2). We then calculated the dN/dS pre- and postdiversification for the Mj and Mn sublineages, using both a phylogenetics-based approach (analysis 1) and a pairwise dN/dS analysis (analysis 2) (7). For both analyses, we repeated the dN/dS calculations after excluding SNPs that were present only once across all 163 genomes (singletons).
Ethical approval for this work was obtained from the McGill University Faculty of Medicine Institutional Review Board.

SI Materials and Methods

Study Population.

The Nunavik region is 443,685 km2 in size and is composed of 14 Inuit communities, with a total population of 12,090 (Statistics Canada, 2011) (www12.statcan.gc.ca/nhs-enm/2011/dp-pd/aprof/index.cfm?Lang=E). Each of these communities is separated from the nearest village by a median distance of 137 km (IQR 110–178), without adjoining roads.

Bacteria.

All specimens from TB suspects in Nunavik are sent to the mycobacteriology laboratory of the McGill University Health Centre for processing. Culture-positive specimens are then forwarded to the Laboratoire de Santé Publique du Québec for drug susceptibility testing. Between 2001 and 2013, there were 149 cases of microbiologically confirmed TB in Nunavik. All available isolates were included in this study and were provided by these two laboratories. Between 1990 and 2000, there were 51 cases of TB in Nunavik; 26 isolates were available from a previous study for these years (4).

DNA Extraction.

M. tuberculosis DNA was isolated as previously described in Lee et al. (32).

Whole-Genome Sequencing.

High-throughput sequencing of extracted DNA was performed by the McGill University and Génome Québec Innovation Centre. The amount of genomic DNA (gDNA) was checked by a Quant-iT PicoGreen dsDNA Assay Kit (Life Technologies). gDNA was then fragmented by sonication. A TruSeq DNA Sample Preparation Kit was used to prepare gDNA libraries (Illumina) and quality was estimated by examining the distribution of fragment lengths (conducted by McGill University and Génome Québec Innovation Centre). Fragments were multiplexed by 24 for paired-end 250-bp sequencing using the MiSeq 250 System (Illumina).

Identification, Annotation, and Confirmation of SNPs and Deletions.

Reads were aligned to H37Rv [National Center for Biotechnology Information (NCBI) accession no. NC_000962.3] and compared as previously described (32). Aligned reads were deposited in the NCBI’s Sequence Read Archive under accession no. SRP039605 (BioProject PRJNA240330). Due to their repetitive nature, it is more difficult to accurately map reads to the PE/PGRS and PPE genes, as well as mobile elements, and therefore these regions may be at higher risk of false positives (5, 7). Consequently, SNPs in these regions were excluded from the analyses presented in this manuscript. For a list of nonsynonymous SNPs that were excluded, see Dataset S3. Additional analyses including SNPs in these regions did not alter our key findings.
Deletion events were identified against the H37Rv and CDC1551 reference genomes with the Integrative Genomics Viewer (version 2.3.34) (33) and confirmed by PCR and Sanger sequencing. The SNPs and deletions were annotated against two reference genomes (uid57777 and uid57775 databases for H37Rv and CDC1551, respectively) using SnpEff (version 3.3h) (41) and Artemis (version 15.0.0) (42), respectively.

Phylogenetic Analysis.

Concatenated sequences of the SNPs were used to generate phylogenetic trees via the maximum likelihood method in Molecular Evolutionary Genetics Analysis (MEGA; version 6) (34). Based on the lowest Bayesian information criterion (BIC), the general time-reversible model (43) with uniform site-specific rate variation and Tamura three-parameter (44) model were used to construct phylogenetic trees of genomes from Nunavik and isolates from lineages 1–7. The initial trees for the heuristic search were performed by the neighbor-joining method (45) and maximum composite likelihood approach (34). The branches of the trees presenting the highest log likelihood were condensed at the 80% bootstrap value (46). The M. tuberculosis genomes of lineages 1 (SAMEA1877171, SAMEA1877209, SAMEA1877280), 2 (SAMEA1877068, SAMEA1877086, SAMEA1877286), 3 (SAMEA1877096, SAMEA1877124, SAMEA1877277), 4 (SAMEA1877192, SAMEA1877238, SAMEA1877276), 5 (SAMEA1877073, SAMEA1877165, SAMEA1877206), 6 (SAMEA1877101, SAMEA1877190, SAMEA1877233), and 7 (SAMEA1877077, SAMEA1877197, SAMEA1877216) were obtained from European Nucleotide Archive ERP001731 (47).

Molecular Dating.

Divergence times for the 163 Nunavik isolates were estimated using Bayesian Markov chain Monte Carlo methods [Bayesian Evolutionary Analysis by Sampling Trees (BEAST), versions 1.8 and 2.1.3] (10, 11). Because the Tamura three-parameter model of nucleotide substitution was not available in BEAST, we used the general time-reversible (GTR) model (43), which had the next-lowest BIC (Δ 9.03).
Divergence dates were estimated using all 163 genomes dated with the year of isolation. H37Rv was used as an outgroup (with a date of isolation of 1905) (48). To assess the robustness of our findings, we compared three different approaches to deriving the MRCAs. Using the concatenated sequences of SNPs across the 163 genomes, we first conducted an analysis that incorporated prior knowledge of the substitution rate of M. tuberculosis in the form of a calibration node for the Mj sublineage (analysis 1). We identified the SNP difference between the two most-divergent isolates from the Mj sublineage as 72 SNPs. Assuming equal divergence, we then applied the previously reported substitution rate of 0.5 SNPs per genome per y to obtain a mean node age, whereas the SD was calculated using the extremes of frequently reported confidence intervals (0.3 SNPs per genome per y and 0.7 SNPs per genome per y, respectively) (5, 9). The resultant mean node age and its SD were then used as a prior for the MRCA of the Mj sublineage. We then compared these results with an analysis agnostic to the reported substitution rate (i.e., without calibration), also using concatenated sequences (analysis 2). We then repeated this second analysis but applied a correction for the constant sites across the genomes (analysis 3).
The null hypothesis of a single molecular clock across all branches was tested before analysis using the likelihood ratio test in MEGA and was rejected with P < 0.05. Therefore, all models used an uncorrelated relaxed lognormal clock set at 1.3 × 10−7 substitutions per site in the genome per y. For analyses 1 and 2, a coalescent constant population tree prior was used. Analysis 3 used a coalescent Bayesian skyline tree prior, as described below. All models were run using a Markov chain Monte Carlo (MCMC) chain length of 200,000,000, with 10% burn-in and sampling every 10,000 generations. Convergence was assessed in Tracer (version 1.6) (49) with evidence of adequate mixing, and all parameters had an effective sample size >190. Maximum clade credibility trees were generated using TreeAnnotator (10), with 10% burn-in. Summaries of the posterior densities were generated for nodes with a probability of at least 0.8. The 95% highest posterior densities were used to reflect uncertainty in our estimates.

Coalescent-Based Analyses.

Different coalescent models were tested, including constant population size, exponential growth (assuming a constant growth rate through time), and the Bayesian skyline plot demographic model [a general, nonparametric prior that enforces no particular demographic history (35)]. The posterior simulation-based analog of Akaike’s information criterion (AICM) (14) implemented in Tracer 1.6 has been used to select the model providing the best fit to our data. The estimations of the AICM from 1,000 bootstrap replicates support that, among the different models, the Bayesian skyline model provides the better fit overall (marginally lower value of AIC), followed by the constant population size model (Table S1). Substitution rates were estimated for each sublineage in BEAST using a GTR model of nucleotide substitution and a coalescent Bayesian skyline tree prior. To formally assess changes in population demographics over time, Bayesian skyline plots were generated in Tracer for the 163 genomes overall, as well as the Mj and Mn sublineages individually (Fig. S2).

Calculation of dN/dS Ratios.

The ancestral sequences for each MRCA (Mj–Mn, Mj, and Mn) were reconstructed manually (Dataset S2). We calculated the dN/dS between Mj–Mn and Mj by comparing these reconstructed codon sequences using the Nei-Gojobori (Jukes-Cantor) method (50) in MEGA (version 6), as has been previously done for M. tuberculosis (7, 29, 31). Similarly, to obtain a dN/dS for Mj–Mn versus Mn, we compared these two reconstructed sequences.
For dN/dS calculations postdiversification, two methods were applied. First, we calculated the dN/dS for each sublineage compared with its ancestral sequence using a phylogenetics-based approach (analysis 1). In this method, we generated a concatenated sequence of codons for each sublineage based on all SNP loci. SNPs that occurred in more than one isolate only contributed once to this sequence; no SNPs were double-counted. These concatenated sequences (one for each of the Mj and Mn sublineages) were then compared with their respective imputed ancestral sequences to obtain the dN/dS postdiversification. Second, we conducted a pairwise dN/dS analysis (analysis 2) (7). In this analysis, we calculated dN/dS for each isolate of the Mj sublineage compared with the reconstructed ancestral sequence of Mj. We then determined the median dN/dS across the 153 pairwise comparisons. For both analyses 1 and 2, we repeated the dN/dS calculations after excluding SNPs that were present only once across all 163 genomes (singletons).

Comparisons to Gene Categories from the Literature.

nsSNPs were mapped to gene categories described in the literature: M. tuberculosis-specific genes (39), in vitro (20) or in vivo (19) essential genes, genes coding membrane proteins (40), lateral gene transfer or duplication acquisition (39), regions of differentiation (36), macrophage survival (38), human T-cell epitopes (7), and deleted genes in bacillus Calmette–Guérin (37). In addition to these gene categories, deletions were also identified across genes coding PE/PPE family proteins (7) and mobile elements (7). These comparisons were performed using the most recent version of the H37Rv reference genome (NCBI accession no. NC_000962.3) including gene nomenclatures of older versions of H37Rv used by these authors at the time of their studies.

Statistical Analyses.

The distributions of pairwise SNPs within and between villages were compared using the Wilcoxon–Mann–Whitney test. The two-sample z test for difference in proportions was used to compare proportions of pairwise comparisons less than the prespecified threshold as a proxy of transmission during the study period. This test was also used to compare the proportions of genes with nsSNPs in each gene category pre- and postdiversification. For the dN/dS analyses, the raw numbers of nsSNPs and sSNPs pre- and postdiversification were compared using the G test (12), whereas the median pairwise dN/dS values postdiversification for each sublineage were compared with their respective prediversification value using the Wilcoxon signed-rank test. All tests were two-tailed, with a P value of <0.05 considered statistically significant. Analyses were conducted in Stata (version 13; StataCorp 2013) and R (version 3.1.2; cran.r-project.org).

Ethics.

Ethical approval for this work was obtained from the McGill University Faculty of Medicine Institutional Review Board. Individual patient consent was not required.

Data Availability

Data deposition: The aligned reads reported in this paper have been deposited in the National Center for Biotechnology Information’s Sequence Read Archive (accession no. SRP039605, BioProject PRJNA240330).

Acknowledgments

The authors thank the Nunavik Regional Board of Health and Social Services for their collaboration on this study and Drs. Erwin Schurr, PhD and Michael Reed, PhD of the Research Institute of McGill University Health Centre for their input into the genetic analysis. This work was supported by the Canadian Institutes of Health Research (MOP 125858 to M.A.B. and D.M.) and Fonds de Recherche Santé Québec (29836 and 26274 to N.R.). B.J.S. was supported by the Canada Research Chairs Program (CRC 2289986).

Supporting Information

Supporting Information (PDF)
Supporting Information
pnas.1507071112.sd01.xlsx
pnas.1507071112.sd02.xlsx
pnas.1507071112.sd03.xlsx
pnas.1507071112.sd04.xlsx
pnas.1507071112.sd05.xlsx

References

1
S Gagneux, PM Small, Global phylogeography of Mycobacterium tuberculosis and implications for tuberculosis product development. Lancet Infect Dis 7, 328–337 (2007).
2
T Wirth, et al., Origin, spread and demography of the Mycobacterium tuberculosis complex. PLoS Pathog 4, e1000160 (2008).
3
S Gagneux, et al., Variable host-pathogen compatibility in Mycobacterium tuberculosis. Proc Natl Acad Sci USA 103, 2869–2873 (2006).
4
D Nguyen, et al., Tuberculosis in the Inuit community of Quebec, Canada. Am J Respir Crit Care Med 168, 1353–1357 (2003).
5
A Roetzer, et al., Whole genome sequencing versus traditional genotyping for investigation of a Mycobacterium tuberculosis outbreak: A longitudinal molecular epidemiological study. PLoS Med 10, e1001387 (2013).
6
M Marmiesse, et al., Macro-array and bioinformatic analyses reveal mycobacterial ‘core’ genes, variation in the ESAT-6 gene family and new phylogenetic markers for the Mycobacterium tuberculosis complex. Microbiology 150, 483–496 (2004).
7
I Comas, et al., Human T cell epitopes of Mycobacterium tuberculosis are evolutionarily hyperconserved. Nat Genet 42, 498–503 (2010).
8
F Coll, et al., A robust SNP barcode for typing Mycobacterium tuberculosis complex strains. Nat Commun 5, 4812 (2014).
9
TM Walker, et al., Whole-genome sequencing to delineate Mycobacterium tuberculosis outbreaks: A retrospective observational study. Lancet Infect Dis 13, 137–146 (2013).
10
R Bouckaert, et al., BEAST 2: A software platform for Bayesian evolutionary analysis. PLOS Comput Biol 10, e1003537 (2014).
11
AJ Drummond, MA Suchard, D Xie, A Rambaut, Bayesian phylogenetics with BEAUti and the BEAST 1.7. Mol Biol Evol 29, 1969–1973 (2012).
12
JH McDonald, M Kreitman, Adaptive protein evolution at the Adh locus in Drosophila. Nature 351, 652–654 (1991).
13
EPC Rocha, et al., Comparisons of dN/dS are time dependent for closely related bacterial genomes. J Theor Biol 239, 226–235 (2006).
14
G Baele, et al., Improving the accuracy of demographic and molecular clock model comparison while accommodating phylogenetic uncertainty. Mol Biol Evol 29, 2157–2167 (2012).
15
PS Novichkov, YI Wolf, I Dubchak, EV Koonin, Trends in prokaryotic evolution revealed by comparison of closely related bacterial and archaeal genomes. J Bacteriol 191, 65–73 (2009).
16
BJ Shapiro, EJ Alm, Comparing patterns of natural selection across species using selective signatures. PLoS Genet 4, e23 (2008).
17
P Supply, et al., Genomic analysis of smooth tubercle bacilli provides insights into ancestry and pathoadaptation of Mycobacterium tuberculosis. Nat Genet 45, 172–179 (2013).
18
J Wang, et al., Insights on the emergence of Mycobacterium tuberculosis from the analysis of Mycobacterium kansasii. Genome Biol Evol 7, 856–870 (2015).
19
CM Sassetti, EJ Rubin, Genetic requirements for mycobacterial survival during infection. Proc Natl Acad Sci USA 100, 12989–12994 (2003).
20
CM Sassetti, DH Boyd, EJ Rubin, Genes required for mycobacterial growth defined by high density mutagenesis. Mol Microbiol 48, 77–84 (2003).
21
M Raghavan, et al., The genetic prehistory of the New World Arctic. Science 345, 1255832 (2014).
22
J Higdon, Commercial and subsistence harvests of bowhead whales (Balaena mysticetus) in eastern Canada and west Greenland. J Cetacean Res Manag 11, 185 (2010).
23
S Bonesteel Canada’s Relationship with the Inuit , ed E Anderson (published under the authority of the Minister of Indian Affairs and Northern Development and Federal Interlocutor for Métis and Non-Status Indians, Ottawa, Canada, 2006).
24
PS Grygier A Long Way from Home: The Tuberculosis Epidemic Among the Inuit (McGill-Queen’s Univ Press, Montreal, 1994).
25
H Alonso, et al., Deciphering the role of IS6110 in a highly transmissible Mycobacterium tuberculosis Beijing strain, GC1237. Tuberculosis (Edinb) 91, 117–126 (2011).
26
MB Reed, et al., A glycolipid of hypervirulent tuberculosis strains that inhibits the innate immune response. Nature 431, 84–87 (2004).
27
I Parwati, R van Crevel, D van Soolingen, Possible underlying mechanisms for successful emergence of the Mycobacterium tuberculosis Beijing genotype strains. Lancet Infect Dis 10, 103–111 (2010).
28
EP Amaral, et al., Pulmonary infection with hypervirulent Mycobacteria reveals a crucial role for the P2X7 receptor in aggressive forms of tuberculosis. PLoS Pathog 10, e1004188 (2014).
29
D Ordway, et al., The hypervirulent Mycobacterium tuberculosis strain HN878 induces a potent TH1 response followed by rapid down-regulation. J Immunol 179, 522–531 (2007).
30
R Hershberg, et al., High functional diversity in Mycobacterium tuberculosis driven by genetic drift and human demography. PLoS Biol 6, e311 (2008).
31
CH Kuo, NA Moran, H Ochman, The consequences of genetic drift for bacterial genome complexity. Genome Res 19, 1450–1454 (2009).
32
RS Lee, et al., Reemergence and amplification of tuberculosis in the Canadian Arctic. J Infect Dis 211, 1905–1914 (2015).
33
H Thorvaldsdóttir, JT Robinson, JP Mesirov, Integrative Genomics Viewer (IGV): High-performance genomics data visualization and exploration. Brief Bioinform 14, 178–192 (2013).
34
K Tamura, G Stecher, D Peterson, A Filipski, S Kumar, MEGA6: Molecular Evolutionary Genetics Analysis version 6.0. Mol Biol Evol 30, 2725–2729 (2013).
35
AJ Drummond, SYW Ho, MJ Phillips, A Rambaut, Relaxed phylogenetics and dating with confidence. PLoS Biol 4, e88 (2006).
36
AG Tsolaki, et al., Functional and evolutionary genomics of Mycobacterium tuberculosis: Insights from genomic deletions in 100 strains. Proc Natl Acad Sci USA 101, 4865–4870 (2004).
37
S Mostowy, AG Tsolaki, PM Small, MA Behr, The in vitro evolution of BCG vaccines. Vaccine 21, 4270–4274 (2003).
38
J Rengarajan, BR Bloom, EJ Rubin, Genome-wide requirements for Mycobacterium tuberculosis adaptation and survival in macrophages. Proc Natl Acad Sci USA 102, 8327–8332 (2005).
39
TP Stinear, et al., Insights from the complete genome sequence of Mycobacterium marinum on the evolution of Mycobacterium tuberculosis. Genome Res 18, 729–741 (2008).
40
NS Osório, et al., Evidence for diversifying selection in a set of Mycobacterium tuberculosis genes in response to antibiotic- and nonantibiotic-related pressure. Mol Biol Evol 30, 1326–1336 (2013).
41
P Cingolani, et al., A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3. Fly 6, 80–92 (2012).
42
K Rutherford, et al., Artemis: Sequence visualization and annotation. Bioinformatics 16, 944–945 (2000).
43
PJ Waddell, MA Steel, General time-reversible distances with unequal rates across sites: Mixing gamma and inverse Gaussian distributions with invariant sites. Mol Phylogenet Evol 8, 398–414 (1997).
44
K Tamura, Estimation of the number of nucleotide substitutions when there are strong transition-transversion and G+C-content biases. Mol Biol Evol 9, 678–687 (1992).
45
N Saitou, M Nei, The Neighbor-joining method: A new method for reconstructing phylogenetic trees. Mol Biol Evol 4, 406–425 (1987).
46
J Felsenstein, Confidence limits on phylogenies: An approach using the bootstrap. Evolution 39, 783–791 (1985).
47
I Comas, et al., Out-of-Africa migration and Neolithic coexpansion of Mycobacterium tuberculosis with modern humans. Nat Genet 45, 1176–1182 (2013).
48
W Steenken, WH Oatway, SA Petroff, Biological studies of the tubercle bacillus: III. Dissociation and pathogenicity of the R and S variants of the human tubercle bacillus (H(37)). J Exp Med 60, 515–540 (1934).
49
A Rambaut, M Suchard, D Xie, AJ Drummond, Tracer v1.6. Available at beast.bio.ed.ac.uk/Tracer. (2014).
50
M Nei, T Gojobori, Simple methods for estimating the numbers of synonymous and nonsynonymous nucleotide substitutions. Mol Biol Evol 3, 418–426 (1986).

Information & Authors

Information

Published in

Go to Proceedings of the National Academy of Sciences
Proceedings of the National Academy of Sciences
Vol. 112 | No. 44
November 3, 2015
PubMed: 26483462

Classifications

Data Availability

Data deposition: The aligned reads reported in this paper have been deposited in the National Center for Biotechnology Information’s Sequence Read Archive (accession no. SRP039605, BioProject PRJNA240330).

Submission history

Published online: October 19, 2015
Published in issue: November 3, 2015

Keywords

  1. Mycobacterium tuberculosis
  2. evolution
  3. whole-genome sequencing

Acknowledgments

The authors thank the Nunavik Regional Board of Health and Social Services for their collaboration on this study and Drs. Erwin Schurr, PhD and Michael Reed, PhD of the Research Institute of McGill University Health Centre for their input into the genetic analysis. This work was supported by the Canadian Institutes of Health Research (MOP 125858 to M.A.B. and D.M.) and Fonds de Recherche Santé Québec (29836 and 26274 to N.R.). B.J.S. was supported by the Canada Research Chairs Program (CRC 2289986).

Notes

This article is a PNAS Direct Submission.

Authors

Affiliations

Robyn S. Lee1
Department of Epidemiology, Biostatistics and Occupational Health, McGill University, Montreal, QC, Canada, H3A 1A2;
The Research Institute of the McGill University Health Centre, Montreal, QC, Canada, H4A 3J1;
McGill International TB Centre, McGill University Health Centre, Montreal, QC, Canada, H4A 3J1;
Nicolas Radomski1
The Research Institute of the McGill University Health Centre, Montreal, QC, Canada, H4A 3J1;
McGill International TB Centre, McGill University Health Centre, Montreal, QC, Canada, H4A 3J1;
Jean-Francois Proulx
Nunavik Regional Board of Health and Social Services, Kuujjuaq, QC, Canada, J0M 1C0;
Ines Levade
Département de Sciences Biologiques, Université de Montréal, Montreal, QC, Canada, H2V 2S9;
B. Jesse Shapiro
Département de Sciences Biologiques, Université de Montréal, Montreal, QC, Canada, H2V 2S9;
Fiona McIntosh
The Research Institute of the McGill University Health Centre, Montreal, QC, Canada, H4A 3J1;
McGill International TB Centre, McGill University Health Centre, Montreal, QC, Canada, H4A 3J1;
Hafid Soualhine
Laboratoire de Santé Publique du Québec, Sainte-Anne-de-Bellevue, QC, Canada, H9X 3R5;
Dick Menzies
The Research Institute of the McGill University Health Centre, Montreal, QC, Canada, H4A 3J1;
McGill International TB Centre, McGill University Health Centre, Montreal, QC, Canada, H4A 3J1;
Respiratory Epidemiology and Clinical Research Unit, Montreal Chest Institute, Montreal, QC, Canada, H4A 3J1
Marcel A. Behr2 [email protected]
The Research Institute of the McGill University Health Centre, Montreal, QC, Canada, H4A 3J1;
McGill International TB Centre, McGill University Health Centre, Montreal, QC, Canada, H4A 3J1;

Notes

2
To whom correspondence should be addressed. Email: [email protected].
Author contributions: J.-F.P., B.J.S., D.M., and M.A.B. designed research; R.S.L., N.R., F.M., and H.S. performed research; J.-F.P., I.L., B.J.S., F.M., and H.S. contributed new reagents/analytic tools; R.S.L., N.R., I.L., B.J.S., and M.A.B. analyzed data; and R.S.L., N.R., D.M., and M.A.B. wrote the paper.
1
R.S.L. and N.R. contributed equally to this work.

Competing Interests

The authors declare no conflict of interest.

Metrics & Citations

Metrics

Note: The article usage is presented with a three- to four-day delay and will update daily once available. Due to ths delay, usage data will not appear immediately following publication. Citation information is sourced from Crossref Cited-by service.


Citation statements




Altmetrics

Citations

If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.

Cited by

    Loading...

    View Options

    View options

    PDF format

    Download this article as a PDF file

    DOWNLOAD PDF

    Get Access

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Personal login Institutional Login

    Recommend to a librarian

    Recommend PNAS to a Librarian

    Purchase options

    Purchase this article to access the full text.

    Single Article Purchase

    Population genomics of Mycobacterium tuberculosis in the Inuit
    Proceedings of the National Academy of Sciences
    • Vol. 112
    • No. 44
    • pp. 13417-E6081

    Media

    Figures

    Tables

    Other

    Share

    Share

    Share article link

    Share on social media