Previous Article |
Table of Contents
| Next Article
AGRICULTURAL SCIENCES
Estimating genome conservation between crop and model legume species









, 

,
, ¶¶
Department of Plant Pathology and
College of Agricultural and Environmental Sciences Genomics Facility, University of California, One Shields Avenue, Davis, CA 95616; ¶Department of Plant Pathology, University of Minnesota, St. Paul, MN 55108; ||Advanced Center for Genome Technology, University of Oklahoma, Norman, OK 73019; 
John Innes Centre, Norwich NR4 7UH, United Kingdom; 
Department of Plant Biology, Cornell University, Ithaca, NY 14853; and 
Biological Research Center, Institute of Genetics, H-6701 Szeged, Hungary
Edited by Susan R. Wessler, University of Georgia, Athens, GA and approved August 13, 2004 (received for review March 30, 2004)
| Abstract |
|---|
|
|
|---|
12% of Earth's arable land and accounting for
27% of the world's primary crop production (1). Their unusual capacity for symbiotic nitrogen fixation underlies their importance as a source of protein in the human diet and of nitrogen in both natural and agricultural ecosystems. Legumes are also increasingly recognized as a source of valuable secondary metabolites. These factors have fueled a significant increase in legume research over the past decade.
The
20,000 legume species are divided into three subfamilies: Mimosoideae, Caesalpinioideae, and the numerically and economically dominant Papilionoideae (2). With the notable exception of peanut, the important crop legumes occur in two Papilionoid clades, referred to here as the "phaseoloid" and "galegoid" legumes (Table 1). Despite their close phylogenetic affiliations (Fig. 1), the genetic systems represented within this group are diverse, ranging from simple autogamous diploids to complex out-crossing polyploids. Genome size also varies widely among legumes, with pea having a genome size 10 times that of some related diploid genera.
|
|
A pressing need in legume genomics is to integrate knowledge gained from the study of model legume genomes with the biological and agronomic questions of importance in the crop species. Comparative genetic mapping is well established in several plant families, most notably the Poaceae (4), where initial studies predicted that synteny would greatly facilitate gene discovery among related species (5, 6). However, even closely related grass species (7, 8), in some cases members of the same species (9), can exhibit significant divergence in genome organization. It is important to know whether similar features are prevalent in other plant families, in particular because the extent of such differences may define the limits of comparative structural genomics as a strategy for applied agriculture.
Here we combined genetic and phylogenetic analyses to map putatively orthologous genes across seven legume species. Complementing the genetic linkage analysis, we surveyed the conservation of genome microstructure between M. truncatula and L. japonicus and M. truncatula and Glycine max (soybean) by comparing fully sequenced bacterial artificial chromosome (BAC) clones. The combined genetic, phylogenetic, and genomic analyses demonstrate extensive conservation of gene order and orthology between the crop and model legumes and also identify features of structural divergence between these genomes.
| Materials and Methods |
|---|
|
|
|---|
Development of Cross-Species Genetic Markers. The development and genetic mapping of gene-specific markers were as described by Choi et al. (10). BLASTN [National Center for Biotechnology Information (NCBI), Bethesda] was used to identify conserved sequences between ESTs of the legume M. truncatula and other legume species. Multiple sequence alignments, with the Arabidopsis genomic sequence used to infer intron position, facilitated design of PCR primers that anneal to conserved exon sequences and amplify across more diverged introns. Polymorphisms (Table 2, which is published as supporting information on the PNAS web site) were identified by sequencing PCR products from parental lines (Table 1), followed by manual inspection of alignments and chromatograms. Markers were typically analyzed as cleavable amplified polymorphic sequences (10). Single-nucleotide polymorphisms that did not alter a restriction site were scored by DNA sequencing of PCR products. Resequencing was used to confirm or refute apparently ambiguous data.
Phylogenetic Analysis. Neighbor-joining trees were rooted by using the closest Arabidopsis sequence as an outgroup or left unrooted where no close homolog was present in Arabidopsis. Phylogenetic analyses were conducted by using parsimony options in PAUP* (17). The principal analysis involved 100 searches with random taxon addition and tree bisectionreconnection (TBR) branch swapping, with maxtrees set to increase without limit. Support for branches was assayed by 100 bootstrap replicate searches using simple taxon addition, TBR branch swapping, and maxtrees set to 1,000.
Microsynteny Analysis. Accession numbers for sequenced BACs are given in Tables 25, which are published as supporting information on the PNAS web site. Homologous transformation-competent BAC (TAC) clones of L. japonicus were obtained from NCBI (18). Ab initio gene prediction involved the eudicot version of FGENESH (www.softberry.com/berry.phtml?topic=gfind). Gene prediction based on identity to transcribed sequences was obtained by BLASTN against The Institute for Genomic Research M. truncatula or L. japonicus Gene Index databases (www.tigr.org/tdb/tgi). Additional predicted proteins were identified by means of BLASTX (NCBI) against the NCBI nonredundant protein database. BLASTP (NCBI) was used to compare predicted proteins between M. truncatula and L. japonicus clone pairs, with a maximum E value cutoff of e-10 and a median E value of <e-100 for 533 protein pairs. REPEATMASKER (http://repeatmasker.genome.washington.edu/cgi-bin/Repeat-Masker) was used to screen interspersed repeats, transfer RNA, and low-complexity DNA sequences.
| Results |
|---|
|
|
|---|
Macrosynteny Analysis Among M. truncatula, M. sativa, Pi. sativum, G. max, V. radiata, and Ph. vulgaris. For purposes of establishing a comparative genetic map spanning galegoid and phaseoloid species, we analyzed marker segregation in M. truncatula, M. sativa, Pi. sativum, and V. radiata. In addition to the markers developed based on phylogenetic criteria, we analyzed 60 primer pairs developed based on homology to genetic markers in G. max and 117 additional markers developed for the M. truncatula core genetic map (10). In all cases, M. truncatula was the central point of comparison. Comparisons between the two Medicago species and between Ph. vulgaris and V. radiata have been presented elsewhere (11, 12) and are included here for the sake of integration.
The pea genome is much larger (
10 times) than that of M. truncatula and has a base chromosome number of 7, compared to 8 in M. truncatula. Despite these overt differences, analysis of 57 gene-specific markers reveals broad conservation of genome structure, with the major evident differences being sites of inferred chromosomal rearrangements. An average of eight colinear genetic markers were present for each pea chromosome, and in only one case (i.e., marker PTSB) did we identify the possible translocation of an orthologous gene. Instead, we identified a limited number of chromosomal translocation events such as that shown for the top terminal region of PsLGIII and the bottom portion of MtLG2 (Fig. 2). MtLG6 could not be effectively integrated into the Pi. sativum genetic map, due to a lack of comparative markers in this linkage group. This result corresponds with the previous observation that MtLG6 is rich in heterochromatic DNA (20) and relatively poor in transcribed genes (10). The absence of a corresponding single linkage group in pea suggests that chromosomal rearrangements involving M. truncatula chromosome 6 might be responsible for the difference in chromosome number between these two species.
|
Soybean is also a member of the phaseoloid clade but has a polyploid genome that is predicted to have undergone duplication since its divergence from other Phaseoleae. Sixty loci were mapped in common between M. truncatula and soybean, with the majority of markers derived from homologs of soybean restriction fragment length polymorphism probes identified in M. truncatula (10). A complicating feature of cultivated soybean is a low level of polymorphism, as evidenced in Table 1, which when taken together with the high level of soybean gene duplication significantly reduced our ability to interpret the comparative map between these two species. Nevertheless, 38% of the markers revealed putative synteny between M. truncatula and soybean, identifying 11 colinear blocks between the two genomes (Figs. 2 and 8). Yan et al. (21, 22) report genome-wide conserved microsynteny between the genomes of M. truncatula and soybean, with 54% of 50 soybean contig groups showing conserved microsynteny to M. truncatula. Five of the extensively microsyntenic contigs (21) were mapped in M. truncatula in this study, three of which, namely A095, A064, and A315, were mapped in regions showing putative synteny between M. truncatula and soybean.
Microsynteny Among Papilionoid Legume Genomes. To determine the extent to which macrosytenic relationships identified by genetic mapping are indicative of conserved genome microstructure, we compared putatively orthologous large insert clone pairs [i.e., BAC or transformation-competent BAC (TAC) clones] between M. truncatula and L. japonicus and between M. truncatula and soybean. The Loteae are a sister group to the galegoid legumes (Fig. 1), and, thus, L. japonicus has a more recent ancestry to M. truncatula than to soybean. Sixty-three sequenced BAC and TAC clone pairs containing an average of nine microsyntenic gene pairs were mapped between the M. truncatula and L. japonicus genomes. As shown in Fig. 3, the genomes are highly syntenic, with macrosynteny punctuated by rearrangements that frequently involve translocation of chromosome arms (Fig. 3), reflecting the difference of six chromosomes in L. japonicus vs. eight chromosomes in M. truncatula.
|
The example of a 141-kb region of M. truncatula at genetic marker MtEIL is shown in Fig. 4. All 16 predicted M. truncatula genes possess strong similarity to annotated genes in the Arabidopsis genome. A remarkable feature of this region is the frequent occurrence of local gene duplication, including two argonaut-like genes (MtEIL-e12), two blue-copper-binding proteins (MtEIL-k12), five kinase-like genes (MtEIL-l14), and two I-box-binding factors (MtEIL-m12). Analysis of the corresponding 97-kb segment from L. japonicus (LjEIL) revealed region-wide colinearity with the MtEIL contig. Ten distinct genes were identified in LjEIL, with only a single case of tandem duplication (LjEIL-d12). All 10 L. japonicus genes and a transfer RNALeu had homologs in the MtEIL region. Six of the 16 distinct homologs from the MtEIL and LjEIL regions exhibit a network of microsynteny with two homeologous regions of Arabidopsis chromosomes 2 and 5, respectively (Fig. 4).
|
| Discussion |
|---|
|
|
|---|
In the case of the legume family of plants, there are numerous species with a long history of traditional breeding but with limited molecular characterization, and there are several species that are characterized at both the genetic and genomic levels. Determining the extent of synteny (and the frequency and nature of its exceptions) among these legume genomes was the focus of this study. We report that synteny is high among closely related species, and that the degree of synteny declines with increasing phylogenetic distance. Although this study is unusual in its use of explicit phylogenetic measures to assess gene orthology and its incorporation of a large genome sequence data set, the features of genome conservation and divergence that we describe are typical of those observed in comparative analysis of both plant and animal genomes.
The high level of conservation between the genomes of M. truncatula and Pi. sativum is particularly striking given the 10 times larger genome (26) and one less chromosome in Pi. sativum. Much of the expansion in Pi. sativum genome size may be due to retroelements (27), but, whatever the mechanism, it has done little to disrupt macrosyntenic relationships. M. truncatula LG6 could not be effectively integrated into the Pi. sativum genetic map. MtLG6 is interesting for several reasons: (i) it is the shortest and most highly heterochromatic of the chromosomes (20), (ii) it is underrepresented for randomly selected EST markers (5), and (iii) it is remarkably rich in resistance gene analogs (RGAs) (28). The inability to establish synteny in this study between MtLG6 and Pi. sativum is undoubtedly due to the low frequency of non-RGA EST markers in MtLG6 (10) and the fact that the genetic maps of pea (13) are not well populated by RGA markers. However, parallel analyses conducted by Kaló et al. (29) suggest that M. sativa LG6 is syntenic with several regions in the pea genome, in particular PsLGVI and PsLGVII (Fig. 7d). The absence of a corresponding single linkage group in pea suggests that chromosomal fission/fusion events involving Medicago chromosome 6 might be responsible for the reduction of chromosome number in Pi. sativum.
L. japonicus (tribe Loteae) and M. truncatula represent the two best-characterized legume genomes. Although there are no important crop legumes within the Loteae, the relatively recent divergence and sister-clade relationship to the galegoid legumes (Fig. 1) offers a potentially useful point of comparison to M. truncatula. We determined that M. truncatula and L. japonicus share a remarkably high level of conserved macrosynteny, dominated by a few large chromosome arm-size rearrangements. The availability of fully sequenced large insert clones [i.e., BACs and transformation-competent BAC (TACs)] at each of these genetically syntenic loci provided an opportunity to evaluate the correlation between genetic macrosynteny and sequence microsynteny. Conserved microsynteny was characterized by
80% of close homologs in the same order and transcriptional orientation, similar to values obtained between human and mouse (30) and within the range identified for the grasses (7). The current analysis also reveals significant divergence between these two legume genomes, with the insertion or deletion of individual or groups of genes accounting for
20% divergence of gene content in microsyntenic intervals. Species-specific tandem duplication of genes accounted for an additional 1217% divergence of gene content, and each species possessed a unique distribution of mobile DNAs. Of 21 tandemly duplicated genes, only one duplication was reciprocal. Similarly, of 26 cases of mobile DNAs, only one mobile DNA was potentially syntenic. The lack of ancestral tandem duplication is suggestive of either efficient removal of tandem duplicates that predate speciation or a recent increase in the rate of tandem duplication. Moreover, the observation that tandemly duplicated genes are occasionally interspersed with single copy genes (Table 4) suggests that purification of duplicates by homologous recombination would simultaneously eliminate the intervening single copy gene(s). Such a mechanism could explain, at least in part, the loss of gene homologs from microsyntenic regions.
Syntenic relationships were significantly more convoluted between the galegoid and phaseoloid clades. Twenty-five percent of genetically mapped orthologous genes were potentially nonsyntenic, resulting in smaller regions of colinearity than those observed between M. truncatula and Pi. sativum or between M. truncatula and L. japonicus. This fragmentation of synteny might be expected based solely on the differences in chromosome number between these two clades. Despite the relatively large number of genetic markers used for comparison, synteny between M. truncatula and soybean was difficult to characterize. We attribute this situation to a combination of recent duplication and low rates of polymorphism in the soybean genome. Nevertheless, comparison of putatively orthologous BAC clone pairs revealed significant conservation of microsynteny between M. truncatula and soybean, consistent with previous comparison of other genome regions between these two species (22, 31). In all cases, conservation of microsynteny was significantly greater between legume genomes than between legume genomes and the corresponding regions of the Arabidopsis genome (Figs. 4 and 9).
The genetic mapping of orthologous genes across multiple species provides an opportunity to propose an integrated view of legume synteny. As shown in Fig. 5, the results suggest broad macrosynteny, particularly within the galegoid or phaseoloid legumes, punctuated by chromosomal rearrangements that increase in frequency with phylogenetic distance. The inclusion of phylogenetic measures in marker analysis also aided the inference of genomic rearrangements. For example, in M. truncatula, orthologous marker PTSB maps to LG5, which is highly conserved with PsLGI (Fig. 2). In Pi. sativum, PTSB maps to a nonsyntenic region of PsLGIII, near the point of an inferred chromosomal rearrangement. The combination of PTSB orthology to a marker on MtLG5 and disrupted synteny of the flanking markers are consistent with a complex genome rearrangement involving both chromosomal fragment translocation and single gene translocation. Such events are likely to be significantly more frequent between the galegoid and phaseoloid legumes, because several markers that map to conserved regions in the galegoid legumes occur in nonsyntenic regions of V. radiata (Figs. 2, 5, and 8). This result is consistent with significant genomic changes that must underlie differences in chromosome number between galegoid (typically eight chromosomes) and phaseoloid (typically 11 chromosomes) lineages.
|
| Conclusion |
|---|
|
|
|---|
Although the current study documents substantial conservation between the genomes of crop and model legumes, it also reveals features of genome divergence. The degree to which genome synteny can facilitate cross-species analysis of gene function will depend both on the conservation of gene order and content, as well as on the frequency with which similar traits have a common genetic basis in different species. An indication that this latter criterion might not always be met is suggested by a recent study of branching in foxtail millet (34). By contrast, similar studies of symbiotic nitrogen fixation in legumes (reviewed in ref. 3) demonstrate that functionally conserved genes occupy syntenic positions across the diversity of legume species analyzed in this study. Moreover, even large and rapidly evolving gene families, such as the nucleotide-binding site-leucine-rich repeat resistance gene homologs, can occupy ancestral genome locations among legumes (28). Testing the extent to which inferences made from comparative genomics can be translated to practical applications in crop improvement represents one of the major current challenges facing plant biology.
| Acknowledgements |
|---|
| Footnotes |
|---|
Abbreviations: BAC, bacterial artificial chromosome; NCBI, National Center for Biotechnology Information.
H.-K.C., J.-H.M., and D.-J.K. contributed equally to this work. ![]()
¶¶ To whom correspondence should be addressed. E-mail: drcook{at}ucdavis.edu.
© 2004 by The National Academy of Sciences of the USA
| References |
|---|
|
|
|---|
This article has been cited by other articles in HighWire Press-hosted journals:
![]() |
J. A. Schlueter, B. E. Scheffler, S. Jackson, and R. C. Shoemaker Fractionation of Synteny in a Genomic Region Containing Tandemly Duplicated Genes across Glycine max, Medicago truncatula, and Arabidopsis thaliana J. Hered., March 2, 2008; (2008) esn010v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Hisano, S. Sato, S. Isobe, S. Sasamoto, T. Wada, A. Matsuno, T. Fujishiro, M. Yamada, S. Nakayama, Y. Nakamura, et al. Characterization of the Soybean Genome Using EST-derived Microsatellite Markers DNA Res, January 11, 2008; (2008) dsm025v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Gallardo, C. Firnhaber, H. Zuber, D. Hericher, M. Belghazi, C. Henry, H. Kuster, and R. Thompson A Combined Proteome and Transcriptome Analysis of Developing Medicago truncatula Seeds: Evidence for Metabolic Specialization of Maternal and Filial Tissues Mol. Cell. Proteomics, December 1, 2007; 6(12): 2165 - 2179. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. A. Campbell, W. Zhu, N. Jiang, H. Lin, S. Ouyang, K. L. Childs, B. J. Haas, J. P. Hamilton, and C. R. Buell Identification and Characterization of Lineage-Specific Genes within the Poaceae Plant Physiology, December 1, 2007; 145(4): 1311 - 1322. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Sato, Y. Nakamura, E. Asamizu, S. Isobe, and S. Tabata Genome Sequencing and Genome Resources in Model Legumes Plant Physiology, June 1, 2007; 144(2): 588 - 593. [Full Text] [PDF] |
||||
![]() |
H. T.T. Phan, S. R. Ellwood, K. Adhikari, M. N. Nelson, and R. P. Oliver The First Genetic and Comparative Map of White Lupin (Lupinus albus L.): Identification of QTLs for Anthracnose Resistance and Flowering Time, and a Locus for Alkaloid Content DNA Res, May 26, 2007; (2007) dsm009v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. B. Heckmann, F. Lombardo, H. Miwa, J. A. Perry, S. Bunnewell, M. Parniske, T. L. Wang, and J. A. Downie Lotus japonicus Nodulation Requires Two GRAS Domain Regulators, One of Which Is Functionally Conserved in a Non-Legume Plant Physiology, December 1, 2006; 142(4): 1739 - 1750. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Benlloch, I. d'Erfurth, C. Ferrandiz, V. Cosson, J. P. Beltran, L. A. Canas, A. Kondorosi, F. Madueno, and P. Ratet Isolation of mtpim Proves Tnt1 a Useful Reverse Genetics Tool in Medicago truncatula and Uncovers New Aspects of AP1-Like Functions in Legumes Plant Physiology, November 1, 2006; 142(3): 972 - 983. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. B. Cannon, L. Sterck, S. Rombauts, S. Sato, F. Cheung, J. Gouzy, X. Wang, J. Mudge, J. Vasdewani, T. Schiex, et al. Legume genome evolution viewed through the Medicago truncatula and Lotus japonicus genomes PNAS, October 3, 2006; 103(40): 14959 - 14964. [Abstract] [Full Text] [PDF] |
||||
![]() |
J.-H. Mun, D.-J. Kim, H.-K. Choi, J. Gish, F. Debelle, J. Mudge, R. Denny, G. Endre, O. Saurat, A.-M. Dudez, et al. Distribution of Microsatellites in the Genome of Medicago truncatula: A Resource of Genetic Markers That Integrate Genetic and Physical Maps Genetics, April 1, 2006; 172(4): 2541 - 2555. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. Gualtieri, J. A. Conner, D. T. Morishige, L. D. Moore, J. E. Mullet, and P. Ozias-Akins A Segment of the Apospory-Specific Genomic Region Is Highly Microsyntenic Not Only between the Apomicts Pennisetum squamulatum and Buffelgrass, But Also with a Rice Chromosome 11 Centromeric-Proximal Genomic Region Plant Physiology, March 1, 2006; 140(3): 963 - 971. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. D. Platten, E. Foo, R. C. Elliott, V. Hecht, J. B. Reid, and J. L. Weller Cryptochrome 1 Contributes to Blue-Light Sensing in Pea Plant Physiology, November 1, 2005; 139(3): 1472 - 1482. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. A. Barrett, I. J. Baird, and D. R. Woodfield A QTL Analysis of White Clover Seed Production Crop Sci., August 1, 2005; 45(5): 1844 - 1850. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Abirached-Darmency, M. R. Abdel-gawwad, G. Conejero, J. L. Verdeil, and R. Thompson In situ expression of two storage protein genes in relation to histo-differentiation at mid-embryogenesis in Medicago truncatula and Pisum sativum seeds J. Exp. Bot., August 1, 2005; 56(418): 2019 - 2028. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Fredslund, L. Schauser, L. H. Madsen, N. Sandal, and J. Stougaard PriFi: using a multiple alignment of related sequences to find primers for amplification of homologs Nucleic Acids Res., July 1, 2005; 33(suppl_2): W516 - W520. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. D. Young, S. B. Cannon, S. Sato, D. Kim, D. R. Cook, C. D. Town, B. A. Roe, and S. Tabata Sequencing the Genespaces of Medicago truncatula and Lotus japonicus Plant Physiology, April 1, 2005; 137(4): 1174 - 1181. [Full Text] [PDF] |
||||
|
|