Skip to main content
  • Submit
  • About
    • Editorial Board
    • PNAS Staff
    • FAQ
    • Rights and Permissions
    • Site Map
  • Contact
  • Journal Club
  • Subscribe
    • Subscription Rates
    • Subscriptions FAQ
    • Open Access
    • Recommend PNAS to Your Librarian
  • Log in
  • Log out
  • My Cart

Main menu

  • Home
  • Articles
    • Current
    • Latest Articles
    • Special Features
    • Colloquia
    • Collected Articles
    • PNAS Classics
    • Archive
  • Front Matter
  • News
    • For the Press
    • Highlights from Latest Articles
    • PNAS in the News
  • Podcasts
  • Authors
    • Information for Authors
    • Purpose and Scope
    • Editorial and Journal Policies
    • Submission Procedures
    • For Reviewers
    • Author FAQ
  • Submit
  • About
    • Editorial Board
    • PNAS Staff
    • FAQ
    • Rights and Permissions
    • Site Map
  • Contact
  • Journal Club
  • Subscribe
    • Subscription Rates
    • Subscriptions FAQ
    • Open Access
    • Recommend PNAS to Your Librarian

User menu

  • Log in
  • Log out
  • My Cart

Search

  • Advanced search
Home
Home

Advanced Search

  • Home
  • Articles
    • Current
    • Latest Articles
    • Special Features
    • Colloquia
    • Collected Articles
    • PNAS Classics
    • Archive
  • Front Matter
  • News
    • For the Press
    • Highlights from Latest Articles
    • PNAS in the News
  • Podcasts
  • Authors
    • Information for Authors
    • Purpose and Scope
    • Editorial and Journal Policies
    • Submission Procedures
    • For Reviewers
    • Author FAQ

New Research In

Physical Sciences

Featured Portals

  • Physics
  • Chemistry
  • Sustainability Science

Articles by Topic

  • Applied Mathematics
  • Applied Physical Sciences
  • Astronomy
  • Computer Sciences
  • Earth, Atmospheric, and Planetary Sciences
  • Engineering
  • Environmental Sciences
  • Mathematics
  • Statistics

Social Sciences

Featured Portals

  • Anthropology
  • Sustainability Science

Articles by Topic

  • Economic Sciences
  • Environmental Sciences
  • Political Sciences
  • Psychological and Cognitive Sciences
  • Social Sciences

Biological Sciences

Featured Portals

  • Sustainability Science

Articles by Topic

  • Agricultural Sciences
  • Anthropology
  • Applied Biological Sciences
  • Biochemistry
  • Biophysics and Computational Biology
  • Cell Biology
  • Developmental Biology
  • Ecology
  • Environmental Sciences
  • Evolution
  • Genetics
  • Immunology and Inflammation
  • Medical Sciences
  • Microbiology
  • Neuroscience
  • Pharmacology
  • Physiology
  • Plant Biology
  • Population Biology
  • Psychological and Cognitive Sciences
  • Sustainability Science
  • Systems Biology

Genome organization in dicots: Genome duplication in Arabidopsis and synteny between soybean and Arabidopsis

David Grant, Perry Cregan, and Randy C. Shoemaker
PNAS April 11, 2000 97 (8) 4168-4173; https://doi.org/10.1073/pnas.070430597
David Grant
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Perry Cregan
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Randy C. Shoemaker
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  1. Edited by Ronald R. Sederoff, North Carolina State University, Raleigh, NC, and approved February 1, 2000 (received for review October 6, 1999)

  • Article
  • Figures & SI
  • Info & Metrics
  • PDF
Loading

Abstract

Synteny between soybean and Arabidopsis was studied by using conceptual translations of DNA sequences from loci that map to soybean linkage groups A2, J, and L. Synteny was found between these linkage groups and all four of the Arabidopsis chromosomes, where GenBank contained enough sequence for synteny to be identified confidently. Soybean linkage group A2 (soyA2) and Arabidopsis chromosome I showed significant synteny over almost their entire lengths, with only 2–3 chromosomal rearrangements required to bring the maps into substantial agreement. Smaller blocks of synteny were identified between soyA2 and Arabidopsis chromosomes IV and V (near the RPP5 and RPP8 genes) and between soyA2 and Arabidopsis chromosomes I and V (near the PhyA and PhyC genes). These subchromosomal syntenic regions were themselves homeologous, suggesting that Arabidopsis has undergone a number of segmental duplications or possibly a complete genome duplication during its evolution. Homologies between the homeologous soybean linkage groups J and L and Arabidopsis chromosomes II and IV also revealed evidence of segmental duplication in Arabidopsis. Further support for this hypothesis was provided by the observation of very close linkage in Arabidopsis of homologs of soybean Vsp27 and Bng181 (three locations) and purple acid phosphatase-like sequences and homologs of soybean A256 (five locations). Simulations show that the synteny and duplications we report are unlikely to have arisen by chance during our analysis of the homology reports.

The 145-Mbp genome of Arabidopsis thaliana is one of the smallest known among higher plants (1). Its low, interspersed, repetitive DNA content (2) makes it an ideal model for genomic studies. On the other hand, the soybean unreplicated haploid genome contains 1,115 Mbp (1). This almost 8-fold difference in genome size appears to be due to ancient polyploidization event(s) during the evolution of the Glycinea (3) and the high level of repetitive sequences in the soybean genome (4).

DNA hybridization under moderate stringency indicated that more than 90% of the nonrepetitive sequences in soybean are present in more than two copies, with the average chromosomal segment being duplicated approximately 2.55 times (3). Arabidopsis presents a different story. McGrath et al. (5) suggested that only about 15% of the Arabidopsis genes may be encoded by duplicate loci. A later study based on numbers of restriction fragment bands observed through hybridization with restriction fragment length polymorphism (RFLP) probes estimated that number to be 14% (6). Approximately 98% of Arabidopsis RFLP markers mapped to a single locus (7).

An analysis of approximately 25,000 Arabidopsis expressed sequence tags suggested that relatively few highly similar isoforms of genes are found in the Arabidopsis genome (8). Still, many examples of multigene families have been reported (9). A comparative mapping study between A. thaliana and Brassica oleracea revealed islands of conserved organization between the two chromosome complements (6) and identified a region of Arabidopsis chromosome I that seemed to be homeologous with a region on chromosome V. Short regions of synteny between a much broader sample of higher plant taxa have been reported, including a possible duplication between Arabidopsis chromosomes I and III (10). Recently, an analysis of a 400-kb contig from Arabidopsis chromosome IV uncovered a 45-kb segment that seemed to be duplicated on chromosome II (9). These isolated observations indicated that segmental duplications within the Arabidopsis genome may have occurred during its evolutionary past.

Comparative genome analyses between soybean and Arabidopsis could facilitate cross-utilization of genetic resources and tools of both species and could shed light on evolutionary events associated with the divergence of their seemingly disparate genomes. The public availability of data generated from various genomics programs makes possible the comparative analyses of plant genomes representing broadly divergent genera. However, the detection of duplicated genes by DNA hybridization is less effective than comparisons at the protein sequence level because of the degeneracy of the genetic code in directing amino acid sequence. Consequently, gene duplications that occurred long ago are not likely to be detected by hybridization techniques or direct DNA sequence comparisons although they may be inferred by comparisons of protein sequences.

The objectives of this project were to investigate the degree of synteny between Arabidopsis and soybean by using conceptual translations of newly available DNA sequences rather than hybridization techniques. During the course of this project we detected significant synteny between soybean and Arabidopsis. We also found compelling evidence for multiple segmental duplications or possibly whole genome duplication of the Arabidopsis genome during its evolutionary history.

Materials and Methods

Soybean RFLP probes were chosen from the composite molecular map described by Cregan et al. (11). Many of the probes have only a single reported map location, although upon hybridization to restriction enzyme-digested genomic DNA each probe produces an average of 2.55 RFLP bands (3). For this study, both the 23 clones from which simple sequence repeat (SSR) markers on linkage group J were derived and all 68 available RFLP probes that mapped to soybean linkage groups A2, L, and J were sequenced. These soybean linkage groups were chosen as being representative of a densely populated map (A2) and a pair of homeologous linkage groups (J and L).

Plasmid DNA of clones containing PstI fragments of soybean genomic DNA (12) was prepared by using alkaline lysis minipreps and Qiagen columns. Single sequence runs were made from both ends of the cloned soybean DNA insert by using primers located in the cloning vector. Sequencing was performed by the DNA Sequencing Facility at Iowa State University. Reactions used the Applied Biosystems Prism BigDye terminator cycle sequencing kit with AmpliTaq DNA Polymerase FS and were electrophoresed on an Applied Biosystems Prism 377 DNA sequencer. DNA sequences for the SSR clones were obtained previously (11).

The acid phosphatase (AP) gene located on soybean linkage group A2 has not been isolated. In this case, entrez searches of the Arabidopsis bacterial artificial chromosome (BAC) annotations at the National Center for Biotechnology Information were used to find acid phosphatase-related sequences in Arabidopsis.

Homology searches were performed by using blast programs (13–15) at the National Center for Biotechnology Information (http://www.ncbi.nlm.nih.gov/) and atdb (Stanford University; http://genome-www.stanford.edu/Arabidopsis/). Default parameter values were used for all homology searches.

Because comparisons were between two evolutionarily distant species, we analyzed all matches to Arabidopsis BAC sequence conceptual translations whose region of homology to the soybean sequence was both a subset of, and in the same translation frame as, the most significant match, which had Expect values of less than 0.025 and which had map locations listed in the summaries at the Arabidopsis Genome Analysis Project (Cold Spring Harbor Laboratory; http://nucleus.cshl.org/protarab/). Other sources of Arabidopsis sequence and map information used in this analysis were the Arabidopsis thaliana BAC Sequencing Project (The Institute for Genomic Research; http://www.tigr.org/tdb/at/atgenome/atgenome.html), the Kazusa Arabidopsis thaliana Genome Project (http://www.kazusa.or.jp/arabi/), The Nottingham Arabidopsis Stock Centre (http://nasc.nott.ac.uk/), and the Munich Information Centre for Protein Sequences Arabidopsis thaliana Sequencing Project (http://www.mips.biochem.mpg.de/proj/thal/).

To assess the probability that the inter- and intragenomic synteny we report was detected by chance, a simulated Arabidopsis genome was divided into appropriately sized bins based on either the number of BACs in the putative homologous regions or the genetic size of the regions. We then randomly placed “homologies” in as many bins as there were sequence homologies detected for each soybean sequence. The order of the simulated homologies in each bin was not considered in the analysis. At least 10,000 simulated genomes were analyzed for each case of putative synteny or duplication. A simulated genome was considered a match to our results if the number of bins containing at least one copy of each soybean sequence homolog was at least equal to the number of such bins actually observed. Complete details of the soybean/Arabidopsis homologies we found, along with the simulation algorithms and the results of the simulations, can be found at http://soybase.agron.iastate.edu/publication_data/Grant/synteny1.html

Results

Synteny Between Soybean and Arabidopsis.

The gapped-tblastx program, which makes comparisons of predicted amino acid sequences by using each of the six reading frames (15), was used to compare DNA sequences from 68 soybean RFLP clones and 23 SSR-containing genomic DNA clones against all Arabidopsis sequences in GenBank. Conceptual translations of DNA sequences were used for the comparisons because they provide a more sensitive test of homology between evolutionarily widely separated species than do nucleotide sequence comparisons. The soybean RFLP probes used in this study were generated through the use of methylation-sensitive enzymes (12). This approach long has been thought to be a means to enrich for transcribed sequences (16). This is borne out by our finding that 72% of the RFLP clones showed significant homology to at least one Arabidopsis genomic or cDNA sequence. In contrast, only one of the sequences surrounding soybean SSRs had any detectable homology to Arabidopsis. In this case, the homology was to a putative exon and the relative position of the SSR was in an intron. The blast reports suggested that none of the homologies we detected were to known repetitive sequences. Many matches were to cDNAs or isolated genes that did not have a reported map location, although in some cases these sequences were contained in mapped BACs. The map location(s) of the matching BAC(s) was determined by using information provided at the Cold Spring Harbor web site.

Comparisons of map positions of linked soybean RFLP probes and Arabidopsis BACs revealed many regions of synteny. Fig. 1 shows the homologies detected between sequences from soyA2 and Arabidopsis chromosome I (arabI), where soybean sequences distributed along the entire linkage group had homologs on arabI. Because the homologous sequences in soybean and Arabidopsis have been separated for approximately 90 million years (17) and, at least in soybean, there have been 1–2 rounds of genome duplication since their divergence (3), we did not expect a priori that there would be any correlation between a RFLP's map position in soybean and the location in Arabidopsis of the BAC that contained the most significant homology. However, for those 14 sequences on soyA2 that have homologs in BACs on arabI, seven (50%) had the lowest expect value returned by tblastx, whereas four were the second lowest. Eleven of the 14 soybean sequences that revealed synteny between soyA2 and arabI were homologous to Arabidopsis genes or cDNAs. The remaining three with no reported matches to expressed sequences (A096, A117, and T153) had expect values of 1.7e-7, 4e-9, and 7e-34, respectively. In three instances two soybean sequences were homologous to distinct sequences in the same Arabidopsis BAC (F21M11, yUP8H12, and F14J16). Simulations based on our data were conducted to test the likelihood that the synteny we observed was an artifact of analyzing a very large data set. This appears not to be the case because only 23 of 10,000 simulated Arabidopsis genomes had a chromosome that contained all of the loci we report. Our simulations did not consider order of loci on the chromosomes. If we had, the number of matches found would perforce have been much lower.

Figure 1
  • Download figure
  • Open in new tab
  • Download powerpoint
Figure 1

Synteny between soyA2 and arabI. BACs showing homology to soybean sequences are indicated with lines connecting them to their soybean homolog(s). The soyA2 map at the right shows the modern linkage group, with each locus that was analyzed in this study indicated. The proposed progenitor soyA2 in the middle shows a rearranged soyA2 that maximizes the synteny with arabI. Soybean Vsp27 and AP cannot be distinguished at this level of analysis; this ambiguity is indicated by broken lines connecting Arabidopsis BAC F2 M11 and the two soybean loci. The Arabidopsis map is drawn inverted relative to the usual presentation. Tic marks and numbers indicate 10-cM intervals on arabI. The previously identified regions of homeology between soybean linkage groups (3) are identified and shown as vertical lines to the right of soyA2.

The soybean vegetative storage protein gene Vsp27 (18), which maps more than 50 cM from AP on linkage group A2 (11), has acid phosphatase activity (19), and its DNA sequence shows significant homology to many nonpurple AP. For this reason, we were unable to determine unambiguously the orthologous or paralogous relationships between these genes and homologous sequences in Arabidopsis. To indicate this ambiguity we use broken lines in Fig. 1 to show the most parsimonious syntenic relationships.

Although there is substantial synteny between soyA2 and arabI, the maps are not completely colinear. However, the magnitude of the differences are similar to those found in other interspecies comparisons in which synteny has been observed and a limited number of chromosomal rearrangements need be invoked to explain how the observed locus orders were derived from the ancestral ones (20, 21). Fig. 1 demonstrates how one translocation and one inversion in the evolution of soyA2 substantially explains the observed map order differences between soybean and Arabidopsis. Interestingly, the putatively rearranged blocks of soyA2 indicated in Fig. 1 are very similar to regions of homeology reported in the soybean genome (3).

Duplicated Segments Within the Arabidopsis Genome.

Most soybean RFLP sequences had strong homologies to BACs on more than one Arabidopsis chromosome. Surprisingly, we found that these multiple homologies often identified homeologous regions in Arabidopsis.

Segmental duplication involving three Arabidopsis chromosomes.

Fig. 2 shows the synteny between soyA2 and subchromosomal regions of arabI, arabIV, and arabV. To help in visualizing the relationships between soybean and Arabidopsis we have shown the proposed progenitor to soybean linkage group A2 in Fig. 2 because this map likely is more similar to that of their last common ancestor than is the arrangement of the current soyA2. Nine soybean loci spanning the entire length of soyA2 had homologs in an approximately 20- to 30-cM region of arabV (Fig. 2). A subset of these loci also had homologs on arabIV (Fig. 2A), and a distinct but partially overlapping subset had homologs on arabI (Fig. 2B). Eight of 11 soybean sequences had matches to Arabidopsis genes or cDNAs. The remaining three had significant homology only to BACs (A096, E = 1.7e-7; Bng205, E = 2e-8; B132, E = 5e-22). In addition to the shared homologies to soybean RFLP sequences, and providing further support of their homeologous relationships, the three chromosomal segments also each contain at least one copy of a disease-resistance gene and a phytochrome gene. RPP5 and RPP8 (arabIV and arabV, respectively) both confer resistance to Peronospora parasitica (downy mildew). RPP5 is a member of the TIR-NBS-LRR R-gene subclass whereas RPP8 is an example of the LZ-NBS-LRR subclass (22, 23). RPS5 (arabI) conditions resistance to Pseudomonas syringae and is also a member of the LZ-NBS-LRR R-gene subclass (24). The five members of the phytochrome gene family in Arabidopsis can be grouped into three lineages: PhyA, PhyB/D/E, and PhyC (25). Each of these ancient lineages is represented on only one of the three evolutionarily related segments (PhyA, arabI; PhyD/E, arabIV; PhyC, arabV). Simulations showed that the probability of observing such apparently duplicated segments by chance is approximately 0.036.

Figure 2
  • Download figure
  • Open in new tab
  • Download powerpoint
Figure 2

SoyA2 is shown with the proposed progenitor locus order (see Fig. 1). Only those loci that had significant homology to Arabidopsis sequences on arabIV or arabV are connected by lines, although tic marks for every soybean sequence analyzed are shown on the proposed progenitor soyA2 map. Thin lines indicate soybean sequences that had homologs on only one Arabidopsis chromosome. Broken lines are used to indicate uncertainty in syntenic relationships because of duplicated loci in soybean. Known genes in Arabidopsis are shown in bold type. Tic marks and numbers indicate 10-cM intervals on the Arabidopsis chromosomes. (A) Synteny between loci on soybean linkage group A2 and duplicated segments of Arabidopsis chromosomes IV and V. (B) Synteny between soybean linkage group A2 and duplicated segments of Arabidopsis chromosomes I and V.

A comparison of Fig. 2 A and B shows that some of the homologs to soybean sequences that define the arabIV/arabV homeologous regions also contribute to defining the arabI/arabIV regions. Despite this similarity, both pairs of homeologous regions have slightly different orders of soyA2 homologs between their members. This surprising overlap of homologous sequence composition suggests that all three of the modern Arabidopsis chromosomal regions could have been derived from a single progenitor chromosome. Fig. 3 shows how, starting with a single chromosomal segment, a relatively simple series of chromosomal duplications and rearrangements generate the order of homologous loci in all three regions of the modern Arabidopsis genome. In this model the progenitor chromosome contained single copies of each locus, which then diverged through a series of duplications and rearrangements to yield the chromosomal segments observed today.

Figure 3
  • Download figure
  • Open in new tab
  • Download powerpoint
Figure 3

Proposed evolutionary derivation of related regions of Arabidopsis chromosomes I, IV, and V. Green and orange are used to help track chromosomal segments only and do not necessarily indicate related functionality. In this model, an ancestral chromosome or chromosomal segment (protochromosome) was duplicated, producing lineages that culminated in parts of the modern Arabidopsis chromosomes I, IV, and V. In the arabIV/V lineage the path leading to arabIV branched off before the final inversion occurred. Vsp is used in the figure for both AP similarity and sequence homology to the soybean Vsp27 gene. Tic marks and numbers on the modern Arabidopsis chromosomes indicate 10-cM intervals.

Arabidopsis chromosomes II and IV share a duplicated segment.

Based on duplicate RFLP loci, soybean chromosomes J and L (soyJ and soyL) have been proposed to be homeologous (3). Our analysis showed that the upper 15–20 cM of both soybean linkage groups corresponded to approximately 6 cM on both arabII and arabIV (Fig. 4). Additionally, two linked markers in soybean (Sct_046 and A060) mapped to single BACs in each duplicated region. Other soybean sequences from this region also mapped to similar regions in arabII and arabIV, although clearly significant rearrangements have occurred in both genomes since their last common ancestor. Six of seven soybean sequences had matches to genes or cDNAs; B101 had a match to a BAC with an expect value of 9e-28. Simulations using the three loci in common between the homologous segments, but not considering that a single BAC in each region contained the same two soybean sequence homologies, show that the probability of all three occurring in two subgenomic regions is approximately 0.002.

Figure 4
  • Download figure
  • Open in new tab
  • Download powerpoint
Figure 4

Synteny between parts of homeologous soybean linkage groups J and L and duplicated segments of Arabidopsis chromosomes II and IV located at approximately 80 and 10 cM, respectively. Homologous sequences are connected by solid lines. Homologs in Arabidopsis to soybean sequences Sct_046 and A060 are located on a single BACs on both arabII and arabIV.

Multiple segmental duplications involving the acid phosphatase genes in Arabidopsis.

Associations of soybean sequences with acid phosphatase-like sequences in Arabidopsis suggest that segmental duplication may be a common event in Arabidopsis and may underlie some of the extensive gene duplication that has been reported for both species.

DNA sequence from soybean Vsp27 (18) shows homology to Arabidopsis nonpurple AP but not to the purple AP (PAP). In two instances, Vsp27 and Bng181 homologs appear on the same Arabidopsis BAC on arabI (F21M11) and arabII (T28M21) (Fig. 5). In addition, the Cold Spring Harbor table shows these homologs to be about nine BACs apart on arabV (MBD2 and MRH10). In three cases, Arabidopsis PAP-related sequences and homologs of soybean A256 map to the same BAC on arabII (F24L7) and arabIV (ESSA AP2 fragment 2) or separated by one BAC (about 140 kb) on arabII (F16F14 and T24I21). In two other cases they are separated by five or seven BACs on arabIV (F13M23 and F20B18) or arabII (T22O13 and T17D12), respectively. In all cases the soybean RFLP homologs and the Vsp27 sequence or AP similarity are in different parts of the BAC. Each of these sequences or genes also appears several times in the Arabidopsis genome not in close association. Because only about 40% of the Arabidopsis genome has been annotated, the BACs we identified by using entrez probably represent only a subset of the AP-like sequences in Arabidopsis. Simulations showed that the probability of such tight linkage between A256 and PAP is approximately 0.0003 whereas that for Bng181 and AP is approximately 0.008.

Figure 5
  • Download figure
  • Open in new tab
  • Download powerpoint
Figure 5

Locations of AP- and PAP-like sequences and homologs to soybean Vsp27, A256, and Bng181 on Arabidopsis chromosomes. Numbers on the maps show an approximate scale in centimorgans. Arabidopsis homologs to soybean AP/Vsp27 and A256 are tightly linked in three locations. Arabidopsis homologs to PAP and soybean Bng181 are tightly linked at five locations.

There were no obvious patterns in the expect values for paralogous sequences in the duplicated regions in Arabidopsis although both Bng181 and A256 are homologous to cDNAs in Arabidopsis. For example, BAC F21M11 on arabI contained homologs to Vsp27 (E = 3.1e-24) and Bng181 (E = 2.4e-3) whereas the paralogous sequences in BAC T28M21 on arabII had expect values of 1.5e-2 and 2.2e-11, respectively. This suggests that the genes in the duplicated regions have evolved independently since the duplication event.

The proposed soybean chromosomal rearrangements shown in Fig. 1 explain many of the differences in map order now seen between arabI and soyA2. The proposed telomeric translocation places Bng181 and A256 very close to the AP gene in the ancestral chromosome. It is tempting to hypothesize that this association represents the starting point for several segmental duplications in Arabidopsis.

Discussion

We have conducted a preliminary comparative analysis of the soybean genome with that of Arabidopsis. Using data from only three linkage groups of soybean and the information currently available from the Arabidopsis Genome Initiative, we were able to demonstrate synteny between the two genera and show that duplicated segments spanning 10–20 cM are common in Arabidopsis. To make our analysis more powerful, we compared genome organization of soybean and Arabidopsis through tblastx conceptual translations of the DNA sequences. Such comparisons may have a higher probability of recognizing true evolutionary relationships because (i) many changes at the DNA level are not reflected at the amino acid level due to the degeneracy of the genetic code and the functional similarity of some amino acids and (ii) subgenic regions or motifs often are widely conserved while the sequences between them are not. Most of the soybean RFLP sequences we analyzed showed homology to BACs from multiple locations in Arabidopsis. Because the sequences we used were from random, PstI-generated genomic fragments, and thus were not full-length genes, the homologies we observed were necessarily to subgenic regions. This makes it difficult to know whether we are detecting members of gene families. However, the high incidence of conserved amino acid sequences allowed us to infer the evolutionary relatedness of sequences in soybean and Arabidopsis without any direct knowledge of gene function in either organism. This approach revealed extensive genome duplication within the Arabidopsis genome. Our results suggest that extensive segmental duplication has occurred during the evolution of this genome or even that, similar to soybean, Arabidopsis may be an ancient paleopolyploid.

Despite the substantial colinearity observed between soyA2 and arabI, it is clear that these two chromosomes do not simply represent modern chromosomes that evolved from a single, common ancestor. Many loci on soyA2 have no detectable homolog on arabI, and, although we analyzed sequences from only three soybean linkage groups, loci from all three had homologs on arabI. This along with the numerous duplications we observed in Arabidopsis suggests that considerable chromosomal rearrangement involving small regions likely has occurred in both species. Elucidating the evolutionary history of these chromosomes is complicated further by the one or two genome duplications and subsequent diploidization in the soybean lineage (3). In most cases only one of the 2–4 loci detected by a RFLP probe in soybean has been mapped. Thus, the DNA sequence we used in these comparison is not necessarily that of the mapped locus and may explain why the matches we report between soyA2 and arabI were often but not always the most significant ones of those we observed.

The evidence we present for duplicated chromosomal regions in Arabidopsis is based on homologies between the predicted amino acid sequences derived from soybean genomic clones and Arabidopsis BACs. One of the decisions we had to make in making these comparisons was whether a given sequence similarity was evolutionarily significant or was simply due to chance. In this study we did not want to miss any weak homologies resulting from ancient duplications or speciation with subsequent divergence of the sequences. Thus, we included any low-scoring Arabidopsis sequences whose region of homology to the soybean sequence was both a subset of, and in the same translation frame as, the most significant match and that had expect values of less than 0.025, although such E values normally would not be considered significant. Although this means that potentially some matches might be accepted when they were not actually significant, we found only one soybean/Arabidopsis homology in a duplicated region where the probability of the match occurring by chance was less than 1 in 50 (soyJ/A060 on arabII). In total, only eight of the homologies we report (5.5%) had probabilities of being due to chance of less than 1:1,000.

The level of duplicated loci in Arabidopsis has been proposed to approximate a basal level of duplicated genes among crucifers (5). Our observation of extensive segmental genomic duplication covering regions in all of the Arabidopsis chromosomes for which sequence is available suggests that even a basal genomic level of redundancy in a higher eukaryote may include a high level of ancient genome duplication beyond the single-gene level.

Circumstantial evidence would suggest that all organisms have experienced at least one round of genome duplication in their phylogenetic past. Thus, all eukaryotes probably are ancient polyploids (26). There is a tendency for a polyploid genome to evolve into a diploid state through sequence diversification and chromosomal rearrangement (3, 26–28). This process of “diploidization” may result in changes in the amount of nuclear DNA because of additions and deletions, major genome restructuring because of rearrangements (29), as well as an accumulation of sequence and functional differences (30). Not surprisingly, then, our results indicate that at least one member of each pair of the large duplicated regions we identified in Arabidopsis has been rearranged since the duplication event.

Cretaceous fossil records have placed rosids of various types (a lineage that includes legumes) and a capparalean taxon (Capparales include Cruciferae) to about 92 million years ago (17), indicating that divergence of the lineages that gave rise to Cruciferae and legumes probably occurred at about that time. Despite this long period of separation, we were able to detect numerous instances of sequence homology and several regions of synteny between Arabidopsis and soybean. Our results, along with those reported previously between various Brassicas and Arabidopsis (20, 31, 32), Arabidopsis and cotton (10), among some legumes (33, 34), and between members of the Solanacea (21), suggest that it should be possible to use the maps and molecular information developed for Arabidopsis widely throughout the dicots.

Acknowledgments

We thank Ms. Brianne Veach for technical assistance, Dr. T. Vision for critical reading of the manuscript, and Dr. D. Ashlock for statistics assistance. Contributions of the Corn Insect and Crop Genetics Research Unit, U.S. Department of Agriculture–Agricultural Research Service, Midwest Area, and Project 3236 of the Iowa Agriculture and Home Economics Experiment Station (Ames, IA) are acknowledged. This is journal paper no. 18657.

Footnotes

    • ↵† To whom reprint requests should be addressed at: U.S. Department of Agriculture–Agricultural Research Service and Iowa State University, G304 Agronomy Hall, Ames, IA 50011. E-mail: dgrant{at}iastate.edu.

    • This paper was submitted directly (Track II) to the PNAS office.

    • Data deposition: The sequences reported in this paper have been deposited in the GenBank database (accession nos. AZ044886–AF045004 and AF237389–AF237412).

    • Article published online before print: Proc. Natl. Acad. Sci. USA, 10.1073/pnas.070430597.

    • Article and publication date are at www.pnas.org/cgi/doi/10.1073/pnas.070430597

    Abbreviations

    RFLP,
    restriction fragment length polymorphism;
    SSR,
    simple sequence repeat;
    BAC,
    bacterial artificial chromosome;
    soyA2,
    soybean linkage group A2;
    arabI,
    Arabidopsis chromosome I;
    AP,
    acid phosphatase(s);
    PAP,
    purple AP(s)
    • Received October 6, 1999.
    • Copyright © The National Academy of Sciences

    References

    1. ↵
      1. Arumuganathan K,
      2. Earle E D
      (1991) Plant Mol Biol Rep 9:208–218.
      OpenUrlCrossRef
    2. ↵
      1. Pruitt R E,
      2. Meyerowitz E M
      (1986) J Mol Biol 187:169–184, pmid:3701864.
      OpenUrlCrossRefPubMed
    3. ↵
      1. Shoemaker R C,
      2. Polzin K,
      3. Labate J,
      4. Specht J,
      5. Brummer E C,
      6. Olson T,
      7. Young N,
      8. Concibido V,
      9. Wilcox J,
      10. Tamulonis J P,
      11. et al.
      (1996) Genetics 144:329–338, pmid:8878696.
      OpenUrlAbstract/FREE Full Text
    4. ↵
      1. Gurley W B,
      2. Hepburn A G,
      3. Key J L
      (1979) Biochim Biophys Acta 561:167–183, pmid:570420.
      OpenUrlPubMed
    5. ↵
      1. McGrath J M,
      2. Jancso M M,
      3. Pichersky E
      (1993) Theor Appl Genet 86:880–888.
      OpenUrlCrossRefPubMed
    6. ↵
      1. Kowalski S P,
      2. Lan T-H,
      3. Feldmann K A,
      4. Paterson A H
      (1994) Genetics 138:499–510, pmid:7828831.
      OpenUrlAbstract/FREE Full Text
    7. ↵
      1. Chang C,
      2. Bowman J L,
      3. DeJohn A W,
      4. Lander E S,
      5. Meyerowitz E M
      (1988) Proc Natl Acad Sci USA 85:6856–6860, pmid:2901107.
      OpenUrlAbstract/FREE Full Text
    8. ↵
      1. Rounsley S D,
      2. Glodek A,
      3. Sutton G
      (1996) Plant Physiol 112:1177–1193, pmid:8938416.
      OpenUrlAbstract
    9. ↵
      1. Terryn N,
      2. Heijnen L,
      3. De Keyser A,
      4. Van Asseldonck M,
      5. De Clercq R,
      6. Verbakel H,
      7. Gielen J,
      8. Zabeau M,
      9. Villaroel R,
      10. Jesse T,
      11. et al.
      (1999) FEBS Lett 445:237–245, pmid:10094464.
      OpenUrlCrossRefPubMed
    10. ↵
      1. Paterson A H,
      2. Lan T-H,
      3. Reischmann K P,
      4. Chang C,
      5. Lin Y-R,
      6. Liu S-C,
      7. Burow M D,
      8. Kowalski S P,
      9. Katsur C S,
      10. DelMonet T A,
      11. et al.
      (1996) Nat Genet 14:380–382, pmid:8944014.
      OpenUrlCrossRefPubMed
    11. ↵
      1. Cregan P B,
      2. Jarvik T,
      3. Bush A L,
      4. Shoemaker R C,
      5. Lark K G,
      6. Kahler A L,
      7. Kaya N,
      8. VanToai T T,
      9. Lohnes D G,
      10. Chung J,
      11. et al.
      (1999) Crop Sci 39:1464–1490.
      OpenUrlCrossRef
    12. ↵
      1. Keim P,
      2. Shoemaker R C
      (1988) Soybean Genet Newsletter 15:147–148.
      OpenUrl
    13. ↵
      1. Altschul S F,
      2. Gish W,
      3. Miller W,
      4. Myers E W,
      5. Lipman D J
      (1990) J Mol Biol 215:403–410, pmid:2231712.
      OpenUrlCrossRefPubMed
      1. Gish W,
      2. States D J
      (1993) Nat Genet 3:266–272, pmid:8485583.
      OpenUrlCrossRefPubMed
    14. ↵
      1. Altschul S F,
      2. Madden T L,
      3. Schaffer A A,
      4. Zhang J,
      5. Zhang Z,
      6. Miller W,
      7. Lipman D J
      (1997) Nucleic Acids Res 25:3389–3402, pmid:9254694.
      OpenUrlAbstract/FREE Full Text
    15. ↵
      1. Burr B,
      2. Burr F,
      3. Thompson K,
      4. Albertson M,
      5. Stuber C
      (1988) Genetics 118:519–526, pmid:3366363.
      OpenUrlAbstract/FREE Full Text
    16. ↵
      1. Gandolfo M,
      2. Nixon K,
      3. Crepet W
      (1998) Am J Bot 85:964–974.
      OpenUrlAbstract/FREE Full Text
    17. ↵
      1. Staswick P E
      (1988) Plant Physiol 87:250–254.
      OpenUrlAbstract/FREE Full Text
    18. ↵
      1. DeWald D B,
      2. Mason H S,
      3. Mullet J E
      (1992) J Biol Chem 267:15958–15964, pmid:1639823.
      OpenUrlAbstract/FREE Full Text
    19. ↵
      1. Lagercrantz U
      (1998) Genetics 150:1217–1228, pmid:9799273.
      OpenUrlAbstract/FREE Full Text
    20. ↵
      1. Livingstone K,
      2. Lackney V K,
      3. Blauth J R,
      4. van Wijk R,
      5. Jahn M K
      (1999) Genetics 152:1183–1202, pmid:10388833.
      OpenUrlAbstract/FREE Full Text
    21. ↵
      1. Parker J E,
      2. Coleman M J,
      3. Szabo V,
      4. Frost L N,
      5. Schmidt R,
      6. van der Biezen E A,
      7. Moores T,
      8. Dean C,
      9. Daniels M J,
      10. Jones M D
      (1997) Plant Cell 9:879–894, pmid:9212464.
      OpenUrlAbstract/FREE Full Text
    22. ↵
      1. McDowell J M,
      2. Dhandaydham M,
      3. Long T A,
      4. Aarts M G,
      5. Holub E B,
      6. Dangl J L
      (1998) Plant Cell 10:1861–1874, pmid:9811794.
      OpenUrlAbstract/FREE Full Text
    23. ↵
      1. Warren R F,
      2. Henk A,
      3. Mowery P,
      4. Holub E,
      5. Innes R W
      (1998) Plant Cell 10:1439–1452, pmid:9724691.
      OpenUrlAbstract/FREE Full Text
    24. ↵
      1. Mathews S,
      2. Sharrock R A
      (1997) Plant Cell Environ 20:666–671.
      OpenUrlCrossRef
    25. ↵
      1. Dover G,
      2. Flavell R
      1. Leipoldt M,
      2. Schmidtke J
      (1982) in Genome Evolution, eds Dover G, Flavell R(Academic, New York), pp 219–236.
      1. Song K,
      2. Lu P,
      3. Tang K,
      4. Osborn T
      (1995) Proc Nat Acad Sci USA 92:7719–7723, pmid:7644483.
      OpenUrlAbstract/FREE Full Text
    26. ↵
      1. Lagercrantz U,
      2. Lydiate D
      (1996) Genetics 144:1903–1910, pmid:8978073.
      OpenUrlAbstract/FREE Full Text
    27. ↵
      1. Ohno S
      (1970) Evolution by Gene Duplication (Springer, New York).
    28. ↵
      1. Pickett F B,
      2. Meeks-Wagner D R
      (1995) Plant Cell 7:1347–1356, pmid:8589620.
      OpenUrlFREE Full Text
    29. ↵
      1. Cavell A C,
      2. Lydiate D J,
      3. Parkin I A P,
      4. Dean C,
      5. Trick M
      (1998) Genome 41:62–69, pmid:9549059.
      OpenUrlPubMed
    30. ↵
      1. Conner J A,
      2. Conner P,
      3. Nasrallah M E,
      4. Nasrallah J B
      (1998) Plant Cell 10:801–812, pmid:9596638.
      OpenUrlAbstract/FREE Full Text
    31. ↵
      1. Weeden N F,
      2. Muehlbauer F J,
      3. Ladizinsky G
      (1992) J Hered 83:123–129.
      OpenUrlAbstract/FREE Full Text
    32. ↵
      1. Boutin S R,
      2. Young N D,
      3. Olson T C,
      4. Yu Z H,
      5. Shoemaker R C,
      6. Vallejos C E
      (1995) Genome 38:928–937.
      OpenUrlPubMed
    View Abstract
    PreviousNext
    Back to top
    Article Alerts
    Email Article

    Thank you for your interest in spreading the word on PNAS.

    NOTE: We only request your email address so that the person you are recommending the page to knows that you wanted them to see it, and that it is not junk mail. We do not capture any email address.

    Enter multiple addresses on separate lines or separate them with commas.
    Genome organization in dicots: Genome duplication in Arabidopsis and synteny between soybean and Arabidopsis
    (Your Name) has sent you a message from PNAS
    (Your Name) thought you would like to see the PNAS web site.
    Citation Tools
    Genome organization in dicots: Genome duplication in Arabidopsis and synteny between soybean and Arabidopsis
    David Grant, Perry Cregan, Randy C. Shoemaker
    Proceedings of the National Academy of Sciences Apr 2000, 97 (8) 4168-4173; DOI: 10.1073/pnas.070430597

    Citation Manager Formats

    • BibTeX
    • Bookends
    • EasyBib
    • EndNote (tagged)
    • EndNote 8 (xml)
    • Medlars
    • Mendeley
    • Papers
    • RefWorks Tagged
    • Ref Manager
    • RIS
    • Zotero
    Request Permissions
    Share
    Genome organization in dicots: Genome duplication in Arabidopsis and synteny between soybean and Arabidopsis
    David Grant, Perry Cregan, Randy C. Shoemaker
    Proceedings of the National Academy of Sciences Apr 2000, 97 (8) 4168-4173; DOI: 10.1073/pnas.070430597
    del.icio.us logo Digg logo Reddit logo Twitter logo CiteULike logo Facebook logo Google logo Mendeley logo
    • Tweet Widget
    • Facebook Like
    • Mendeley logo Mendeley
    Proceedings of the National Academy of Sciences: 116 (8)
    Current Issue

    Submit

    Sign up for Article Alerts

    Jump to section

    • Article
      • Abstract
      • Materials and Methods
      • Results
      • Discussion
      • Acknowledgments
      • Footnotes
      • Abbreviations
      • References
    • Figures & SI
    • Info & Metrics
    • PDF

    You May Also be Interested in

    Several aspects of the proposal, which aims to expand open access, require serious discussion and, in some cases, a rethink.
    Opinion: “Plan S” falls short for society publishers—and for the researchers they serve
    Several aspects of the proposal, which aims to expand open access, require serious discussion and, in some cases, a rethink.
    Image credit: Dave Cutler (artist).
    Several large or long-lived animals seem strangely resistant to developing cancer. Elucidating the reasons why could lead to promising cancer-fighting strategies in humans.
    Core Concept: Solving Peto’s Paradox to better understand cancer
    Several large or long-lived animals seem strangely resistant to developing cancer. Elucidating the reasons why could lead to promising cancer-fighting strategies in humans.
    Image credit: Shutterstock.com/ronnybas frimages.
    Featured Profile
    PNAS Profile of NAS member and biochemist Hao Wu
     Nonmonogamous strawberry poison frog (Oophaga pumilio).  Image courtesy of Yusan Yang (University of Pittsburgh, Pittsburgh).
    Putative signature of monogamy
    A study suggests a putative gene-expression hallmark common to monogamous male vertebrates of some species, namely cichlid fishes, dendrobatid frogs, passeroid songbirds, common voles, and deer mice, and identifies 24 candidate genes potentially associated with monogamy.
    Image courtesy of Yusan Yang (University of Pittsburgh, Pittsburgh).
    Active lifestyles. Image courtesy of Pixabay/MabelAmber.
    Meaningful life tied to healthy aging
    Physical and social well-being in old age are linked to self-assessments of life worth, and a spectrum of behavioral, economic, health, and social variables may influence whether aging individuals believe they are leading meaningful lives.
    Image courtesy of Pixabay/MabelAmber.

    More Articles of This Classification

    Biological Sciences

    • DNA helicase RecQ1 regulates mutually exclusive expression of virulence genes in Plasmodium falciparum via heterochromatin alteration
    • Calcineurin dephosphorylates Kelch-like 3, reversing phosphorylation by angiotensin II and regulating renal electrolyte handling
    • Impacts of the Northwest Forest Plan on forest composition and bird populations
    Show more

    Genetics

    • Processing generates 3′ ends of RNA masking transcription termination events in prokaryotes
    • Self-regulation and the foraging gene (PRKG1) in humans
    • The ubiquitin ligase UBE3B, disrupted in intellectual disability and absent speech, regulates metabolic pathways by targeting BCKDK
    Show more

    Related Content

    • No related articles found.
    • Scopus
    • PubMed
    • Google Scholar

    Cited by...

    • Similarity between soybean and Arabidopsis seed methylomes and loss of non-CG methylation does not affect seed development
    • Development of a 10,000 Locus Genetic Map of the Sunflower Genome Based on Multiple Crosses
    • Integrated Syntenic and Phylogenomic Analyses Reveal an Ancient Genome Duplication in Monocots
    • Legume Transcription Factor Genes: What Makes Legumes So Special?
    • Legume Anchor Markers Link Syntenic Regions Between Phaseolus vulgaris, Lotus japonicus, Medicago truncatula and Arachis
    • Combining Bioinformatics and Phylogenetics to Identify Large Sets of Single-Copy Orthologous Genes (COSII) for Comparative, Evolutionary and Systematic Studies: A Test Case in the Euasterid Plant Clade
    • Analyses of Synteny Between Arabidopsis thaliana and Species in the Asteraceae Reveal a Complex Network of Small Syntenic Segments and Major Chromosomal Rearrangements
    • Widespread genome duplications throughout the history of flowering plants
    • Simple Sequence Repeat-Based Comparative Genomics Between Brassica rapa and Arabidopsis thaliana: The Genetic Origin of Clubroot Resistance
    • Comparative genomics of Gossypium and Arabidopsis: Unraveling the consequences of both ancient and recent polyploidy
    • Genome evolution among cruciferous plants: a lecture from the comparison of the genetic maps of three diploid species--Capsella rubella, Arabidopsis lyrata subsp. petraea, and A. thaliana
    • Bridging Model and Crop Legumes through Comparative Genomics
    • Comparative Mapping in the Pinaceae
    • Functional Divergence of Duplicated Genes Formed by Polyploidy during Arabidopsis Evolution
    • Methods for Transcriptional Profiling in Plants. Be Fruitful and Replicate
    • National Science Foundation-Sponsored Workshop Report. Draft Plan for Soybean Genomics
    • A BAC- and BIBAC-Based Physical Map of the Soybean Genome
    • Calcium Sensors and Their Interacting Protein Kinases: Genomics of the Arabidopsis and Rice CBL-CIPK Signaling Networks
    • Evolution of genome size in the angiosperms
    • Analysis of the Alternative Oxidase Promoters from Soybean
    • Genome-Level Evolution of Resistance Genes in Arabidopsis thaliana
    • LineUp: Statistical Detection of Chromosomal Homology With Application to Plant Comparative Genomics
    • Comparison of a Brassica oleracea Genetic Map With the Genome of Arabidopsis thaliana
    • Syntenic Relationships between Medicago truncatula and Arabidopsis Reveal Extensive Divergence of Genome Organization
    • Whole-Genome Comparison of Leucine-Rich Repeat Extensins in Arabidopsis and Rice. A Conserved Family of Cell Wall Proteins Form a Vegetative and a Reproductive Clade
    • A Recent Polyploidy Superimposed on Older Large-Scale Duplications in the Arabidopsis Genome
    • The Automatic Detection of Homologous Regions (ADHoRe) and Its Application to Microcolinearity Between Arabidopsis and Rice
    • The hidden duplication past of Arabidopsis thaliana
    • A Draft Sequence of the Rice Genome (Oryza sativa L. ssp. japonica)
    • A Sense of Self: The Role of DNA Sequence Elimination in Allopolyploidization
    • Everything in Its Place: Conservation of Gene Order among Distantly Related Plant Species
    • Comparative Sequence Analysis Reveals Extensive Microcolinearity in the Lateral Suppressor Regions of the Tomato, Arabidopsis, and Capsella Genomes
    • Life with 25,000 Genes
    • Arabidopsis and Brassica Comparative Genomics: Sequence, Structure and Gene Content in the ABI1-Rps2-Ck1 Chromosomal Segment and Related Regions
    • Patterns of Chromosomal Duplication in Maize and Their Implications for Comparative Maps of the Grasses
    • The Origins of Genomic Duplications in Arabidopsis
    • Genetic Structure and Evolution of RAC-GTPases in Arabidopsis thaliana
    • The Evolutionary Fate and Consequences of Duplicate Genes
    • Species-specific double-strand break repair and genome evolution in plants
    • Comparing sequenced segments of the tomato and Arabidopsis genomes: Large-scale duplication followed by selective gene loss creates a network of synteny
    • Comparative Sequence Analysis of Plant Nuclear Genomes: Microcolinearity and Its Many Exceptions
    • Analysis of the 5S RNA Pool in Arabidopsis thaliana: RNAs Are Heterogeneous and Only Two of the Genomic 5S Loci Produce Mature 5S RNA
    • Comparing sequenced segments of the tomato and Arabidopsis genomes: Large-scale duplication followed by selective gene loss creates a network of synteny
    • Scopus (186)
    • Google Scholar

    Similar Articles

    Site Logo
    Powered by HighWire
    • Submit Manuscript
    • Twitter
    • Facebook
    • RSS Feeds
    • Email Alerts

    Articles

    • Current Issue
    • Latest Articles
    • Archive

    PNAS Portals

    • Classics
    • Front Matter
    • Teaching Resources
    • Anthropology
    • Chemistry
    • Physics
    • Sustainability Science

    Information

    • Authors
    • Editorial Board
    • Reviewers
    • Press
    • Site Map

    Feedback    Privacy/Legal

    Copyright © 2019 National Academy of Sciences. Online ISSN 1091-6490