Previous Article |
Table of Contents
| Next Article
BIOLOGICAL SCIENCES / MEDICAL SCIENCES
Evolution and expression of chimeric POTEactin genes in the human genome
Laboratory of Molecular Biology, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD 20892-4264
Contributed by Ira Pastan, September 25, 2006
| Abstract |
|---|
|
|
|---|
We previously described a primate-specific gene family, POTE, that is expressed in many cancers but in a limited number of normal organs. The 13 POTE genes are dispersed among eight different chromosomes and evolved by duplications and remodeling of the human genome from an ancestral gene, ANKRD26. Based on sequence similarity, the POTE gene family members can be divided into three groups. By genome database searches, we identified an actin retroposon insertion at the carboxyl terminus of one of the ancestral POTE paralogs. By Northern blot analysis, we identified the expected 7.5-kb POTEactin chimeric transcript in a breast cancer cell line. The protein encoded by the POTEactin transcript is predicted to be 120 kDa in size. Using anti-POTE mAbs that recognize the amino-terminal portion of the POTE protein, we detected the 120-kDa POTEactin fusion protein in breast cancer cell lines known to express the fusion transcript. These data demonstrate that insertion of a retroposon produced an altered functional POTE gene. This example indicates that new functional human genes can evolve by insertion of retroposons.
ANKRD26 | cancer | primate | retroposon | testis
The mechanism by which novel gene functions evolve during speciation is of fundamental interest. Several such mechanisms have been suggested, including gene duplication, retroposition, and exon shuffling (5). Retro-elements, which include long interspersed elements (LINEs), short interspersed elements, and LTR retrotransposons, comprise 1520% of the genome and can provide promoters, alternative exons, or poly(A) signals to potentially influence the transcription of affected genes and neighboring genomic loci (610). Their insertion may have played an important role in the speciation of hominoids by providing new regulatory modules capable of changing gene expression networks.
Analysis of genome sequence databases indicates that the POTE gene entered the primate genome before the divergence of the New World monkeys (NWMs) and the Old World monkeys (OWMs). This analysis also shows that a
-actin cDNA was inserted at the carboxyl terminus of one of the ancestral POTE paralogs before the divergence of the OWMs and apes and is now present in several POTE paralogs. This insertion has led to the formation of a new chimeric protein that contains both POTE and actin modules in the same protein. It is apparent that there has been a strong selection pressure to keep the chimeric POTEactin ORF intact during primate evolution.
| Results |
|---|
|
|
|---|
|
-actin transcript and surrounded by putative target site duplications, a hallmark of a retroposition event (Fig. 1B) (11). Examination of the DNA sequence at the POTEactin junction of these genes reveals an ORF extending from exon 11 of POTE into actin (Fig. 1A). This ORF is present in three members of group 3, POTE-2
, POTE-2
, and POTE-2
. In the other POTE paralogs of group 3 (POTE-2
, POTE-14
, POTE-14
and POTE-22), there is a stop codon before the actin module caused by a small deletion in a preceding exon causing a frameshift and premature termination. Evolution of POTE Genes. Because all of the POTE genes have exon 11 and a LINE element in their 3' UTR sequence, we used the POTE exon 11-LINE fusion as a marker of the POTE gene formation in the primate lineage (probe 1 in Fig. 1A). To estimate the time of POTE gene emergence, we performed an extensive analysis of all available primate genome sequences by using the POTE exon 11-LINE junction sequence as a query. The POTELINE fusion sequences were detected in apes (chimpanzee, orangutan, and gibbon), OWMs (macaque and baboon), and NWMs (marmoset) (Fig. 2) (12). The fact that a POTE gene is found in marmosets indicates that the POTE gene ancestor had emerged before the divergence of NWMs and OWMs. The marmoset sequence is most similar to POTE-8 (group 1), which is likely to be the most ancient member of POTE genes (3). However, the sequence similarities among members of three groups are greater than that between the marmoset sequence and any POTE gene. Therefore, we deduce that the marmoset sequence might be a direct descendent of the ancestor of all POTE genes. The events that produced the three POTE gene groups must have occurred before the OWMape split but after the OWMNWM split.
|
gene as a query (probe 2 in Figs. 1A and 2). The fusion sequence was found in all species that contain a POTE-2
homolog or a member of group 3 detected by probe 1 (Fig. 2). This observation and the fact that the POTEactin fusion is found only in group 3 members indicate that the retroposition of actin occurred in the ancestor of group 3 before the divergence of OWMs and apes. By using the ancestral POTE ORF inferred from the ANKRD26 gene (3) or from a consensus sequence of all POTE genes, the actin sequence can be fused seamlessly to generate a single ORF, indicating that the ancestral POTEactin fusion gene must have encoded a POTEactin fusion protein. Subsequent rounds of duplication must have resulted in propagation of POTEactin fusion genes in the human genome, some of which suffered nucleotide deletions that resulted in frameshifts and premature terminations. The actin retrogene in all group 3 POTE genes retains the original ORF without any deleterious mutation, indicating that the actin retrogene has been under purifying selection pressure since it was introduced in the ancestral genome of OWMs and apes. It also implies that the frameshift mutations that have led to premature terminations in some members of group 3 occurred very recently. The 4-bp deletion in exon 5 of POTE-2
was not detected in the chimpanzee genome sequence (March 2006 freeze), suggesting that the deletion might be human-specific. However, the 5-bp deletion in exon 10 of POTE-14
, POTE-14
, and POTE-22 was found in the chimpanzee genome, indicating that this deletion occurred before the humanchimp split.
Detection of RNA Containing POTE and Actin Sequences.
To determine whether the actin cDNA that is inserted into the 3' end of the POTE-2
, POTE-2
, and POTE-2
genes is transcribed into mRNA, we used several breast cancer cell lines, because we had previously shown that POTE-2
and POTE-2
were expressed in breast cancer (4). To do this, we used a RT-PCR analysis with a primer pair in which one primer was located at a conserved sequence in POTE-2
, POTE-2
, and POTE-2
(exon 10) and the other in the actin insert of exon 11 (primers PA01 and PA07 shown in Fig. 1A). As a source of mRNA, we used mRNA from four different breast cancer cell lines (MCF-7, HTB-19, HTB-20, and HTB-30) that express POTE-2
, POTE-2
, and/or POTE-2
(Table 1). POTE is not expressed in the normal breast cell line MCF-10A, consistent with lack of POTE expression in normal breast (4). However, when the MCF-10A cell line is transformed individually or together with the ErbB2 or c-Ha-Ras oncogenes to deregulate the transcription, POTE-2
becomes highly expressed (Table 1).
|
|
7.5 kb in size was observed. This mRNA is large enough to encode a POTEactin fusion transcript. Furthermore, the band is approximately the same size as POTE transcripts detected in prostate and testis, normal tissues that express many POTE paralogs (1). No band was detected in POTE-negative MCF-10A cells.
|
actin fusion protein.
Detection of POTEActin Fusion Protein in Breast Cancer Cell Lines by Specific Antibodies.
To determine which POTE proteins are present in different cancers and tissues, we have prepared mAbs that specifically react with various POTE paralogs (unpublished data). We used two of these mAbs to detect the POTEactin fusion protein. Protein extracts were prepared from cell lines, then immunoprecipitation (IP) was performed with one mAb, HP8, and a Western blot analysis was performed with a different mAb, PG5. As shown in Fig. 5, a specific band of 120 kDa, the expected size for the POTEactin fusion protein, was detected in protein extracts from the four breast cancer cell lines. No band was detected in that region in Raji, KB, or 293T cells that do not contain POTE transcripts. To demonstrate the performance of this anti-POTE IP-Western blot method, we transfected 293T cells with a POTE-2 isoform that does not contain actin sequences and detected the expected 40-kDa band (293T + 2
C lane in Fig. 5). This POTE isoform also was detected in HTB-19 and HTB-30 cells. As positive controls for Western blot analysis, cell lysates prepared from 293T cells transfected with each POTE-encoding plasmid were loaded directly in the first three lanes without IP. The PG5 antibody specifically detected all three POTE paralogs: POTE-22, POTE-2
C, and POTE-2
actin. We also attempted to determine whether anti-actin antibodies reacted with the fusion protein but were unsuccessful (data not shown). This failure is probably because of the relatively low affinity of these antibodies or nonreactivity attributable to changes in the amino acid sequence of actin (see below).
|
|
| Discussion |
|---|
|
|
|---|
We used the DIVERGE part of the GCG program package (Wisconsin Package Version 10.2; Accelrys, San Diego, CA) to calculate the synonymous (Ks) and nonsynonymous (Ka) nucleotide substitution rates between human
-actin gene and the actin part of the POTE genes. Some of the results are shown in Table 2. It shows that the Ka values between
-actin and the actin part of POTE genes are between 4% and 5%, which are comparable to the Ka value of 4% between nonactin parts of group 3 POTE genes 2
and 14 and somewhat smaller than 6% or 7% between nonactin parts of group 2 genes 15 and 21 and the group 3 gene 2
(1). This finding implies that the actin part of the gene is under similar purifying evolutionary pressure as the rest of the POTE gene. On the other hand, the Ks values between the
-actin gene and the actin part of the POTE genes are much larger than the Ks values of either between the actin parts (Table 2) or between the nonactin parts (1) of different POTE genes. It has been observed (13, 14) that retrogenes derived from high-GC content parental genes but inserted into a low-GC content genomic region tend to show an increased rate of silent site substitutions, which make the GC content of the retrogene more similar to that in the new location. The GC content of the human and rhesus macaque authentic
-actin coding regions is 59%, whereas that of the consensus POTE parts range 4243%. The actin ORFs of the POTE genes show reduced GC content of 5758%, suggesting that they are still in the process of adaptation to the local GC content.
|
Introduction of premature stop codons into some of the POTEactin genes apparently occurred after redundant copies were propagated. The actin-part of these genes (POTE-14
, POTE-14
, and POTE-22) appears intact and can code for an actin protein, were it not for the frameshift mutation upstream of actin. The Ka values between the actin-parts of these genes and those of other genes in group 3 (Table 2 and data not shown) are similar. These observations indicate that the frameshift mutation was introduced quite recently and that the POTE genes are still possibly in the process of active evolution. It is interesting that the POTE protein already contains spectrin-like domains, which are predicted to interact with actin. The fusion of actin with POTE apparently provides an additional way for POTE proteins to interact with the cytoskeleton.
| Materials and Methods |
|---|
|
|
|---|
Primers. The primers used in this study were synthesized by the Lofstrand Laboratory (Gaithersburg, MD), and the sequences of the primers used were as follows: T444, 5'-CAA TGC CAG GAA GAT GAA TGT GCG-3'; T445, 5'-TCT CTG GCC GTC TGT CCA GAT AGA T-3'; PA01, 5'-GAA CAA AAT GAT ACT CAG AAG CA-3'; and PA07, 5'-TGT TGA AGG TCT CAA ACA TGA TC-3'.
Computer Analysis of Genomic Databases.
Systematic search of the growing primate genome sequence information in the National Center for Biotechnology Information (NCBI) (Bethesda, MD) sequence database enabled us to estimate the time of POTE gene emergence and the retroposition of a
-actin transcript. First, we prepared three query sequences each
400-bp-long (probe 1 in Fig. 1A), encompassing POTE exon 11 and the LINE element. These sequences were selected from the representative members of the three POTE gene groups (3): group 1, POTE-8; group 2, POTE-21; and group 3, POTE-2
. In addition, a query sequence for detection of the actin retroposition was prepared from POTE-2
(probe 2 in Fig. 1A). There, four query sequences were used to detect the POTE gene and actin retroposition in primate genome sequences by using the NCBI BLAST server (http://ncbi.nih.gov/BLAST). If the entire query sequence aligned to a genomic sequence in a given species with >80% sequence identity, we concluded that the species possessed a POTE gene. Sequence databases used in this study include genome assemblies, GenBank (nr, htgs, chromosome, and wgs) databases, and whole-genome shotgun traces in the NCBI Trace database. Whole-genome shotgun traces were assembled into contigs with the CAP3 program (21).
Northern Blot Analysis. Total RNAs were isolated from different cell lines by using the RNA miniprep kit from Stratagene (La Jolla, CA) followed by isolation of the mRNA by using the Oligotex mRNA midi kit from Qiagen (Alameda, CA). Approximately 2.5 µg of mRNAs from different samples was run on agarose gel and transferred to a nylon membrane. The 1.2-kb probe was generated by PCR and labeled with 32P by using the random priming extension method. Membranes were incubated for 2 h in hybridization buffer followed by addition of denatured probe and incubation for an additional 12 h. Membranes were washed two times for 15 min each in 2x SSC/0.1% SDS, at room temperature and then washed two times for 20 min each in 0.1x SSC/0.1% SDS at 60°C. The membranes then were subjected to autoradiography.
Expression Analysis of POTEActin Fusion Transcript by RT-PCR. Total RNAs from different cell lines were isolated by using TRIzol reagents (Invitrogen, Carlsbad, CA). First-strand cDNAs were synthesized from the isolated RNA by using a first-strand cDNA synthesis kit following the manufacturer's instructions (Amersham, Piscataway, NJ). PCR on cDNA from breast cancer cell lines was performed with primers specific for POTEactin fusion transcript (PA01 and PA07 shown in Fig. 1A).
Cloning of the POTEActin Fusion cDNA. RACE ready cDNAs were prepared from MCF-10A-Ras/ErbB2 RNA by using a SMART RACE cDNA amplification kit (Clontech, Palo Alto, CA) and was used to performed 5' and 3' RACE reaction. Gene-specific primers T444 and PA01 were used for the 3' RACE reaction, and primer T445 was used for 5' RACE reaction. Several individual clones from the RACE product were isolated and sequenced to establish the correct POTEactin transcript sequence. Finally, the full-length cDNA encoding POTEactin ORF was amplified from MCF-10A-Ras/ErbB2 cDNA by PCR, and Taq-amplified PCR products were cloned with the Topo TA cloning kit (Invitrogen) and sequenced.
IP and Western Blot Analysis.
Cells were grown in log phase, washed with PBS, and harvested with a cell scraper. After centrifugation, cell pellets were lysed with RIPA buffer (50 mM Tris·HCl, pH 7.5/150 mM NaCl/5 mM EDTA/1% Nonidet P-40/1% deoxycolate/0.1% SDS/1 mM PMSF/1 µg/ml leupeptin/1 µg/ml aprotinin) on ice. After centrifugation, the protein concentration of the supernatants was determined by Coomassie Plus Protein Assay Reagent (Pierce, Rockford, IL). For IP, 4-mg proteins of each cell lysate were incubated with 10 µg of HP8 anti-POTE mAb (T.I., S.N., and I.P., unpublished data) at 4°C for 2 h and then 10 µl of protein G-Sepharose beads were added and incubated for 1 h at 4°C. After washing, the proteins bound on the beads were dissolved in SDS/PAGE sample buffer and separated on a 420% gradient SDS/PAGE gel under reducing condition. After transfer to a PVDF membrane (0.2 µm; Immuno-Blot; Bio-Rad, Richmond, CA), the POTE proteins on the membrane were detected by incubation with 0.5 µg/ml of PG5 anti-POTE mAb (T.I., S.N., and I.P., unpublished data) followed by alkaline phosphatase-labeled goat anti-mouse IgG2b (Invitrogen) (final 1/3,000 dilution) and 5-bromo-4-chloro-3-indolyl phosphate/p-nitroblue tetrazolium chloride substrate. As a control for IP, 3 µg of POTE-2
C/293T lysate was added to 4 mg of 293T lysates, and the IP was performed as described. For the controls in Western blot analysis, 3 µg of 293T cell lysates transfected with plasmids encoding each POTE paralog was run on the gel without IP.
Immunofluorescence Analysis. MCF-7 cells grown on a cover glass were washed with PBS and fixed with 4% paraformaldehyde for 30 min. After permeabilization with 0.5% Triton X-100 for 10 min, the samples then were blocked with 3% BSA for 30 min. Samples then were stained by primary antibody (anti-PG5, 0.5 µg/ml) for 2 h in PBS/BSA, followed by Alexa Fluor-conjugated secondary antibody staining for 1 h. Filamentous actin was stained by TRITC-labeled phalloidin (Sigma, St. Louis, MO) with the secondary antibody incubation. Cells were washed four times with PBS after secondary antibody staining and then mounted on glass slides by using mounting medium with DAPI (Vector Laboratories, Burlingame, CA). Images were obtained by using the Zeiss (Thornwood, NY) LSM 510 confocal microscope.
| Acknowledgements |
|---|
|
|
|---|
| Footnotes |
|---|
Abbreviations: LINEs, long interspersed elements; NWM, New World monkey; OWM, Old World monkey; IP, immunoprecipitation; Ks, synonymous; Ka, nonsynonymous
*To whom correspondence should be addressed at: Laboratory of Molecular Biology, National Cancer Institute, 37 Convent Drive, Room 5106, Bethesda, MD 20892-4264. E-mail: pastani{at}mail.nih.gov
Author contributions: Y.L. and T.I. contributed equally to this work. B.L., T.K.B., and I.P. designed research; Y.L., T.I., D.H., A.S.F., X.-F.L., and T.K.B. performed research; T.I. and S.N. contributed new reagents/analytic tools; Y.L., Y.H., S.N., B.L., T.K.B., and I.P. analyzed data; and Y.L., Y.H., B.L., T.K.B., and I.P. wrote the paper.
The authors declare no conflict of interest.
| References |
|---|
|
|
|---|
This article has been cited by other articles in HighWire Press-hosted journals:
![]() |
T. K. Bera, X.-F. Liu, M. Yamada, O. Gavrilova, E. Mezey, L. Tessarollo, M. Anver, Y. Hahn, B. Lee, and I. Pastan A model for obesity and gigantism due to disruption of the Ankrd26 gene PNAS, January 8, 2008; 105(1): 270 - 275. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||