There is currently no consensus on the evolutionary origin of eukaryotes. In the search of the ancestors of eukaryotes, we analyzed the phylogeny of 46 genomes, including those of 2 eukaryotes, 8 archaea, and 36 eubacteria. To avoid the effects of gene duplications, we used inparalog pairs of genes with orthologous relationships. First, we grouped these inparalogs into the functional categories of the nucleus, cytoplasm, and mitochondria. Next, we counted the sister groups of eukaryotes in prokaryotic phyla and plotted them on a standard phylogenetic tree. Finally, we used Pearson’s chi-square test to estimate the origin of the genomes from specific prokaryotic ancestors. The results suggest the eukaryotic nuclear genome descends from an archaea that was neither euryarchaeota nor crenarchaeota and that the mitochondrial genome descends from α-proteobacteria. In contrast, genes related to the cytoplasm do not appear to originate from a specific group of prokaryotes.
The Arabidopsis acaulis1-1 (acl1-1) mutant exhibits severe growth defects when grown at 22°C. The leaves are tiny and curled and the inflorescence stems are short. We identified an inversion mutation in the original acl1-1 plants. The acl1-1 plants were crossed with Columbia wild-type, and the acl1-1 phenotype and the inversion were segregated in the F2 generation. Compared to the original acl1-1 plants with the inversion, the genuine acl1-1 plants without the inversion grew larger and their inflorescence stems grew longer at 22°C. When the plants were grown at 24°C, the differences in growth became more apparent. We investigated the expression of genes located in the inversion. Two genes that were located at each end of the inversion were disrupted, and full-length transcripts were not expressed. Expressions of some genes within and adjacent to the inversion were also altered. Our results indicate that the expression of multiple genes may be involved in the enhancement of the acl1-1 phenotype.
We investigated the evolutionary dynamics of wheat mitochondrial genes with respect to their structural differentiation during organellar evolution, and to mutations that occurred during cereal evolution. First, we compared the nucleotide sequences of three wheat mitochondrial genes to those of wheat chloroplast, α-proteobacterium and cyanobacterium orthologs. As a result, we were able to (1) differentiate the conserved and variable segments of the orthologs, (2) reveal the functional importance of the conserved segments, and (3) provide a corroborative support for the α-proteobacterial and cyanobacterial origins of those mitochondrial and chloroplast genes, respectively. Second, we compared the nucleotide sequences of wheat mitochondrial genes to those of rice and maize to determine the types and frequencies of base changes and indels occurred in cereal evolution. Our analyses showed that both the evolutionary speed, in terms of number of base substitutions per site, and the transition/transversion ratio of the cereal mitochondrial genes were less than two-fifths of those of the chloroplast genes. Eight mitochondrial gene groups differed in their evolutionary variability, RNA and Complex I (nad) genes being most stable whereas Complex V (atp) and ribosomal protein genes most variable. C-to-T transition was the most frequent type of base change; C-to-G and G-to-C transversions occurred at lower rates than all other changes. The excess of C-to-T transitions was attributed to C-to-U RNA editing that developed in early stage of vascular plant evolution. On the contrary, the editing of C residues at cereal T-to-C transition sites developed mostly during cereal divergence. Most indels were associated with short direct repeats, suggesting intra- and intermolecular recombination as an important mechanism for their origin. Most of the repeats associated with indels were di- or trinucleotides, although no preference was noticed for their sequences. The maize mt genome was characterized by a high incidence of indels, comparing to the wheat and rice mt genomes.
Transposable elements (TEs) have played important roles in the evolution of genes and genomes of higher eukaryotes. Among the TEs in the rice genome, miniature inverted-repeat transposable elements (MITEs) exist at the highest copy number. Some of MITEs in the rice genome contain poly(A) signals and putative cis-acting regulatory domains. Insertion events of such MITEs may have caused many structural and functional changes of genomes. The genome-wide examination of MITE-derived sequences could elucidate the contribution of MITEs to gene evolution. Here we report on the MITEs in the rice genome that have contributed to the emergence of novel genes and the expansion of the sequence diversity of the genome and mRNAs. Of the MITE-derived sequences, approximately 6000 were found in gene regions (exons and introns) and 67,000 in intergenic regions. In gene regions, most MITEs are located in introns rather than exons. For over 300 protein-coding genes, coding sequences, poly(A) sites, transcription start sites, and splicing sites overlap with MITEs. These sequence alterations via MITE insertions potentially affect the biological functions of gene products. Many MITE insertions also exist in 5’-untranslated regions (UTRs), 3’-UTRs, and in the proximity of genes. Although mutations in these non-protein coding regions do not alter protein sequences, these regions have key roles for gene regulation. Moreover, MITE family sequences (Tourist, Stowaway, and others) are unevenly distributed in introns. Our findings suggest that MITEs may have contributed to expansion of genome diversity by causing alterations not only in gene functions but also in regulation of many genes.
Natural selection operating at the amino acid sequence level can be detected by comparing the rates of synonymous (rS) and nonsynonymous (rN) nucleotide substitutions, where rN/rS (ω) > 1 and ω < 1 suggest positive and negative selection, respectively. The branch-site test has been developed for detecting positive selection operating at a group of amino acid sites for a pre-specified (foreground) branch of a phylogenetic tree by taking into account the heterogeneity of ω among sites and branches. Here the performance of the branch-site test was examined by computer simulation, with special reference to the false-positive rate when the divergence of the sequences analyzed was small. The false-positive rate was found to inflate when the assumptions made on the ω values for the foreground and other (background) branches in the branch-site test were violated. In addition, under a similar condition, false-positive results were often obtained even when Bonferroni correction was conducted and the false-discovery rate was controlled in a large-scale analysis. False-positive results were also obtained even when the number of nonsynonymous substitutions for the foreground branch was smaller than the minimum value required for detecting positive selection. The existence of a codon site with a possibility of occurrence of multiple nonsynonymous substitutions for the foreground branch often caused the branch-site test to falsely identify positive selection. In the re-analysis of orthologous trios of protein-coding genes from humans, chimpanzees, and macaques, most of the genes previously identified to be positively selected for the human or chimpanzee branch by the branch-site test contained such a codon site, suggesting a possibility that a significant fraction of these genes are false-positives.
In acute myeloid leukemia (AML), hematologic malignancies are characterized by recurring chromosomal abnormalities. Chromosome translocation t(9;11)(p22;q23) is one of the most common genetic aberrations and results in the formation of the MLL-AF9 fusion gene that functions as a facilitator of cell growth directly. In order to study this type of AML, the cell lines with cytogenetically diagnosed t(9;11)(p22;q23), such as Mono Mac 6 (MM6), have been widely used. To examine whether there is any difference in gene expression between the primary human t(9;11) AML cells and MM6 cell line, genome-wide transcriptome analysis was performed on MM6 cell line using SAGE and the results were compared to the profile of primary human t(9;11) AML cells. 884 transcripts which were alternatively expressed between MM6 cells and primary human t(9;11) cells were identified through statistical analysis (P < 0.05) and 4-fold expression change. Of these transcripts, 830 (94%) matched to known genes or EST were classified by functional categories (http://david.abcc.ncifcrf.gov/). The majority of alternatively expressed genes in MM6 were involved in biosynthetic and metabolic processes, but HRAS, a protein that is known to be associated with leukemogenesis, was expressed only in MM6 cells and several other genes involved in Erk1/Erk2 MAPK pathway were also over-expressed in MM6. Therefore, since MM6 cell line has a similar expression profile to primary human t(9;11) AML in general and expresses uniquely a strong Erk1/Erk2 MAPK pathway including HRAS, it can be used as a model for HRAS-positive t(9;11) AML.
FBXW7 has been reported to be a candidate tumor suppressor gene on 4q31. Three isoforms (α-form, β-form, and γ-form) of FBXW7 are produced from mRNAs with distinct 5’ exons. Our previous study identified the specific suppression of the mRNA expression of the FBXW7 β-form in human gliomas. Because this form is the major FBXW7 isoform in the human brain, we elucidated the silencing mechanisms for the FBXW7 β-form in gliomas. No genetic alterations were found in the whole FBXW7 gene including putative promoter region of the β-form. Treatments with 5-azacytidine and trichostatin A did not induce re-expression. A sodium bisulfite-modification assay indicated that CpG sequences in the promoter of FBXW7 β-form were not methylated in glioma cells. Meanwhile we searched for the expression of FBXW7 and the sodium bisulfite sequences in normal human peripheral blood cells, and we surprisingly found that the mRNA expression of the FBXW7 β-form was highly suppressed and the CpG sequences in the promoter region of the FBXW7 β-form were heavily methylated. Our data suggest that the inactivation of the FBXW7 β-form plays an important role in the pathogenesis of gliomas and that an unknown mechanism(s) other than mutation and methylation is the major cause of the suppression of the FBXW7 β-form in gliomas.
In order to analyze the pattern of DNA polymorphism in detail, we have developed a simple method using a new statistic θi which estimates 4Nμ from the number of segregating sites whose allelic nucleotide frequency is i/n among n DNA sequences, where N is the effective population size and μ is the mutation rate per generation per nucleotide site. Under the assumption that mutations are selectively neutral and a population size is constant, the expectation of θi is equal to that of θ, which estimates 4Nμ from the number of segregating sites, so that the distribution of θi is flat. Therefore, the departure of the distribution of θi from the horizontal line, which represents the value of θ, reflects change in population size and natural selection. Results of the coalescent simulation show that the distributions of θi in the populations which experienced expansion and reduction are U-shaped and upside-down U-shaped, respectively. And the distributions of θi in some populations that experienced bottleneck are W-shaped. Furthermore, we have applied this method to the SNP data in the International HapMap Project. Results of data analyses show that the distributions of θi in the CEU (European), CHB and JPT (Asian) populations are different from that in the YRI population (African). From these results of data analyses in nuclear DNA and the pattern of polymorphism in human mitochondrial DNA already known, we infer that the CEU, CHB and JPT populations experienced the bottleneck.