Recent advances in metabolomics technology have enabled large-scale comprehensive analyses of metabolites, but the throughput of data processing of non-targeted, quantitative differential analyses is very low. It is crucial to solve this problem to generate biological hypotheses from a large-scale dataset. To improve the analysis of metabolite data, we focused on the processing of quantitative differential analysis after multiple peak alignment. We have developed a program named FAQuant that automatically selects reliable peaks from each chromatogram, quantifies the mean of peak intensity to compare between sample groups, and selects the peaks with differences in accumulation of metabolites. This program was incorporated into a quantitative differential metabolome pipeline as a module to improve the throughput of gas chromatography-mass spectrometry dataset analysis. As a result, the module incorporation largely reduced the total processing time. Furthermore, differential analysis of metabolites in soybean (Glycine max) cultivars was demonstrated by use of the system. This system might facilitate biological hypothesis generation from large-scale comparative metabolome analysis.
The near-infrared spectroscopy is used to analyze various foods because of the facts that the measurement time is very short and the coverage area is very wide. This analysis technique is recognized as an important technique to finger printing in addition to the element measurement. This research aimed at the development of the data analysis software for metabolic finger printing of food using the near-infrared spectroscopy. This software was made by using JAVA language that has advantages in developing graphical user interface. This software can perform feature extraction by using multi derivatives and Spearman's correlation with an arbitrary dependent variable after the preprocessing of the spectrum between 1000–2500 nm Wavelength by Standard Normal Variate. In addition, this software can visualize the tendency of the data by the PCA method, and determine the regression model by the PLS method. We demonstrated the usability of this software using Japanese green tea samples. A set of ranked green tea samples from a Japanese commercial tea contest was analyzed by Fourier transform near-infrared (FT-NIR) reflectance spectroscopy. This FT-NIR data was analyzed by our software, and quality prediction model was made. This prediction model had enough high accuracy.
Secondary metabolites are highly species-specific and play important roles in the survival of the producing organism within its natural habitat. Systematization of secondary metabolic pathways is necessary to understand species-specific metabolic pathways and to develop new drugs, etc. To attain this, we have made a database system called KNApSAcK, which describes the relationships between metabolites and species. On March 21, 2009, KNApSAcK had 34,852 metabolite entries and 68,819 metabolite-species pair entries. Though the chemical structures of around 50,000 secondary metabolites are known, information on their pathways is very limited. In this work, we have developed an algorithm to predict metabolic pathways on the basis of chemical structures of metabolites by exploiting the information contained in their cyclic substructures. Also, to handle a huge amount of these metabolites and to predict metabolic pathways automatically, we have developed a software tool called MetClassifier. MetClassifier is written in C and uses the OpenGL, GLUT (http://www.opengl.org/resources/libraries/glut/) and AntTweakBar (http://www.antisphere.com/Wiki/tools:anttweakbar) libraries. MetClassifier can be downloaded from the following URL (http://kanaya.naist.jp/MetClassifier/).
Novel tools are needed for comprehensive comparisons of the inter- and intraspecies characteristics of a large amounts of available genome sequences. An unsupervised neural network algorithm, Kohonen's Self-Organizing Map (SOM), is an effective tool for clustering and visualizing high-dimensional complex data on a single map. We modified the conventional SOM for genome informatics on the basis of Batch Learning SOM (BLSOM), making the resulting map independent of the order of data input. We generated BLSOMs for oligonucleotide frequencies in fragment sequences (e.g. 10-kb) from 13 plant genomes for which almost complete genome sequences are available. BLSOM recognized species-specific characteristics (key combinations of oligonucleotide frequencies) in most of the fragment sequences, permitting classification (self-organization) of sequences according to species without any information regarding the species during computation. To disclose sequence characteristics of a single genome independently of other genomes, we constructed BLSOMs for sequence fragments from one genome plus computer-generated random sequences. Genomic sequences were clearly separated from random sequences, revealing the oligonucleotides with characteristic occurrence levels in the genomic sequences. We discussed these oligonucleotides diagnostic for genomic sequences, in connection with genetic signal sequences. Because the classification and visualization power is very high, BLSOM is thought to be an efficient and powerful tool for extracting a wide range of genomic information.
In this study, we used a graphical representation to integrate, visualize, search, and analyze information on metabolite identifiers. By considering the links between metabolite identifiers described in the metabolite databases to be the edges between vertices in a graph, several metabolite databases can be integrated into a database without defining a new metabolite identifier code. The graphical visualization of metabolite identifier network enables us to understand the meaning of each metabolite identifier and their relationship with associated identifiers. The projection of actual metabolome data on the pathway map was also attained by using the converter function of the metabolite identifier database. We demonstrated that other metabolite-related information, such as chemical ontology and species-metabolite relationship, can be incorporated into the network, and performed an analysis of plant species-alkaloid ontology relationship.
Recent advances in genome research have yielded a vast amount of large-scale data (e.g. DNA microarray) and have begun to deepen our understanding of plant cellular systems. Meta-analysis such as gene coexpression across publicly available microarrays has demonstrated that this approach is useful for investigating transcriptome organization and for predicting unknown gene functions in biological processes ranging from yeast to humans. However, no overall coexpression-network module in rice has been examined in detail. Here we present the coexpression clusters of rice genes based on unbiased graph clustering of the coexpression network of 4,495 genes. The coexpression network was constructed by using over 230 microarrays; it manifested several properties of a typical complex network (e.g. scale-free degree distribution). Using the DPClus algorithm that can extract densely connected clusters we detected 1,220 clusters. We evaluated these clusters using gene ontology enrichment analysis. We conclude that this approach is important for generating experimentally testable hypotheses for uncharacterized gene functions in rice and we posit that meta-analysis across publicly available microarrays will become increasingly important in crop science.
The present study proposes a method to predict the conformation of protein complexes by using statistically significant domain–domain interactions (DDIs). High-throughput methods for detecting protein interactions generate a significant number of false-positives, and especially the combinatorial method of protein-complex purification and mass spectrometry detect both direct and non-direct interactions i.e. “bait–prey” and “prey–prey” interactions making it difficult to predict the conformation of complexes. Therefore in this work we utilized the DDIs as a means to support the interactions and subsequently to predict the conformation of complexes. As the first step, we extracted 312 statistically significant DDIs out of 1,162 DDIs underlying 3, 118 protein–protein interactions (PPIs) of Arabidopsis thaliana by using Fisher's exact test. Next, 67 protein complexes were obtained by applying a graph clustering algorithm to the PPI network. Finally, we discussed the conformation of protein complexes based on DDI information extracted in the first step. Information on significant DDIs can also be utilized to annotate unknown function proteins and to predict localization of proteins with confidence.
The sobean genome has been decoded independently in the United States, China, South Korea, and Japan using varieties indigenous to those regions. Using an EST library of the variety indigenous to the United States, a DNA microarray technique was developed and gene expression data in microarray experiments have been accumulated in public databases such as the Gene Expression Omnibus database (GEO). Such data can be used for genome-wide bioinformatics approaches to co-expression analysis for predicting gene function. To extract co-expressed genes in soybean, we analyzed soybean gene expression data based on DNA microarray datasets obtained from GEO and developed a database for soybean gene co-expression analysis. Using the database, we illustrate the steps to retrieve co-expressed genes that may be involved in isoflavonoid biosynthesis and secondary cell wall biosynthesis. Our database is useful for extracting co-expressed genes, which may be involved in a particular biological process, and then for understanding the mechanisms of such processes in legumes.
Proper functional classification and statistical assessment of a set of genes is very important for the purpose of comparison of gene compositions in genomes between different plant species as well as for the post-genomic research such as assessment of tissue and cell conditions concerning gene expression and metabolite accumulation profiles in transcriptomics and metabolomics. So we defined five-level categories concerning Arabidopsis thaliana genes by surveying approximately 3,000 references and classified 14,525 of 27,677 genes into different categories. Based on the classification information accumulated from various sources, we have developed a software tool called ‘Arabidopsis Gene Classifier’. By using this classifier system, we performed the comparative genomic analyses of five genome sequences of the plants, Chlamydomonas reinhardtii, Physcomitrella patens, Selaginella moellendorffii, A. thaliana and Oryza sativa, and extracted statistically significant differences in their gene compositions concerning metabolic pathways.
We present a new database called MetalMine that contains the classification of metal-binding sites derived from the structures of protein-metal-ion complexes. Metal-binding sites were automatically extracted from Protein Data Bank (PDB) structures, classified based on the protein domains in which the metal-binding sites are incorporated, and then manually curated. Tentative or artificial metal ion coordinations were excluded during this curation process. On the web pages of the database, the following information about metal-binding sites is presented in a hierarchical manner: the kind of metal ion, metal-binding site typically specified by the name of the protein, and each instance of metal-binding coordinates in the PDB structures. The database search engine currently supports the following two types of queries. First, given the PDB code of a protein–metal–ion complex, it provides a list of metal-binding sites incorporated in the structure file. Second, given an amino acid sequence as a query, it looks for matches with metal-binding residues in the Basic Local Alignment Search Tool (BLAST) sequence alignment. As of October 2009, MetalMine contained 412 classified entries of functional metal-binding sites, which we believe is the largest number of entries in databases available to the public.
Aromatic amino acids function as building blocks of proteins and as precursors for secondary metabolism. To obtain plants that accumulate tryptophan (Trp) and phenylalanine (Phe), we modified the biosynthetic pathways for these amino acids in rice and dicot species. By introducing a gene encoding a feedback-insensitive anthranilate synthase (AS) alpha subunit, we successfully obtained transgenic plants that over-accumulated Trp. In addition, we found mutant calli that accumulated Phe and Trp at high concentrations. The causal gene (mtr1-D) encoded an arogenate dehydratase (ADT)/prephenate dehydratase (PDT) that catalyzes the final reaction in Phe biosynthesis. The wild-type enzyme was sensitive to feedback inhibition by Phe, but the mutant enzyme encoded by mtr1-D was relatively insensitive. Further, detailed analysis of downstream secondary metabolism from Trp in rice revealed that the Trp pathway, by producing serotonin, is involved in the defense response against pathogenic infection. Based on these findings we propose that the reactions catalyzed by AS and ADT are critical regulatory points in the biosynthesis of Trp and Phe, respectively. In addition, detailed characterization of transgenic lines that accumulate these aromatic amino acids provided new insights into the regulation of downstream secondary metabolism, translocation of aromatic amino acids, and effects of accumulation of aromatic amino acids on various agronomic traits.
Colchicaceous plants, Gloriosa spp., Littonia modesta and Sandersonia aurantiaca, are cultivated as ornamentals. However, unfortunately no large variations in horticultural traits are found within each genus. We examined intergeneric hybridization using 6 genotypes of Gloriosa spp., 1 genotype of L. modesta and 2 genotypes of S. aurantiaca to obtain wider variability and to develop novel cultivars in those groups. Following intergeneric cross-pollination, putative hybrid plantlets were obtained via ovule culture in various combinations. Early confirmation of the hybridity of ovule culture-derived plantlets was accomplished by flow cytometry and random amplified polymorphic DNA analyses. Several intergeneric hybrids have so far been produced flowers and subjected to morphological characterization. All the hybrids examined, i.e. L. modesta×S. aurantiaca, L. modesta×S. aurantiaca ‘Phoenix’, L. modesta×G. superba ‘Lutea’, L. modesta×G. ‘Marron Gold’, S. aurantiaca×G. superba ‘Lutea’, S. aurantiaca×G. ‘Marron Gold’ and S. aurantiaca ‘Phoenix’×G. ‘Marron Gold’, had novel morphological characteristics compared with their parents, some of which were horticulturally attractive. The results obtained in our series of studies indicate the validity of intergeneric hybridization in the improvement programs of colchicaceous ornamentals. We are now examining to develop a rapid and efficient micropropagation system and to restore fertility by artificial chromosome doubling of intergeneric hybrids that had been produced in our series of experiments.
Soybean somatic embryos have attracted attention both as a model of zygotic embryos and as explants for the generation of stable transgenic plants. We have now characterized the maturation of soybean somatic embryos in detail by examining both the accumulation of the major seed storage proteins β-conglycinin and glycinin as well as changes in cellular organization. Protein storage vacuoles and oil bodies, which are the main depositories of seed storage reserves, formed within cells during the maturation of somatic embryos. The seed storage proteins were gradually synthesized and accumulated in the protein storage vacuoles in a manner similar to that apparent in seeds. The α and α′ subunits of β-conglycinin were detected earlier than the β subunit of this protein and glycinin. In addition, The α and α′ subunits of β-conglycinin accumulated in both the cotyledons and the hypocotyl of somatic embryos, whereas the β subunit of β-conglycinin and glycinin accumulated only in the cotyledons. These temporal and spatial characteristics of storage protein production in maturating somatic embryos are similar to those in developing seeds, although the maturation of somatic embryos ceases prematurely without attainment of the final stages of development. Our findings suggest that somatic embryos are suitable for verification of seed-specific traits such as the biosynthesis of seed components.
Plant hormones are known to play important roles for maintenance of internal conditions under various environmental stresses. Recent studies revealed that there is a significant cross-talk between abiotic and biotic stress responses. To understand such complex mechanisms, comprehensive analyses at multiple levels are required. In this study, to examine the dynamic interactions between plant hormone responses, we analyzed the metabolic movements of Arabidopsis thaliana cultured cells during hormone treatments by NMR metabolic profiling. First, we verified the effect of plant hormone treatments on intracellular metabolites, and detected that the abscisic acid (ABA), salicylic acid (SA), auxin, and brassinosteroid treatment caused metabolic changes. Secondly, since SA and ABA act antagonistically against each other, we monitored dynamic metabolic movements during ABA and SA combined treatments. The response to ABA-only treatment suggested that sugars and amino acids significantly increased. Although SA alone caused fewer metabolic changes, SA caused remarkable metabolic changes when applied in combination with ABA. In addition, our NMR data implied that salicylate glucoside (SAG), which is major metabolite converted from SA, significantly increased in the SA-only treatment but decreased with ABA in a dose dependent manner. These results suggest that ABA and SA cross-talk at the metabolite level in a complicate manner and that the combination of various conditions will provide us with a holistic view of plant stress response mechanisms.
The effects of salts on cell proliferation in suspension cultures of Avicennia alba and on callus induction from the leaf-tissue of A. marina were investigated using a small-scale liquid culture method. The effects of the seawater components, NaCl, KCl, CaCl2, MgCl2, and MgSO4 were examined separately. In both Avicennia species, the cell growth was increased in the presence of a low concentration, 10 mM, of MgCl2. Even in the presence of 100 mM NaCl, growth was stimulated in A. marina and there was no inhibition of growth in A. alba. CaCl2 was the most inhibitory and completely inhibited growth at 100 mM in both species. Similarities and differences in the effects of sea salts among Avicennia species and Sonneratia alba and Bruguiera sexangula of different mangrove families are discussed. This is the first report on establishing cell suspension culture of the mangrove plant A. alba belonging to the Avicenniaceae.
We previously bred fragrant cyclamen cultivars by interspecific hybridization between cultivars of Cyclamen persicum and the wild species Cyclamen purpurascens. One of these fragrant cultivars, Kaori-no-mai, blooms purple flowers containing malvidin 3,5-diglucoside as the major anthocyanin. Here, we irradiated etiolated petioles of Kaori-no-mai with a 320-MeV carbon-ion beam at 0–16 Gy to increase flower color variation by mutation. Some of the M2 plants derived from self-pollination of M1 plants irradiated at 2 Gy were flower-color mutants that retained desirable flower shape, flower size, and leaf color. One of the mutants bloomed novel red-purple flowers, the major anthocyanin of which was delphinidin 3,5-diglucoside. Loss of methylation activity at the anthocyanin 3′- and 5′-hydroxyl groups with little influence on anthocyanin concentration was attributed to the mutation. Because the major anthocyanins in flowers of Cyclamen spp. were previously restricted to malvidin, peonidin, and cyanidin types, the generation of a cyclamen containing mostly the delphinidin-type anthocyanin is an important breakthrough in cyclamen breeding. We expect this mutant to become not only a commercial cultivar itself, but also a valuable genetic resource for cyclamen breeding.