Proceedings of the Symposium on Chemoinformatics
30th Symposium on Chemical Information and Computer Sciences, Kyoto
Conference information

Special Session (Metabolomics and Information Chemistry)
A Batch-learning Self-organizing Map for Functional Classification of Proteins in Sequence Databases
Takashi AbeShigehiko Kanaya*Toshimichi Ikemura
Author information
CONFERENCE PROCEEDINGS FREE ACCESS

Pages JL02

Details
Abstract
Homology searches have been used widely to predict functions of genes and proteins when genomes are newly decoded. As a result of decoding of genome sequences from a wide range of species, a large number of proteins whose function cannot be estimated by the homology search of amino acid sequences is progressively accumulated and remains of no use in science and industry. A method to estimate the protein function that does not depend on the sequence homology search is in urgent need. We previously developed a Batch-Learning SOM (BL-SOM) for genome informatics, which does not depend on the order of data input. The present study focused on BL-SOM analyses on di to tetra continuous amino acid frequencies. Concerning the oligopeptide frequencies in the 110,000 proteins which have been classified into 2,853 function-known COGs (clusters of orthologous groups of proteins), BL-SOMs that faithfully reproduced the COG classifications were obtained. This indicated that proteins, whose functions are presently unknown because of lack of significant homology with function-known proteins, can be related to function-known proteins with the BL-SOM.
Content from these authors
© 2007 The Chemical Society of Japan
Previous article Next article
feedback
Top