Genome Informatics
Online ISSN : 2185-842X
Print ISSN : 0919-9454
ISSN-L : 0919-9454
決定木とインデックス化によるアミノ酸配列からの知識獲得
宮野 悟篠原 歩有川 節夫下薗 真一篠原 武久原 哲
著者情報
ジャーナル フリー

1992 年 3 巻 p. 69-72

詳細
抄録
We present a machine learning system for knowledge acquisition that produces hypotheses from positive and negative examples, and report some experiments on protein data using the PIR and Gen Bank databases. This learning system is developed with an algorithmic learning theory for decision trees over regular patterns, which we newly devised for this research. In the experiments on transmembrane domain identification, the system discovered very simple hypotheses with very high accuracy from a small number of positive and negative examples. These hypotheses show that negative motifs, namely, motifs of negative data, play a key role in such classification. In these experiments, we classified 20 symbols of amino acid residues into 3 categories according to the hydropathy indices due to Kyte and Doolittle. We call such transformation of symbols an indexing. We observed that the indexing by the hydropathy indices is important in making the learning algorithm efficient and accurate. This observation inspired us with a desire to discover such an indexing itself just by a learning algorithm. We succeeded in it by combining the above learning algorithm and the local search technique for finding good indexings. We also report some experiments on signal peptides.
We have implemented this learning system, called BONSAI, which shall be presented at the Computer Demonstration Session during this workshop.
著者関連情報
© 日本バイオインフォマティクス学会
前の記事 次の記事
feedback
Top