Genome Informatics
Online ISSN : 2185-842X
Print ISSN : 0919-9454
ISSN-L : 0919-9454
Protein Motif Extraction Using Hidden Markov Model
藤原 由希子小長谷 明彦
著者情報
ジャーナル フリー

1993 年 4 巻 p. 56-64

詳細
抄録

In this paper, we study the application of HMM to the problem of representing protein sequences by a stochastic motif. A stochastic (protein) motif represents the portions of protein sequences that have a certain function or structure, where conditional probabilities are used to deal with the stochastic nature of the motif. We proposed the iterative duplication method for HMM network learning. HMMs are much more expressive than symbolic patterns and are better suited to represent the variety of protein sequences. As an experiment, we constructed HMMs for leucine zipper motif using 112 protein sequences as a training set, and obtained an accuracy of 79.3 percent in the prediction of protein sequences, compared for an accuracy 14.8 percent when using a symbolic representation. Our approach can be used also for the validation of protein databases; the automatically constructed HMM has indicated that one protein sequence annotated as “leucine-zipper like sequence” in the database is quite different from other leucine-zipper sequences in terms of likelihood.

著者関連情報
© Japanese Society for Bioinformatics
前の記事 次の記事
feedback
Top