2005 Volume 6 Pages 83-89
We propose a new method for the prediction of protein function, especially enzyme activity, based on statistical characteristics of oligopeptides. A known function of a protein is regarded to be inherited to its oligopeptides, and the correspondence between oligopeptides and the function is calculated in the whole proteins. In our method, unknown functions of proteins are predicted by means of the correspondence automatically. We measured the prediction performance for several enzymes by recall, precision and maximum f-measure using 28,520 whole human proteins registered in RefSeq. This paper reports prediction of a specific enzyme 'protein-tyrosine kinase' (EC 2.7.1.112) and a large class of enzymes 'transferases' (EC 2.-.-.-). The former and the latter score maximum f-measure of 0.932 and 0.786, respectively. The results suggest that the proposed method is quite efficient in predicting enzyme activity.