Abstract
Structural genomics projects are beginning to produce protein structures with unknown functions, thereby creating a need for high-throughput methods to predict functions. Although sequence-based function prediction methods have been used extensively, structure-based prediction is believed to provide higher specificity and sensitivity because functions are closely related to the three-dimensional structures of functional sites, which are more strongly conserved during evolution than sequence. We have developed FCANAL, a method to predict functions using a score matrix obtained from the distances between Cα atoms and frequencies of appearance [1]. The previous report used key residues predicted from sequence comparisons (motifs). In this report, we have expanded the method to include enzymes and binding proteins with key residues predicted on the basis of three-dimensional structures. Using FCANAL, we constructed score matrices for 31 enzymes. When we applied them to all of the structure entries deposited in the Protein Data Bank, FCANAL could detect functional sites with high accuracy. This suggests that FCANAL will help identify the functions of newly determined structures and pinpoint their functionally important regions.