2002 Volume 13 Pages 163-172
Divergence in sequence through evolution precludes sequence alignment based homology methodologies for protein folding prediction from detecting structural and folding similarities for distantly related protein. Homolog coverage of actual data bases is also a factor playing a critical role in the performance of those methodologies, the factor being conspicuously apparent in what is called the twilight zone of sequence homology in which proteins of high degree of similarity in both biological function and structure are found but for which the amino acid sequence homology ranges from about 20% to less than 30%. In contrast to these methodologies a strategy is proposed here based on a different concept of sequence homology. This concept is derived from a periodicity analysis of the physicochemical properties of the residues constituting proteins primary structures. The analysis is performed using a front-end processing technique in automatic speech recognition by means of which the cepstrum (measure of the periodic wiggliness of a frequency response) is computed that leads to a spectral envelope that depicts the subtle periodicity in physicochemical characteristics of the sequence. Homology in sequences is then derived by alignment of spectral envelopes. Proteins sharing common folding patterns and biological function but low sequence homology can then be detected by the similarity in spectral dimension. The methodology applied to protein folding recognition underscores in many cases other methodologies in the twilight zone.