Abstract
More than 6, 000 amino acid sequence attributes were ranked by their conditional probabilities for indicating ordered or disordered protein structure. The top 10 each from several different groups of attributes were merged with still other attributes and then subjected to selection by logistic regression. Evidently, the determination of order or disorder results from the interplay among several attributes, such as average Coordination Number, aromatic content and the numbers of non-polar amino acids, all of which favor the ordered state, and others like Net Charge, Flexibility Index, and the presence of certain polar amino acids, all of which favor disorder. The top 12 selected attributes were used as inputs for artificial neural network (ANN) predictors. Five predictors were developed, compared with each other, and with previous work. The best of these shows substantially improved generalization compared to our previously published predictor.