Abstract
Intrinsically disordered regions (IDRs) can be predicted by computer programs. In this work, we pursued what factors provide basis for predicting IDRs. We conducted a random forest analysis to obtain degrees of contribution of each of the amino acid residues for the predictions. The results suggested that the contribution of proline is remarkably larger than other residues. Next, we analyzed the distribution of proline residues around the boundaries between IDRs and structural domains (SDs), disclosing that proline residues notably overrepresent in the SD sides of the boundaries. This result can contribute to develop more accurate prediction programs and to understand the structural nature of intrinsically disordered proteins.