主催: 日本毒性学会
会議名: 第51回日本毒性学会学術年会
開催日: 2024/07/03 - 2024/07/05
The cytochrome P450 (CYP) superfamily metabolizes diverse compounds, and drug-induced CYP inhibition can lead to adverse drug-drug interactions. Therefore, identifying potential CYP inhibitors is crucial for safe drug administration. This study explored multitask deep learning with graph convolutional networks (GCN) to predict CYP inhibition, addressing limited data challenges. Public databases provided data on 12,654 compounds for seven CYP isoforms, including two small datasets for CYP 2B6 and 2C8 (481 and 724 compounds, respectively). A baseline model to classify compounds as inhibitors or non-inihibitors was built with kMoL, but limitations in dataset size and imbalance challenged the prediction performance for 2B6 and 2C8. Thus, multitask and fine-tuning models were implemented to improve predictions for 2B6 and 2C8. While they produced modest improvements, the differences were not statistically significant. Additionally, missing data exceeding 50% negatively affected the multitask model performance. Imputing missing data using predictions from both single-task and multitask models led to significant improvements (F1 and Kappa values) for the limited datasets. Notably, a multitask model combined with imputation from the multitask model outperformed all the other approaches. This study demonstrated that multitask deep learning, particularly with the imputation of missing values, can effectively improve the prediction performance of models on small datasets.