Host: The Japanese Society for Artificial Intelligence
Name : The 38th Annual Conference of the Japanese Society for Artificial Intelligence
Number : 38
Location : [in Japanese]
Date : May 28, 2024 - May 31, 2024
In pharmaceutical development, protein language models (pLMs) and reinforcement learning (RL) have become essential techniques for designing desired protein sequences. In this paper, we investigate the effect of loss functions in reward model training, since reward models are central to obtaining protein sequences with better performance. Two types of typical loss functions, such as mean squared error and ranking loss, are used to train reward models.Numerical experiments have shown that there is no significant difference in the performance evaluation of the reward models alone. However, it turned out that the difference in the loss functions affect to the pLMs after performing RL. The ranking loss tends to provide better performance and to keep the distribution of pLMs during RL, resulting in obtaining desired protein sequences with better performance.