Proceedings of the Annual Conference of JSAI
Online ISSN : 2758-7347
38th (2024)
Session ID : 4A1-GS-6-02
Conference information

Learning Methods for LLMs on Game Data Using RLHF
*Tomoya MURATANaoki MORIMakoto OKADA
Author information
Keywords: LLM, Alignment, RLHF, BERT
CONFERENCE PROCEEDINGS FREE ACCESS

Details
Abstract

Recent advancements in Large Language Models (LLMs) within the artificial intelligence domain have shown exceptional performance across various natural language processing tasks. Amidst these developments, aligning the values and objectives of LLMs with human perspectives has become increasingly important. Reinforcement Learning from Human Feedback (RLHF) has gained notable interest as a method for such alignment adjustments. This study explored a learning approach for LLMs using RLHF, employing scenarios from the romance simulation game 'Tokimeki Memorial 3' as the game scenario data. Specifically, the research involved an experiment where sentences were generated following five Japanese characters, tailored to align with the personalities of the game characters. While subjective, this evaluation demonstrated the capability of producing sentences that appropriately matched the distinct characters in the game.

Content from these authors
© 2024 The Japanese Society for Artificial Intelligence
Previous article Next article
feedback
Top