Proceedings of the Annual Conference of JSAI
Online ISSN : 2758-7347
38th (2024)
Session ID : 3G1-GS-11-04
Conference information

Creating Japanese VIrtue Dataset for AI Safety
*Masashi TAKESHITARafal RZEPKAKenji ARAKI
Author information
CONFERENCE PROCEEDINGS FREE ACCESS

Details
Abstract

Some AI models, such as large language models (LLMs), are known to generate harmful content for humans. AI researchers conduct AI alignment research to ensure that AI models understand our ethics and behave appropriately. However, most of these studies are conducted in English, with few studies in Japanese. Thus, this study creates a dataset for AI safety based on virtue ethics, a major stance in normative ethics. We create a new dataset in Japanese using the same construction method as that used to create the existing English virtue ethics dataset. The created dataset consists of approximately 20,000 cases, and we evaluate whether the AI model can correctly classify the correspondence between sentences describing an action and the character trait terms describing that action. We experimented with existing Japanese LLMs and found that it is difficult for these models to classify the correspondence correctly. We also compared our dataset with an existing English virtue ethics dataset.

Content from these authors
© 2024 The Japanese Society for Artificial Intelligence
Previous article Next article
feedback
Top