Host: The Japanese Society for Artificial Intelligence
Name : The 39th Annual Conference of the Japanese Society for Artificial Intelligence
Number : 39
Location : [in Japanese]
Date : May 27, 2025 - May 30, 2025
Large Language Models (LLMs) possess the ability to perform well on unknown tasks and flexibly alter their behavior according to prompts. Leveraging this characteristic, there are attempts to assign virtual personas or personalities to LLMs and make them behave accordingly. If we could intentionally limit LLM performance, the constructed virtual personas would likely become more realistic (e.g., making a kindergartener unable to solve integral calculus). This paper addresses such intentional performance degradation of LLMs. Using multiple Japanese benchmark tasks, we report that it is difficult to degrade LLM performance in downstream tasks through prompts alone. We also examine the benchmarks necessary for measuring performance degradation.