抄録
The objective of this study is to evaluate the ability of large language models (LLMs) to generate an agreement based on social welfare in multi-issue negotiation. In the experiment, prompts were designed that in-cluded the negotiation theme, issues, options within each issue, weights of the issues, utility values of the options, the method for calculating social welfare, and the output format. These prompts were then input into each LLM. All negotiations described in the prompts were two-party negotiations, and two types of negotiation scenarios were considered: one where the agents’ utilities were aligned, and one where they were conflict-ing. For evaluation, the generated agreements and their corresponding social welfare values were classified into five categories. As a result, GPT-4o-mini and Llama 3.3 70B showed strong performance in scenarios with aligned interests, while significant performance differences were observed among models in scenarios with conflicting interests. These results sug-gest that while LLMs can handle simple negotiation scenarios, they face challenges in more complex ones. Future work includes evaluating the consistency of the generation process, improving computational accuracy, and diversifying negotiation scenarios.