Host: The Japanese Society for Artificial Intelligence
Name : The 38th Annual Conference of the Japanese Society for Artificial Intelligence
Number : 38
Location : [in Japanese]
Date : May 28, 2024 - May 31, 2024
To apply Large Language Models (LLMs) in the real world, it is crucial that the text they generate is of value to humans and of a quality that is acceptable to humans. This study aims to find evaluation functions that correlate with human evaluations of fashion coordination descriptions generated by LLMs. Identifying such evaluation functions could allow for the improvement of the accuracy of fashion coordination description generation models in a direction aligned with human values, and potentially automate the entire process from description generation to evaluation. In this research, fashion coordination descriptions generated by LLMs were evaluated by skilled fashion stylists, and a dataset was created based on their evaluation. Using this dataset, we sought to find evaluation metrics that correlate with human evaluations. The candidates for these functions were functions used in the abstractive summarization task.