Host: The Japanese Society for Artificial Intelligence
Name : 34th Annual Conference, 2020
Number : 34
Location : Online
Date : June 09, 2020 - June 12, 2020
When evaluating automatically generated ad creative-texts of NLG systems, we often put more importance on manual evaluation by human evaluators than automatic evaluation metrics such as ROUGE. Despite this, there is a lack of evaluation metrics dedicated to advertisement domain and assistant tools regarding the best practices. In this paper, we review the metrics for manual evaluation for NLG systems. We also give an outlook for the assistant tools for evaluation focused on automatically generated ads with domain-specific evaluation metrics, as well as the measurement of evaluators' agreement and performance.