Proceedings of the Annual Conference of JSAI
Online ISSN : 2758-7347
38th (2024)
Session ID : 4Xin2-30
Conference information

Execution-Based Evaluation Method for Code Documentation Generation Using Back-Translation
*Shiho TAKANOMiyu SATOWaka ITOYuka AKINOBUTakako KAWAGUCHIToshiyuki KURABAYASHIHaruto TANNNOKimio KURAMITSU
Author information
CONFERENCE PROCEEDINGS FREE ACCESS

Details
Abstract

In software development, code documentation is crucial for understanding and maintaining software. The manual creation and maintenance of code documentation are costly, leading to increased interest in automatic generation using Large Language Models (LLMs). However, the previous method of match-based evaluation cannot incorporate semantic interpretation and incurs high costs for preparing reference texts. We propose an execution-based evaluation method using back-translation to address these issues. Our approach back-translates LLM-generated code documentation into code and evaluates it based on execution results. This evaluation process enables assessments that include semantic interpretation, synonyms, and diversity of expression. In this paper, we introduce an automated evaluation tool, lm-chaineval-harness, that implements our proposed method and discusses validation experiments. lm-chaineval-harness, developed by our team, provides a user-friendly evaluation environment. The experimental results qualitatively show that our proposed method allows for evaluations incorporating semantic interpretation and accounts for synonyms and diversity of expression.

Content from these authors
© 2024 The Japanese Society for Artificial Intelligence
Previous article Next article
feedback
Top