逆翻訳によるコード文書生成の実行ベース評価法

髙野 志歩; 佐藤 美唯; 伊東 和香; 秋信 有花; 川口 貴子; 倉林 利行; 丹野 治門; 倉光 君郎

doi:10.11517/pjsai.JSAI2024.0_4Xin230

38th (2024)

Session ID : 4Xin2-30

DOI https://doi.org/10.11517/pjsai.JSAI2024.0_4Xin230

Conference information

Host: The Japanese Society for Artificial Intelligence

Name : The 38th Annual Conference of the Japanese Society for Artificial Intelligence

Number : 38

Location : [in Japanese]

Date : May 28, 2024 - May 31, 2024

Execution-Based Evaluation Method for Code Documentation Generation Using Back-Translation

*Shiho TAKANO, Miyu SATO, Waka ITO, Yuka AKINOBU, Takako KAWAGUCHI, Toshiyuki KURABAYASHI, Haruto TANNNO, Kimio KURAMITSU

Author information

Keywords: Large Language Model, Back-Translation, Execution-Based Evaluation, Automated Evaluation Tool

CONFERENCE PROCEEDINGS FREE ACCESS

Details

Abstract

In software development, code documentation is crucial for understanding and maintaining software. The manual creation and maintenance of code documentation are costly, leading to increased interest in automatic generation using Large Language Models (LLMs). However, the previous method of match-based evaluation cannot incorporate semantic interpretation and incurs high costs for preparing reference texts. We propose an execution-based evaluation method using back-translation to address these issues. Our approach back-translates LLM-generated code documentation into code and evaluates it based on execution results. This evaluation process enables assessments that include semantic interpretation, synonyms, and diversity of expression. In this paper, we introduce an automated evaluation tool, lm-chaineval-harness, that implements our proposed method and discusses validation experiments. lm-chaineval-harness, developed by our team, provides a user-friendly evaluation environment. The experimental results qualitatively show that our proposed method allows for evaluations incorporating semantic interpretation and accounts for synonyms and diversity of expression.

Corresponding author

Conference information

Register with J-STAGE for free!