Journal of Information Processing
Online ISSN : 1882-6652
ISSN-L : 1882-6652
 
Hallucination Detection on Code Generation with SelfCheckGPT
Waka ItoYui ObaraMiyu SatoKimio Kuramitsu
著者情報
ジャーナル フリー

2025 年 33 巻 p. 487-493

詳細
抄録

Large language models (LLMs) are expected to bring automation and efficiency to software development, including programming. However, an LLM encounters a challenge known as “hallucination, ” where it produces incorrect content or outputs that deviate from input requirements. SelfCheckGPT is one of the methods designed to detect hallucinations. Its key feature lies in its ability to infer the occurrence of hallucinations without requiring reference data or test cases. Although SelfCheckGPT has been evaluated and applied in natural language processing tasks such as text summarization and question answering, its performance in code generation has not yet been explored. In this study, we applied SelfCheckGPT to the HumanEval dataset, a standard benchmark for code generation, and investigated its evaluation performance by comparing it with execution-based evaluations. The results revealed that calculating similarity using BLEU, ROUGE-L, and EditSim is adequate for predicting the correctness of code or, in other words, hallucinations.

著者関連情報
© 2025 by the Information Processing Society of Japan
前の記事 次の記事
feedback
Top