ChatGPT-3.5および4を用いた歯科医師国家試験正答率の比較

守下 昌輝; 福田 晃; 村岡 宏祐; 中村 太志; 吉岡 泉; 小野 堅太郎; 粟野 秀慈

doi:10.24744/jdea.40.1_3

Abstract

Abstract The rapid development of large learning models in recent years has been remarkable, and their application to dental education is being explored. In this study, ChatGPT-3.5 and GPT-4, developed by OpenAI, answered the Japanese National Dental Examination and the correct answer rates were compared and evaluated.

All Japanese National Dental Examination questions were obtained from the website of the Ministry of Health, Labour and Welfare, except for questions containing charts, images, etc. Prompts and questions were entered into ChatGPT-3.5 and GPT-4, and the correct response rates were evaluated for compulsory questions, general questions, domain A, domain B, domain C, and dental subjects.

Significant differences were found in the percentage of correct responses between ChatGPT-3.5 and GPT-4 for 288 required questions, 500 general questions, 314 domain A questions, 92 domain B questions, and 94 domain C questions, with GPT-4 showing significantly higher percentages of correct responses for all questions and domains. Based on the number of correct answers indicated by the question text, GPT-4 showed significantly higher rates of correct answers than GPT-3.5, except for “choose all.”

Regarding the problem-solving ability for the National Dental Examination based on ChatGPT, GPT-4 outperformed GPT-3.5, suggesting that GPT-4 can pass the National Dental Examination except for domain B for questions that do not include clinical images and charts and that it has potential as a tool to support dental education.

Content from these authors

Favorites & Alerts

Corresponding author

Register with J-STAGE for free!