2026 年 52 巻 4 号 p. 200-210
Generative artificial intelligence (AI) has rapidly advanced and is expected to enhance the efficiency and quality of clinical practice. In pharmacy practice, determining perioperative medication discontinuation is a critical task that is directly related to patient safety. However, standardized guidance remains limited, particularly for over-the-counter (OTC) drugs and dietary supplements, and decisions often rely on the individual expertise of pharmacists. In this study, the accuracy of nine generative AI models available in April 2025 (GPT-4o, GPT-4o mini, OpenAI o3, Gemini 2.5 Pro Exp, Gemini 2.0 Flash, Claude 3.7 Sonnet, Grok 3, Llama 4 Scout, and DeepSeek R1) for perioperative medication discontinuation decisions was evaluated using 15 mock prescription sets comprising 105 items. Each model received a standardized Japanese prompt, and outputs were independently assessed by five hospital pharmacists (≥5 years of clinical experience) based on three criteria: accurate drug identification, appropriateness of discontinuation and resumption timing, and the validity of the pharmacological rationale. A response was considered correct when at least four of the pharmacists agreed. OpenAI o3 demonstrated the highest accuracy (87.6%), followed by Gemini 2.5 Pro Exp (84.8%) and GPT-4o (83.8%). Lightweight models demonstrated lower accuracy, particularly for OTC products, dietary supplements, and fixed-dose combination drugs. High-performance models with advanced reasoning capabilities exhibited high accuracy and may serve as useful decision-support tools. However, incorrect responses occurred in approximately 10 – 20% of cases, even among the top-performing models. Therefore, safe clinical implementation requires careful model selection, integration with institutional knowledge resources, and final verification by pharmacists.