2025 Volume 76 Issue 5 Pages 275-281
Endoscopic laryngo-pharyngeal surgery (ELPS) is a treatment for superficial laryngo-pharyngeal cancer which is minimally invasive and excellent for preservation of swallowing or speech function. To maximize its benefits, however, it is necessary to achieve both complete and optimal tumor resection without excessive dissection. This paper outlines the development of a system using an AI image processing model to assist in determining the appropriate resection range during ELPS. As a pilot study, a holdout validation test (172 for training, 44 for validation) was conducted using 216 NBI images of superficial laryngo-pharyngeal cancer treated in our department. An upper gastrointestinal endoscopy specialist checked the images and annotated the lesion areas by surrounding them with free-form curves that were judged to be cancerous based on NBI findings. We used DeepLab v3+, a semantic segmentation model, to predict the lesion area, and performed transfer learning using our data on a published pre-trained model using Pascal VOC 2014. The average performance indices of the inference results verified pixel by pixel for 44 verification images were intersection over union (IoU): 0.596, sensitivity: 0.734, and specificity: 0.947. In Japan, two AI surgical support systems for endoscopic surgeries were approved for manufacture and marketing by the Ministry of Health, Labour and Welfare in 2024 as a “surgical image recognition support program.” In the field of otorhinolaryngology, it is necessary to aim for clinical applications of AI surgical support systems that are truly beneficial to society, with a clear strategy for the future of healthcare systems.