Transactions of the Society of Instrument and Control Engineers
Online ISSN : 1883-8189
Print ISSN : 0453-4654
ISSN-L : 0453-4654
Paper
Disaster Related Image Detection on an IoT Device Using Vision-language Technology
Masao KUBOQuyet Thanh NGUYENHiroshi SATOAkihiro YAMAGUCHI
Author information
JOURNAL RESTRICTED ACCESS

2026 Volume 62 Issue 2 Pages 59-68

Details
Abstract

In a crisis situation, it is expected that a user's mobile device may not be connected to the Internet or advanced AI services, and there is a need for technology that can be used even in such cases. In this paper, we demonstrate through computer experiments that a method using BLIP-2, a visual language technology, is promising in terms of accuracy as a method for locally determining whether a subject in a photo is related to a disaster. The proposed method focuses on embedding technology, which is a preprocessing method for LLM. When LLM is not available, we thought that using its vectors as the basis for processing is one of the optimal methods. BLIP-2 is used to vectorize the photo, and the judgment is made by comparing it with the built-in database using the k-nearest neighbor method. Computer experiments show that when an appropriate database is used, this method outperforms other methods in terms of accuracy and F1 value.

Content from these authors
© 2026 The Society of Instrument and Control Engineers
Next article
feedback
Top