2025 Volume 81 Issue 16 Article ID: 24-16190
Since surveys of river space use do not capture the attributes of users and detailed usage conditions, it is desirable to develop a new method to complement the current survey results in order to properly evaluate the effects of river improvement and to understand the involvement of residents in rivers. Therefore, in this study, as a basic investigation of a survey method on people's use of river space using a river camera and AI together, we investigated the degree of recognition of attributes and behaviors of people using rivers by utilizing LLaVA, a type of LMM. In addition, we compared LLaVA with an object detection model to investigate the influence of the shooting environment, and examined the usefulness and application limitations of LLaVA. The results of LLaVA estimation of human gender, age, and behavior showed that men and women generally agreed with visual recognition, but elderly people tended to be misjudged. The results generally agreed with visual inspection except for some behaviors. The results show the usefulness of the model in recognizing human attributes and behaviors.