Host: Japan Society for Fuzzy Theory and Intelligent Info rmatics (SOFT)
Name : 41th Fuzzy System Symposium
Number : 41
Location : [in Japanese]
Date : September 03, 2025 - September 05, 2025
Human-object interaction (hereafter HOI) is essential for deeper scene understanding. Existing two-step approaches use spatial information of the person and object in addition to pixel features to reveal the relationship between their placement and provide more detailed context information. When extracting 3D spatial information from monocular images, there are methods that use 3D modeling or a renderer for training data, but the data cost and rendering cost are bottlenecks in both cases. In this study, we propose a new method for extracting spatial information from monocular images that is independent of 3D training data and renderers by using skeletal and depth estimation information.