2025 Volume 37 Issue 2 Pages 310-321
Product display robots are considered for industrial arm robot applications. Object pose estimation is necessary to automate product displays. However, the shapes of some objects in retail stores are simple, and robots often use RGB images from a single viewpoint. Consequently, the pose estimation accuracy is low depending on the viewpoint. Therefore, this paper proposes a multiview pose estimation method that fuses features using weights for each viewpoint. To calculate the weights, we focus on a shared object representation that expresses object poses through classification. The classification score for each class increased when pose estimation became easier. Thus, we developed a method that weighs features from each viewpoint using classification scores as confidence, and estimates the object pose. We compared the pose estimation results with those of the conventional method, which derives the most plausible pose from multiple estimation results. When the permissible angle error was set to 30°, the success rate of our method was 68.0%, which was 8.2 points higher than that of the conventional method.
This article cannot obtain the latest cited-by information.