IEICE Transactions on Information and Systems
Online ISSN : 1745-1361
Print ISSN : 0916-8532
Regular Section
Attention Voting Network with Prior Distance Augmented Loss for 6DoF Pose Estimation
Yong HEJi LIXuanhong ZHOUZewei CHENXin LIU
Author information
JOURNAL FREE ACCESS

2021 Volume E104.D Issue 7 Pages 1039-1048

Details
Abstract

6DoF pose estimation from a monocular RGB image is a challenging but fundamental task. The methods based on unit direction vector-field representation and Hough voting strategy achieved state-of-the-art performance. Nevertheless, they apply the smooth 1 loss to learn the two elements of the unit vector separately, resulting in which is not taken into account that the prior distance between the pixel and the keypoint. While the positioning error is significantly affected by the prior distance. In this work, we propose a Prior Distance Augmented Loss (PDAL) to exploit the prior distance for more accurate vector-field representation. Furthermore, we propose a lightweight channel-level attention module for adaptive feature fusion. Embedding this Adaptive Fusion Attention Module (AFAM) into the U-Net, we build an Attention Voting Network to further improve the performance of our method. We conduct extensive experiments to demonstrate the effectiveness and performance improvement of our methods on the LINEMOD, OCCLUSION and YCB-Video datasets. Our experiments show that the proposed methods bring significant performance gains and outperform state-of-the-art RGB-based methods without any post-refinement.

Content from these authors
© 2021 The Institute of Electronics, Information and Communication Engineers
Previous article Next article
feedback
Top