2024 Volume 76 Issue 1 Pages 75-80
In this paper, a novel perception framework is presented for 2D and 3D object detection, based on sensor fusion of cameras and Li-DAR. While camera images provide abundant environmental features, they lack depth information. Conversely, Li-DAR point clouds offer accurate depth information, which however, are sparse in nature. Recognizing the complementary nature of each sensor’s strengths and weaknesses, an unsupervised depth completion network to enrich information from both sensors is used. This enhanced data is then utilized for performing 2D and 3D object detection tasks using a state-of-the-art detection network. The proposed framework is validated on KITTI data set, and experimental results demonstrate notable improvements in both 2D and 3D tasks when compared to baseline results.