2022 Volume 10 Issue 1 Pages 127-135
The facial alignment task has been well-studied extensively and achieved significant progress in recent years. However, previous works remain challenging due to the ambiguity of invisible landmarks under extreme viewpoints (e.g., large pose and expression). This paper proposes a novel dense network on top of the coordinate regression method to improve face coordinates localization in extreme environments. The attention mechanism can better guide the model on which information to emphasize or suppress. In our network, we embed a Convolutional Block Attention Module (CBAM) in three stages of the densely connected convolutional network (DenseNet) respectively for face alignment tasks. Then we concatenate landmark location labels with bounding box location and head pose value as the guide information for our network. We demonstrate the performance of our network through the AFLW2000-3D, AFLW2000-3DReannotated, Menpo-3D test datasets. Comparative experiments reveal that our network with lower mean NME (3.2%) outperforms the baseline DenseNet (3.66%), ShuffleNet (4.39%), and DenseNet+SE (3.93%) on AFLW2000-3D-Reannotated. We conclude that our network obtains improved performances for face landmarks prediction even in extreme conditions.