2022 年 9 巻 3 号 p. 275-284
Cross-view image matching for geo-localization is the task of finding images containing the same geographic target across different platforms. This task has drawn significant attention among researchers due to its vast applications in UAV’s self-localization and navigation. Given a query image from UAV-view, a matching model can find the same geo-referenced satellite image from the database, which can be used later to precisely locate the UAV’s current position. Many studies have achieved high accuracy on existing datasets, but they can be further improved by combining different feature processing methods. Inspired by previous studies, in this paper, we proposed a new framework by using a channel-based attention mechanism combined with a part-based representation learning method, including multi-level feature aggregation and an alternative pooling strategy to enhance the feature extracting process. The proposed model significantly improved matching accuracy and surpassed the existing state-of-the-art methods on University-1652 dataset..