Infrared and Visible Image Fusion with Overlapped Window Transformer

Xingwang Liu; Bemnet Wondimagegnehu Mersha; Kaoru Hirota; Yaping Dai

doi:10.20965/jaciii.2025.p0838

抄録

An overlap window-based transformer is proposed for infrared and visible image fusion. A multi-head self-attention mechanism based on overlapping windows is designed. By introducing overlapping regions between windows, local features can interact across different windows, avoiding the discontinuity and information isolation issues caused by non-overlapping partitions. The proposed model is trained using an unsupervised loss function composed of three terms: pixel, gradient, and structural loss. With the end-to-end model and the unsupervised loss function, our method eliminates the need to manually design complex activity-level measurements and fusion strategies. Extensive experiments on the public TNO (grayscale) and RoadScene (RGB) datasets demonstrate that the proposed method achieves the expected long-distance dependency modeling capabilities when fusing infrared and visible images, as well as the positive results in both qualitative and quantitative evaluations.

著者関連情報

この記事は最新の被引用情報を取得できません。

This article is licensed under a Creative Commons [Attribution-NoDerivatives 4.0 International] license (https://creativecommons.org/licenses/by-nd/4.0/).
The journal is fully Open Access under Creative Commons licenses and all articles are free to access at JACIII official website.
https://www.fujipress.jp/jaciii/jc-about/#https://creativecommons.org/licenses/by-nd

お気に入り & アラート

閲覧履歴

創刊号からの全論文のPDFは
JACIII公式サイトで公開中(無料)
doiリンクをクリック！

責任著者(Corresponding author)

J-STAGEへの登録はこちら（無料）