ITE Transactions on Media Technology and Applications

Papers

[Paper] Deep Learning-based RGBA Image Compression with Masked Window-based Attention

Yoshiki Inazu, Hideaki Kimata

2025 年13 巻2 号 p. 200-210
発行日: 2025年
公開日: 2025/04/01

DOIhttps://doi.org/10.3169/mta.13.200

ジャーナルフリー

抄録を表示する抄録を非表示にする

RGBA image that includes an alpha channel for transparency is common in real-world applications. Traditional RGBA compression methods apply the same methods to both RGB and alpha channel, but potentially leading to suboptimal results due to their different characteristics. This paper proposes a deep neural network that introduces attention modules individually suitable for RGB signals and alpha channel. The proposed method consists of two networks, one for the RGB signal and one for the alpha channel, with an appropriate attention module applied in each. In particular, a new attention module that focuses on the unmasked regions of the alpha channel is applied. In the evaluation, the proposed method is compared with a simple deep neural network with input and output layers extended from three to four channels and classical RGBA image compression methods.

抄録全体を表示

PDF形式でダウンロード (5223K)
[Paper] Appearance-and-Spectral-Based Identification System for Penguin Individuals

Youta Noboru, Yuko Ozasa, Masayuki Tanaka

2025 年13 巻2 号 p. 211-220
発行日: 2025年
公開日: 2025/04/01

DOIhttps://doi.org/10.3169/mta.13.211

ジャーナルフリー

抄録を表示する抄録を非表示にする

We present a system for identifying individual penguins in flock images by combining appearance-based penguin detection with RGB images and spectral-based penguin identification with hyperspectral (HS) information. For the spectral-based identification, we propose a classification and ensemble approach inspired by the observation that many animals have similar color patterns. For example, penguins typically have black and white coloration, with consistent color distributions across individuals. The proposed system's effectiveness was experimentally validated with a mean Average Precision (mAP) metric. The proposed Appearance-and-Spectral-Based identification system significantly outperformed conventional appearance-based and spectral-based systems.

抄録全体を表示

PDF形式でダウンロード (3296K)
[Paper] Reproducing material-specific appearances of optically complex objects using a view interpolation network

Chihiro Hoshizawa, Taishi Iriyama, Takashi Komuro

2025 年13 巻2 号 p. 221-230
発行日: 2025年
公開日: 2025/04/01

DOIhttps://doi.org/10.3169/mta.13.221

ジャーナルフリー

抄録を表示する抄録を非表示にする

In this study, we attempt to reproduce material appearance of objects with various optical characteristics using a free viewpoint image generation network. The network takes RGB and depth images captured at four specific viewpoints as input and generates an image at an intermediate viewpoint inside them. The RGB images are geometrically transformed into images seen from the output viewpoint by image warping. Then, an image with interpolated luminances is generated using a U-Net based image transformation network. We use adversarial loss to generate material-specific appearances rather than obtaining optically correct outputs. In our experiments, we used metal materials that reflect the surrounding environment, glass materials that transmit and refract light, and materials with sub-surface scattering. The results showed that the use of adversarial loss gave better results for all these materials both in LPIPS, an image quality assessment metric that is close to human perception, and evaluations by human participants.

抄録全体を表示

PDF形式でダウンロード (3272K)

J-STAGEへの登録はこちら（無料）