ITE Transactions on Media Technology and Applications

Papers

[Paper] Deep Learning-based RGBA Image Compression with Masked Window-based Attention

Yoshiki Inazu, Hideaki Kimata

2025 Volume 13 Issue 2 Pages 200-210
Published: 2025
Released on J-STAGE: April 01, 2025

DOIhttps://doi.org/10.3169/mta.13.200

JOURNAL FREE ACCESS

Show abstractHide abstract

RGBA image that includes an alpha channel for transparency is common in real-world applications. Traditional RGBA compression methods apply the same methods to both RGB and alpha channel, but potentially leading to suboptimal results due to their different characteristics. This paper proposes a deep neural network that introduces attention modules individually suitable for RGB signals and alpha channel. The proposed method consists of two networks, one for the RGB signal and one for the alpha channel, with an appropriate attention module applied in each. In particular, a new attention module that focuses on the unmasked regions of the alpha channel is applied. In the evaluation, the proposed method is compared with a simple deep neural network with input and output layers extended from three to four channels and classical RGBA image compression methods.

View full abstract

Download PDF (5223K)
[Paper] Appearance-and-Spectral-Based Identification System for Penguin Individuals

Youta Noboru, Yuko Ozasa, Masayuki Tanaka

2025 Volume 13 Issue 2 Pages 211-220
Published: 2025
Released on J-STAGE: April 01, 2025

DOIhttps://doi.org/10.3169/mta.13.211

JOURNAL FREE ACCESS

Show abstractHide abstract

We present a system for identifying individual penguins in flock images by combining appearance-based penguin detection with RGB images and spectral-based penguin identification with hyperspectral (HS) information. For the spectral-based identification, we propose a classification and ensemble approach inspired by the observation that many animals have similar color patterns. For example, penguins typically have black and white coloration, with consistent color distributions across individuals. The proposed system's effectiveness was experimentally validated with a mean Average Precision (mAP) metric. The proposed Appearance-and-Spectral-Based identification system significantly outperformed conventional appearance-based and spectral-based systems.

View full abstract

Download PDF (3296K)
[Paper] Reproducing material-specific appearances of optically complex objects using a view interpolation network

Chihiro Hoshizawa, Taishi Iriyama, Takashi Komuro

2025 Volume 13 Issue 2 Pages 221-230
Published: 2025
Released on J-STAGE: April 01, 2025

DOIhttps://doi.org/10.3169/mta.13.221

JOURNAL FREE ACCESS

Show abstractHide abstract

In this study, we attempt to reproduce material appearance of objects with various optical characteristics using a free viewpoint image generation network. The network takes RGB and depth images captured at four specific viewpoints as input and generates an image at an intermediate viewpoint inside them. The RGB images are geometrically transformed into images seen from the output viewpoint by image warping. Then, an image with interpolated luminances is generated using a U-Net based image transformation network. We use adversarial loss to generate material-specific appearances rather than obtaining optically correct outputs. In our experiments, we used metal materials that reflect the surrounding environment, glass materials that transmit and refract light, and materials with sub-surface scattering. The results showed that the use of adversarial loss gave better results for all these materials both in LPIPS, an image quality assessment metric that is close to human perception, and evaluations by human participants.

View full abstract

Download PDF (3272K)

Register with J-STAGE for free!