ITE Transactions on Media Technology and Applications
Online ISSN : 2186-7364
ISSN-L : 2186-7364
最新号
選択された号の論文の18件中1~18を表示しています
Special Section on 3D Media Technology 2026
Special Section on Fast-track Review
  • Shingo Ando
    2026 年14 巻1 号 p. 92
    発行日: 2026年
    公開日: 2026/01/01
    ジャーナル フリー
  • Yuki Rogi, Kota Yoshida, Ayaka Banno, Takeshi Fujino, Shunsuke Okura
    2026 年14 巻1 号 p. 93-101
    発行日: 2026年
    公開日: 2026/01/01
    ジャーナル フリー

    With the development of IoT technology, edge AI is widely expected. Security and recovery from attacks are important for further development of edge AI. One of the attacks on edge AI is adversarial example (AE) attack which artificially causes false recognition by adding perturbation. As one of the solutions, a defense method to remove adversarial perturbation by adding disturbance noise and then using denoising autoencoder (DAE) has been proposed. In this paper, we first show that the effectiveness of the defense method noise is low when the perturbation noise is based on predictable pseudorandom. Next, we propose a defense method based on unpredictable pixel reset noise of a CMOS image sensor and a pre-processing to enhance the randomness of the perturbation noise. According to simulation results, we confirmed that the defense performance against AE attacks is improved by approximately 30%.

  • Keiichiro Kuroda, Yudai Morikaku, Yu Osuka, Ryoya Iegaki, Ryuichi Ujii ...
    2026 年14 巻1 号 p. 102-109
    発行日: 2026年
    公開日: 2026/01/01
    ジャーナル フリー

    Anticipating the rise of the Internet of Things (IoT) era, we have proposed an object detection framework that employs a CMOS image sensor with binary feature extraction to reduce power requirements. Initially, we presented a lightweight deep neural network for the feature data based on the YOLOv7, comparable to the YOLOv7-tiny in the number of parameters and FLOPs, but it enhances large object recognition accuracy (APL50) by 6.6%. Moreover, our approach achieves a 48.8% reduction of GPU power consumption compared to the YOLOv7. Additionally, we introduce an on-chip signal processing method for the binary feature data. The proposed method achieves a compression rate of 64.1% and increases GPU power consumption by only 14.9% during the decoding process preceding object detection. Moreover, the size of 1-bit feature data is reduced by 96.0%, and object recognition accuracy is improved by 4.0% relative to 1-bit RGB color images.

  • Jinlong Zhu, Keigo Sakurai, Ren Togo, Takahiro Ogawa, Miki Haseyama
    2026 年14 巻1 号 p. 110-118
    発行日: 2026年
    公開日: 2026/01/01
    ジャーナル フリー

    We propose a novel text-controllable polyphonic symbolic music generation method based on diffusion models. Symbolic music generation has garnered significant attention due to its flexibility and seamless integration with Digital Audio Workstations (DAWs), as it enables the generation of MIDI files, facilitating easier modification compared to waveform music. Although existing techniques enable control through chords or other metadata, few methods allow intuitive control via text prompts, which better align with user preferences. To address this limitation, we introduce Text-Controllable Polyphonic Symbolic Music Generation (TPSMG), a diffusion model specifically designed for text-conditioned symbolic music generation. Our approach incorporates a text condition module into a U-Net backbone within a Denoising Diffusion Probabilistic Model. This module translates text prompts into embeddings that steer the denoising process, thereby enabling precise, text-based control over music generation. Experimental results demonstrate that our method generates high-quality polyphonic symbolic music outputs that closely reflect the intended textual input.

Regular Section
feedback
Top