拡散モデルを用いた視覚体験に基づく脳内表象の再構築手法への取り組み

石﨑 文都; 小林 一郎

doi:10.14864/fss.41.0_656

Abstract

The human brain has sophisticated mechanisms for processing complex information from the external environment. Understanding these mechanisms can significantly contribute to advances in artificial intelligence, particularly in image and speech recognition. Among various approaches, decoding visual experiences is crucial for clarifying how visual information is represented in the brain and for reconstructing perceived images. In this study, we propose a novel method for predicting and reconstructing visual experiences from brain activity using Stable Diffusion, a generative diffusion model. Unlike conventional methods based on text inputs, our approach conditions the image generation on brain signals, offering a new way to understand visual perception and decode perceptual content. Additionally, we incorporate an encoding model into the loss function of the U-Net used for noise prediction. This model employs an LSTM to process noisy images at each time step and encode the corresponding temporal features of brain activity, helping the U-Net align noise prediction with dynamic neural responses.

Content from these authors

Favorites & Alerts

Corresponding author

Conference information

Register with J-STAGE for free!