BioVL2データセット：生化学分野における一人称視点の実験映像への言語アノテーション

西村 太一; 迫田 航次郎; 牛久 敦; 橋本 敦史; 奥田 奈津子; 小野 富三人; 亀甲 博貴; 森 信介

doi:10.5715/jnlp.29.1106

General Paper (Peer-Reviewed)

BioVL2: An Egocentric Biochemical Video-and-Language Dataset

Taichi Nishimura, Kojiro Sakoda, Atsushi Ushiku, Atsushi Hashimoto, Natsuko Okuda, Fumihito Ono, Hirotaka Kameko, Shinsuke Mori

Author information

Keywords: Biochemical Domain, Protocols, Vision-and-Language

JOURNAL FREE ACCESS

2022 Volume 29 Issue 4 Pages 1106-1137

DOI https://doi.org/10.5715/jnlp.29.1106

Details

Abstract

In this study, we propose an egocentric biochemical video-and-language dataset called BioVL2 comprising eight videos for each of four experiments, with a total duration of 2.5 hours for all 32 samples. Each video corresponds to a protocol and two types of linguistic annotations are provided: (1) video-and-text alignment and (2) bounding boxes linked to objects in the protocol. As an application of the BioVL2 dataset, we consider the task of generating a protocol from an experimental video. Our experimental results show that the proposed system can generate better protocols than a weak baseline designed to output objects appearing in the video frames. The BioVL2 dataset will be released for research purposes only.

Corresponding author

Register with J-STAGE for free!