深層学習による日本語キャプション生成システムの開発

小林 豊; 鈴木 諒; 谷津 元樹; 原田 実

doi:10.11517/jsaisigtwo.2017.AM-17_04

Abstract

For the purpose of developing a dialogue system to dialogue after visually understanding the surrounding situation. We developed Japanese Caption generation system Deep Watcher and image datasets with captions. We used the Show and Tell model using CNN and LSTM to generate captions. We also evaluated the coincidence rate of caption content and five feature items manually. As a result the coincidence rate of the contents of the generated caption was 41.6%, the highest characteristic item was gender and was 86.9%. The coincidence rate of the caption contents were not high by over learning, but we could show the possibility of application to the dialog system for the feature item of gender.

Content from these authors

Favorites & Alerts

Corresponding author

Conference information

Register with J-STAGE for free!