精密工学会誌
Online ISSN : 1882-675X
Print ISSN : 0912-0289
ISSN-L : 0912-0289
論文
アテンション機構を用いたクロップとマスクによるキャプション生成のためのデータ拡張
岩村 紀与彦ルイ笠原 純ユネスモロ アレッサンドロ山下 淳淺間 一
著者情報
ジャーナル フリー

2020 年 86 巻 11 号 p. 904-910

詳細
抄録

Automatic image captioning has various important applications such as the depiction of contents for the visually impaired. Most approaches use Deep Learning and have achieved remarkable results. However there are still some unresolved issues. One of them is the overfitting of the trained model to specific images, usually caused by limited training dataset sizes. In order to augment the training dataset size in such scenarios, previous researches proposed data augmentation using random cropping or mask. However, those do not specifically target overfitted regions in images and, therefore, may remove areas in images that are needed to generate captions and lower performance. In this study, we propose a novel data augmentation method that targets specifically regions in images subject to overfitting by using attention. Experimental results show that the proposed method allows generation of better image captions.

著者関連情報
© 2020 公益社団法人 精密工学会
前の記事 次の記事
feedback
Top