2021 年 E104.D 巻 5 号 p. 752-761
An encoder-decoder (Enc-Dec) model is one of the fundamental architectures in many computer vision applications. One desired property of a trained Enc-Dec model is to feasibly encode (and decode) diverse input patterns. Aiming to obtain such a model, in this paper, we propose a simple method called curiosity-guided fine-tuning (CurioFT), which puts more weight on uncommon input patterns without explicitly knowing their frequency. In an experiment, we evaluated CurioFT in a task of future frame generation with the CUHK Avenue dataset and found that it reduced the mean square error by 7.4% for anomalous scenes, 4.8% for common scenes, and 6.6% in total. Some other experiments with the UCSD dataset further supported the reasonability of the proposed method.