2025 年 16 巻 3 号 p. 681-690
In the present paper, we investigate corpus size dependency on genre classification of modern Japanese literary works in CBOW model. In the construction of word vectors, models trained on large sentences could be more accurate in the semantic representation of words than models with less one. Eventually, more accurate semantic representation of words could realize more accurate genre classification accuracy of modern Japanese literary works. Therefore, the purpose of the present paper is to investigate corpus size dependency on genre classification of modern Japanese literary works in CBOW model. In computer experiments, we perform two types of classification problem: novel and poetry, and novel and essay. In either problem, the word vector representation presented by the CBOW model with the large corpus size is the worst classification accuracy contrary to our expectation. Thus, the variety of Japanese word corpus makes disappear the characteristic features of modern Japanese literary words in the semantic representation.