Host: The Japanese Society for Artificial Intelligence
Name : 34th Annual Conference, 2020
Number : 34
Location : Online
Date : June 09, 2020 - June 12, 2020
There has been much research on author identification based on a text for a long time. In Japanese texts, many researchers have taken various methods that focus on features such as the distribution of n-grams of parts of speech and the distribution of characters. They also used various models such as random forest method and neural network as classification models. In this paper, I focused on Doc2Vec proposed in 2014 and BERT in 2018 and performed supervised learning using these models and neural networks. I downloaded these works used as training and test data from "Aozora Bunko" and converted them into a numerical vector using Doc2Vec and use it as the input of the neural network. I performed Multinomial classification learning and got results in the accuracy of 84.89% for Doc2Vec and 55.43% for BERT.