Proceedings of the Annual Conference of JSAI
Online ISSN : 2758-7347
33rd (2019)
Session ID : 4Rin1-13
Conference information

Constructing of the word embedding model by Japanese large scale SNS + Web corpus
*Shogo MATSUNOSakae MIZUKITakeshi SAKAKI
Author information
CONFERENCE PROCEEDINGS FREE ACCESS

Details
Abstract

In this paper, we present the word embedding model constructed by Japanese text existing on SNS including Twitter. This model is created from a Japanese large-scale corpus using multiple categories such as SNS data, Wikipedia, and Web pages as media. Perorming the evaluation by the word similarity calculation task with Speaman's rank correlation coefficient as the evaluation index for the created word embedding model resulted in a performance of about 7 points better than the model created by only Wikipedia as the learning corpus was obtained. The presented word embedding model in this paper is planned to be released through the website, and we hope that by utilizing this model, natural language processing research for SNS data will become more active.

Content from these authors
© 2019 The Japanese Society for Artificial Intelligence
Previous article Next article
feedback
Top