Proceedings of the Annual Conference of JSAI
Online ISSN : 2758-7347
34th (2020)
Session ID : 1D3-GS-13-05
Conference information

Creation of a Japanese SDGs dataset and a baseline model of classification
*XIN ZHANGYUSUKE MOTOKIYUYA SONEOKAYUSUKE IWASAWAYUTAKA MATSUO
Author information
CONFERENCE PROCEEDINGS FREE ACCESS

Details
Abstract

Natural language processing tasks targeting the SDGs (Sustainable Development Goals), which have started to influence social structures and corporate philosophy, have recently begun. Because of the lack of language resources, efforts in Japanese were difficult. In this study, we collected Japanese SDGs-related data from materials published by universities and created a data set. And the SDGs classification model was constructed. As the augmentation method, 1. a part-of-speech replacement using the BERT MASK model 2. A reverse translation method in which the English translation using Google transfer was translated into Japanese again was used. Classification was performed using a topic model (LDA etc.) which is a classical machine learning method and BERT etc. which is a deep learning model. The results show the results of the augmentation in the minority data task. Produces relatively high accuracy in a small number of data.

Content from these authors
© 2020 The Japanese Society for Artificial Intelligence
Previous article Next article
feedback
Top