BERTを用いた言語横断型評判分析手法の比較

吉野 弘泰; 古宮 嘉那子

doi:10.11517/pjsai.JSAI2022.0_3Yin226

Abstract

This paper compares three cross-lingual sentiment analysis methods using various Bidirectional Encoder Representations from Transformers (BERT) models to predict product review ratings for low-resource languages. Sentiment analysis involves the task of predicting review ratings. Cross-lingual approaches have been studied because the prediction accuracy tends to be low due to the lack of training data for low-resource languages. Meanwhile, recently, the approaches that fine-tune pre-trained models attract attention. In particular, BERT achieved high performances for various tasks. Therefore, we compared cross-lingual methods that use various BERT with a large amount of English data and a small amount of Japanese data to improve the prediction accuracy of Japanese review ratings. We compared three methods: method A, which uses English BERT fine-tuned with English reviews and Japanese test data translated into English, method B, which uses Japanese BERT fine-tuned with English reviews translated into Japanese, and method C, which uses English-Japanese parallel data to fine-tune the multi-lingual BERT. The translation was conducted using machine translation. Compared to the baseline model, which uses Japanese BERT fine-tuned with only a small amount of Japanese data, we found that method B improved performance, while methods A and C lost performances.

Content from these authors

Favorites & Alerts

Corresponding author

Conference information

Register with J-STAGE for free!