Proceedings of the Annual Conference of JSAI
Online ISSN : 2758-7347
36th (2022)
Session ID : 3P4-GS-2-04
Conference information

Construction of Japanese BERT with Fixed Token Embeddings
*Arata SUGANAMIHiroyuki SHINNOU
Author information
CONFERENCE PROCEEDINGS FREE ACCESS

Details
Abstract

In this paper, we propose to construct Japanese BERT by fixed token embedding in order to reduce the construction time of BERT. In particular, we propose to construct Japanese BERT by learning word embeddings in advance using word2vec, and then fixing the word Embeddings as the TokenEmbedding for BERT. In the experiments, we constructed 1024-dimensional 4-layer Japanese BERT using the conventional method and the proposed method, and verified the effectiveness of the proposed method by comparing the model construction time and the accuracy in the document classification task for Japanese news articles. The experimental results show that the proposed method reduces the construction time by 2.5%, improves the accuracy, and converges the accuracy in an early stage.

Content from these authors
© 2022 The Japanese Society for Artificial Intelligence
Previous article Next article
feedback
Top