2023 Volume 30 Issue 2 Pages 713-747
The meaning and usage of words change over time. One method of analyzing these changes is to group word tokens by their meanings in each period and compare their usage rates. Several methods of this kind have been used to analyze semantic changes in English, but they have not yet been applied to Japanese. In addition, the methods have not been compared. Therefore, the performance of this method on Japanese and the conditions under which each method is effective have not been clarified. Thus, we conducted the following experiments on Japanese words. We applied a supervised grouping method using a dictionary and an unsupervised grouping method using clustering to context-dependent vectors in the BERT model and compared them. We also pre-trained BERT on a diachronic corpus and analyzed the diachronic features captured by the context-dependent vectors in BERT. The results of the comparison and analysis showed that in the absence of a well-developed dictionary, the clustering-based method was better able to capture semantic change. Furthermore, it was found that fine-tuning with a diachronic corpus can be used to capture semantic changes in older periods. However, it was also found that some words with usages that did not appear in the older period could not always be captured.