事前学習モデルBERTによる法令用語の校正

山腰 貴大; 駒水 孝裕; 小川 泰弘; 外山 勝彦

doi:10.11517/pjsai.JSAI2020.0_4P3OS805

Abstract

Legal documents contain legal terms that have similar meaning or pronunciation each other. Japanese legislation defines their usage on the basis of traditional customs and rules. In accordance with the definition, we need to use these legal terms properly and strictly in a statute. We are also encouraged to follow the definition in writing broad-sense legal documents, such as contracts and terms of use. To assist in writing legal documents, we propose a method that locates inappropriate legal terms in Japanese statutory sentences and suggests corrections. We solve this task with a classifier by regarding the task as a sentence completion test. Our classifier is based on a pretrained BERT model trained by using a large amount of general sentences. To raise performance, we apply three training techniques: domain adaptation, undersampling, classifier unification. Our experiments show that our classifier achieved better performance than Random Forest-based ones and language model-based ones.

Content from these authors

Favorites & Alerts

Corresponding author

Conference information

Register with J-STAGE for free!