Proceedings of the Annual Conference of JSAI
Online ISSN : 2758-7347
34th (2020)
Session ID : 3D1-OS-22a-02
Conference information

Paragraph Segmentation for Novels using BERT with Focal Loss
*Riku IIKURAMakoto OKADANaoki MORI
Author information
CONFERENCE PROCEEDINGS FREE ACCESS

Details
Abstract

We worked on the problem of paragraph segmentation from the perspective of understanding the content of novels. Estimating the paragraph of a text can be considered as a binary classification problem regarding whether the two sentences concerned belong to the same paragraph. In that case, the number of paragraphs is small relative to the number of sentences. Therefore it is necessary to consider the imbalance in the number of data. We applied the Bidirectional Encoder Representations from Transformer (BERT), which has shown high accuracy in various natural language processing tasks, to the paragraph segmentation problem. We improved the performance of the model by using focal loss as the loss function of the classifier. As a result, the effectiveness of the proposed model was confirmed in datasets made for this work. In addition, the value of each evaluation metrics was improved by expanding the range of input sentences for the model.

Content from these authors
© 2020 The Japanese Society for Artificial Intelligence
Previous article Next article
feedback
Top