IEICE Transactions on Information and Systems
Online ISSN : 1745-1361
Print ISSN : 0916-8532
Regular Section
BRsyn-Caps: Chinese Text Classification Using Capsule Network Based on Bert and Dependency Syntax
Jie LUOChengwan HEHongwei LUO
Author information
JOURNAL FREE ACCESS

2024 Volume E107.D Issue 2 Pages 212-219

Details
Abstract

Text classification is a fundamental task in natural language processing, which finds extensive applications in various domains, such as spam detection and sentiment analysis. Syntactic information can be effectively utilized to improve the performance of neural network models in understanding the semantics of text. The Chinese text exhibits a high degree of syntactic complexity, with individual words often possessing multiple parts of speech. In this paper, we propose BRsyn-caps, a capsule network-based Chinese text classification model that leverages both Bert and dependency syntax. Our proposed approach integrates semantic information through Bert pre-training model for obtaining word representations, extracts contextual information through Long Short-term memory neural network (LSTM), encodes syntactic dependency trees through graph attention neural network, and utilizes capsule network to effectively integrate features for text classification. Additionally, we propose a character-level syntactic dependency tree adjacency matrix construction algorithm, which can introduce syntactic information into character-level representation. Experiments on five datasets demonstrate that BRsyn-caps can effectively integrate semantic, sequential, and syntactic information in text, proving the effectiveness of our proposed method for Chinese text classification.

Content from these authors
© 2024 The Institute of Electronics, Information and Communication Engineers
Previous article Next article
feedback
Top