IEICE Transactions on Information and Systems
Online ISSN : 1745-1361
Print ISSN : 0916-8532
Regular Section
New Word Detection Using BiLSTM+CRF Model with Features
Jianyong DUANZheng TANMei ZHANGHao WANG
Author information
JOURNAL FREE ACCESS

2020 Volume E103.D Issue 10 Pages 2228-2236

Details
Abstract

With the widespread popularity of a large number of social platforms, an increasing number of new words gradually appear. However, such new words have made some NLP tasks like word segmentation more challenging. Therefore, new word detection is always an important and tough task in NLP. This paper aims to extract new words using the BiLSTM+CRF model which added some features selected by us. These features include word length, part of speech (POS), contextual entropy and degree of word coagulation. Comparing to the traditional new word detection methods, our method can use both the features extracted by the model and the features we select to find new words. Experimental results demonstrate that our model can perform better compared to the benchmark models.

Content from these authors
© 2020 The Institute of Electronics, Information and Communication Engineers
Previous article Next article
feedback
Top