A Simple and Effective Method for Injecting Word-level Information into Character-aware Neural Language Models

Yukun Feng; Hidetaka Kamigaito; Hiroya Takamura; Manabu Okumura

doi:10.5715/jnlp.30.156

Abstract

In this study, we propose a simple and effective method to inject word-level information into character-aware neural language models. Unlike previous approaches, which typically inject word-level information as input to a long short-term memory (LSTM) network, we inject such information into the softmax function. The resultant model can be considered a combination of a character-aware language model and a simple word-level language model. Our injection method can be used in conjunction with previous methods. The results of experiments on 14 typologically diverse languages are provided to empirically show that our injection method performed better than previous methods that inject word-level information at the input, including a gating mechanism, averaging, and concatenation of word vectors. Our method can also be used together with previous injection methods. Finally, we provide a comprehensive comparison with previous injection methods and analyze the effectiveness of word-level information in character-aware language models and the properties of our injection method in detail.

Content from these authors

Licensed under CC BY 4.0
https://creativecommons.org/licenses/by/4.0/

Favorites & Alerts

Corresponding author

Register with J-STAGE for free!