Journal of Natural Language Processing
Online ISSN : 2185-8314
Print ISSN : 1340-7619
ISSN-L : 1340-7619
Paper
A Simple Approach to Unknown Word Processing in Japanese Morphological Analysis
Ryohei SasanoSadao KurohashiManabu Okumura
Author information
JOURNAL FREE ACCESS

2014 Volume 21 Issue 6 Pages 1183-1205

Details
Abstract

This paper presents a simple but effective approach to unknown word processing in Japanese morphological analysis, which handles 1) unknown words that are derived from words in a pre-defined lexicon and 2) unknown onomatopoeias. Our approach leverages derivation rules and onomatopoeia patterns, and correctly recognizes certain types of unknown words. Experiments revealed that our approach recognized about 4,500 unknown words in 100,000 Web sentences with only roughly 80 harmful side effects and a 6% loss in speed.

Content from these authors
© 2014 The Association for Natural Language Processing
Previous article Next article
feedback
Top