IEICE Transactions on Information and Systems
Online ISSN : 1745-1361
Print ISSN : 0916-8532
Regular Section
A Method for Recognizing Noisy Romanized Japanese Words in Learner English
Ryo NAGATAJun-ichi KAKEGAWAHiromi SUGIMOTOYukiko YABUTA
Author information
JOURNAL FREE ACCESS

2008 Volume E91.D Issue 10 Pages 2458-2466

Details
Abstract

This paper describes a method for recognizing romanized Japanese words in learner English. They become noise and problematic in a variety of systems and tools for language learning and teaching including text analysis, spell checking, and grammatical error detection because they are Japanese words and thus mostly unknown to such systems and tools. A problem one encounters when recognizing romanized Japanese words in learner English is that the spelling rules of romanized Japanese words are often violated. To address this problem, the described method uses a clustering algorithm reinforced by a small set of rules. Experiments show that it achieves an F-measure of 0.879 and outperforms other methods. They also show that it only requires the target text and an English word list of reasonable size.

Content from these authors
© 2008 The Institute of Electronics, Information and Communication Engineers
Previous article Next article
feedback
Top