Journal of Natural Language Processing
Online ISSN : 2185-8314
Print ISSN : 1340-7619
ISSN-L : 1340-7619
Paper
Extraction of Suppositional Adverb and Clause-Final Modality Form Distant Collocations Using a Web Corpus and Corpus Query System and its Application to Japanese Language Learning
Irena SrdanovićBor HodoščekAndrej BekešKikuko Nishina
Author information
JOURNAL FREE ACCESS

2009 Volume 16 Issue 4 Pages 4_29-4_46

Details
Abstract
A systematic account of Japanese language modality forms as well as distant collocations between modal adverbs and clause-final modality forms is lacking in the field of natural language processing. The same stands for coverage of this kind of linguistic information in Japanese language education. In order to remedy this deficiency, in this paper we make extraction of Japanese adverbs and clause-final modality forms collocations possible using the corpus query system Sketch Engine and examine possibilities for its application in Japanese language learning, focusing on learner’s dictionaries. First, as a result of analyzing various Japanese language corpora, we create a long list of modality forms and their variations. Then, we examine how ChaSen morphologically analyzes the forms and retag a sample of the large-scale Japanese language web corpus, JpWaC, by grouping all morphemes that correspond to individual modality forms together under a new modality tag. Finally, we load the newly tagged corpus into the Sketch Engine (SkE), modify the gramrel file and as a result obtain Word Sketch results for collocations between suppositional adverbs and modality forms. The evaluation of the collocation results shows that the proposed method reaches accuracy of above 93%. The results can be utilized in the creation of Japanese learners’ dictionaries or other language material or directly in language teaching or learning.
Content from these authors
© 2009 The Association for Natural Language Processing
Previous article Next article
feedback
Top