Journal of Natural Language Processing
Online ISSN : 2185-8314
Print ISSN : 1340-7619
ISSN-L : 1340-7619
General Paper (Peer-Reviewed)
Universal Dependencies for Japanese Based on Long-Unit Words by NINJAL
Mai OmuraAya WakasaMasayuki Asahara
Author information
JOURNAL FREE ACCESS

2023 Volume 30 Issue 1 Pages 4-29

Details
Abstract

Universal dependencies (UD) are part of an international project that aims to construct cross-lingual dependency treebanks. The consistent annotation standards of grammar (parts of speech, morphological features, and syntactic dependencies) are defined across different languages and compiled as treebanks of more than 100 languages. The languages written without word delimitation must define the word units of their syntactic words on the UD guideline. The preceding UD Japanese resources are based on the short-unit words by NINJAL, which is defined by their lexicon-based morphology. This study introduces UD Japanese resources UD_Japanese_BCCWJ-GSDLUW, UD_Japanese_PUDLUW, and UD_Japanese_BCCWJLUW based on the long-unit words by NINJAL, which are more suitable as syntactic words than NINJAL’s short-unit words in Japanese.

Content from these authors
© 2023 The Association for Natural Language Processing
Previous article Next article
feedback
Top