Journal of Natural Language Processing
Online ISSN : 2185-8314
Print ISSN : 1340-7619
ISSN-L : 1340-7619
Paper
A Study on Constants of Natural Language Texts
Daisuke KimuraKumiko Tanaka-Ishii
Author information
JOURNAL FREE ACCESS

2011 Volume 18 Issue 2 Pages 119-137

Details
Abstract
This paper considers various measures which become constant for any large lengths of a given natural language text. Consideration of such measures gives some hints for studies of complexity of natural language. Previously, such measures have been studied mainly for relatively small English texts. In this work, we consider the measures for texts other than English and also for large scale texts. Among the measures, we consider Yule’s K, Orlov’s Z, and Golcher’s VM, which are previously empirically argued their convergence, and in addition, the entropy H, and r, the measure related to the scale-free network. Our experiments show that both K and VM are convergent for texts of various language, whereas the other measures are not.
Content from these authors
© 2011 The Association for Natural Language Processing
Previous article Next article
feedback
Top