文長分布型と係り受け関係に基づいた文構造の解析

古橋 翔; 早川 美徳

doi:10.24701/mathling.28.7_250

Abstract

In previous studies about Japanese sentence length, two different types of the model for sentence generation have been proposed by Sasaki: one is a multiplicative stochastic model resulting in log-normal sentence length distribution and the other is an additive stochastic model resulting in the negative binomial sentence length distribution.
In the present study, motivated by Sasaki's suggestion, we examined the structure of dependency trees and checked whether those models could explain the obtained structure of dependency trees. To do that, we used Kyoto University Text Corpus (33,082 sentences) which includes the information of dependency relations among segments.
As a result, we found that the structure of the dependency trees did not accord with the expectation of multiplicative nor additive stochastic process.

Content from these authors

この記事はクリエイティブ・コモンズ [表示 - 非営利 - 改変禁止 4.0 国際]ライセンスの下に提供されています。
https://creativecommons.org/licenses/by-nc-nd/4.0/deed.ja

Favorites & Alerts

Corresponding author

Register with J-STAGE for free!