大規模コーパスに基づく接続詞の連続パターンのレジスター分析

小林 雄一郎; 岡﨑 友子

doi:10.24701/mathling.35.3_38

Abstract

This study aims to conduct a corpus-based register analysis of the Japanese language, targeting a wide range of registers from everyday language use to specialized domains. Utilizing multiple corpora, including the Balanced Corpus of Contemporary Written Japanese (BCCWJ) and the Corpus of Spontaneous Japanese (CSJ), the research compares the sequential patterns of conjunctions across different corpora and registers. Employing several statistical methods such as correspondence analysis, cluster analysis, and random forests, the study identifies linguistic features that distinguish written and spoken Japanese across various registers. The findings reveal significant differences in usage between written and spoken language, as well as variation between monologue and dialogue data.

Content from these authors

この記事はクリエイティブ・コモンズ [表示 - 非営利 - 改変禁止 4.0 国際]ライセンスの下に提供されています。
https://creativecommons.org/licenses/by-nc-nd/4.0/deed.ja

Favorites & Alerts

Corresponding author

Register with J-STAGE for free!