2025 Volume 35 Issue 3 Pages 38-53
This study aims to conduct a corpus-based register analysis of the Japanese language, targeting a wide range of registers from everyday language use to specialized domains. Utilizing multiple corpora, including the Balanced Corpus of Contemporary Written Japanese (BCCWJ) and the Corpus of Spontaneous Japanese (CSJ), the research compares the sequential patterns of conjunctions across different corpora and registers. Employing several statistical methods such as correspondence analysis, cluster analysis, and random forests, the study identifies linguistic features that distinguish written and spoken Japanese across various registers. The findings reveal significant differences in usage between written and spoken language, as well as variation between monologue and dialogue data.