Mathematical Linguistics
Online ISSN : 2433-0302
Print ISSN : 0453-4611
Paper (B) to the Special Issue
Factors Involved in Placing Comma Immediately Following a Conjunction
Modeling and Evaluation by Using Elastic Net
Takuya Iwasaki
Author information
JOURNAL OPEN ACCESS

2018 Volume 31 Issue 6 Pages 426-442

Details
Abstract

Commas in Japanese Sentences are used arbitrarily because orthographic rules haven't permeated through the Japanese society. In this paper, I attempt to reveal the reasons and the factors of why and where to use comma in Japanese sentences. The prediction model used in this paper is constructed from "The Balanced Corpus of Contemporary Written Japanese (BCCWJ)" core data by using the generalized linear models with the Elastic Net, which is a dynamic blending of lasso and ridge regression. I ran the cross-validation with 10 folds for assessing the model quality and in this way, we can prevent the overfitting of the model. The model can also deal with variables with a large amount of information. By reclassifying the original data using this constructed model, the result is that the recall ratio came out to be 78.99%. In conclusion, I claim in this paper that the strongest indicators according to the coefficient plot of how to place a comma after a conjunction are in the following situations: 1. After the lexeme de. 2. When a conjunction is at the beginning of the sentence. 3. After the lexeme ga. 4. After the lexeme shikashinagara. 5. Also, commas are used more frequently in the register "white papers".

Content from these authors
© Mathematical Linguistic Society of Japan

この記事はクリエイティブ・コモンズ [表示 - 非営利 - 改変禁止 4.0 国際]ライセンスの下に提供されています。
https://creativecommons.org/licenses/by-nc-nd/4.0/deed.ja
Previous article Next article
feedback
Top