Repeated nouns act as a means to guarantee continuity and unity of meaning in a text. The purpose of this report is to clarify the quantitative condition of such repeated nouns in writing. A Type Token Ratio (TTR) analysis is used in order to accomplish this goal.
The data used in this analysis is the “Publication Sub-corpus” and the “Library Sub-corpus” from the “Balanced Corpus of Contemporary Written Japanese” (Monitor Exhibition Data 2009).
As a result of the analysis, it is found that the percentage of repeated nouns in the data occupies between four and sixty one percent of a text, and the average of repeated nouns is approximately twenty five percent. These results show that most of the repeated nouns may not be used in a text. At the same time, even if a lot of repeated nouns are used in a text, nearly half of all nouns are new.
In addition, the majority of the texts with a low quantity of repeated nouns are written in first person. Likewise, texts with a large quantity of repeated nouns, mainly consist of legal documents and manuals concerning law.
抄録全体を表示