Bulletin of Data Analysis of Japanese Classification Society
Online ISSN : 2434-3382
Print ISSN : 2186-4195
Article
Authorship Attribution Using the Nucleus Bunsetsu as Stylometric Features in Japanese Writings
Yejia LiuMingzhe Jin
Author information
JOURNAL FREE ACCESS FULL-TEXT HTML

2023 Volume 12 Issue 1 Pages 33-46

Details
Abstract

Most of the existing stylistic features of Japanese writings used to attribute authorship are based on various linguistic units that constitute sentence elements like characters and words. However, attempts to convert the structural characteristics of the sentence into stylometric features are limited and not quite effective in distinguishing authorship. Following the earlier research, we represented Japanese sentences by dendrograms branched over the predicative clause. We defined the root nodes and clauses sprouting directly from those root nodes as “Nucleus Bunsetsu” and then proposed a series of new stylometric features called the NBSs. To examine the effectiveness of our proposal, we compared the attributional accuracies of the NBSs and Phrase Pattern, a clause-based stylometric feature, over a corpus containing the works of ten contemporary authors belonging to two literary genres, i.e., the novel and essay. The results revealed that, although our approach was narrowly outperformed by Phrase Pattern when there were two suspected authors, it turned the tables on opponent by 2% when there were ten candidates. Therefore, we concluded that the dependency structure-derived stylometric feature is sufficiently effective for authorship attribution and can reflect a new attempt to capture authorial idiosyncrasies, which might be overlooked by the existing ones.

Content from these authors
© 2023 Japanese Classification Society
Previous article
feedback
Top