2020 Volume 30 Issue 3 Pages 390-400
Various stylometric features have been proposed in the field of authorship identification. In Japanese, for example, morpheme, POS n-grams, and phrase patterns proved to be an effective means of identifying the author. However, it is difficult to extract expression patterns pertaining to functions and usage in sentences, because these stylometric features are tabulated in terms of words, parts-of-speech, and phrases. This study proposes function phrases that represent functions and usages in the text as stylometric features, to prove their effectiveness in author identification. To demonstrate this, we created a corpus of 400 literary works of 20 authors and compared them with the existing stylometric features (morphemes, POS bigrams, particle distribution, and phrase patterns). The results of the study showed that the proposed stylometric features are effective in authorship identification.