論文ID: 678
The effectiveness of identifying the author of an illegal document by using text mining was investigated. The suspected writing evaluated in this study was a claim of responsibility written by a 14-year-old boy, which stated that he committed the “Kobe child murders” in 1997. It was compared with control writings including confessions, and an essay that we knew were written by the same boy, as well as with irrelevant materials including various essays written by five junior high school students, and claims of responsibility in four past criminal cases. First, the writings in each document were digitalized and converted to text files. Then, the relative frequencies of bigram of letters, bigram of part-of-speech taggers, sentence lengths of each document, and rate of using Kanji, Hiragana, and Katakana were calculated. Results of sammon multi-dimensional scaling and hierarchical cluster analysis indicated that the text in the suspected writing was arranged identically or similarly to groups of texts in control materials, where they were arranged differently from groups of texts in irrelevant materials. In a separate analysis, the suspected writing was substituted with a document written by a different offender and we conducted the identical procedure described above. Results demonstrated that texts in the suspected writing were in a different form control and irrelevant texts. These results indicated the effectiveness of identifying an author by using text mining when examining forensic documents.