子音の音韻類似性及びSVMを用いた駄洒落検出手法

谷津 元樹; 荒木 健治

doi:10.3156/jsoft.28.875

Abstract

In recent years, there have been growing needs for computers which comprehend what is meant in humorous texts. However, we have few examples of research that have tried to detect puns from a large corpora of spoken language. A sampling survey of typology and component ratio analysis in Japanese puns revealed that the type of Japanese pun that had the largest proportion was a pun type with two sound sequences, whose consonants are phonetically close to each other in the same sentence which includes the pun. Based on this finding, we constructed three rules to detect pairs of phonetically similar sequences, and used them as features for SVM. Using these features in addition to bag-of-words features, an evaluation experiment confirmed the effectiveness of adding the three phonetic similarity features to the baseline classifier.

Content from these authors

Favorites & Alerts

Corresponding author

Register with J-STAGE for free!