2016 Volume 28 Issue 5 Pages 875-886
In recent years, there have been growing needs for computers which comprehend what is meant in humorous texts. However, we have few examples of research that have tried to detect puns from a large corpora of spoken language. A sampling survey of typology and component ratio analysis in Japanese puns revealed that the type of Japanese pun that had the largest proportion was a pun type with two sound sequences, whose consonants are phonetically close to each other in the same sentence which includes the pun. Based on this finding, we constructed three rules to detect pairs of phonetically similar sequences, and used them as features for SVM. Using these features in addition to bag-of-words features, an evaluation experiment confirmed the effectiveness of adding the three phonetic similarity features to the baseline classifier.