Transactions of Japan Society of Kansei Engineering
Online ISSN : 1884-5258
ISSN-L : 1884-0833
Original Articles
Japanese Stopword List Making for Keyword Extraction Suitable for Semantic Interpretation
Hisatsugu KOKUBUHaruko YAMAZAKIMasashi NOSAKA
Author information
JOURNAL FREE ACCESS

2013 Volume 12 Issue 4 Pages 511-518

Details
Abstract
Extracting keywords from a target text data is essential for an analysis to describe substance characteristics of message content. We picked a use of a stopword filter from among alternatives because the method has the advantage that it is simple yet effective way. The filter we present was made up of non-content words and low-content words. Non-content-bearing words consisted mainly of function words and were gotten rid of by using part-of-speech (POS) tag information. High occurrence rate words in remaining had prospects of being keywords, however usually there were some low-content words like delexical verbs and so on. This article presents a stopword list obtained to come up with low-content words by sensuous manual procedures carried out using 40 text files from the CASTEL/J database and establishes it in the view of general versatility.
Content from these authors
© 2013 Japan Society of Kansei Engineering
Previous article Next article
feedback
Top