Proceedings of the Fuzzy System Symposium
25th Fuzzy System Symposium
Session ID : 2D2-03
Conference information

Examination of text classification using a relativity caluculation of keyword and category, based on the serach engine results
*Akira OmoriHajime Nobuhara
Author information
CONFERENCE PROCEEDINGS FREE ACCESS

Details
Abstract
In this paper, we examined methods to classify the texts by using internet search engine's results of "and search" between keyword and category. We setup two functions which determines the keywords classification. First function simply uses the number of "and search" results. Second function uses the number of "and search" results as a numerator and uses the number of category's search results as a denominator. They both classify the keyword to the highest valued category. For the text classification, first we use these functions to classify the keywords which are in the text, then classify the text to the most classified category of the keywords. We also sort the keywords by its frequency or TF-IDF and use top 10,30, or 50\% of the keywords to classify the text. We compared precision of all of the combination of the text classification. We used internet news for test data set and the result of the test shows a future possibilities to be used in text classification.
Content from these authors
© 2009 Japan Society for Fuzzy Theory and Intelligent Informatics
Previous article Next article
feedback
Top