SCIS & ISIS
SCIS & ISIS 2008
Session ID : SA-A2-3
Conference information

Spam Filtering with Active Feature Identification
*Masayuki OkabeSeiji Yamada
Author information
CONFERENCE PROCEEDINGS FREE ACCESS

Details
Abstract
This paper proposes a spam filtering method that utilizes active learning and feature identification. Identification of effective features are very important procedure in spam filtering because spam mail includes so much meaningless words that are slightly different from each other. Those words bring down much calculation cost and performance reduction in filtering process. Thus identifying effective and ineffective features is promising approach in spam filtering. However traditional feature selection methods calculate the score of features based on some amount of labeled training data. This assumption does not hold in the situation of spam filtering. Spam filtering process starts with non or few labeled data, and gradually increases labeled data using user feedback. We propose a method to identify effective features through this active learning process in spam filtering based on naive Bayes approach. Experimental results show that our method outperforms traditional method with no feature identification.
Content from these authors
© 2008 Japan Society for Fuzzy Theory and Intelligent Informatics
Previous article Next article
feedback
Top