Volume 13 (2014) Pages 1-11
Customer segmentation is a process that divides a business's total customers into groups according to their diversity of purchasing behavior and characteristics. The data mining clustering technique can be used to accomplish this customer segmentation. This technique clusters the customers in such a way that the customers in one group behave similarly when compared to the customers in other groups. The customer related data are categorical in nature. However, the clustering algorithms for categorical data are few and are unable to handle uncertainty. Rough set theory (RST) is a mathematical approach that handles uncertainty and is capable of discovering knowledge from a database. This paper proposes a new clustering technique called MADO (Minimum Average Dissimilarity between Objects) for categorical data based on elements of RST. The proposed algorithm is compared with other RST based clustering algorithms, such as MMR (Min-Min Roughness), MMeR (Min Mean Roughness), SDR (Standard Deviation Roughness), SSDR (Standard deviation of Standard Deviation Roughness), and MADE (Maximal Attributes DEpendency). The results show that for the real customer data considered, the MADO algorithm achieves clusters with higher cohesion, lower coupling, and less computational complexity when compared to the above mentioned algorithms. The proposed algorithm has also been tested on a synthetic data set to prove that it is also suitable for high dimensional data.