Performance Comparison of Particle Swarm Optimization with Traditional Clustering Algorithms used in Self-Organizing Map

Self-organizing map (SOM) is a well known data reduction technique used in data mining. It can reveal structure in data sets through data visualization that is otherwise hard to detect from raw data alone. However, interpretation through visual inspection is prone to errors and can be very tedious. There are several techniques for the automatic detection of clusters of code vectors found by SOM, but they generally do not take into account the distribution of code vectors; this may lead to unsatisfactory clustering and poor definition of cluster boundaries, particularly where the density of data points is low. In this paper, we propose the use of an adaptive heuristic particle swarm optimization (PSO) algorithm for finding cluster boundaries directly from the code vectors obtained from SOM. The application of our method to several standard data sets demonstrates its feasibility. PSO algorithm utilizes a so-called U-matrix of SOM to determine cluster boundaries; the results of this novel automatic method compare very favorably to boundary detection through traditional algorithms namely k-means and hierarchical based approach which are normally used to interpret the output of SOM.




References:
[1] A. Rauber and D. Merkl, Automatic Labelling of Self-Organizing
Maps: Making a Treasure Maps Reveal Its Secrets, in Proc. 4th Pacific-
Asia Conference on Knowledge Discovery and Data Mining,
PAKDD99 (Beijing, China, 1999).
[2] A. Carlistle, and G. Dozier. Adapting Particle Swarm Optimization to
Dynamic Environments [Online]. Available: http://www.CartistleA.edu
(1998).
[3] A. Sharma, and W. C. Omlin, Determining cluster boundary using
Particle Swarm Optimization, in Proc. World Academy of Science,
Engineering and Technology, vol. 15 (2006) 250-254.
[4] B. Jiang, and L. Harrie, Selection of streets form a network using self-
Organizing maps, Transactions in GIS, Vol. 8, No. 3 (2004) 335-350.
[5] C. Rosenberger and K. Chehdi, Unsupervised Clustering Method with
Optimal Estimation of the Number of Clusters: Application to Image
Segmentation, International Conference on Pattern Recognition
(ICPR'00), vol. 1 (2000) 1656-1659
[6] F. De la Torre Frade and T. Kanade, Discriminative Cluster Analysis,
International Conference on Machine Learning, ACM Press, New York,
NY, USA, Vol. 148 (June, 2006) 241 - 248.
[7] J. Vesanto, and E. Alhoniemi, Clustering of the Self-Organizing Map,
IEEE transaction on neural network, Vol. 11, No. 3 (May 2000).
[8] J. Vesanto, and E. Alhoniemi, Clustering of the Self-Organizing Map,
IEEE Transaction on Neural Neetworks, Vol. 11, No. 3 (2000) 586-
600.
[9] J. Hollmén, Process Modelling using the Self-Organizing Map, M.Sc.
thesis, Dept. Computer Science, Helsinki Univ. of Technology, Finland,
1996.
[10] J. Vesanto, J. Himberg, E. Alhoniemi, and J. Parhankangas, "SOM
Toolbox for Matlab 5," ISBN 951-22-4951-0, ISSN 1456-2243 (Espoo,
Finland, 2000).
[11] J. Kennedy, and R. C. Eberhart, The particle swarm: social adaptation in
information processing systems, In Corne, D., Dorigo, M., and Glover,
F., Eds., New Ideas in Optimization (London: McGraw-Hill, 1999) 379-
387.
[12] J. Vesanto, and M. Sulkava, Distance matrix based clustering of the
self-organizing map, in Proc. International Conference on Artificial
Neural Networks - ICANN 2002, Lecture Notes in Computer Science,
No. 2415 (Springer-Verlag, 2002) 951-956.
[13] J. Han and M. Kambler, Introduction, Data mining: concepts and
techniques (Morgan Kaufmann Publishers, 2001).
[14] J. Han and M. Kambler, Cluster Analysis, Data mining: concepts and
techniques, (Morgan Kaufmann Publishers, 2001).
[15] J. V. Oliveira and W. Pedrycz, Fundamentals of Fuzzy Clustering,
Advances in Fuzzy Clustering and its Applications (John Wiley & Sons,
Ltd, 2007).
[16] J. V. Oliveira and W. Pedrycz, Visualization of Clustering Results,
Advances in Fuzzy Clustering and its Applications (John Wiley & Sons,
Ltd, 2007).
[17] K. P. Malay, and S. Bandyopadhyay, and M. Ujjwal, "Validity index for
crisp and fuzzy clusters," Pattern Recognition, Vol. 37, No. 3 (2004)
487-501.
[18] K. L. Wu, and M. S. Yang, A cluster validity index for fuzzy clustering,
Pattern Recognition Letters, Vol 26, Issue 9 (2005).
[19] M. G. H. Omran, A. P. Engelbrecht, and A. Salman, Dynamic
Clustering using Particle Swarm Optimization with Application in
Unsupervised Image Classification, transactions on engineering,
computing and technology, vol. 9 (2005).
[20] M. Kantardzic, Cluster Analysis, Data Mining - concepts, models,
methods, and algrorithms (Wiley InterScience, 2003) 129-132.
[21] O. Abidogun, Data Mining, Fraud Detection and Mobile
Telecommunication: Call pattern Analysis with Unsupervised Neural
Networks, M.Sc. thesis, University of the Western Cape, Bellville,
Cape Town, South Africa, 2004.
[22] R. J. Kuo, K. Chang, and S. Y. Chien, Integration of Self-Organizing
Feature Maps and Genetic-Algorithm-Based Clustering Method for
Market Segmentation, Journal of Organizational Computing and
Electronic Commerce (2002).
[23] S. Bandyopadhyay, Simulated Annealing Using a Reversible Jump
Markov Chain Monte Carlo Algorithm for Fuzzy Clustering, IEEE
transactions on knowledge and data engineering, Vol. 17, No. 4 (2005)
479-490.
[24] T. Kohonen, Self-Organizing Maps. Springer-Verlag (Berlin, Germany,
2001).