Generating Concept Trees from Dynamic Self-organizing Map

Self-organizing map (SOM) provides both clustering and visualization capabilities in mining data. Dynamic self-organizing maps such as Growing Self-organizing Map (GSOM) has been developed to overcome the problem of fixed structure in SOM to enable better representation of the discovered patterns. However, in mining large datasets or historical data the hierarchical structure of the data is also useful to view the cluster formation at different levels of abstraction. In this paper, we present a technique to generate concept trees from the GSOM. The formation of tree from different spread factor values of GSOM is also investigated and the quality of the trees analyzed. The results show that concept trees can be generated from GSOM, thus, eliminating the need for re-clustering of the data from scratch to obtain a hierarchical view of the data under study.





References:
[1] T. Kohonen, "The self-organizing map," Proceedings of the IEEE,
vol. 78, no. 9, pp. 1464-1480-, 1990.
[2] D. Alahakoon, S. K. Halgamuge, and B. Srinivasan, "Dynamic selforganizing
maps with controlled growth for knowledge discovery," IEEE
Transactions on Neural Networks, vol. 11, no. 3, pp. 601-614, 2000.
[3] R. Amarasiri, D. Alahakoon, and K. A. Smith, "Hdgsom: a modified
growing self-organizing map for high dimensional data clustering," in
Fourth international conference on hybrid intelligent, 2004, pp. 216-
221.
[4] N. Ahmad, D. Alahakoon, and R. Chau, "Cluster identification and
separation in the growing self-organizing map: application in protein
sequence classification," Neural Computing and Applications, 2009.
[5] A. L. Hsu, S.-L. Tang, and S. K. Halgamuge, "An unsupervised hierarchical
dynamic self-organizing approach to cancer class discovery and
marker gene identification in microarray data," Bioinformatics, vol. 19,
no. 16, pp. 2131-2140, 2003.
[6] L. Wickramasinghe and L. Alahakoon, "A novel adaptive decision
making agent architecture inspired by human behavior and brain study
models," in Fourth International Conference on Hybrid Intelligent
Systems, 2004. HIS -04., Dec. 2004, pp. 142-147.
[7] C.-Y. Chen, Y.-J. Oyang, and H.-F. Juan, "Incremental generation of
summarized clustering hierarchy for protein family analysis," Bioinformatics,
vol. 20, no. 16, pp. 2586-2596, 2004.
[8] J. Han and M. Kamber, Data mining : concepts and techniques,
2nd ed., ser. The Morgan Kaufmann series in data management systems.
San Francisco, Calif. Oxford: Morgan Kaufmann ; Elsevier Science
[distributor], 2006, jiawei Han and Micheline Kamber. ill. ; 25 cm.
Previous ed.: 2000.
[9] R. S. Michalski, "Knowledge acquisition through conceptual clustering:
A theoretical framework and an algorithm for partitioning data into
conjunctive concepts," International Journal of Policy Analysis and
Information Systems, vol. 4, pp. 219-244, 1980.
[10] J. W. Tukey, Exploratory Data Analysis. Reading, MA: Addison-
Wesley, 1977.
[11] D. S. Moore, The Basic Practice of Statistics, 2nd ed. New York: W.
H. Freeman and Company, 2000.
[12] S. Theodoridis and K. Koutroumbas, Pattern recognition, 3rd ed. Amsterdam
; Boston: Elsevier/Academic Press, 2006, sergios Theodoridis
and Konstantinos Koutroumbas.ill. ; 24 cm.Includes bibliographical
references and index.
[13] W. M. Rand, "Objective criteria for the evaluation of clustering methods,"
Journal of the American Statistical Association, vol. 44, pp. 846-
850, 1971.
[14] P. Jaccard, "tude comparative de la distribution florale dans une portion
des alpes et des jura," Bulletin del la Socit Vaudoise des Sciences
Naturelles, vol. 37, pp. 547-579, 1901.
[15] D. H. Widyantoro, "Exploiting homogeneity of density in incremental
hierarchical clustering," in Proceedings of ITB Eng. Science, vol. 39 B,
no. 2, 2006, pp. 79-98.
[16] A. Asuncion and D. Newman, "Uci machine learning repository," 2007.
(Online). Available: http://www.ics.uci.edu/ mlearn/MLRepository.html