Fuzzy Clustering of Categorical Attributes and its Use in Analyzing Cultural Data

We develop a three-step fuzzy logic-based algorithm for clustering categorical attributes, and we apply it to analyze cultural data. In the first step the algorithm employs an entropy-based clustering scheme, which initializes the cluster centers. In the second step we apply the fuzzy c-modes algorithm to obtain a fuzzy partition of the data set, and the third step introduces a novel cluster validity index, which decides the final number of clusters.





References:
[1] T. Morzy, Wojciechowski M., & Zakrzewicz M., Scalable hierarchical clustering method for sequences of categorical values, Lecture Notes in Artificial Intelligence, 2035, 2001, 282-293. [2] S. Guha, R. Rastogi, & K. Shim, ROCK: A robust clustering algorithm for a categorical attributes, Information Systems, 25(5), 2000, 345-366. [3] K. Mali, & M. Sushmita, ''Clustering of symbolic data and its validation, Lecture Notes in Artificial Intelligence, 2275, 2002, 339-344. [4] H. Ralambondrainy, A conceptual version of the k-means algorithm'', Pattern Recognition Letters, 16, 1995, 1147-1157. [5] Z. Huang, Extensions of the k-means algorithm for clustering large data sets with categorical values, Data Mining and Knowledge Discovery, 2, 1998, 283-304. [6] Z. Huang, & M. K. Ng, A fuzzy k-modes algorithm for clustering categorical data, IEEE Transactions on Fuzzy Systems, 7(4), 1999, 446-452. [7] J. Yao, M. Dash, S. T. Tan, and H. Liu, Entropy-based fuzzy clustering and fuzzy modeling, Fuzzy Sets and Systems, 113, 2000, 381-388. [8] L. X. Xie, G. Beni, A validity measure for fuzzy clustering, IEEE Trans. PAMI, 13, 1991, 841-847. [9] K. Cushner, R.W. Brislin, Improving intercultural interactions: Modules for training programs, Vol. 2, Sage Publications, 1997. [10] Y.Y. Kim, Communication and cross-cultural adaptation: an integrative theory, Multilingual Matters Ltd, England, 1988.