Abstract: Lacking an inherent “natural" dissimilarity measure
between objects in categorical dataset presents special difficulties in
clustering analysis. However, each categorical attributes from a given
dataset provides natural probability and information in the sense of
Shannon. In this paper, we proposed a novel method which
heuristically converts categorical attributes to numerical values by
exploiting such associated information. We conduct an experimental
study with real-life categorical dataset. The experiment demonstrates
the effectiveness of our approach.