Abstract: Rough set theory is used to handle uncertainty and incomplete information by applying two accurate sets, Lower approximation and Upper approximation. In this paper, the rough clustering algorithms are improved by adopting the Similarity, Dissimilarity–Similarity and Entropy based initial centroids selection method on three different clustering algorithms namely Entropy based Rough K-Means (ERKM), Similarity based Rough K-Means (SRKM) and Dissimilarity-Similarity based Rough K-Means (DSRKM) were developed and executed by yeast dataset. The rough clustering algorithms are validated by cluster validity indexes namely Rand and Adjusted Rand indexes. An experimental result shows that the ERKM clustering algorithm perform effectively and delivers better results than other clustering methods. Outlier detection is an important task in data mining and very much different from the rest of the objects in the clusters. Entropy based Rough Outlier Factor (EROF) method is seemly to detect outlier effectively for yeast dataset. In rough K-Means method, by tuning the epsilon (ᶓ) value from 0.8 to 1.08 can detect outliers on boundary region and the RKM algorithm delivers better results, when choosing the value of epsilon (ᶓ) in the specified range. An experimental result shows that the EROF method on clustering algorithm performed very well and suitable for detecting outlier effectively for all datasets. Further, experimental readings show that the ERKM clustering method outperformed the other methods.
Abstract: We develop a three-step fuzzy logic-based algorithm for clustering categorical attributes, and we apply it to analyze cultural data. In the first step the algorithm employs an entropy-based clustering scheme, which initializes the cluster centers. In the second step we apply the fuzzy c-modes algorithm to obtain a fuzzy partition of the data set, and the third step introduces a novel cluster validity index, which decides the final number of clusters.
Abstract: Most of fuzzy clustering algorithms have some
discrepancies, e.g. they are not able to detect clusters with convex
shapes, the number of the clusters should be a priori known, they
suffer from numerical problems, like sensitiveness to the
initialization, etc. This paper studies the synergistic combination of
the hierarchical and graph theoretic minimal spanning tree based
clustering algorithm with the partitional Gath-Geva fuzzy clustering
algorithm. The aim of this hybridization is to increase the robustness
and consistency of the clustering results and to decrease the number
of the heuristically defined parameters of these algorithms to
decrease the influence of the user on the clustering results. For the
analysis of the resulted fuzzy clusters a new fuzzy similarity measure
based tool has been presented. The calculated similarities of the
clusters can be used for the hierarchical clustering of the resulted
fuzzy clusters, which information is useful for cluster merging and
for the visualization of the clustering results. As the examples used
for the illustration of the operation of the new algorithm will show,
the proposed algorithm can detect clusters from data with arbitrary
shape and does not suffer from the numerical problems of the
classical Gath-Geva fuzzy clustering algorithm.
Abstract: Interpretation of aerial images is an important task in
various applications. Image segmentation can be viewed as the essential
step for extracting information from aerial images. Among many
developed segmentation methods, the technique of clustering has been
extensively investigated and used. However, determining the number
of clusters in an image is inherently a difficult problem, especially
when a priori information on the aerial image is unavailable. This
study proposes a support vector machine approach for clustering
aerial images. Three cluster validity indices, distance-based index,
Davies-Bouldin index, and Xie-Beni index, are utilized as quantitative
measures of the quality of clustering results. Comparisons on the
effectiveness of these indices and various parameters settings on the
proposed methods are conducted. Experimental results are provided
to illustrate the feasibility of the proposed approach.