Abstract: Data clustering is an important data exploration
technique with many applications in data mining. The k-means
algorithm is well known for its efficiency in clustering large data
sets. However, this algorithm is suitable for spherical shaped clusters
of similar sizes and densities. The quality of the resulting clusters
decreases when the data set contains spherical shaped with large
variance in sizes. In this paper, we introduce a competent procedure
to overcome this problem. The proposed method is based on shifting
the center of the large cluster toward the small cluster, and recomputing
the membership of small cluster points, the experimental
results reveal that the proposed algorithm produces satisfactory
results.
Abstract: Data clustering is an important data exploration technique
with many applications in data mining. We present an enhanced
version of the well known single link clustering algorithm. We will
refer to this algorithm as DCBOR. The proposed algorithm alleviates
the chain effect by removing the outliers from the given dataset.
So this algorithm provides outlier detection and data clustering
simultaneously. This algorithm does not need to update the distance
matrix, since the algorithm depends on merging the most k-nearest
objects in one step and the cluster continues grow as long as possible
under specified condition. So the algorithm consists of two phases;
at the first phase, it removes the outliers from the input dataset. At
the second phase, it performs the clustering process. This algorithm
discovers clusters of different shapes, sizes, densities and requires
only one input parameter; this parameter represents a threshold for
outlier points. The value of the input parameter is ranging from 0 to
1. The algorithm supports the user in determining an appropriate
value for it. We have tested this algorithm on different datasets
contain outlier and connecting clusters by chain of density points,
and the algorithm discovers the correct clusters. The results of
our experiments demonstrate the effectiveness and the efficiency of
DCBOR.
Abstract: Clustering algorithms are attractive for the task of class identification in spatial databases. However, the application to large spatial databases rises the following requirements for clustering algorithms: minimal requirements of domain knowledge to determine the input parameters, discovery of clusters with arbitrary shape and good efficiency on large databases. The well-known clustering algorithms offer no solution to the combination of these requirements. In this paper, a density based clustering algorithm (DCBRD) is presented, relying on a knowledge acquired from the data by dividing the data space into overlapped regions. The proposed algorithm discovers arbitrary shaped clusters, requires no input parameters and uses the same definitions of DBSCAN algorithm. We performed an experimental evaluation of the effectiveness and efficiency of it, and compared this results with that of DBSCAN. The results of our experiments demonstrate that the proposed algorithm is significantly efficient in discovering clusters of arbitrary shape and size.
Abstract: This study suggests a model of a new set of evaluation criteria that will be used to measure the efficiency of real-world E-commerce websites. Evaluation criteria include design, usability and performance for websites, the Data Envelopment Analysis (DEA) technique has been used to measure the websites efficiency. An efficient Web site is defined as a site that generates the most outputs, using the smallest amount of inputs. Inputs refer to measurements representing the amount of effort required to build, maintain and perform the site. Output is amount of traffic the site generates. These outputs are measured as the average number of daily hits and the average number of daily unique visitors.