Abstract: Web usage mining algorithms have been widely
utilized for modeling user web navigation behavior. In this study we
advance a model for mining of user-s navigation pattern. The model
makes user model based on expectation-maximization (EM)
algorithm.An EM algorithm is used in statistics for finding maximum
likelihood estimates of parameters in probabilistic models, where the
model depends on unobserved latent variables. The experimental
results represent that by decreasing the number of clusters, the log
likelihood converges toward lower values and probability of the
largest cluster will be decreased while the number of the clusters
increases in each treatment.
Abstract: In this paper, a clustering algorithm named KHarmonic
means (KHM) was employed in the training of Radial
Basis Function Networks (RBFNs). KHM organized the data in
clusters and determined the centres of the basis function. The popular
clustering algorithms, namely K-means (KM) and Fuzzy c-means
(FCM), are highly dependent on the initial identification of elements
that represent the cluster well. In KHM, the problem can be avoided.
This leads to improvement in the classification performance when
compared to other clustering algorithms. A comparison of the
classification accuracy was performed between KM, FCM and KHM.
The classification performance is based on the benchmark data sets:
Iris Plant, Diabetes and Breast Cancer. RBFN training with the KHM
algorithm shows better accuracy in classification problem.
Abstract: The paper presents an analysis of linkages and
structures of co-operation and their intensity like the potential for the
establishment of clusters in the Central and Eastern (Pannonian)
Croatian. Starting from the theoretical elaboration of the need for
entrepreneurs to organize through the cluster model and the terms of
their self-actualization, related to the importance of traditional values
in terms of benefits, social capital and assess where the company now
is, in order to prove the need to create their own identity in terms of
clustering. The institutional dimensions of social capital where the
public sector has the best role in creating the social structure of
clusters, and social dimensions of social capital in terms of trust,
cooperation and networking will be analyzed to what extent the trust
and coherency are present between companies in the Brod posavina
and Pozega slavonia County, expressed through the readiness of
inclusion in clusters in the NUTS II region - Central and Eastern
(Pannonian) Croatia, as a homogeneous economic entity, with
emphasis on limiting factors that stand in the way of greater
competitiveness.
Abstract: A traffic light gives security from traffic congestion,reducing the traffic jam, and organizing the traffic flow. Furthermore,increasing congestion level in public road networks is a growingproblem in many countries. Using Intelligent Transportation Systemsto provide emergency vehicles a green light at intersections canreduce driver confusion, reduce conflicts, and improve emergencyresponse times. Nowadays, the technology of wireless sensornetworks can solve many problems and can offer a good managementof the crossroad. In this paper, we develop a new approach based onthe technique of clustering and the graphical possibilistic fusionmodeling. So, the proposed model is elaborated in three phases. Thefirst one consists to decompose the environment into clusters,following by the fusion intra and inter clusters processes. Finally, wewill show some experimental results by simulation that proves theefficiency of our proposed approach.KeywordsTraffic light, Wireless sensor network, Controller,Possibilistic network/Bayesain network.
Abstract: It is important problems to increase the detection rates
and reduce false positive rates in Intrusion Detection System (IDS).
Although preventative techniques such as access control and
authentication attempt to prevent intruders, these can fail, and as a
second line of defence, intrusion detection has been introduced. Rare
events are events that occur very infrequently, detection of rare
events is a common problem in many domains. In this paper we
propose an intrusion detection method that combines Rough set and
Fuzzy Clustering. Rough set has to decrease the amount of data and
get rid of redundancy. Fuzzy c-means clustering allow objects to
belong to several clusters simultaneously, with different degrees of
membership. Our approach allows us to recognize not only known
attacks but also to detect suspicious activity that may be the result of
a new, unknown attack. The experimental results on Knowledge
Discovery and Data Mining-(KDDCup 1999) Dataset show that the
method is efficient and practical for intrusion detection systems.
Abstract: This paper presents a text clustering system developed based on a k-means type subspace clustering algorithm to cluster large, high dimensional and sparse text data. In this algorithm, a new step is added in the k-means clustering process to automatically calculate the weights of keywords in each cluster so that the important words of a cluster can be identified by the weight values. For understanding and interpretation of clustering results, a few keywords that can best represent the semantic topic are extracted from each cluster. Two methods are used to extract the representative words. The candidate words are first selected according to their weights calculated by our new algorithm. Then, the candidates are fed to the WordNet to identify the set of noun words and consolidate the synonymy and hyponymy words. Experimental results have shown that the clustering algorithm is superior to the other subspace clustering algorithms, such as PROCLUS and HARP and kmeans type algorithm, e.g., Bisecting-KMeans. Furthermore, the word extraction method is effective in selection of the words to represent the topics of the clusters.
Abstract: The article presents a new method for detection of
artificial objects and materials from images of the environmental
(non-urban) terrain. Our approach uses the hue and saturation (or Cb
and Cr) components of the image as the input to the segmentation
module that uses the mean shift method. The clusters obtained as the
output of this stage have been processed by the decision-making
module in order to find the regions of the image with the significant
possibility of representing human. Although this method will detect
various non-natural objects, it is primarily intended and optimized for
detection of humans; i.e. for search and rescue purposes in non-urban
terrain where, in normal circumstances, non-natural objects shouldn-t
be present. Real world images are used for the evaluation of the
method.
Abstract: The effect of SnO2 surface modification by Ag nanoclusters, synthesized by SILD method, on the operating characteristics of thin film gas sensors was studied and models for the promotional role of Ag additives were discussed. It was found that mentioned above approach can be used for improvement both the sensitivity and the rate of response of the SnO2-based gas sensors to CO and H2. At the same time, the presence of the Ag clusters on the surface of SnO2 depressed the sensor response to ozone.
Abstract: This paper presents the benchmarking results and
performance evaluation of differentclustersbuilt atthe National Center
for High-Performance Computingin Taiwan. Performance of
processor, memory subsystem andinterconnect is a critical factor in the
overall performance of high performance computing platforms. The
evaluation compares different system architecture and software
platforms. Most supercomputer used HPL to benchmark their system
performance, in accordance with the requirement of the TOP500 List.
In this paper we consider system memory access factors that affect
benchmark performance, such as processor and memory
performance.We hope these works will provide useful information for
future development and construct cluster system.
Abstract: Many experimental results suggest that more precise spike timing is significant in neural information processing. We construct a self-organization model using the spatiotemporal pat-terns, where Spike-Timing Dependent Plasticity (STDP) tunes the conduction delays between neurons. We show that, for highly syn-chronized inputs, the fluctuation of conduction delays causes globally continuous and locally distributed firing patterns through the self-organization.
Abstract: Tofurther advance research on immune-related genes
from T. molitor, we constructed acDNA library and analyzed
expressed sequence taq (EST) sequences from 1,056 clones. After
removing vector sequence and quality checkingthrough thePhred
program (trim_alt 0.05 (P-score>20), 1039 sequences were generated.
The average length of insert was 792 bp. In addition, we identified 162
clusters, 167 contigs and 391 contigs after clustering and assembling
process using a TGICL package. EST sequences were searchedagainst
NCBI nr database by local BLAST (blastx, E
Abstract: The deterministic quantum transfer-matrix (QTM)
technique and its mathematical background are presented. This
important tool in computational physics can be applied to a class of
the real physical low-dimensional magnetic systems described by the
Heisenberg hamiltonian which includes the macroscopic molecularbased
spin chains, small size magnetic clusters embedded in some
supramolecules and other interesting compounds. Using QTM, the
spin degrees of freedom are accurately taken into account, yielding
the thermodynamical functions at finite temperatures.
In order to test the application for the susceptibility calculations to
run in the parallel environment, the speed-up and efficiency of
parallelization are analyzed on our platform SGI Origin 3800 with
p = 128 processor units. Using Message Parallel Interface (MPI)
system libraries we find the efficiency of the code of 94% for
p = 128 that makes our application highly scalable.
Abstract: Most of fuzzy clustering algorithms have some
discrepancies, e.g. they are not able to detect clusters with convex
shapes, the number of the clusters should be a priori known, they
suffer from numerical problems, like sensitiveness to the
initialization, etc. This paper studies the synergistic combination of
the hierarchical and graph theoretic minimal spanning tree based
clustering algorithm with the partitional Gath-Geva fuzzy clustering
algorithm. The aim of this hybridization is to increase the robustness
and consistency of the clustering results and to decrease the number
of the heuristically defined parameters of these algorithms to
decrease the influence of the user on the clustering results. For the
analysis of the resulted fuzzy clusters a new fuzzy similarity measure
based tool has been presented. The calculated similarities of the
clusters can be used for the hierarchical clustering of the resulted
fuzzy clusters, which information is useful for cluster merging and
for the visualization of the clustering results. As the examples used
for the illustration of the operation of the new algorithm will show,
the proposed algorithm can detect clusters from data with arbitrary
shape and does not suffer from the numerical problems of the
classical Gath-Geva fuzzy clustering algorithm.
Abstract: The Cluster Dimension of a network is defined as, which is the minimum cardinality of a subset S of the set of nodes having the property that for any two distinct nodes x and y, there exist the node Si, s2 (need not be distinct) in S such that ld(x,s1) — d(y, s1)1 > 1 and d(x,s2) < d(x,$) for all s E S — {s2}. In this paper, strictly non overlap¬ping clusters are constructed. The concept of LandMarks for Unique Addressing and Clustering (LMUAC) routing scheme is developed. With the help of LMUAC routing scheme, It is shown that path length (upper bound)PLN,d < PLD, Maximum memory space requirement for the networkMSLmuAc(Az) < MSEmuAc < MSH3L < MSric and Maximum Link utilization factor MLLMUAC(i=3) < MLLMUAC(z03) < M Lc
Abstract: In this paper we used data mining techniques to
identify outlier patients who are using large amount of drugs over a
long period of time. Any healthcare or health insurance system
should deal with the quantities of drugs utilized by chronic diseases
patients. In Kingdom of Bahrain, about 20% of health budget is spent
on medications. For the managers of healthcare systems, there is no
enough information about the ways of drug utilization by chronic
diseases patients, is there any misuse or is there outliers patients. In
this work, which has been done in cooperation with information
department in the Bahrain Defence Force hospital; we select the data
for Cardiac patients in the period starting from 1/1/2008 to
December 31/12/2008 to be the data for the model in this paper. We
used three techniques for finding the drug utilization for cardiac
patients. First we applied a clustering technique, followed by
measuring of clustering validity, and finally we applied a decision
tree as classification algorithm. The clustering results is divided into
three clusters according to the drug utilization, for 1603 patients, who
received 15,806 prescriptions during this period can be partitioned
into three groups, where 23 patients (2.59%) who received 1316
prescriptions (8.32%) are classified to be outliers. The classification
algorithm shows that the use of average drug utilization and the age,
and the gender of the patient can be considered to be the main
predictive factors in the induced model.
Abstract: Computation of facility location problem for every
location in the country is not easy simultaneously. Solving the
problem is described by using cluster computing. A technique is to
design parallel algorithm by using local search with single swap
method in order to solve that problem on clusters. Parallel
implementation is done by the use of portable parallel programming,
Message Passing Interface (MPI), on Microsoft Windows Compute
Cluster. In this paper, it presents the algorithm that used local search
with single swap method and implementation of the system of a
facility to be opened by using MPI on cluster. If large datasets are
considered, the process of calculating a reasonable cost for a facility
becomes time consuming. The result shows parallel computation of
facility location problem on cluster speedups and scales well as
problem size increases.
Abstract: The aim of this article is to assess the existing
business models used by the banks operating in the CEE countries in
the time period from 2006 till 2011.
In order to obtain research results, the authors performed
qualitative analysis of the scientific literature on bank business
models, which have been grouped into clusters that consist of such
components as: 1) capital and reserves; 2) assets; 3) deposits, and 4)
loans.
In their turn, bank business models have been developed based on
the types of core activities of the banks, and have been divided into
four groups: Wholesale, Investment, Retail and Universal Banks.
Descriptive statistics have been used to analyse the models,
determining mean, minimal and maximal values of constituent
cluster components, as well as standard deviation. The analysis of
the data is based on such bank variable indices as Return on Assets
(ROA) and Return on Equity (ROE).
Abstract: Terminal localization for indoor Wireless Local Area
Networks (WLANs) is critical for the deployment of location-aware
computing inside of buildings. A major challenge is obtaining high
localization accuracy in presence of fluctuations of the received signal
strength (RSS) measurements caused by multipath fading. This paper
focuses on reducing the effect of the distance-varying noise by spatial
filtering of the measured RSS. Two different survey point geometries
are tested with the noise reduction technique: survey points arranged
in sets of clusters and survey points uniformly distributed over the
network area. The results show that the location accuracy improves
by 16% when the filter is used and by 18% when the filter is applied
to a clustered survey set as opposed to a straight-line survey set.
The estimated locations are within 2 m of the true location, which
indicates that clustering the survey points provides better localization
accuracy due to superior noise removal.
Abstract: Clustering large populations is an important problem
when the data contain noise and different shapes. A good clustering
algorithm or approach should be efficient enough to detect clusters
sensitively. Besides space complexity, time complexity also gains
importance as the size grows. Using hierarchies we developed a new
algorithm to split attributes according to the values they have and
choosing the dimension for splitting so as to divide the database
roughly into equal parts as much as possible. At each node we
calculate some certain descriptive statistical features of the data
which reside and by pruning we generate the natural clusters with a
complexity of O(n).
Abstract: A wireless sensor network with a large number of tiny sensor nodes can be used as an effective tool for gathering data in various situations. One of the major issues in wireless sensor networks is developing an energy-efficient routing protocol which has a significant impact on the overall lifetime of the sensor network. In this paper, we propose a novel hierarchical with static clustering routing protocol called Energy-Efficient Protocol with Static Clustering (EEPSC). EEPSC, partitions the network into static clusters, eliminates the overhead of dynamic clustering and utilizes temporary-cluster-heads to distribute the energy load among high-power sensor nodes; thus extends network lifetime. We have conducted simulation-based evaluations to compare the performance of EEPSC against Low-Energy Adaptive Clustering Hierarchy (LEACH). Our experiment results show that EEPSC outperforms LEACH in terms of network lifetime and power consumption minimization.