Abstract: The deterministic quantum transfer-matrix (QTM)
technique and its mathematical background are presented. This
important tool in computational physics can be applied to a class of
the real physical low-dimensional magnetic systems described by the
Heisenberg hamiltonian which includes the macroscopic molecularbased
spin chains, small size magnetic clusters embedded in some
supramolecules and other interesting compounds. Using QTM, the
spin degrees of freedom are accurately taken into account, yielding
the thermodynamical functions at finite temperatures.
In order to test the application for the susceptibility calculations to
run in the parallel environment, the speed-up and efficiency of
parallelization are analyzed on our platform SGI Origin 3800 with
p = 128 processor units. Using Message Parallel Interface (MPI)
system libraries we find the efficiency of the code of 94% for
p = 128 that makes our application highly scalable.
Abstract: Most of fuzzy clustering algorithms have some
discrepancies, e.g. they are not able to detect clusters with convex
shapes, the number of the clusters should be a priori known, they
suffer from numerical problems, like sensitiveness to the
initialization, etc. This paper studies the synergistic combination of
the hierarchical and graph theoretic minimal spanning tree based
clustering algorithm with the partitional Gath-Geva fuzzy clustering
algorithm. The aim of this hybridization is to increase the robustness
and consistency of the clustering results and to decrease the number
of the heuristically defined parameters of these algorithms to
decrease the influence of the user on the clustering results. For the
analysis of the resulted fuzzy clusters a new fuzzy similarity measure
based tool has been presented. The calculated similarities of the
clusters can be used for the hierarchical clustering of the resulted
fuzzy clusters, which information is useful for cluster merging and
for the visualization of the clustering results. As the examples used
for the illustration of the operation of the new algorithm will show,
the proposed algorithm can detect clusters from data with arbitrary
shape and does not suffer from the numerical problems of the
classical Gath-Geva fuzzy clustering algorithm.
Abstract: Hand gesture is an active area of research in the vision
community, mainly for the purpose of sign language recognition and
Human Computer Interaction. In this paper, we propose a system to
recognize alphabet characters (A-Z) and numbers (0-9) in real-time
from stereo color image sequences using Hidden Markov Models
(HMMs). Our system is based on three main stages; automatic segmentation
and preprocessing of the hand regions, feature extraction
and classification. In automatic segmentation and preprocessing stage,
color and 3D depth map are used to detect hands where the hand
trajectory will take place in further step using Mean-shift algorithm
and Kalman filter. In the feature extraction stage, 3D combined features
of location, orientation and velocity with respected to Cartesian
systems are used. And then, k-means clustering is employed for
HMMs codeword. The final stage so-called classification, Baum-
Welch algorithm is used to do a full train for HMMs parameters.
The gesture of alphabets and numbers is recognized using Left-Right
Banded model in conjunction with Viterbi algorithm. Experimental
results demonstrate that, our system can successfully recognize hand
gestures with 98.33% recognition rate.
Abstract: In this paper, a novel algorithm based on Ridgelet
Transform and support vector machine is proposed for human action
recognition. The Ridgelet transform is a directional multi-resolution
transform and it is more suitable for describing the human action by
performing its directional information to form spatial features
vectors. The dynamic transition between the spatial features is carried
out using both the Principal Component Analysis and clustering
algorithm K-means. First, the Principal Component Analysis is used
to reduce the dimensionality of the obtained vectors. Then, the kmeans
algorithm is then used to perform the obtained vectors to form
the spatio-temporal pattern, called set-of-labels, according to given
periodicity of human action. Finally, a Support Machine classifier is
used to discriminate between the different human actions. Different
tests are conducted on popular Datasets, such as Weizmann and
KTH. The obtained results show that the proposed method provides
more significant accuracy rate and it drives more robustness in very
challenging situations such as lighting changes, scaling and dynamic
environment
Abstract: The rapid improvement of the microprocessor and network has made it possible for the PC cluster to compete with conventional supercomputers. Lots of high throughput type of applications can be satisfied by using the current desktop PCs, especially for those in PC classrooms, and leave the supercomputers for the demands from large scale high performance parallel computations. This paper presents our development on enabling an automated deployment mechanism for cluster computing to utilize the computing power of PCs such as reside in PC classroom. After well deployment, these PCs can be transformed into a pre-configured cluster computing resource immediately without touching the existing education/training environment installed on these PCs. Thus, the training activities will not be affected by this additional activity to harvest idle computing cycles. The time and manpower required to build and manage a computing platform in geographically distributed PC classrooms also can be reduced by this development.
Abstract: The vast amount of information hidden in huge
databases has created tremendous interests in the field of data
mining. This paper examines the possibility of using data clustering
techniques in oral medicine to identify functional relationships
between different attributes and classification of similar patient
examinations. Commonly used data clustering algorithms have been
reviewed and as a result several interesting results have been
gathered.
Abstract: Lung cancer accounts for the most cancer related deaths for men as well as for women. The identification of cancer associated genes and the related pathways are essential to provide an important possibility in the prevention of many types of cancer. In this work two filter approaches, namely the information gain and the biomarker identifier (BMI) are used for the identification of different types of small-cell and non-small-cell lung cancer. A new method to determine the BMI thresholds is proposed to prioritize genes (i.e., primary, secondary and tertiary) using a k-means clustering approach. Sets of key genes were identified that can be found in several pathways. It turned out that the modified BMI is well suited for microarray data and therefore BMI is proposed as a powerful tool for the search for new and so far undiscovered genes related to cancer.
Abstract: Despite the relatively large number of studies that
have examined the use of appeals in advertisements, research on the
use of appeals in green advertisements is still underdeveloped and
needs to be investigated further, as it is definitely a tool for marketers
to create illustrious ads. In this study, content analysis was employed
to examine the nature of green advertising appeals and to match the
appeals with the green advertisements. Two different types of green
print advertisings, product orientation and organizational image
orientation were used. Thirty highly educated participants with
different backgrounds were asked individually to ascertain three
appeals out of thirty-four given appeals found among forty real green
advertisements. To analyze participant responses and to group them
based on common appeals, two-step K-mean clustering is used. The
clustering solution indicates that eye-catching graphics and
imaginative appeals are highly notable in both types of green ads.
Depressed, meaningful and sad appeals are found to be highly used in
organizational image orientation ads, whereas, corporate image,
informative and natural appeals are found to be essential for product
orientation ads.
Abstract: This study proposes novel hybrid social network analysis and collaborative filtering approach to enhance the performance of recommender systems. The proposed model selects subgroups of users in Internet community through social network analysis (SNA), and then performs clustering analysis using the information about subgroups. Finally, it makes recommendations using cluster-indexing CF based on the clustering results. This study tries to use the cores in subgroups as an initial seed for a conventional clustering algorithm. This model chooses five cores which have the highest value of degree centrality from SNA, and then performs clustering analysis by using the cores as initial centroids (cluster centers). Then, the model amplifies the impact of friends in social network in the process of cluster-indexing CF.
Abstract: The interdependences among stock market indices
were studied for a long while by academics in the entire world. The
current financial crisis opened the door to a wide range of opinions
concerning the understanding and measurement of the connections
considered to provide the controversial phenomenon of market
integration. Using data on the log-returns of 17 stock market indices
that include most of the CEE markets, from 2005 until 2009, our
paper studies the problem of these dependences using a new
methodological tool that takes into account both the volatility
clustering effect and the stochastic properties of these linkages
through a Dynamic Conditional System of Simultaneous Equations.
We find that the crisis is well captured by our model as it provides
evidence for the high volatility – high dependence effect.
Abstract: This paper makes a contribution to the on-going
debate on conceptualization and lexicalization of cutting and
breaking (C&B) verbs by discussing data from Telugu, a language of
India belonging to the Dravidian family. Five Telugu native speakers-
verbalizations of agentive actions depicted in 43 short video-clips
were analyzed. It was noted that verbalization of C&B events in
Telugu requires formal units such as simple lexical verbs, explicator
compound verbs, and other complex verb forms. The properties of
the objects involved, the kind of instruments used, and the manner of
action had differential influence on the lexicalization patterns.
Further, it was noted that all the complex verb forms encode 'result'
and 'cause' sub-events in that order. Due to the polysemy associated
with some of the verb forms, our data does not support the
straightforward bipartition of this semantic domain.
Abstract: Clustering is one of an interesting data mining topics
that can be applied in many fields. Recently, the problem of cluster
analysis is formulated as a problem of nonsmooth, nonconvex optimization,
and an algorithm for solving the cluster analysis problem
based on nonsmooth optimization techniques is developed. This
optimization problem has a number of characteristics that make it
challenging: it has many local minimum, the optimization variables
can be either continuous or categorical, and there are no exact
analytical derivatives. In this study we show how to apply a particular
class of optimization methods known as pattern search methods
to address these challenges. These methods do not explicitly use
derivatives, an important feature that has not been addressed in
previous studies. Results of numerical experiments are presented
which demonstrate the effectiveness of the proposed method.
Abstract: The Cluster Dimension of a network is defined as, which is the minimum cardinality of a subset S of the set of nodes having the property that for any two distinct nodes x and y, there exist the node Si, s2 (need not be distinct) in S such that ld(x,s1) — d(y, s1)1 > 1 and d(x,s2) < d(x,$) for all s E S — {s2}. In this paper, strictly non overlap¬ping clusters are constructed. The concept of LandMarks for Unique Addressing and Clustering (LMUAC) routing scheme is developed. With the help of LMUAC routing scheme, It is shown that path length (upper bound)PLN,d < PLD, Maximum memory space requirement for the networkMSLmuAc(Az) < MSEmuAc < MSH3L < MSric and Maximum Link utilization factor MLLMUAC(i=3) < MLLMUAC(z03) < M Lc
Abstract: In this paper we used data mining techniques to
identify outlier patients who are using large amount of drugs over a
long period of time. Any healthcare or health insurance system
should deal with the quantities of drugs utilized by chronic diseases
patients. In Kingdom of Bahrain, about 20% of health budget is spent
on medications. For the managers of healthcare systems, there is no
enough information about the ways of drug utilization by chronic
diseases patients, is there any misuse or is there outliers patients. In
this work, which has been done in cooperation with information
department in the Bahrain Defence Force hospital; we select the data
for Cardiac patients in the period starting from 1/1/2008 to
December 31/12/2008 to be the data for the model in this paper. We
used three techniques for finding the drug utilization for cardiac
patients. First we applied a clustering technique, followed by
measuring of clustering validity, and finally we applied a decision
tree as classification algorithm. The clustering results is divided into
three clusters according to the drug utilization, for 1603 patients, who
received 15,806 prescriptions during this period can be partitioned
into three groups, where 23 patients (2.59%) who received 1316
prescriptions (8.32%) are classified to be outliers. The classification
algorithm shows that the use of average drug utilization and the age,
and the gender of the patient can be considered to be the main
predictive factors in the induced model.
Abstract: Computation of facility location problem for every
location in the country is not easy simultaneously. Solving the
problem is described by using cluster computing. A technique is to
design parallel algorithm by using local search with single swap
method in order to solve that problem on clusters. Parallel
implementation is done by the use of portable parallel programming,
Message Passing Interface (MPI), on Microsoft Windows Compute
Cluster. In this paper, it presents the algorithm that used local search
with single swap method and implementation of the system of a
facility to be opened by using MPI on cluster. If large datasets are
considered, the process of calculating a reasonable cost for a facility
becomes time consuming. The result shows parallel computation of
facility location problem on cluster speedups and scales well as
problem size increases.
Abstract: In the semiconductor manufacturing process, large
amounts of data are collected from various sensors of multiple
facilities. The collected data from sensors have several different characteristics
due to variables such as types of products, former processes
and recipes. In general, Statistical Quality Control (SQC) methods
assume the normality of the data to detect out-of-control states of
processes. Although the collected data have different characteristics,
using the data as inputs of SQC will increase variations of data,
require wide control limits, and decrease performance to detect outof-
control. Therefore, it is necessary to separate similar data groups
from mixed data for more accurate process control. In the paper,
we propose a regression tree using split algorithm based on Pearson
distribution to handle non-normal distribution in parametric method.
The regression tree finds similar properties of data from different
variables. The experiments using real semiconductor manufacturing
process data show improved performance in fault detecting ability.
Abstract: The aim of this article is to assess the existing
business models used by the banks operating in the CEE countries in
the time period from 2006 till 2011.
In order to obtain research results, the authors performed
qualitative analysis of the scientific literature on bank business
models, which have been grouped into clusters that consist of such
components as: 1) capital and reserves; 2) assets; 3) deposits, and 4)
loans.
In their turn, bank business models have been developed based on
the types of core activities of the banks, and have been divided into
four groups: Wholesale, Investment, Retail and Universal Banks.
Descriptive statistics have been used to analyse the models,
determining mean, minimal and maximal values of constituent
cluster components, as well as standard deviation. The analysis of
the data is based on such bank variable indices as Return on Assets
(ROA) and Return on Equity (ROE).
Abstract: Terminal localization for indoor Wireless Local Area
Networks (WLANs) is critical for the deployment of location-aware
computing inside of buildings. A major challenge is obtaining high
localization accuracy in presence of fluctuations of the received signal
strength (RSS) measurements caused by multipath fading. This paper
focuses on reducing the effect of the distance-varying noise by spatial
filtering of the measured RSS. Two different survey point geometries
are tested with the noise reduction technique: survey points arranged
in sets of clusters and survey points uniformly distributed over the
network area. The results show that the location accuracy improves
by 16% when the filter is used and by 18% when the filter is applied
to a clustered survey set as opposed to a straight-line survey set.
The estimated locations are within 2 m of the true location, which
indicates that clustering the survey points provides better localization
accuracy due to superior noise removal.
Abstract: This paper describes the optimization of a complex
dairy farm simulation model using two quite different methods of
optimization, the Genetic algorithm (GA) and the Lipschitz
Branch-and-Bound (LBB) algorithm. These techniques have been
used to improve an agricultural system model developed by Dexcel
Limited, New Zealand, which describes a detailed representation of
pastoral dairying scenarios and contains an 8-dimensional parameter
space. The model incorporates the sub-models of pasture growth and
animal metabolism, which are themselves complex in many cases.
Each evaluation of the objective function, a composite 'Farm
Performance Index (FPI)', requires simulation of at least a one-year
period of farm operation with a daily time-step, and is therefore
computationally expensive. The problem of visualization of the
objective function (response surface) in high-dimensional spaces is
also considered in the context of the farm optimization problem.
Adaptations of the sammon mapping and parallel coordinates
visualization are described which help visualize some important
properties of the model-s output topography. From this study, it is
found that GA requires fewer function evaluations in optimization
than the LBB algorithm.
Abstract: Clustering large populations is an important problem
when the data contain noise and different shapes. A good clustering
algorithm or approach should be efficient enough to detect clusters
sensitively. Besides space complexity, time complexity also gains
importance as the size grows. Using hierarchies we developed a new
algorithm to split attributes according to the values they have and
choosing the dimension for splitting so as to divide the database
roughly into equal parts as much as possible. At each node we
calculate some certain descriptive statistical features of the data
which reside and by pruning we generate the natural clusters with a
complexity of O(n).