Abstract: An effective approach for extracting document images from a noisy background is introduced. The entire scheme is divided into three sub- stechniques – the initial preprocessing operations for noise cluster tightening, introduction of a new thresholding method by maximizing the ratio of stan- dard deviations of the combined effect on the image to the sum of weighted classes and finally the image restoration phase by image binarization utiliz- ing the proposed optimum threshold level. The proposed method is found to be efficient compared to the existing schemes in terms of computational complexity as well as speed with better noise rejection.
Abstract: This paper presents an algorithm based on the
wavelet decomposition, for feature extraction from the ECG signal
and recognition of three types of Ventricular Arrhythmias using
neural networks. A set of Discrete Wavelet Transform (DWT)
coefficients, which contain the maximum information about the
arrhythmias, is selected from the wavelet decomposition. After that a
novel clustering algorithm based on nature inspired algorithm (Ant
Colony Optimization) is developed for classifying arrhythmia types.
The algorithm is applied on the ECG registrations from the MIT-BIH
arrhythmia and malignant ventricular arrhythmia databases. We
applied Daubechies 4 wavelet in our algorithm. The wavelet
decomposition enabled us to perform the task efficiently and
produced reliable results.
Abstract: The medical data statistical analysis often requires the
using of some special techniques, because of the particularities of
these data. The principal components analysis and the data clustering
are two statistical methods for data mining very useful in the medical
field, the first one as a method to decrease the number of studied
parameters, and the second one as a method to analyze the
connections between diagnosis and the data about the patient-s
condition. In this paper we investigate the implications obtained from
a specific data analysis technique: the data clustering preceded by a
selection of the most relevant parameters, made using the principal
components analysis. Our assumption was that, using the principal
components analysis before data clustering - in order to select and to
classify only the most relevant parameters – the accuracy of
clustering is improved, but the practical results showed the opposite
fact: the clustering accuracy decreases, with a percentage
approximately equal with the percentage of information loss reported
by the principal components analysis.
Abstract: Wireless Sensor Networks consist of inexpensive, low power sensor nodes deployed to monitor the environment and collect
data. Gathering information in an energy efficient manner is a critical aspect to prolong the network lifetime. Clustering algorithms have an advantage of enhancing the network lifetime. Current clustering algorithms usually focus on global re-clustering and local re-clustering separately. This paper, proposed a combination of those two reclustering methods to reduce the energy consumption of the network. Furthermore, the proposed algorithm can apply to homogeneous as well as heterogeneous wireless sensor networks. In addition, the cluster head rotation happens, only when its energy drops below a dynamic threshold value computed by the algorithm. The simulation result shows that the proposed algorithm prolong the network lifetime compared to existing algorithms.
Abstract: Many advanced Routing protocols for wireless sensor networks have been implemented for the effective routing of data. Energy awareness is an essential design issue and almost all of these routing protocols are considered as energy efficient and its ultimate objective is to maximize the whole network lifetime. However, the introductions of video and imaging sensors have posed additional challenges. Transmission of video and imaging data requires both energy and QoS aware routing in order to ensure efficient usage of the sensors and effective access to the gathered measurements. In this paper, the performance of the energy-aware QoS routing Protocol are analyzed in different performance metrics like average lifetime of a node, average delay per packet and network throughput. The parameters considered in this study are end-to-end delay, real time data generation/capture rates, packet drop probability and buffer size. The network throughput for realtime and non-realtime data was also has been analyzed. The simulation has been done in NS2 simulation environment and the simulation results were analyzed with respect to different metrics.
Abstract: Due to the constant increase in the volume of information available to applications in fields varying from medical diagnosis to web search engines, accurate support of similarity becomes an important task. This is also the case of spam filtering techniques where the similarities between the known and incoming messages are the fundaments of making the spam/not spam decision. We present a novel approach to filtering based solely on layout, whose goal is not only to correctly identify spam, but also warn about major emerging threats. We propose a mathematical formulation of the email message layout and based on it we elaborate an algorithm to separate different types of emails and find the new, numerically relevant spam types.
Abstract: A computer cluster is a group of tightly coupled
computers that work together closely so that in many respects they
can be viewed as though they are a single computer. The components
of a cluster are commonly, but not always, connected to each other
through fast local area networks. Clusters are usually deployed to
improve performance and/or availability over that provided by a
single computer, while typically being much more cost-effective than
single computers of comparable speed or availability. This paper
proposed the way to implement the Beowulf Cluster in order to
achieve high performance as well as high availability.
Abstract: A new approach to predict the 3D structures of proteins by combining the knowledge-based method and Molecular Dynamics Simulation is presented on the chicken villin headpiece subdomain (HP-36). Comparative modeling is employed as the knowledge-based method to predict the core region (Ala9-Asn28) of the protein while the remaining residues are built as extended regions (Met1-Lys8; Leu29-Phe36) which then further refined using Molecular Dynamics Simulation for 120 ns. Since the core region is built based on a high sequence identity to the template (65%) resulting in RMSD of 1.39 Å from the native, it is believed that this well-developed core region can act as a 'nucleation center' for subsequent rapid downhill folding. Results also demonstrate that the formation of the non-native contact which tends to hamper folding rate can be avoided. The best 3D model that exhibits most of the native characteristics is identified using clustering method which then further ranked based on the conformational free energies. It is found that the backbone RMSD of the best model compared to the NMR-MDavg is 1.01 Å and 3.53 Å, for the core region and the complete protein, respectively. In addition to this, the conformational free energy of the best model is lower by 5.85 kcal/mol as compared to the NMR-MDavg. This structure prediction protocol is shown to be effective in predicting the 3D structure of small globular protein with a considerable accuracy in much shorter time compared to the conventional Molecular Dynamics simulation alone.
Abstract: Wireless sensor network can be applied to both abominable
and military environments. A primary goal in the design of
wireless sensor networks is lifetime maximization, constrained by
the energy capacity of batteries. One well-known method to reduce
energy consumption in such networks is data aggregation. Providing
efcient data aggregation while preserving data privacy is a challenging
problem in wireless sensor networks research. In this paper,
we present privacy-preserving data aggregation scheme for additive
aggregation functions. The Cluster-based Private Data Aggregation
(CPDA)leverages clustering protocol and algebraic properties of
polynomials. It has the advantage of incurring less communication
overhead. The goal of our work is to bridge the gap between
collaborative data collection by wireless sensor networks and data
privacy. We present simulation results of our schemes and compare
their performance to a typical data aggregation scheme TAG, where
no data privacy protection is provided. Results show the efficacy and
efficiency of our schemes.
Abstract: We compare three categorical data clustering
algorithms with respect to the problem of classifying cultural data
related to the aesthetic judgment of comics artists. Such a
classification is very important in Comics Art theory since the
determination of any classes of similarities in such kind of data will
provide to art-historians very fruitful information of Comics Art-s
evolution. To establish this, we use a categorical data set and we
study it by employing three categorical data clustering algorithms.
The performances of these algorithms are compared each other,
while interpretations of the clustering results are also given.
Abstract: In this paper, we propose a fast and efficient method for drawing very large-scale graph data. The conventional force-directed method proposed by Fruchterman and Rheingold (FR method) is well-known. It defines repulsive forces between every pair of nodes and attractive forces between connected nodes on a edge and calculates corresponding potential energy. An optimal layout is obtained by iteratively updating node positions to minimize the potential energy. Here, the positions of the nodes are updated every global timestep at the same time. In the proposed method, each node has its own individual time and time step, and nodes are updated at different frequencies depending on the local situation. The proposed method is inspired by the hierarchical individual time step method used for the high accuracy calculations for dense particle fields such as star clusters in astrophysical dynamics. Experiments show that the proposed method outperforms the original FR method in both speed and accuracy. We implement the proposed method on the MDGRAPE-3 PCI-X special purpose parallel computer and realize a speed enhancement of several hundred times.
Abstract: Essential hypertension (HTN) usually clusters with other cardiovascular risk factors such as age, overweight, diabetes, insulin resistance and dyslipidemia. The target organ damage (TOD) such as left ventricular hypertrophy, microalbuminuria (MA), acute coronary syndrome (ACS), stroke and cognitive dysfunction takes place early in course of hypertension. Though the prevalence of hypertension is high in India, the relationship between microalbuminuria and target organ damage in hypertension is not well studied. This study aim at detecting MA in essential hypertension and its relation to severity of HTN, duration of HTN, body mass index (BMI), age and TOD such as HTN retinopathy and acute coronary syndrome The present study was done in 100 patients of essential hypertension non diabetics admitted to B.L.D.E.University-s Sri B.M.Patil Medical College, Bijapur, from October 2008 to April 2011. The patients underwent detailed history and clinical examination. Early morning 5 ml of urine sample was collected & MA was estimated by immunoturbidometry method. The relationship of MA with the duration & severity of HTN, BMI, age, sex and TOD's like hypertensive retinopathy, ACS was assessed by univariate analysis. The prevalence of MA in this study was found to be 63 %. In that 42% were male & 21% were female. In this study a significant association between MA and the duration of hypertension (p = 0.036) & (OR =0.438). Longer the duration of hypertension, more possibility of microalbumin in urine. Also there was a significant association between severity of hypertension and MA (p=0.045) and (OR=0.093). MA was positive in 50 (79.4%) patients out of 63, whose blood pressure was >160/100 mm Hg. In this study a significant association between MA and the grades of hypertensive retinopathy (p =0.011) and acute coronary syndrome (p = 0.041) (OR =2.805). Gender and BMI did not pose high risk for MA in this study.The prevalence of MA in essential hypertension is high in this part of the community and MA will increase the risk of developing target organ damage.Early screening of patients with essential hypertension for MA and aggressive management of positive cases might reduce the burden of chronic kidney diseases and cardiovascular diseases in the community.
Abstract: As the number of networked computers grows,
intrusion detection is an essential component in keeping networks
secure. Various approaches for intrusion detection are currently
being in use with each one has its own merits and demerits. This
paper presents our work to test and improve the performance of a
new class of decision tree c-fuzzy decision tree to detect intrusion.
The work also includes identifying best candidate feature sub set to
build the efficient c-fuzzy decision tree based Intrusion Detection
System (IDS). We investigated the usefulness of c-fuzzy decision
tree for developing IDS with a data partition based on horizontal
fragmentation. Empirical results indicate the usefulness of our
approach in developing the efficient IDS.
Abstract: In this paper a Pattern Recognition algorithm based on
a constrained version of the k-means clustering algorithm will be
presented. The proposed algorithm is a non parametric supervised
statistical pattern recognition algorithm, i.e. it works under very mild
assumptions on the dataset. The performance of the algorithm will
be tested, togheter with a feature extraction technique that captures
the information on the closed two-dimensional contour of an image,
on images of industrial mineral ores.
Abstract: Wireless mesh networks based on IEEE 802.11
technology are a scalable and efficient solution for next generation
wireless networking to provide wide-area wideband internet access to
a significant number of users. The deployment of these wireless mesh
networks may be within different authorities and without any
planning, they are potentially overlapped partially or completely in
the same service area. The aim of the proposed model is design a new
model to Enhancement Throughput of Unplanned Wireless Mesh
Networks Deployment Using Partitioning Hierarchical Cluster
(PHC), the unplanned deployment of WMNs are determinates there
performance. We use throughput optimization approach to model the
unplanned WMNs deployment problem based on partitioning
hierarchical cluster (PHC) based architecture, in this paper the
researcher used bridge node by allowing interworking traffic between
these WMNs as solution for performance degradation.
Abstract: The classical temporal scan statistic is often used to
identify disease clusters. In recent years, this method has become as a
very popular technique and its field of application has been notably
increased. Many bioinformatic problems have been solved with this
technique. In this paper a new scan fuzzy method is proposed. The
behaviors of classic and fuzzy scan techniques are studied with
simulated data. ROC curves are calculated, being demonstrated the
superiority of the fuzzy scan technique.
Abstract: A dissimilarity measure between the empiric
characteristic functions of the subsamples associated to the different
classes in a multivariate data set is proposed. This measure can be
efficiently computed, and it depends on all the cases of each class. It
may be used to find groups of similar classes, which could be joined
for further analysis, or it could be employed to perform an
agglomerative hierarchical cluster analysis of the set of classes. The
final tree can serve to build a family of binary classification models,
offering an alternative approach to the multi-class SVM problem. We
have tested this dendrogram based SVM approach with the oneagainst-
one SVM approach over four publicly available data sets,
three of them being microarray data. Both performances have been
found equivalent, but the first solution requires a smaller number of
binary SVM models.
Abstract: Psoriasis is a chronic inflammatory skin condition
which affects 2-3% of population around the world. Psoriasis Area
and Severity Index (PASI) is a gold standard to assess psoriasis
severity as well as the treatment efficacy. Although a gold standard,
PASI is rarely used because it is tedious and complex. In practice,
PASI score is determined subjectively by dermatologists, therefore
inter and intra variations of assessment are possible to happen even
among expert dermatologists. This research develops an algorithm to
assess psoriasis lesion for PASI scoring objectively. Focus of this
research is thickness assessment as one of PASI four parameters
beside area, erythema and scaliness. Psoriasis lesion thickness is
measured by averaging the total elevation from lesion base to lesion
surface. Thickness values of 122 3D images taken from 39 patients
are grouped into 4 PASI thickness score using K-means clustering.
Validation on lesion base construction is performed using twelve
body curvature models and show good result with coefficient of
determinant (R2) is equal to 1.
Abstract: Many research works are carried out on the analysis of
traces in a digital learning environment. These studies produce large
volumes of usage tracks from the various actions performed by a
user. However, to exploit these data, compare and improve
performance, several issues are raised. To remedy this, several works
deal with this problem seen recently. This research studied a series of
questions about format and description of the data to be shared. Our
goal is to share thoughts on these issues by presenting our experience
in the analysis of trace-based log files, comparing several approaches
used in automatic classification applied to e-learning platforms.
Finally, the obtained results are discussed.
Abstract: Consider a mass production of HDD arms where
hundreds of CNC machines are used to manufacturer the HDD arms.
According to an overwhelming number of machines and models of
arm, construction of separate control chart for monitoring each HDD
arm model by each machine is not feasible. This research proposed a
strategy to optimize the SPC management on shop floor. The
procedure started from identifying the clusters of the machine with
similar manufacturing performance using clustering technique. The
three way control chart ( I - MR - R ) is then applied to each
clustered group of machine. This proposed research has
advantageous to the manufacturer in terms of not only better
performance of the SPC but also the quality management paradigm.