Abstract: Studying DNA (deoxyribonucleic acid) sequence is useful in biological processes and it is applied in the fields such as diagnostic and forensic research. DNA is the hereditary information in human and almost all other organisms. It is passed to their generations. Earlier stage detection of defective DNA sequence may lead to many developments in the field of Bioinformatics. Nowadays various tedious techniques are used to identify defective DNA. The proposed work is to analyze and identify the cancer-causing DNA motif in a given sequence. Initially the human DNA sequence is separated as k-mers using k-mer separation rule. The separated k-mers are clustered using Self Organizing Map (SOM). Using Levenshtein distance measure, cancer associated DNA motif is identified from the k-mer clusters. Experimental results of this work indicate the presence or absence of cancer causing DNA motif. If the cancer associated DNA motif is found in DNA, it is declared as the cancer disease causing DNA sequence. Otherwise the input human DNA is declared as normal sequence. Finally, elapsed time is calculated for finding the presence of cancer causing DNA motif using clustering formation. It is compared with normal process of finding cancer causing DNA motif. Locating cancer associated motif is easier in cluster formation process than the other one. The proposed work will be an initiative aid for finding genetic disease related research.
Abstract: Customer churn prediction is one of the most useful
areas of study in customer analytics. Due to the enormous amount
of data available for such predictions, machine learning and data
mining have been heavily used in this domain. There exist many
machine learning algorithms directly applicable for the problem of
customer churn prediction, and here, we attempt to experiment on
a novel approach by using a cognitive learning based technique in
an attempt to improve the results obtained by using a combination
of supervised learning methods, with cognitive unsupervised learning
methods.
Abstract: Textual data plays an important role in the modern
world. The possibilities of applying data mining techniques to
uncover hidden information present in large volumes of text
collections is immense. The Growing Self Organizing Map (GSOM)
is a highly successful member of the Self Organising Map family
and has been used as a clustering and visualisation tool across wide
range of disciplines to discover hidden patterns present in the data.
A comprehensive analysis of the GSOM’s capabilities as a text
clustering and visualisation tool has so far not been published. These
functionalities, namely map visualisation capabilities, automatic
cluster identification and hierarchical clustering capabilities are
presented in this paper and are further demonstrated with experiments
on a benchmark text corpus.
Abstract: Several studies have been carried out, using various techniques, including neural networks, to discriminate vigilance states in humans from electroencephalographic (EEG) signals, but we are still far from results satisfactorily useable results. The work presented in this paper aims at improving this status with regards to 2 aspects. Firstly, we introduce an original procedure made of the association of two neural networks, a self organizing map (SOM) and a learning vector quantization (LVQ), that allows to automatically detect artefacted states and to separate the different levels of vigilance which is a major breakthrough in the field of vigilance. Lastly and more importantly, our study has been oriented toward real-worked situation and the resulting model can be easily implemented as a wearable device. It benefits from restricted computational and memory requirements and data access is very limited in time. Furthermore, some ongoing works demonstrate that this work should shortly results in the design and conception of a non invasive electronic wearable device.
Abstract: A set of Artificial Neural Network (ANN) based methods
for the design of an effective system of speech recognition of
numerals of Assamese language captured under varied recording
conditions and moods is presented here. The work is related to
the formulation of several ANN models configured to use Linear
Predictive Code (LPC), Principal Component Analysis (PCA) and
other features to tackle mood and gender variations uttering numbers
as part of an Automatic Speech Recognition (ASR) system in
Assamese. The ANN models are designed using a combination of
Self Organizing Map (SOM) and Multi Layer Perceptron (MLP)
constituting a Learning Vector Quantization (LVQ) block trained in a
cooperative environment to handle male and female speech samples
of numerals of Assamese- a language spoken by a sizable population
in the North-Eastern part of India. The work provides a comparative
evaluation of several such combinations while subjected to handle
speech samples with gender based differences captured by a microphone
in four different conditions viz. noiseless, noise mixed, stressed
and stress-free.
Abstract: In this paper a one-dimension Self Organizing Map
algorithm (SOM) to perform feature selection is presented. The
algorithm is based on a first classification of the input dataset on a
similarity space. From this classification for each class a set of
positive and negative features is computed. This set of features is
selected as result of the procedure. The procedure is evaluated on an
in-house dataset from a Knowledge Discovery from Text (KDT)
application and on a set of publicly available datasets used in
international feature selection competitions. These datasets come
from KDT applications, drug discovery as well as other applications.
The knowledge of the correct classification available for the training
and validation datasets is used to optimize the parameters for positive
and negative feature extractions. The process becomes feasible for
large and sparse datasets, as the ones obtained in KDT applications,
by using both compression techniques to store the similarity matrix
and speed up techniques of the Kohonen algorithm that take
advantage of the sparsity of the input matrix. These improvements
make it feasible, by using the grid, the application of the
methodology to massive datasets.
Abstract: With the explosive growth of data available on the
Internet, personalization of this information space become a
necessity. At present time with the rapid increasing popularity of the
WWW, Websites are playing a crucial role to convey knowledge and
information to the end users. Discovering hidden and meaningful
information about Web users usage patterns is critical to determine
effective marketing strategies to optimize the Web server usage for
accommodating future growth. The task of mining useful information
becomes more challenging when the Web traffic volume is enormous
and keeps on growing. In this paper, we propose a intelligent model
to discover and analyze useful knowledge from the available Web
log data.
Abstract: In this paper, a model for an information retrieval
system is proposed which takes into account that knowledge about
documents and information need of users are dynamic. Two
methods are combined, one qualitative or symbolic and the other
quantitative or numeric, which are deemed suitable for many
clustering contexts, data analysis, concept exploring and
knowledge discovery. These two methods may be classified as
inductive learning techniques. In this model, they are introduced to
build “long term" knowledge about past queries and concepts in a
collection of documents. The “long term" knowledge can guide
and assist the user to formulate an initial query and can be
exploited in the process of retrieving relevant information. The
different kinds of knowledge are organized in different points of
view. This may be considered an enrichment of the exploration
level which is coherent with the concept of document/query
structure.
Abstract: This work presents a neural network model for the
clustering analysis of data based on Self Organizing Maps (SOM).
The model evolves during the training stage towards a hierarchical
structure according to the input requirements. The hierarchical structure
symbolizes a specialization tool that provides refinements of the
classification process. The structure behaves like a single map with
different resolutions depending on the region to analyze. The benefits
and performance of the algorithm are discussed in application to the
Iris dataset, a classical example for pattern recognition.
Abstract: Cosmic showers, during the transit through space, produce
sub - products as a result of interactions with the intergalactic
or interstellar medium which after entering earth generate secondary
particles called Extensive Air Shower (EAS). Detection and analysis
of High Energy Particle Showers involve a plethora of theoretical and
experimental works with a host of constraints resulting in inaccuracies
in measurements. Therefore, there exist a necessity to develop a
readily available system based on soft-computational approaches
which can be used for EAS analysis. This is due to the fact that soft
computational tools such as Artificial Neural Network (ANN)s can be
trained as classifiers to adapt and learn the surrounding variations. But
single classifiers fail to reach optimality of decision making in many
situations for which Multiple Classifier System (MCS) are preferred
to enhance the ability of the system to make decisions adjusting
to finer variations. This work describes the formation of an MCS
using Multi Layer Perceptron (MLP), Recurrent Neural Network
(RNN) and Probabilistic Neural Network (PNN) with data inputs
from correlation mapping Self Organizing Map (SOM) blocks and
the output optimized by another SOM. The results show that the setup
can be adopted for real time practical applications for prediction
of primary energy and location of EAS from density values captured
using detectors in a circular grid.
Abstract: Embedded systems need to respect stringent real
time constraints. Various hardware components included in such
systems such as cache memories exhibit variability and therefore
affect execution time. Indeed, a cache memory access from an
embedded microprocessor might result in a cache hit where the
data is available or a cache miss and the data need to be fetched
with an additional delay from an external memory. It is therefore
highly desirable to predict future memory accesses during
execution in order to appropriately prefetch data without incurring
delays. In this paper, we evaluate the potential of several artificial
neural networks for the prediction of instruction memory
addresses. Neural network have the potential to tackle the nonlinear
behavior observed in memory accesses during program
execution and their demonstrated numerous hardware
implementation emphasize this choice over traditional forecasting
techniques for their inclusion in embedded systems. However,
embedded applications execute millions of instructions and
therefore millions of addresses to be predicted. This very
challenging problem of neural network based prediction of large
time series is approached in this paper by evaluating various neural
network architectures based on the recurrent neural network
paradigm with pre-processing based on the Self Organizing Map
(SOM) classification technique.
Abstract: Face Recognition has always been a fascinating research area. It has drawn the attention of many researchers because of its various potential applications such as security systems, entertainment, criminal identification etc. Many supervised and unsupervised learning techniques have been reported so far. Principal Component Analysis (PCA), Self Organizing Maps (SOM) and Independent Component Analysis (ICA) are the three techniques among many others as proposed by different researchers for Face Recognition, known as the unsupervised techniques. This paper proposes integration of the two techniques, SOM and PCA, for dimensionality reduction and feature selection. Simulation results show that, though, the individual techniques SOM and PCA itself give excellent performance but the combination of these two can also be utilized for face recognition. Experimental results also indicate that for the given face database and the classifier used, SOM performs better as compared to other unsupervised learning techniques. A comparison of two proposed methodologies of SOM, Local and Global processing, shows the superiority of the later but at the cost of more computational time.
Abstract: Integration of system process information obtained
through an image processing system with an evolving knowledge
database to improve the accuracy and predictability of wear particle
analysis is the main focus of the paper. The objective is to automate
intelligently the analysis process of wear particle using classification
via self organizing maps. This is achieved using relationship
measurements among corresponding attributes of various
measurements for wear particle. Finally, visualization technique is
proposed that helps the viewer in understanding and utilizing these
relationships that enable accurate diagnostics.