Abstract: The use of neural networks for recognition application is generally constrained by their inherent parameters inflexibility after the training phase. This means no adaptation is accommodated for input variations that have any influence on the network parameters. Attempts were made in this work to design a neural network that includes an additional mechanism that adjusts the threshold values according to the input pattern variations. The new approach is based on splitting the whole network into two subnets; main traditional net and a supportive net. The first deals with the required output of trained patterns with predefined settings, while the second tolerates output generation dynamically with tuning capability for any newly applied input. This tuning comes in the form of an adjustment to the threshold values. Two levels of supportive net were studied; one implements an extended additional layer with adjustable neuronal threshold setting mechanism, while the second implements an auxiliary net with traditional architecture performs dynamic adjustment to the threshold value of the main net that is constructed in dual-layer architecture. Experiment results and analysis of the proposed designs have given quite satisfactory conducts. The supportive layer approach achieved over 90% recognition rate, while the multiple network technique shows more effective and acceptable level of recognition. However, this is achieved at the price of network complexity and computation time. Recognition generalization may be also improved by accommodating capabilities involving all the innate structures in conjugation with Intelligence abilities with the needs of further advanced learning phases.
Abstract: The goal of a network-based intrusion detection
system is to classify activities of network traffics into two major
categories: normal and attack (intrusive) activities. Nowadays, data
mining and machine learning plays an important role in many
sciences; including intrusion detection system (IDS) using both
supervised and unsupervised techniques. However, one of the
essential steps of data mining is feature selection that helps in
improving the efficiency, performance and prediction rate of
proposed approach. This paper applies unsupervised K-means
clustering algorithm with information gain (IG) for feature selection
and reduction to build a network intrusion detection system. For our
experimental analysis, we have used the new NSL-KDD dataset,
which is a modified dataset for KDDCup 1999 intrusion detection
benchmark dataset. With a split of 60.0% for the training set and the
remainder for the testing set, a 2 class classifications have been
implemented (Normal, Attack). Weka framework which is a java
based open source software consists of a collection of machine
learning algorithms for data mining tasks has been used in the testing
process. The experimental results show that the proposed approach is
very accurate with low false positive rate and high true positive rate
and it takes less learning time in comparison with using the full
features of the dataset with the same algorithm.
Abstract: Artificial neural networks (ANN) have the ability to model input-output relationships from processing raw data. This characteristic makes them invaluable in industry domains where such knowledge is scarce at best. In the recent decades, in order to overcome the black-box characteristic of ANNs, researchers have attempted to extract the knowledge embedded within ANNs in the form of rules that can be used in inference systems. This paper presents a new technique that is able to extract a small set of rules from a two-layer ANN. The extracted rules yield high classification accuracy when implemented within a fuzzy inference system. The technique targets industry domains that possess less complex problems for which no expert knowledge exists and for which a simpler solution is preferred to a complex one. The proposed technique is more efficient, simple, and applicable than most of the previously proposed techniques.
Abstract: The most important subtype of non-Hodgkin-s
lymphoma is the Diffuse Large B-Cell Lymphoma. Approximately
40% of the patients suffering from it respond well to therapy,
whereas the remainder needs a more aggressive treatment, in order to
better their chances of survival. Data Mining techniques have helped
to identify the class of the lymphoma in an efficient manner. Despite
that, thousands of genes should be processed to obtain the results.
This paper presents a comparison of the use of various attribute
selection methods aiming to reduce the number of genes to be
searched, looking for a more effective procedure as a whole.
Abstract: In this paper, several improvements are proposed to
previous work of automated classification of alcoholics and nonalcoholics.
In the previous paper, multiplayer-perceptron neural
network classifying energy of gamma band Visual Evoked Potential
(VEP) signals gave the best classification performance using 800
VEP signals from 10 alcoholics and 10 non-alcoholics. Here, the
dataset is extended to include 3560 VEP signals from 102 subjects:
62 alcoholics and 40 non-alcoholics. Three modifications are
introduced to improve the classification performance: i) increasing
the gamma band spectral range by increasing the pass-band width of
the used filter ii) the use of Multiple Signal Classification algorithm
to obtain the power of the dominant frequency in gamma band VEP
signals as features and iii) the use of the simple but effective knearest
neighbour classifier. To validate that these two modifications
do give improved performance, a 10-fold cross validation
classification (CVC) scheme is used. Repeat experiments of the
previously used methodology for the extended dataset are performed
here and improvement from 94.49% to 98.71% in maximum
averaged CVC accuracy is obtained using the modifications. This
latest results show that VEP based classification of alcoholics is
worth exploring further for system development.
Abstract: Safe drinking water is one of the biggest issues facing
the planet this century. The primary aim of this paper is to present our
research focused on theoretical and experimental analysis of potable
water and in-building water distribution systems from the point of
view of microbiological risk on the basis of confrontation between
the theoretical analysis and synthesis of gathered information in
conditions of the Slovak Republic. The presence of the bacteria
Legionella in water systems, especially in hot water distribution
system, represents in terms of health protection of inhabitants the
crucial problem which cannot be overlooked. Legionella
pneumophila discovery, its classification and its influence on
installations inside buildings are relatively new. There are a lot of
guidelines and regulations developed in many individual countries for
the design, operation and maintenance for tap water systems to avoid
the growth of bacteria Legionella pneumophila, but in Slovakia we
don-t have any. The goal of this paper is to show the necessity of
prevention and regulations for installations inside buildings verified
by simulation methods.
Abstract: Identification of cancer genes that might anticipate
the clinical behaviors from different types of cancer disease is
challenging due to the huge number of genes and small number of
patients samples. The new method is being proposed based on
supervised learning of classification like support vector machines
(SVMs).A new solution is described by the introduction of the
Maximized Margin (MM) in the subset criterion, which permits to
get near the least generalization error rate. In class prediction
problem, gene selection is essential to improve the accuracy and to
identify genes for cancer disease. The performance of the new
method was evaluated with real-world data experiment. It can give
the better accuracy for classification.
Abstract: This paper examines the available experiment data for a copper bromide vapor laser (CuBr laser), emitting at two wavelengths - 510.6 and 578.2nm. Laser output power is estimated based on 10 independent input physical parameters. A classification and regression tree (CART) model is obtained which describes 97% of data. The resulting binary CART tree specifies which input parameters influence considerably each of the classification groups. This allows for a technical assessment that indicates which of these are the most significant for the manufacture and operation of the type of laser under consideration. The predicted values of the laser output power are also obtained depending on classification. This aids the design and development processes considerably.
Abstract: A novel application of neural network approach to
fault classification and fault location of Medium voltage cables is
demonstrated in this paper. Different faults on a protected cable
should be classified and located correctly. This paper presents the use
of neural networks as a pattern classifier algorithm to perform these
tasks. The proposed scheme is insensitive to variation of different
parameters such as fault type, fault resistance, and fault inception
angle. Studies show that the proposed technique is able to offer high
accuracy in both of the fault classification and fault location tasks.
Abstract: This paper presents an optimal and unsupervised satellite image segmentation approach based on Pearson system and k-Means Clustering Algorithm Initialization. Such method could be considered as original by the fact that it utilised K-Means clustering algorithm for an optimal initialisation of image class number on one hand and it exploited Pearson system for an optimal statistical distributions- affectation of each considered class on the other hand. Satellite image exploitation requires the use of different approaches, especially those founded on the unsupervised statistical segmentation principle. Such approaches necessitate definition of several parameters like image class number, class variables- estimation and generalised mixture distributions. Use of statistical images- attributes assured convincing and promoting results under the condition of having an optimal initialisation step with appropriated statistical distributions- affectation. Pearson system associated with a k-means clustering algorithm and Stochastic Expectation-Maximization 'SEM' algorithm could be adapted to such problem. For each image-s class, Pearson system attributes one distribution type according to different parameters and especially the Skewness 'β1' and the kurtosis 'β2'. The different adapted algorithms, K-Means clustering algorithm, SEM algorithm and Pearson system algorithm, are then applied to satellite image segmentation problem. Efficiency of those combined algorithms was firstly validated with the Mean Quadratic Error 'MQE' evaluation, and secondly with visual inspection along several comparisons of these unsupervised images- segmentation.
Abstract: In this paper we introduce some subspaces of fuzzy entire sequence space. Some general properties of these sequence spaces are discussed. Also some inclusion relation involving the spaces are obtained. Mathematics Subject Classification: 40A05, 40D25.
Abstract: In this paper, we have presented a new multivariate fuzzy time series forecasting method. This method assumes mfactors with one main factor of interest. History of past three years is used for making new forecasts. This new method is applied in forecasting total number of car accidents in Belgium using four secondary factors. We also make comparison of our proposed method with existing methods of fuzzy time series forecasting. Experimentally, it is shown that our proposed method perform better than existing fuzzy time series forecasting methods. Practically, actuaries are interested in analysis of the patterns of causalities in road accidents. Thus using fuzzy time series, actuaries can define fuzzy premium and fuzzy underwriting of car insurance and life insurance for car insurance. National Institute of Statistics, Belgium provides region of risk classification for each road. Thus using this risk classification, we can predict premium rate and underwriting of insurance policy holders.
Abstract: Urban problems are problems of organized complexity. Thus, many models and scientific methods to resolve urban problems are failed. This study is concerned with proposing of a fuzzy system driven approach for classification and solving urban problems. The proposed study investigated mainly the selection of the inputs and outputs of urban systems for classification of urban problems. In this research, five categories of urban problems, respect to fuzzy system approach had been recognized: control, polytely, optimizing, open and decision making problems. Grounded Theory techniques were then applied to analyze the data and develop new solving method for each category. The findings indicate that the fuzzy system methods are powerful processes and analytic tools for helping planners to resolve urban complex problems. These tools can be successful where as others have failed because both incorporate or address uncertainty and risk; complexity and systems interacting with other systems.
Abstract: Fishing has always been an essential component of
the Polynesians- life. Fishhooks, mostly in pearl shell, found during
archaeological excavations are the artifacts related to this activity the
most numerous. Thanks to them, we try to reconstruct the ancient
techniques of resources exploitation, inside the lagoons and offshore.
They can also be used as chronological and cultural indicators. The
shapes and dimensions of these artifacts allow comparisons and
classifications used in both functional approach and chrono-cultural
perspective. Hence it is very important for the ethno-archaeologists
to dispose of reliable methods and standardized measurement of
these artifacts. Such a reliable objective and standardized method
have been previously proposed. But this method cannot be envisaged
manually because of the very important time required to measure
each fishhook manually and the quantity of fishhooks to measure
(many hundreds). We propose in this paper a detailed acquisition
protocol of fishhooks and an automation of every step of this method.
We also provide some experimental results obtained on the fishhooks
coming from three archaeological excavations sites.
Abstract: This paper regards the phenomena of intensive suburbanization and urbanization in Olomouc city and in Olomouc region in general for the period of 1986–2009. A Remote Sensing approach that involves tracking of changes in Land Cover units is proposed to quantify the urbanization state and trends in temporal and spatial aspects. It actually consisted of two approaches, Experiment 1 and Experiment 2 which implied two different image classification solutions in order to provide Land Cover maps for each 1986–2009 time split available in the Landsat image set. Experiment 1 dealt with the unsupervised classification, while Experiment 2 involved semi- supervised classification, using a combination of object-based and pixel-based classifiers. The resulting Land Cover maps were subsequently quantified for the proportion of urban area unit and its trend through time, and also for the urban area unit stability, yielding the relation of spatial and temporal development of the urban area unit. Some outcomes seem promising but there is indisputably room for improvements of source data and also processing and filtering.
Abstract: This paper proposed classification models that would
be used as a proxy for hard disk drive (HDD) functional test equitant
which required approximately more than two weeks to perform the
HDD status classification in either “Pass" or “Fail". These models
were constructed by using committee network which consisted of a
number of single neural networks. This paper also included the
method to solve the problem of sparseness data in failed part, which
was called “enforce learning method". Our results reveal that the
constructed classification models with the proposed method could
perform well in the sparse data conditions and thus the models,
which used a few seconds for HDD classification, could be used to
substitute the HDD functional tests.
Abstract: Learning using labeled and unlabelled data has
received considerable amount of attention in the machine learning
community due its potential in reducing the need for expensive
labeled data. In this work we present a new method for combining
labeled and unlabeled data based on classifier ensembles. The model
we propose assumes each classifier in the ensemble observes the
input using different set of features. Classifiers are initially trained
using some labeled samples. The trained classifiers learn further
through labeling the unknown patterns using a teaching signals that is
generated using the decision of the classifier ensemble, i.e. the
classifiers self-supervise each other. Experiments on a set of object
images are presented. Our experiments investigate different classifier
models, different fusing techniques, different training sizes and
different input features. Experimental results reveal that the proposed
self-supervised ensemble learning approach reduces classification
error over the single classifier and the traditional ensemble classifier
approachs.
Abstract: HIV-1 genome is highly heterogeneous. Due to this
variation, features of HIV-I genome is in a wide range. For this
reason, the ability to infection of the virus changes depending on
different chemokine receptors. From this point of view, R5 HIV
viruses use CCR5 coreceptor while X4 viruses use CXCR5 and
R5X4 viruses can utilize both coreceptors. Recently, in
Bioinformatics, R5X4 viruses have been studied to classify by using
the experiments on HIV-1 genome.
In this study, R5X4 type of HIV viruses were classified using
Auto Regressive (AR) model through Artificial Neural Networks
(ANNs). The statistical data of R5X4, R5 and X4 viruses was
analyzed by using signal processing methods and ANNs. Accessible
residues of these virus sequences were obtained and modeled by AR
model since the dimension of residues is large and different from
each other. Finally the pre-processed data was used to evolve various
ANN structures for determining R5X4 viruses. Furthermore ROC
analysis was applied to ANNs to show their real performances. The
results indicate that R5X4 viruses successfully classified with high
sensitivity and specificity values training and testing ROC analysis
for RBF, which gives the best performance among ANN structures.
Abstract: In this study, an inland metropolitan area, Gwangju, in Korea was selected to assess the amplification potential of earthquake motion and provide the information for regional seismic countermeasure. A geographic information system-based expert system was implemented for reliably predicting the spatial geotechnical layers in the entire region of interesting by building a geo-knowledge database. Particularly, the database consists of the existing boring data gathered from the prior geotechnical projects and the surface geo-knowledge data acquired from the site visit. For practical application of the geo-knowledge database to estimate the earthquake hazard potential related to site amplification effects at the study area, seismic zoning maps on geotechnical parameters, such as the bedrock depth and the site period, were created within GIS framework. In addition, seismic zonation of site classification was also performed to determine the site amplification coefficients for seismic design at any site in the study area. KeywordsEarthquake hazard, geo-knowledge, geographic information system, seismic zonation, site period.
Abstract: Sonogram images of normal and lymphocyte thyroid tissues have considerable overlap which makes it difficult to interpret and distinguish. Classification from sonogram images of thyroid gland is tackled in semiautomatic way. While making manual diagnosis from images, some relevant information need not to be recognized by human visual system. Quantitative image analysis could be helpful to manual diagnostic process so far done by physician. Two classes are considered: normal tissue and chronic lymphocyte thyroid (Hashimoto's Thyroid). Data structure is analyzed using K-nearest-neighbors classification. This paper is mentioned that unlike the wavelet sub bands' energy, histograms and Haralick features are not appropriate to distinguish between normal tissue and Hashimoto's thyroid.