Dynamic Threshold Adjustment Approach For Neural Networks

The use of neural networks for recognition application is generally constrained by their inherent parameters inflexibility after the training phase. This means no adaptation is accommodated for input variations that have any influence on the network parameters. Attempts were made in this work to design a neural network that includes an additional mechanism that adjusts the threshold values according to the input pattern variations. The new approach is based on splitting the whole network into two subnets; main traditional net and a supportive net. The first deals with the required output of trained patterns with predefined settings, while the second tolerates output generation dynamically with tuning capability for any newly applied input. This tuning comes in the form of an adjustment to the threshold values. Two levels of supportive net were studied; one implements an extended additional layer with adjustable neuronal threshold setting mechanism, while the second implements an auxiliary net with traditional architecture performs dynamic adjustment to the threshold value of the main net that is constructed in dual-layer architecture. Experiment results and analysis of the proposed designs have given quite satisfactory conducts. The supportive layer approach achieved over 90% recognition rate, while the multiple network technique shows more effective and acceptable level of recognition. However, this is achieved at the price of network complexity and computation time. Recognition generalization may be also improved by accommodating capabilities involving all the innate structures in conjugation with Intelligence abilities with the needs of further advanced learning phases.

Feature Based Unsupervised Intrusion Detection

The goal of a network-based intrusion detection system is to classify activities of network traffics into two major categories: normal and attack (intrusive) activities. Nowadays, data mining and machine learning plays an important role in many sciences; including intrusion detection system (IDS) using both supervised and unsupervised techniques. However, one of the essential steps of data mining is feature selection that helps in improving the efficiency, performance and prediction rate of proposed approach. This paper applies unsupervised K-means clustering algorithm with information gain (IG) for feature selection and reduction to build a network intrusion detection system. For our experimental analysis, we have used the new NSL-KDD dataset, which is a modified dataset for KDDCup 1999 intrusion detection benchmark dataset. With a split of 60.0% for the training set and the remainder for the testing set, a 2 class classifications have been implemented (Normal, Attack). Weka framework which is a java based open source software consists of a collection of machine learning algorithms for data mining tasks has been used in the testing process. The experimental results show that the proposed approach is very accurate with low false positive rate and high true positive rate and it takes less learning time in comparison with using the full features of the dataset with the same algorithm.

An Efficient Technique for Extracting Fuzzy Rulesfrom Neural Networks

Artificial neural networks (ANN) have the ability to model input-output relationships from processing raw data. This characteristic makes them invaluable in industry domains where such knowledge is scarce at best. In the recent decades, in order to overcome the black-box characteristic of ANNs, researchers have attempted to extract the knowledge embedded within ANNs in the form of rules that can be used in inference systems. This paper presents a new technique that is able to extract a small set of rules from a two-layer ANN. The extracted rules yield high classification accuracy when implemented within a fuzzy inference system. The technique targets industry domains that possess less complex problems for which no expert knowledge exists and for which a simpler solution is preferred to a complex one. The proposed technique is more efficient, simple, and applicable than most of the previously proposed techniques.

Attribute Selection Methods Comparison for Classification of Diffuse Large B-Cell Lymphoma

The most important subtype of non-Hodgkin-s lymphoma is the Diffuse Large B-Cell Lymphoma. Approximately 40% of the patients suffering from it respond well to therapy, whereas the remainder needs a more aggressive treatment, in order to better their chances of survival. Data Mining techniques have helped to identify the class of the lymphoma in an efficient manner. Despite that, thousands of genes should be processed to obtain the results. This paper presents a comparison of the use of various attribute selection methods aiming to reduce the number of genes to be searched, looking for a more effective procedure as a whole.

Improved Automated Classification of Alcoholics and Non-alcoholics

In this paper, several improvements are proposed to previous work of automated classification of alcoholics and nonalcoholics. In the previous paper, multiplayer-perceptron neural network classifying energy of gamma band Visual Evoked Potential (VEP) signals gave the best classification performance using 800 VEP signals from 10 alcoholics and 10 non-alcoholics. Here, the dataset is extended to include 3560 VEP signals from 102 subjects: 62 alcoholics and 40 non-alcoholics. Three modifications are introduced to improve the classification performance: i) increasing the gamma band spectral range by increasing the pass-band width of the used filter ii) the use of Multiple Signal Classification algorithm to obtain the power of the dominant frequency in gamma band VEP signals as features and iii) the use of the simple but effective knearest neighbour classifier. To validate that these two modifications do give improved performance, a 10-fold cross validation classification (CVC) scheme is used. Repeat experiments of the previously used methodology for the extended dataset are performed here and improvement from 94.49% to 98.71% in maximum averaged CVC accuracy is obtained using the modifications. This latest results show that VEP based classification of alcoholics is worth exploring further for system development.

Negative Impact of Bacteria Legionella Pneumophila in Hot Water Distribution Systems on Human Health

Safe drinking water is one of the biggest issues facing the planet this century. The primary aim of this paper is to present our research focused on theoretical and experimental analysis of potable water and in-building water distribution systems from the point of view of microbiological risk on the basis of confrontation between the theoretical analysis and synthesis of gathered information in conditions of the Slovak Republic. The presence of the bacteria Legionella in water systems, especially in hot water distribution system, represents in terms of health protection of inhabitants the crucial problem which cannot be overlooked. Legionella pneumophila discovery, its classification and its influence on installations inside buildings are relatively new. There are a lot of guidelines and regulations developed in many individual countries for the design, operation and maintenance for tap water systems to avoid the growth of bacteria Legionella pneumophila, but in Slovakia we don-t have any. The goal of this paper is to show the necessity of prevention and regulations for installations inside buildings verified by simulation methods.

Feature Subset Selection approach based on Maximizing Margin of Support Vector Classifier

Identification of cancer genes that might anticipate the clinical behaviors from different types of cancer disease is challenging due to the huge number of genes and small number of patients samples. The new method is being proposed based on supervised learning of classification like support vector machines (SVMs).A new solution is described by the introduction of the Maximized Margin (MM) in the subset criterion, which permits to get near the least generalization error rate. In class prediction problem, gene selection is essential to improve the accuracy and to identify genes for cancer disease. The performance of the new method was evaluated with real-world data experiment. It can give the better accuracy for classification.

CART Method for Modeling the Output Power of Copper Bromide Laser

This paper examines the available experiment data for a copper bromide vapor laser (CuBr laser), emitting at two wavelengths - 510.6 and 578.2nm. Laser output power is estimated based on 10 independent input physical parameters. A classification and regression tree (CART) model is obtained which describes 97% of data. The resulting binary CART tree specifies which input parameters influence considerably each of the classification groups. This allows for a technical assessment that indicates which of these are the most significant for the manufacture and operation of the type of laser under consideration. The predicted values of the laser output power are also obtained depending on classification. This aids the design and development processes considerably.

A Novel Approach to Fault Classification and Fault Location for Medium Voltage Cables Based on Artificial Neural Network

A novel application of neural network approach to fault classification and fault location of Medium voltage cables is demonstrated in this paper. Different faults on a protected cable should be classified and located correctly. This paper presents the use of neural networks as a pattern classifier algorithm to perform these tasks. The proposed scheme is insensitive to variation of different parameters such as fault type, fault resistance, and fault inception angle. Studies show that the proposed technique is able to offer high accuracy in both of the fault classification and fault location tasks.

An Optimal Unsupervised Satellite image Segmentation Approach Based on Pearson System and k-Means Clustering Algorithm Initialization

This paper presents an optimal and unsupervised satellite image segmentation approach based on Pearson system and k-Means Clustering Algorithm Initialization. Such method could be considered as original by the fact that it utilised K-Means clustering algorithm for an optimal initialisation of image class number on one hand and it exploited Pearson system for an optimal statistical distributions- affectation of each considered class on the other hand. Satellite image exploitation requires the use of different approaches, especially those founded on the unsupervised statistical segmentation principle. Such approaches necessitate definition of several parameters like image class number, class variables- estimation and generalised mixture distributions. Use of statistical images- attributes assured convincing and promoting results under the condition of having an optimal initialisation step with appropriated statistical distributions- affectation. Pearson system associated with a k-means clustering algorithm and Stochastic Expectation-Maximization 'SEM' algorithm could be adapted to such problem. For each image-s class, Pearson system attributes one distribution type according to different parameters and especially the Skewness 'β1' and the kurtosis 'β2'. The different adapted algorithms, K-Means clustering algorithm, SEM algorithm and Pearson system algorithm, are then applied to satellite image segmentation problem. Efficiency of those combined algorithms was firstly validated with the Mean Quadratic Error 'MQE' evaluation, and secondly with visual inspection along several comparisons of these unsupervised images- segmentation.

Multivariate High Order Fuzzy Time Series Forecasting for Car Road Accidents

In this paper, we have presented a new multivariate fuzzy time series forecasting method. This method assumes mfactors with one main factor of interest. History of past three years is used for making new forecasts. This new method is applied in forecasting total number of car accidents in Belgium using four secondary factors. We also make comparison of our proposed method with existing methods of fuzzy time series forecasting. Experimentally, it is shown that our proposed method perform better than existing fuzzy time series forecasting methods. Practically, actuaries are interested in analysis of the patterns of causalities in road accidents. Thus using fuzzy time series, actuaries can define fuzzy premium and fuzzy underwriting of car insurance and life insurance for car insurance. National Institute of Statistics, Belgium provides region of risk classification for each road. Thus using this risk classification, we can predict premium rate and underwriting of insurance policy holders.

Classification and Resolving Urban Problems by Means of Fuzzy Approach

Urban problems are problems of organized complexity. Thus, many models and scientific methods to resolve urban problems are failed. This study is concerned with proposing of a fuzzy system driven approach for classification and solving urban problems. The proposed study investigated mainly the selection of the inputs and outputs of urban systems for classification of urban problems. In this research, five categories of urban problems, respect to fuzzy system approach had been recognized: control, polytely, optimizing, open and decision making problems. Grounded Theory techniques were then applied to analyze the data and develop new solving method for each category. The findings indicate that the fuzzy system methods are powerful processes and analytic tools for helping planners to resolve urban complex problems. These tools can be successful where as others have failed because both incorporate or address uncertainty and risk; complexity and systems interacting with other systems.

Automation of Fishhooks Objective Measures

Fishing has always been an essential component of the Polynesians- life. Fishhooks, mostly in pearl shell, found during archaeological excavations are the artifacts related to this activity the most numerous. Thanks to them, we try to reconstruct the ancient techniques of resources exploitation, inside the lagoons and offshore. They can also be used as chronological and cultural indicators. The shapes and dimensions of these artifacts allow comparisons and classifications used in both functional approach and chrono-cultural perspective. Hence it is very important for the ethno-archaeologists to dispose of reliable methods and standardized measurement of these artifacts. Such a reliable objective and standardized method have been previously proposed. But this method cannot be envisaged manually because of the very important time required to measure each fishhook manually and the quantity of fishhooks to measure (many hundreds). We propose in this paper a detailed acquisition protocol of fishhooks and an automation of every step of this method. We also provide some experimental results obtained on the fishhooks coming from three archaeological excavations sites.

Urban Land Cover Change of Olomouc City Using LANDSAT Images

This paper regards the phenomena of intensive suburbanization and urbanization in Olomouc city and in Olomouc region in general for the period of 1986–2009. A Remote Sensing approach that involves tracking of changes in Land Cover units is proposed to quantify the urbanization state and trends in temporal and spatial aspects. It actually consisted of two approaches, Experiment 1 and Experiment 2 which implied two different image classification solutions in order to provide Land Cover maps for each 1986–2009 time split available in the Landsat image set. Experiment 1 dealt with the unsupervised classification, while Experiment 2 involved semi- supervised classification, using a combination of object-based and pixel-based classifiers. The resulting Land Cover maps were subsequently quantified for the proportion of urban area unit and its trend through time, and also for the urban area unit stability, yielding the relation of spatial and temporal development of the urban area unit. Some outcomes seem promising but there is indisputably room for improvements of source data and also processing and filtering.

The Classification Model for Hard Disk Drive Functional Tests under Sparse Data Conditions

This paper proposed classification models that would be used as a proxy for hard disk drive (HDD) functional test equitant which required approximately more than two weeks to perform the HDD status classification in either “Pass" or “Fail". These models were constructed by using committee network which consisted of a number of single neural networks. This paper also included the method to solve the problem of sparseness data in failed part, which was called “enforce learning method". Our results reveal that the constructed classification models with the proposed method could perform well in the sparse data conditions and thus the models, which used a few seconds for HDD classification, could be used to substitute the HDD functional tests.

An Experimental Study of a Self-Supervised Classifier Ensemble

Learning using labeled and unlabelled data has received considerable amount of attention in the machine learning community due its potential in reducing the need for expensive labeled data. In this work we present a new method for combining labeled and unlabeled data based on classifier ensembles. The model we propose assumes each classifier in the ensemble observes the input using different set of features. Classifiers are initially trained using some labeled samples. The trained classifiers learn further through labeling the unknown patterns using a teaching signals that is generated using the decision of the classifier ensemble, i.e. the classifiers self-supervise each other. Experiments on a set of object images are presented. Our experiments investigate different classifier models, different fusing techniques, different training sizes and different input features. Experimental results reveal that the proposed self-supervised ensemble learning approach reduces classification error over the single classifier and the traditional ensemble classifier approachs.

Analysis and Classification of Hiv-1 Sub- Type Viruses by AR Model through Artificial Neural Networks

HIV-1 genome is highly heterogeneous. Due to this variation, features of HIV-I genome is in a wide range. For this reason, the ability to infection of the virus changes depending on different chemokine receptors. From this point of view, R5 HIV viruses use CCR5 coreceptor while X4 viruses use CXCR5 and R5X4 viruses can utilize both coreceptors. Recently, in Bioinformatics, R5X4 viruses have been studied to classify by using the experiments on HIV-1 genome. In this study, R5X4 type of HIV viruses were classified using Auto Regressive (AR) model through Artificial Neural Networks (ANNs). The statistical data of R5X4, R5 and X4 viruses was analyzed by using signal processing methods and ANNs. Accessible residues of these virus sequences were obtained and modeled by AR model since the dimension of residues is large and different from each other. Finally the pre-processed data was used to evolve various ANN structures for determining R5X4 viruses. Furthermore ROC analysis was applied to ANNs to show their real performances. The results indicate that R5X4 viruses successfully classified with high sensitivity and specificity values training and testing ROC analysis for RBF, which gives the best performance among ANN structures.

Implementation of Geo-knowledge Based Geographic Information System for Estimating Earthquake Hazard Potential at a Metropolitan Area, Gwangju, in Korea

In this study, an inland metropolitan area, Gwangju, in Korea was selected to assess the amplification potential of earthquake motion and provide the information for regional seismic countermeasure. A geographic information system-based expert system was implemented for reliably predicting the spatial geotechnical layers in the entire region of interesting by building a geo-knowledge database. Particularly, the database consists of the existing boring data gathered from the prior geotechnical projects and the surface geo-knowledge data acquired from the site visit. For practical application of the geo-knowledge database to estimate the earthquake hazard potential related to site amplification effects at the study area, seismic zoning maps on geotechnical parameters, such as the bedrock depth and the site period, were created within GIS framework. In addition, seismic zonation of site classification was also performed to determine the site amplification coefficients for seismic design at any site in the study area. KeywordsEarthquake hazard, geo-knowledge, geographic information system, seismic zonation, site period.

Analysis of Sonogram Images of Thyroid Gland Based on Wavelet Transform

Sonogram images of normal and lymphocyte thyroid tissues have considerable overlap which makes it difficult to interpret and distinguish. Classification from sonogram images of thyroid gland is tackled in semiautomatic way. While making manual diagnosis from images, some relevant information need not to be recognized by human visual system. Quantitative image analysis could be helpful to manual diagnostic process so far done by physician. Two classes are considered: normal tissue and chronic lymphocyte thyroid (Hashimoto's Thyroid). Data structure is analyzed using K-nearest-neighbors classification. This paper is mentioned that unlike the wavelet sub bands' energy, histograms and Haralick features are not appropriate to distinguish between normal tissue and Hashimoto's thyroid.