Integration of Support Vector Machine and Bayesian Neural Network for Data Mining and Classification

Several combinations of the preprocessing algorithms, feature selection techniques and classifiers can be applied to the data classification tasks. This study introduces a new accurate classifier, the proposed classifier consist from four components: Signal-to- Noise as a feature selection technique, support vector machine, Bayesian neural network and AdaBoost as an ensemble algorithm. To verify the effectiveness of the proposed classifier, seven well known classifiers are applied to four datasets. The experiments show that using the suggested classifier enhances the classification rates for all datasets.

Selecting Negative Examples for Protein-Protein Interaction

Proteomics is one of the largest areas of research for bioinformatics and medical science. An ambitious goal of proteomics is to elucidate the structure, interactions and functions of all proteins within cells and organisms. Predicting Protein-Protein Interaction (PPI) is one of the crucial and decisive problems in current research. Genomic data offer a great opportunity and at the same time a lot of challenges for the identification of these interactions. Many methods have already been proposed in this regard. In case of in-silico identification, most of the methods require both positive and negative examples of protein interaction and the perfection of these examples are very much crucial for the final prediction accuracy. Positive examples are relatively easy to obtain from well known databases. But the generation of negative examples is not a trivial task. Current PPI identification methods generate negative examples based on some assumptions, which are likely to affect their prediction accuracy. Hence, if more reliable negative examples are used, the PPI prediction methods may achieve even more accuracy. Focusing on this issue, a graph based negative example generation method is proposed, which is simple and more accurate than the existing approaches. An interaction graph of the protein sequences is created. The basic assumption is that the longer the shortest path between two protein-sequences in the interaction graph, the less is the possibility of their interaction. A well established PPI detection algorithm is employed with our negative examples and in most cases it increases the accuracy more than 10% in comparison with the negative pair selection method in that paper.

The Transfer of Energy Technologies in a Developing Country Context Towards Improved Practice from Past Successes and Failures

Technology transfer of renewable energy technologies is very often unsuccessful in the developing world. Aside from challenges that have social, economic, financial, institutional and environmental dimensions, technology transfer has generally been misunderstood, and largely seen as mere delivery of high tech equipment from developed to developing countries or within the developing world from R&D institutions to society. Technology transfer entails much more, including, but not limited to: entire systems and their component parts, know-how, goods and services, equipment, and organisational and managerial procedures. Means to facilitate the successful transfer of energy technologies, including the sharing of lessons are subsequently extremely important for developing countries as they grapple with increasing energy needs to sustain adequate economic growth and development. Improving the success of technology transfer is an ongoing process as more projects are implemented, new problems are encountered and new lessons are learnt. Renewable energy is also critical to improve the quality of lives of the majority of people in developing countries. In rural areas energy is primarily traditional biomass. The consumption activities typically occur in an inefficient manner, thus working against the notion of sustainable development. This paper explores the implementation of technology transfer in the developing world (sub-Saharan Africa). The focus is necessarily on RETs since most rural energy initiatives are RETs-based. Additionally, it aims to highlight some lessons drawn from the cited RE projects and identifies notable differences where energy technology transfer was judged to be successful. This is done through a literature review based on a selection of documented case studies which are judged against the definition provided for technology transfer. This paper also puts forth research recommendations that might contribute to improved technology transfer in the developing world. Key findings of this paper include: Technology transfer cannot be complete without satisfying pre-conditions such as: affordability, maintenance (and associated plans), knowledge and skills transfer, appropriate know how, ownership and commitment, ability to adapt technology, sound business principles such as financial viability and sustainability, project management, relevance and many others. It is also shown that lessons are learnt in both successful and unsuccessful projects.

A Methodology for Creating a Conceptual Model Under Uncertainty

This article deals with the conceptual modeling under uncertainty. First, the division of information systems with their definition will be described, focusing on those where the construction of a conceptual model is suitable for the design of future information system database. Furthermore, the disadvantages of the traditional approach in creating a conceptual model and database design will be analyzed. A comprehensive methodology for the creation of a conceptual model based on analysis of client requirements and the selection of a suitable domain model is proposed here. This article presents the expert system used for the construction of a conceptual model and is a suitable tool for database designers to create a conceptual model.

A Markov Chain Model for Load-Balancing Based and Service Based RAT Selection Algorithms in Heterogeneous Networks

Next Generation Wireless Network (NGWN) is expected to be a heterogeneous network which integrates all different Radio Access Technologies (RATs) through a common platform. A major challenge is how to allocate users to the most suitable RAT for them. An optimized solution can lead to maximize the efficient use of radio resources, achieve better performance for service providers and provide Quality of Service (QoS) with low costs to users. Currently, Radio Resource Management (RRM) is implemented efficiently for the RAT that it was developed. However, it is not suitable for a heterogeneous network. Common RRM (CRRM) was proposed to manage radio resource utilization in the heterogeneous network. This paper presents a user level Markov model for a three co-located RAT networks. The load-balancing based and service based CRRM algorithms have been studied using the presented Markov model. A comparison for the performance of load-balancing based and service based CRRM algorithms is studied in terms of traffic distribution, new call blocking probability, vertical handover (VHO) call dropping probability and throughput.

Combined Feature Based Hyperspectral Image Classification Technique Using Support Vector Machines

A spatial classification technique incorporating a State of Art Feature Extraction algorithm is proposed in this paper for classifying a heterogeneous classes present in hyper spectral images. The classification accuracy can be improved if and only if both the feature extraction and classifier selection are proper. As the classes in the hyper spectral images are assumed to have different textures, textural classification is entertained. Run Length feature extraction is entailed along with the Principal Components and Independent Components. A Hyperspectral Image of Indiana Site taken by AVIRIS is inducted for the experiment. Among the original 220 bands, a subset of 120 bands is selected. Gray Level Run Length Matrix (GLRLM) is calculated for the selected forty bands. From GLRLMs the Run Length features for individual pixels are calculated. The Principle Components are calculated for other forty bands. Independent Components are calculated for next forty bands. As Principal & Independent Components have the ability to represent the textural content of pixels, they are treated as features. The summation of Run Length features, Principal Components, and Independent Components forms the Combined Features which are used for classification. SVM with Binary Hierarchical Tree is used to classify the hyper spectral image. Results are validated with ground truth and accuracies are calculated.

Secure Resource Selection in Computational Grid Based on Quantitative Execution Trust

Grid computing provides a virtual framework for controlled sharing of resources across institutional boundaries. Recently, trust has been recognised as an important factor for selection of optimal resources in a grid. We introduce a new method that provides a quantitative trust value, based on the past interactions and present environment characteristics. This quantitative trust value is used to select a suitable resource for a job and eliminates run time failures arising from incompatible user-resource pairs. The proposed work will act as a tool to calculate the trust values of the various components of the grid and there by improves the success rate of the jobs submitted to the resource on the grid. The access to a resource not only depend on the identity and behaviour of the resource but also upon its context of transaction, time of transaction, connectivity bandwidth, availability of the resource and load on the resource. The quality of the recommender is also evaluated based on the accuracy of the feedback provided about a resource. The jobs are submitted for execution to the selected resource after finding the overall trust value of the resource. The overall trust value is computed with respect to the subjective and objective parameters.

Emotion Recognition Using Neural Network: A Comparative Study

Emotion recognition is an important research field that finds lots of applications nowadays. This work emphasizes on recognizing different emotions from speech signal. The extracted features are related to statistics of pitch, formants, and energy contours, as well as spectral, perceptual and temporal features, jitter, and shimmer. The Artificial Neural Networks (ANN) was chosen as the classifier. Working on finding a robust and fast ANN classifier suitable for different real life application is our concern. Several experiments were carried out on different ANN to investigate the different factors that impact the classification success rate. Using a database containing 7 different emotions, it will be shown that with a proper and careful adjustment of features format, training data sorting, number of features selected and even the ANN type and architecture used, a success rate of 85% or even more can be achieved without increasing the system complicity and the computation time

On the Efficient Implementation of a Serial and Parallel Decomposition Algorithm for Fast Support Vector Machine Training Including a Multi-Parameter Kernel

This work deals with aspects of support vector machine learning for large-scale data mining tasks. Based on a decomposition algorithm for support vector machine training that can be run in serial as well as shared memory parallel mode we introduce a transformation of the training data that allows for the usage of an expensive generalized kernel without additional costs. We present experiments for the Gaussian kernel, but usage of other kernel functions is possible, too. In order to further speed up the decomposition algorithm we analyze the critical problem of working set selection for large training data sets. In addition, we analyze the influence of the working set sizes onto the scalability of the parallel decomposition scheme. Our tests and conclusions led to several modifications of the algorithm and the improvement of overall support vector machine learning performance. Our method allows for using extensive parameter search methods to optimize classification accuracy.

Unsupervised Feature Selection Using Feature Density Functions

Since dealing with high dimensional data is computationally complex and sometimes even intractable, recently several feature reductions methods have been developed to reduce the dimensionality of the data in order to simplify the calculation analysis in various applications such as text categorization, signal processing, image retrieval, gene expressions and etc. Among feature reduction techniques, feature selection is one the most popular methods due to the preservation of the original features. In this paper, we propose a new unsupervised feature selection method which will remove redundant features from the original feature space by the use of probability density functions of various features. To show the effectiveness of the proposed method, popular feature selection methods have been implemented and compared. Experimental results on the several datasets derived from UCI repository database, illustrate the effectiveness of our proposed methods in comparison with the other compared methods in terms of both classification accuracy and the number of selected features.

Stock Portfolio Selection Using Chemical Reaction Optimization

Stock portfolio selection is a classic problem in finance, and it involves deciding how to allocate an institution-s or an individual-s wealth to a number of stocks, with certain investment objectives (return and risk). In this paper, we adopt the classical Markowitz mean-variance model and consider an additional common realistic constraint, namely, the cardinality constraint. Thus, stock portfolio optimization becomes a mixed-integer quadratic programming problem and it is difficult to be solved by exact optimization algorithms. Chemical Reaction Optimization (CRO), which mimics the molecular interactions in a chemical reaction process, is a population-based metaheuristic method. Two different types of CRO, named canonical CRO and Super Molecule-based CRO (S-CRO), are proposed to solve the stock portfolio selection problem. We test both canonical CRO and S-CRO on a benchmark and compare their performance under two criteria: Markowitz efficient frontier (Pareto frontier) and Sharpe ratio. Computational experiments suggest that S-CRO is promising in handling the stock portfolio optimization problem.

Decision Support System for Flood Crisis Management using Artificial Neural Network

This paper presents an alternate approach that uses artificial neural network to simulate the flood level dynamics in a river basin. The algorithm was developed in a decision support system environment in order to enable users to process the data. The decision support system is found to be useful due to its interactive nature, flexibility in approach and evolving graphical feature and can be adopted for any similar situation to predict the flood level. The main data processing includes the gauging station selection, input generation, lead-time selection/generation, and length of prediction. This program enables users to process the flood level data, to train/test the model using various inputs and to visualize results. The program code consists of a set of files, which can as well be modified to match other purposes. This program may also serve as a tool for real-time flood monitoring and process control. The running results indicate that the decision support system applied to the flood level seems to have reached encouraging results for the river basin under examination. The comparison of the model predictions with the observed data was satisfactory, where the model is able to forecast the flood level up to 5 hours in advance with reasonable prediction accuracy. Finally, this program may also serve as a tool for real-time flood monitoring and process control.

Upgrading Performance of DSR Routing Protocol in Mobile Ad Hoc Networks

Routing in mobile ad hoc networks is a challenging task because nodes are free to move randomly. In DSR like all On- Demand routing algorithms, route discovery mechanism is associated with great delay. More Clearly in DSR routing protocol to send route reply packet, when current route breaks, destination seeks a new route. In this paper we try to change route selection mechanism proactively. We also define a link stability parameter in which a stability value is assigned to each link. Given this feature, destination node can estimate stability of routes and can select the best and more stable route. Therefore we can reduce the delay and jitter of sending data packets.

A Comparison of Some Thresholding Selection Methods for Wavelet Regression

In wavelet regression, choosing threshold value is a crucial issue. A too large value cuts too many coefficients resulting in over smoothing. Conversely, a too small threshold value allows many coefficients to be included in reconstruction, giving a wiggly estimate which result in under smoothing. However, the proper choice of threshold can be considered as a careful balance of these principles. This paper gives a very brief introduction to some thresholding selection methods. These methods include: Universal, Sure, Ebays, Two fold cross validation and level dependent cross validation. A simulation study on a variety of sample sizes, test functions, signal-to-noise ratios is conducted to compare their numerical performances using three different noise structures. For Gaussian noise, EBayes outperforms in all cases for all used functions while Two fold cross validation provides the best results in the case of long tail noise. For large values of signal-to-noise ratios, level dependent cross validation works well under correlated noises case. As expected, increasing both sample size and level of signal to noise ratio, increases estimation efficiency.

M-ary Chaotic Sequence Based SLM-OFDM System for PAPR Reduction without Side-Information

Selected Mapping (SLM) is a PAPR reduction technique, which converts the OFDM signal into several independent signals by multiplication with the phase sequence set and transmits one of the signals with lowest PAPR. But it requires the index of the selected signal i.e. side information (SI) to be transmitted with each OFDM symbol. The PAPR reduction capability of the SLM scheme depends on the selection of phase sequence set. In this paper, we have proposed a new phase sequence set generation scheme based on M-ary chaotic sequence and a mapping scheme to map quaternary data to concentric circle constellation (CCC) is used. It is shown that this method does not require SI and provides better SER performance with good PAPR reduction capability as compared to existing SLMOFDM methods.

Automatic Generation Control of Multi-Area Electric Energy Systems Using Modified GA

A modified Genetic Algorithm (GA) based optimal selection of parameters for Automatic Generation Control (AGC) of multi-area electric energy systems is proposed in this paper. Simulations on multi-area reheat thermal system with and without consideration of nonlinearity like governor dead band followed by 1% step load perturbation is performed to exemplify the optimum parameter search. In this proposed method, a modified Genetic Algorithm is proposed where one point crossover with modification is employed. Positional dependency in respect of crossing site helps to maintain diversity of search point as well as exploitation of already known optimum value. This makes a trade-off between exploration and exploitation of search space to find global optimum in less number of generations. The proposed GA along with decomposition technique as developed has been used to obtain the optimum megawatt frequency control of multi-area electric energy systems. Time-domain simulations are conducted with trapezoidal integration along with decomposition technique. The superiority of the proposed method over existing one is verified from simulations and comparisons.

Fish Marketing: A Panacea towards Sustainable Agriculture in Ogun State, Nigeria

This study assessed fish marketing as panacea towards sustainable agriculture in Ogun State, Nigeria. Multi-stage sampling technique was used in the selection of 150 fish marketers for this study. Descriptive statistics were used for the objectives while Product Pearson Moment Correlation was used to test the hypothesis. Result of the findings revealed that the mean age of the respondents was 38.60 years. Majority (93.33%) of the respondents had acceptable levels of formal education. Many (44.00%) of the respondents had spent 1-5 years in fish marketing. The average quantity of fish sold in a day was 94.10kg. However, efficient fish marketing were hindered by inadequate processing equipment, storage rooms and ice holding facilities (86.67%). There was a significant relationship between socio-economic characteristics and profit realized from fish marketing (p < 0.05). It was recommended that storage and warehousing facilities should be provided to the fish marketers in the study area.

Classifier Combination Approach in Motion Imagery Signals Processing for Brain Computer Interface

In this study we focus on improvement performance of a cue based Motor Imagery Brain Computer Interface (BCI). For this purpose, data fusion approach is used on results of different classifiers to make the best decision. At first step Distinction Sensitive Learning Vector Quantization method is used as a feature selection method to determine most informative frequencies in recorded signals and its performance is evaluated by frequency search method. Then informative features are extracted by packet wavelet transform. In next step 5 different types of classification methods are applied. The methodologies are tested on BCI Competition II dataset III, the best obtained accuracy is 85% and the best kappa value is 0.8. At final step ordered weighted averaging (OWA) method is used to provide a proper aggregation classifiers outputs. Using OWA enhanced system accuracy to 95% and kappa value to 0.9. Applying OWA just uses 50 milliseconds for performing calculation.

Two-Phase Optimization for Selecting Materialized Views in a Data Warehouse

A data warehouse (DW) is a system which has value and role for decision-making by querying. Queries to DW are critical regarding to their complexity and length. They often access millions of tuples, and involve joins between relations and aggregations. Materialized views are able to provide the better performance for DW queries. However, these views have maintenance cost, so materialization of all views is not possible. An important challenge of DW environment is materialized view selection because we have to realize the trade-off between performance and view maintenance. Therefore, in this paper, we introduce a new approach aimed to solve this challenge based on Two-Phase Optimization (2PO), which is a combination of Simulated Annealing (SA) and Iterative Improvement (II), with the use of Multiple View Processing Plan (MVPP). Our experiments show that 2PO outperform the original algorithms in terms of query processing cost and view maintenance cost.

Quantitative Estimation of Periodicities in Lyari River Flow Routing

The hydrologic time series data display periodic structure and periodic autoregressive process receives considerable attention in modeling of such series. In this communication long term record of monthly waste flow of Lyari river is utilized to quantify by using PAR modeling technique. The parameters of model are estimated by using Frances & Paap methodology. This study shows that periodic autoregressive model of order 2 is the most parsimonious model for assessing periodicity in waste flow of the river. A careful statistical analysis of residuals of PAR (2) model is used for establishing goodness of fit. The forecast by using proposed model confirms significance and effectiveness of the model.