Development of Subjective Measures of Interestingness: From Unexpectedness to Shocking

Knowledge Discovery of Databases (KDD) is the process of extracting previously unknown but useful and significant information from large massive volume of databases. Data Mining is a stage in the entire process of KDD which applies an algorithm to extract interesting patterns. Usually, such algorithms generate huge volume of patterns. These patterns have to be evaluated by using interestingness measures to reflect the user requirements. Interestingness is defined in different ways, (i) Objective measures (ii) Subjective measures. Objective measures such as support and confidence extract meaningful patterns based on the structure of the patterns, while subjective measures such as unexpectedness and novelty reflect the user perspective. In this report, we try to brief the more widely spread and successful subjective measures and propose a new subjective measure of interestingness, i.e. shocking.

On the EM Algorithm and Bootstrap Approach Combination for Improving Satellite Image Fusion

This paper discusses EM algorithm and Bootstrap approach combination applied for the improvement of the satellite image fusion process. This novel satellite image fusion method based on estimation theory EM algorithm and reinforced by Bootstrap approach was successfully implemented and tested. The sensor images are firstly split by a Bayesian segmentation method to determine a joint region map for the fused image. Then, we use the EM algorithm in conjunction with the Bootstrap approach to develop the bootstrap EM fusion algorithm, hence producing the fused targeted image. We proposed in this research to estimate the statistical parameters from some iterative equations of the EM algorithm relying on a reference of representative Bootstrap samples of images. Sizes of those samples are determined from a new criterion called 'hybrid criterion'. Consequently, the obtained results of our work show that using the Bootstrap EM (BEM) in image fusion improve performances of estimated parameters which involve amelioration of the fused image quality; and reduce the computing time during the fusion process.

Analysis of Equal cost Adaptive Routing Algorithms using Connection-Oriented and Connectionless Protocols

This research paper evaluates and compares the performance of equal cost adaptive multi-path routing algorithms taking the transport protocols TCP (Transmission Control Protocol) and UDP (User Datagram Protocol) using network simulator ns2 and concludes which one is better.

New Enhanced Hexagon-Based Search Using Point-Oriented Inner Search for Fast Block Motion Estimation

Recently, an enhanced hexagon-based search (EHS) algorithm was proposed to speedup the original hexagon-based search (HS) by exploiting the group-distortion information of some evaluated points. In this paper, a second version of the EHS is proposed with a new point-oriented inner search technique which can further speedup the HS in both large and small motion environments. Experimental results show that the enhanced hexagon-based search version-2 (EHS2) is faster than the HS up to 34% with negligible PSNR degradation.

Evolutionary Techniques Based Combined Artificial Neural Networks for Peak Load Forecasting

This paper presents a new approach using Combined Artificial Neural Network (CANN) module for daily peak load forecasting. Five different computational techniques –Constrained method, Unconstrained method, Evolutionary Programming (EP), Particle Swarm Optimization (PSO), and Genetic Algorithm (GA) – have been used to identify the CANN module for peak load forecasting. In this paper, a set of neural networks has been trained with different architecture and training parameters. The networks are trained and tested for the actual load data of Chennai city (India). A set of better trained conventional ANNs are selected to develop a CANN module using different algorithms instead of using one best conventional ANN. Obtained results using CANN module confirm its validity.

Implementation and Analysis of Elliptic Curve Cryptosystems over Polynomial basis and ONB

Polynomial bases and normal bases are both used for elliptic curve cryptosystems, but field arithmetic operations such as multiplication, inversion and doubling for each basis are implemented by different methods. In general, it is said that normal bases, especially optimal normal bases (ONB) which are special cases on normal bases, are efficient for the implementation in hardware in comparison with polynomial bases. However there seems to be more examined by implementing and analyzing these systems under similar condition. In this paper, we designed field arithmetic operators for each basis over GF(2233), which field has a polynomial basis recommended by SEC2 and a type-II ONB both, and analyzed these implementation results. And, in addition, we predicted the efficiency of two elliptic curve cryptosystems using these field arithmetic operators.

Combined Sewer Overflow forecasting with Feed-forward Back-propagation Artificial Neural Network

A feed-forward, back-propagation Artificial Neural Network (ANN) model has been used to forecast the occurrences of wastewater overflows in a combined sewerage reticulation system. This approach was tested to evaluate its applicability as a method alternative to the common practice of developing a complete conceptual, mathematical hydrological-hydraulic model for the sewerage system to enable such forecasts. The ANN approach obviates the need for a-priori understanding and representation of the underlying hydrological hydraulic phenomena in mathematical terms but enables learning the characteristics of a sewer overflow from the historical data. The performance of the standard feed-forward, back-propagation of error algorithm was enhanced by a modified data normalizing technique that enabled the ANN model to extrapolate into the territory that was unseen by the training data. The algorithm and the data normalizing method are presented along with the ANN model output results that indicate a good accuracy in the forecasted sewer overflow rates. However, it was revealed that the accurate forecasting of the overflow rates are heavily dependent on the availability of a real-time flow monitoring at the overflow structure to provide antecedent flow rate data. The ability of the ANN to forecast the overflow rates without the antecedent flow rates (as is the case with traditional conceptual reticulation models) was found to be quite poor.

Experimental and Theoretical Investigation of Rough Rice Drying in Infrared-assisted Hot Air Dryer Using Artificial Neural Network

Drying characteristics of rough rice (variety of lenjan) with an initial moisture content of 25% dry basis (db) was studied in a hot air dryer assisted by infrared heating. Three arrival air temperatures (30, 40 and 500C) and four infrared radiation intensities (0, 0.2 , 0.4 and 0.6 W/cm2) and three arrival air speeds (0.1, 0.15 and 0.2 m.s-1) were studied. Bending strength of brown rice kernel, percentage of cracked kernels and time of drying were measured and evaluated. The results showed that increasing the drying arrival air temperature and radiation intensity of infrared resulted decrease in drying time. High bending strength and low percentage of cracked kernel was obtained when paddy was dried by hot air assisted infrared dryer. Between this factors and their interactive effect were a significant difference (p

Efficient Dimensionality Reduction of Directional Overcurrent Relays Optimal Coordination Problem

Directional over current relays (DOCR) are commonly used in power system protection as a primary protection in distribution and sub-transmission electrical systems and as a secondary protection in transmission systems. Coordination of protective relays is necessary to obtain selective tripping. In this paper, an approach for efficiency reduction of DOCRs nonlinear optimum coordination (OC) is proposed. This was achieved by modifying the objective function and relaxing several constraints depending on the four constraints classification, non-valid, redundant, pre-obtained and valid constraints. According to this classification, the far end fault effect on the objective function and constraints, and in consequently on relay operating time, was studied. The study was carried out, firstly by taking into account the near-end and far-end faults in DOCRs coordination problem formulation; and then faults very close to the primary relays (nearend faults). The optimal coordination (OC) was achieved by simultaneously optimizing all variables (TDS and Ip) in nonlinear environment by using of Genetic algorithm nonlinear programming techniques. The results application of the above two approaches on 6-bus and 26-bus system verify that the far-end faults consideration on OC problem formulation don-t lose the optimality.

Improving Image Quality in Remote Sensing Satellites using Channel Coding

Among other factors that characterize satellite communication channels is their high bit error rate. We present a system for still image transmission over noisy satellite channels. The system couples image compression together with error control codes to improve the received image quality while maintaining its bandwidth requirements. The proposed system is tested using a high resolution satellite imagery simulated over the Rician fading channel. Evaluation results show improvement in overall system including image quality and bandwidth requirements compared to similar systems with different coding schemes.

Economic Evaluations Using Genetic Algorithms to Determine the Territorial Impact Caused by High Speed Railways

The evolution of technology and construction techniques has enabled the upgrading of transport networks. In particular, the high-speed rail networks allow convoys to peak at above 300 km/h. These structures, however, often significantly impact the surrounding environment. Among the effects of greater importance are the ones provoked by the soundwave connected to train transit. The wave propagation affects the quality of life in areas surrounding the tracks, often for several hundred metres. There are substantial damages to properties (buildings and land), in terms of market depreciation. The present study, integrating expertise in acoustics, computering and evaluation fields, outlines a useful model to select project paths so as to minimize the noise impact and reduce the causes of possible litigation. It also facilitates the rational selection of initiatives to contain the environmental damage to the already existing railway tracks. The research is developed with reference to the Italian regulatory framework (usually more stringent than European and international standards) and refers to a case study concerning the high speed network in Italy.

Identification, Prediction and Detection of the Process Fault in a Cement Rotary Kiln by Locally Linear Neuro-Fuzzy Technique

In this paper, we use nonlinear system identification method to predict and detect process fault of a cement rotary kiln. After selecting proper inputs and output, an input-output model is identified for the plant. To identify the various operation points in the kiln, Locally Linear Neuro-Fuzzy (LLNF) model is used. This model is trained by LOLIMOT algorithm which is an incremental treestructure algorithm. Then, by using this method, we obtained 3 distinct models for the normal and faulty situations in the kiln. One of the models is for normal condition of the kiln with 15 minutes prediction horizon. The other two models are for the two faulty situations in the kiln with 7 minutes prediction horizon are presented. At the end, we detect these faults in validation data. The data collected from White Saveh Cement Company is used for in this study.

A Review of Methods for 2D/3D Registration

2D/3D registration is a special case of medical image registration which is of particular interest to surgeons. Applications of 2D/3D registration are [1] radiotherapy planning and treatment verification, spinal surgery, hip replacement, neurointerventions and aortic stenting. The purpose of this paper is to provide a literature review of the main methods for image registration for the 2D/3D case. At the end of the paper an algorithm is proposed for 2D/3D registration based on the Chebyssev polynomials iteration loop.

Actionable Rules: Issues and New Directions

Knowledge Discovery in Databases (KDD) is the process of extracting previously unknown, hidden and interesting patterns from a huge amount of data stored in databases. Data mining is a stage of the KDD process that aims at selecting and applying a particular data mining algorithm to extract an interesting and useful knowledge. It is highly expected that data mining methods will find interesting patterns according to some measures, from databases. It is of vital importance to define good measures of interestingness that would allow the system to discover only the useful patterns. Measures of interestingness are divided into objective and subjective measures. Objective measures are those that depend only on the structure of a pattern and which can be quantified by using statistical methods. While, subjective measures depend only on the subjectivity and understandability of the user who examine the patterns. These subjective measures are further divided into actionable, unexpected and novel. The key issues that faces data mining community is how to make actions on the basis of discovered knowledge. For a pattern to be actionable, the user subjectivity is captured by providing his/her background knowledge about domain. Here, we consider the actionability of the discovered knowledge as a measure of interestingness and raise important issues which need to be addressed to discover actionable knowledge.

Active and Reactive Power Control of a DFIG with MPPT for Variable Speed Wind Energy Conversion using Sliding Mode Control

This paper presents the study of a variable speed wind energy conversion system based on a Doubly Fed Induction Generator (DFIG) based on a sliding mode control applied to achieve control of active and reactive powers exchanged between the stator of the DFIG and the grid to ensure a Maximum Power Point Tracking (MPPT) of a wind energy conversion system. The proposed control algorithm is applied to a DFIG whose stator is directly connected to the grid and the rotor is connected to the PWM converter. To extract a maximum of power, the rotor side converter is controlled by using a stator flux-oriented strategy. The created decoupling control between active and reactive stator power allows keeping the power factor close to unity. Simulation results show that the wind turbine can operate at its optimum energy for a wide range of wind speed.

Evaluating some Feature Selection Methods for an Improved SVM Classifier

Text categorization is the problem of classifying text documents into a set of predefined classes. After a preprocessing step the documents are typically represented as large sparse vectors. When training classifiers on large collections of documents, both the time and memory restrictions can be quite prohibitive. This justifies the application of features selection methods to reduce the dimensionality of the document-representation vector. Four feature selection methods are evaluated: Random Selection, Information Gain (IG), Support Vector Machine (called SVM_FS) and Genetic Algorithm with SVM (GA_FS). We showed that the best results were obtained with SVM_FS and GA_FS methods for a relatively small dimension of the features vector comparative with the IG method that involves longer vectors, for quite similar classification accuracies. Also we present a novel method to better correlate SVM kernel-s parameters (Polynomial or Gaussian kernel).

Identification of Aircraft Gas Turbine Engines Temperature Condition

Groundlessness of application probability-statistic methods are especially shown at an early stage of the aviation GTE technical condition diagnosing, when the volume of the information has property of the fuzzy, limitations, uncertainty and efficiency of application of new technology Soft computing at these diagnosing stages by using the fuzzy logic and neural networks methods. It is made training with high accuracy of multiple linear and nonlinear models (the regression equations) received on the statistical fuzzy data basis. At the information sufficiency it is offered to use recurrent algorithm of aviation GTE technical condition identification on measurements of input and output parameters of the multiple linear and nonlinear generalized models at presence of noise measured (the new recursive least squares method (LSM)). As application of the given technique the estimation of the new operating aviation engine D30KU-154 technical condition at height H=10600 m was made.

Software Maintenance Severity Prediction with Soft Computing Approach

As the majority of faults are found in a few of its modules so there is a need to investigate the modules that are affected severely as compared to other modules and proper maintenance need to be done on time especially for the critical applications. In this paper, we have explored the different predictor models to NASA-s public domain defect dataset coded in Perl programming language. Different machine learning algorithms belonging to the different learner categories of the WEKA project including Mamdani Based Fuzzy Inference System and Neuro-fuzzy based system have been evaluated for the modeling of maintenance severity or impact of fault severity. The results are recorded in terms of Accuracy, Mean Absolute Error (MAE) and Root Mean Squared Error (RMSE). The results show that Neuro-fuzzy based model provides relatively better prediction accuracy as compared to other models and hence, can be used for the maintenance severity prediction of the software.

Digital filters for Hot-Mix Asphalt Complex Modulus Test Data Using Genetic Algorithm Strategies

The dynamic or complex modulus test is considered to be a mechanistically based laboratory test to reliably characterize the strength and load-resistance of Hot-Mix Asphalt (HMA) mixes used in the construction of roads. The most common observation is that the data collected from these tests are often noisy and somewhat non-sinusoidal. This hampers accurate analysis of the data to obtain engineering insight. The goal of the work presented in this paper is to develop and compare automated evolutionary computational techniques to filter test noise in the collection of data for the HMA complex modulus test. The results showed that the Covariance Matrix Adaptation-Evolutionary Strategy (CMA-ES) approach is computationally efficient for filtering data obtained from the HMA complex modulus test.

An Exact Solution of Axi-symmetric Conductive Heat Transfer in Cylindrical Composite Laminate under the General Boundary Condition

This study presents an exact general solution for steady-state conductive heat transfer in cylindrical composite laminates. Appropriate Fourier transformation has been obtained using Sturm-Liouville theorem. Series coefficients are achieved by solving a set of equations that related to thermal boundary conditions at inner and outer of the cylinder, also related to temperature continuity and heat flux continuity between each layer. The solution of this set of equations are obtained using Thomas algorithm. In this paper, the effect of fibers- angle on temperature distribution of composite laminate is investigated under general boundary conditions. Here, we show that the temperature distribution for any composite laminates is between temperature distribution for laminates with θ = 0° and θ = 90° .