DWM-CDD: Dynamic Weighted Majority Concept Drift Detection for Spam Mail Filtering

Although e-mail is the most efficient and popular communication method, unwanted and mass unsolicited e-mails, also called spam mail, endanger the existence of the mail system. This paper proposes a new algorithm called Dynamic Weighted Majority Concept Drift Detection (DWM-CDD) for content-based filtering. The design purposes of DWM-CDD are first to accurate the performance of the previously proposed algorithms, and second to speed up the time to construct the model. The results show that DWM-CDD can detect both sudden and gradual changes quickly and accurately. Moreover, the time needed for model construction is less than previously proposed algorithms.

Anti-Counterfeiting Solution Employing Mobile RFID Environment

EPC Class-1 Generation-2 UHF tags, one of Radio frequency identification or RFID tag types, is expected that most companies are planning to use it in the supply chain in the short term and in consumer packaging in the long term due to its inexpensive cost. Because of the very cost, however, its resources are extremely scarce and it is hard to have any valuable security algorithms in it. It causes security vulnerabilities, in particular cloning the tags for counterfeits. In this paper, we propose a product authentication solution for anti-counterfeiting at application level in the supply chain and mobile RFID environment. It aims to become aware of distribution of spurious products with fake RFID tags and to provide a product authentication service to general consumers with mobile RFID devices like mobile phone or PDA which has a mobile RFID reader. We will discuss anti-counterfeiting mechanisms which are required to our proposed solution and address requirements that the mechanisms should have.

Simulation Tools for Fixed Point DSP Algorithms and Architectures

This paper presents software tools that convert the C/Cµ floating point source code for a DSP algorithm into a fixedpoint simulation model that can be used to evaluate the numericalperformance of the algorithm on several different fixed pointplatforms including microprocessors, DSPs and FPGAs. The tools use a novel system for maintaining binary point informationso that the conversion from floating point to fixed point isautomated and the resulting fixed point algorithm achieves maximum possible precision. A configurable architecture is used during the simulation phase so that the algorithm can produce a bit-exact output for several different target devices.

Dynamic Decompression for Text Files

Compression algorithms reduce the redundancy in data representation to decrease the storage required for that data. Lossless compression researchers have developed highly sophisticated approaches, such as Huffman encoding, arithmetic encoding, the Lempel-Ziv (LZ) family, Dynamic Markov Compression (DMC), Prediction by Partial Matching (PPM), and Burrows-Wheeler Transform (BWT) based algorithms. Decompression is also required to retrieve the original data by lossless means. A compression scheme for text files coupled with the principle of dynamic decompression, which decompresses only the section of the compressed text file required by the user instead of decompressing the entire text file. Dynamic decompressed files offer better disk space utilization due to higher compression ratios compared to most of the currently available text file formats.

CAPWAP Status and Design Considerations for Seamless Roaming Support

Wireless LAN technologies have picked up momentum in the recent years due to their ease of deployment, cost and availability. The era of wireless LAN has also given rise to unique applications like VOIP, IPTV and unified messaging. However, these real-time applications are very sensitive to network and handoff latencies. To successfully support these applications, seamless roaming during the movement of mobile station has become crucial. Nowadays, centralized architecture models support roaming in WLANs. They have the ability to manage, control and troubleshoot large scale WLAN deployments. This model is managed by Control and Provision of Wireless Access Point protocol (CAPWAP). This paper covers the CAPWAP architectural solution along with its proposals that have emerged. Based on the literature survey conducted in this paper, we found that the proposed algorithms to reduce roaming latency in CAPWAP architecture do not support seamless roaming. Additionally, they are not sufficient during the initial period of the network. This paper also suggests important design consideration for mobility support in future centralized IEEE 802.11 networks.

Correlation-based Feature Selection using Ant Colony Optimization

Feature selection has recently been the subject of intensive research in data mining, specially for datasets with a large number of attributes. Recent work has shown that feature selection can have a positive effect on the performance of machine learning algorithms. The success of many learning algorithms in their attempts to construct models of data, hinges on the reliable identification of a small set of highly predictive attributes. The inclusion of irrelevant, redundant and noisy attributes in the model building process phase can result in poor predictive performance and increased computation. In this paper, a novel feature search procedure that utilizes the Ant Colony Optimization (ACO) is presented. The ACO is a metaheuristic inspired by the behavior of real ants in their search for the shortest paths to food sources. It looks for optimal solutions by considering both local heuristics and previous knowledge. When applied to two different classification problems, the proposed algorithm achieved very promising results.

A Pairwise-Gaussian-Merging Approach: Towards Genome Segmentation for Copy Number Analysis

Segmentation, filtering out of measurement errors and identification of breakpoints are integral parts of any analysis of microarray data for the detection of copy number variation (CNV). Existing algorithms designed for these tasks have had some successes in the past, but they tend to be O(N2) in either computation time or memory requirement, or both, and the rapid advance of microarray resolution has practically rendered such algorithms useless. Here we propose an algorithm, SAD, that is much faster and much less thirsty for memory – O(N) in both computation time and memory requirement -- and offers higher accuracy. The two key ingredients of SAD are the fundamental assumption in statistics that measurement errors are normally distributed and the mathematical relation that the product of two Gaussians is another Gaussian (function). We have produced a computer program for analyzing CNV based on SAD. In addition to being fast and small it offers two important features: quantitative statistics for predictions and, with only two user-decided parameters, ease of use. Its speed shows little dependence on genomic profile. Running on an average modern computer, it completes CNV analyses for a 262 thousand-probe array in ~1 second and a 1.8 million-probe array in 9 seconds

Analysis of Modified Heap Sort Algorithm on Different Environment

In field of Computer Science and Mathematics, sorting algorithm is an algorithm that puts elements of a list in a certain order i.e. ascending or descending. Sorting is perhaps the most widely studied problem in computer science and is frequently used as a benchmark of a system-s performance. This paper presented the comparative performance study of four sorting algorithms on different platform. For each machine, it is found that the algorithm depends upon the number of elements to be sorted. In addition, as expected, results show that the relative performance of the algorithms differed on the various machines. So, algorithm performance is dependent on data size and there exists impact of hardware also.

Performance Analysis of a Series of Adaptive Filters in Non-Stationary Environment for Noise Cancelling Setup

One of the essential components of much of DSP application is noise cancellation. Changes in real time signals are quite rapid and swift. In noise cancellation, a reference signal which is an approximation of noise signal (that corrupts the original information signal) is obtained and then subtracted from the noise bearing signal to obtain a noise free signal. This approximation of noise signal is obtained through adaptive filters which are self adjusting. As the changes in real time signals are abrupt, this needs adaptive algorithm that converges fast and is stable. Least mean square (LMS) and normalized LMS (NLMS) are two widely used algorithms because of their plainness in calculations and implementation. But their convergence rates are small. Adaptive averaging filters (AFA) are also used because they have high convergence, but they are less stable. This paper provides the comparative study of LMS and Normalized NLMS, AFA and new enhanced average adaptive (Average NLMS-ANLMS) filters for noise cancelling application using speech signals.

Dissipation of Higher Mode using Numerical Integration Algorithm in Dynamic Analysis

In general dynamic analyses, lower mode response is of interest, however the higher modes of spatially discretized equations generally do not represent the real behavior and not affects to global response much. Some implicit algorithms, therefore, are introduced to filter out the high-frequency modes using intended numerical error. The objective of this study is to introduce the P-method and PC α-method to compare that with dissipation method and Newmark method through the stability analysis and numerical example. PC α-method gives more accuracy than other methods because it based on the α-method inherits the superior properties of the implicit α-method. In finite element analysis, the PC α-method is more useful than other methods because it is the explicit scheme and it achieves the second order accuracy and numerical damping simultaneously.

A Bi-Objective Model for Location-Allocation Problem within Queuing Framework

This paper proposes a bi-objective model for the facility location problem under a congestion system. The idea of the model is motivated by applications of locating servers in bank automated teller machines (ATMS), communication networks, and so on. This model can be specifically considered for situations in which fixed service facilities are congested by stochastic demand within queueing framework. We formulate this model with two perspectives simultaneously: (i) customers and (ii) service provider. The objectives of the model are to minimize (i) the total expected travelling and waiting time and (ii) the average facility idle-time. This model represents a mixed-integer nonlinear programming problem which belongs to the class of NP-hard problems. In addition, to solve the model, two metaheuristic algorithms including nondominated sorting genetic algorithms (NSGA-II) and non-dominated ranking genetic algorithms (NRGA) are proposed. Besides, to evaluate the performance of the two algorithms some numerical examples are produced and analyzed with some metrics to determine which algorithm works better.

Evolutionary Approach for Automated Discovery of Censored Production Rules

In the recent past, there has been an increasing interest in applying evolutionary methods to Knowledge Discovery in Databases (KDD) and a number of successful applications of Genetic Algorithms (GA) and Genetic Programming (GP) to KDD have been demonstrated. The most predominant representation of the discovered knowledge is the standard Production Rules (PRs) in the form If P Then D. The PRs, however, are unable to handle exceptions and do not exhibit variable precision. The Censored Production Rules (CPRs), an extension of PRs, were proposed by Michalski & Winston that exhibit variable precision and supports an efficient mechanism for handling exceptions. A CPR is an augmented production rule of the form: If P Then D Unless C, where C (Censor) is an exception to the rule. Such rules are employed in situations, in which the conditional statement 'If P Then D' holds frequently and the assertion C holds rarely. By using a rule of this type we are free to ignore the exception conditions, when the resources needed to establish its presence are tight or there is simply no information available as to whether it holds or not. Thus, the 'If P Then D' part of the CPR expresses important information, while the Unless C part acts only as a switch and changes the polarity of D to ~D. This paper presents a classification algorithm based on evolutionary approach that discovers comprehensible rules with exceptions in the form of CPRs. The proposed approach has flexible chromosome encoding, where each chromosome corresponds to a CPR. Appropriate genetic operators are suggested and a fitness function is proposed that incorporates the basic constraints on CPRs. Experimental results are presented to demonstrate the performance of the proposed algorithm.

Non-negative Principal Component Analysis for Face Recognition

Principle component analysis is often combined with the state-of-art classification algorithms to recognize human faces. However, principle component analysis can only capture these features contributing to the global characteristics of data because it is a global feature selection algorithm. It misses those features contributing to the local characteristics of data because each principal component only contains some levels of global characteristics of data. In this study, we present a novel face recognition approach using non-negative principal component analysis which is added with the constraint of non-negative to improve data locality and contribute to elucidating latent data structures. Experiments are performed on the Cambridge ORL face database. We demonstrate the strong performances of the algorithm in recognizing human faces in comparison with PCA and NREMF approaches.

Solving Part Type Selection and Loading Problem in Flexible Manufacturing System Using Real Coded Genetic Algorithms – Part I: Modeling

This paper and its companion (Part 2) deal with modeling and optimization of two NP-hard problems in production planning of flexible manufacturing system (FMS), part type selection problem and loading problem. The part type selection problem and the loading problem are strongly related and heavily influence the system-s efficiency and productivity. The complexity of the problems is harder when flexibilities of operations such as the possibility of operation processed on alternative machines with alternative tools are considered. These problems have been modeled and solved simultaneously by using real coded genetic algorithms (RCGA) which uses an array of real numbers as chromosome representation. These real numbers can be converted into part type sequence and machines that are used to process the part types. This first part of the papers focuses on the modeling of the problems and discussing how the novel chromosome representation can be applied to solve the problems. The second part will discuss the effectiveness of the RCGA to solve various test bed problems.

Investigating Simple Multipath Compensation for Frequency Modulated Signals at Lower Frequencies

Radio propagation from point-to-point is affected by the physical channel in many ways. A signal arriving at a destination travels through a number of different paths which are referred to as multi-paths. Research in this area of wireless communications has progressed well over the years with the research taking different angles of focus. By this is meant that some researchers focus on ways of reducing or eluding Multipath effects whilst others focus on ways of mitigating the effects of Multipath through compensation schemes. Baseband processing is seen as one field of signal processing that is cardinal to the advancement of software defined radio technology. This has led to wide research into the carrying out certain algorithms at baseband. This paper considers compensating for Multipath for Frequency Modulated signals. The compensation process is carried out at Radio frequency (RF) and at Quadrature baseband (QBB) and the results are compared. Simulations are carried out using MatLab so as to show the benefits of working at lower QBB frequencies than at RF.

Modeling of Dielectric Heating in Radio- Frequency Applicator Optimized for Uniform Temperature by Means of Genetic Algorithms

The paper presents an optimization study based on genetic algorithms (GA-s) for a radio-frequency applicator used in heating dielectric band products. The weakly coupled electro-thermal problem is analyzed using 2D-FEM. The design variables in the optimization process are: the voltage of a supplementary “guard" electrode and six geometric parameters of the applicator. Two objective functions are used: temperature uniformity and total active power absorbed by the dielectric. Both mono-objective and multiobjective formulations are implemented in GA optimization.

Intelligent Heart Disease Prediction System Using CANFIS and Genetic Algorithm

Heart disease (HD) is a major cause of morbidity and mortality in the modern society. Medical diagnosis is an important but complicated task that should be performed accurately and efficiently and its automation would be very useful. All doctors are unfortunately not equally skilled in every sub specialty and they are in many places a scarce resource. A system for automated medical diagnosis would enhance medical care and reduce costs. In this paper, a new approach based on coactive neuro-fuzzy inference system (CANFIS) was presented for prediction of heart disease. The proposed CANFIS model combined the neural network adaptive capabilities and the fuzzy logic qualitative approach which is then integrated with genetic algorithm to diagnose the presence of the disease. The performances of the CANFIS model were evaluated in terms of training performances and classification accuracies and the results showed that the proposed CANFIS model has great potential in predicting the heart disease.

An Adaptive Model for Blind Image Restoration using Bayesian Approach

Image restoration involves elimination of noise. Filtering techniques were adopted so far to restore images since last five decades. In this paper, we consider the problem of image restoration degraded by a blur function and corrupted by random noise. A method for reducing additive noise in images by explicit analysis of local image statistics is introduced and compared to other noise reduction methods. The proposed method, which makes use of an a priori noise model, has been evaluated on various types of images. Bayesian based algorithms and technique of image processing have been described and substantiated with experimentation using MATLAB.

Data Preprocessing for Supervised Leaning

Many factors affect the success of Machine Learning (ML) on a given task. The representation and quality of the instance data is first and foremost. If there is much irrelevant and redundant information present or noisy and unreliable data, then knowledge discovery during the training phase is more difficult. It is well known that data preparation and filtering steps take considerable amount of processing time in ML problems. Data pre-processing includes data cleaning, normalization, transformation, feature extraction and selection, etc. The product of data pre-processing is the final training set. It would be nice if a single sequence of data pre-processing algorithms had the best performance for each data set but this is not happened. Thus, we present the most well know algorithms for each step of data pre-processing so that one achieves the best performance for their data set.

A Codebook-based Redundancy Suppression Mechanism with Lifetime Prediction in Cluster-based WSN

Wireless Sensor Network (WSN) comprises of sensor nodes which are designed to sense the environment, transmit sensed data back to the base station via multi-hop routing to reconstruct physical phenomena. Since physical phenomena exists significant overlaps between temporal redundancy and spatial redundancy, it is necessary to use Redundancy Suppression Algorithms (RSA) for sensor node to lower energy consumption by reducing the transmission of redundancy. A conventional algorithm of RSAs is threshold-based RSA, which sets threshold to suppress redundant data. Although many temporal and spatial RSAs are proposed, temporal-spatial RSA are seldom to be proposed because it is difficult to determine when to utilize temporal or spatial RSAs. In this paper, we proposed a novel temporal-spatial redundancy suppression algorithm, Codebookbase Redundancy Suppression Mechanism (CRSM). CRSM adopts vector quantization to generate a codebook, which is easily used to implement temporal-spatial RSA. CRSM not only achieves power saving and reliability for WSN, but also provides the predictability of network lifetime. Simulation result shows that the network lifetime of CRSM outperforms at least 23% of that of other RSAs.