Genetic Mining: Using Genetic Algorithm for Topic based on Concept Distribution

Today, Genetic Algorithm has been used to solve wide range of optimization problems. Some researches conduct on applying Genetic Algorithm to text classification, summarization and information retrieval system in text mining process. This researches show a better performance due to the nature of Genetic Algorithm. In this paper a new algorithm for using Genetic Algorithm in concept weighting and topic identification, based on concept standard deviation will be explored.

A new Adaptive Approach for Histogram based Mouth Segmentation

The segmentation of mouth and lips is a fundamental problem in facial image analyisis. In this paper we propose a method for lip segmentation based on rg-color histogram. Statistical analysis shows, using the rg-color-space is optimal for this purpose of a pure color based segmentation. Initially a rough adaptive threshold selects a histogram region, that assures that all pixels in that region are skin pixels. Based on that pixels we build a gaussian model which represents the skin pixels distribution and is utilized to obtain a refined, optimal threshold. We are not incorporating shape or edge information. In experiments we show the performance of our lip pixel segmentation method compared to the ground truth of our dataset and a conventional watershed algorithm.

Breast Skin-Line Estimation and Breast Segmentation in Mammograms using Fast-Marching Method

Breast skin-line estimation and breast segmentation is an important pre-process in mammogram image processing and computer-aided diagnosis of breast cancer. Limiting the area to be processed into a specific target region in an image would increase the accuracy and efficiency of processing algorithms. In this paper we are presenting a new algorithm for estimating skin-line and breast segmentation using fast marching algorithm. Fast marching is a partial-differential equation based numerical technique to track evolution of interfaces. We have introduced some modifications to the traditional fast marching method, specifically to improve the accuracy of skin-line estimation and breast tissue segmentation. Proposed modifications ensure that the evolving front stops near the desired boundary. We have evaluated the performance of the algorithm by using 100 mammogram images taken from mini-MIAS database. The results obtained from the experimental evaluation indicate that this algorithm explains 98.6% of the ground truth breast region and accuracy of the segmentation is 99.1%. Also this algorithm is capable of partially-extracting nipple when it is available in the profile.

Multidimensional Visualization Tools for Analysis of Expression Data

Expression data analysis is based mostly on the statistical approaches that are indispensable for the study of biological systems. Large amounts of multidimensional data resulting from the high-throughput technologies are not completely served by biostatistical techniques and are usually complemented with visual, knowledge discovery and other computational tools. In many cases, in biological systems we only speculate on the processes that are causing the changes, and it is the visual explorative analysis of data during which a hypothesis is formed. We would like to show the usability of multidimensional visualization tools and promote their use in life sciences. We survey and show some of the multidimensional visualization tools in the process of data exploration, such as parallel coordinates and radviz and we extend them by combining them with the self-organizing map algorithm. We use a time course data set of transitional cell carcinoma of the bladder in our examples. Analysis of data with these tools has the potential to uncover additional relationships and non-trivial structures.

A Novel In-Place Sorting Algorithm with O(n log z) Comparisons and O(n log z) Moves

In-place sorting algorithms play an important role in many fields such as very large database systems, data warehouses, data mining, etc. Such algorithms maximize the size of data that can be processed in main memory without input/output operations. In this paper, a novel in-place sorting algorithm is presented. The algorithm comprises two phases; rearranging the input unsorted array in place, resulting segments that are ordered relative to each other but whose elements are yet to be sorted. The first phase requires linear time, while, in the second phase, elements of each segment are sorted inplace in the order of z log (z), where z is the size of the segment, and O(1) auxiliary storage. The algorithm performs, in the worst case, for an array of size n, an O(n log z) element comparisons and O(n log z) element moves. Further, no auxiliary arithmetic operations with indices are required. Besides these theoretical achievements of this algorithm, it is of practical interest, because of its simplicity. Experimental results also show that it outperforms other in-place sorting algorithms. Finally, the analysis of time and space complexity, and required number of moves are presented, along with the auxiliary storage requirements of the proposed algorithm.

Factoring a Polynomial with Multiple-Roots

A given polynomial, possibly with multiple roots, is factored into several lower-degree distinct-root polynomials with natural-order-integer powers. All the roots, including multiplicities, of the original polynomial may be obtained by solving these lowerdegree distinct-root polynomials, instead of the original high-degree multiple-root polynomial directly. The approach requires polynomial Greatest Common Divisor (GCD) computation. The very simple and effective process, “Monic polynomial subtractions" converted trickily from “Longhand polynomial divisions" of Euclidean algorithm is employed. It requires only simple elementary arithmetic operations without any advanced mathematics. Amazingly, the derived routine gives the expected results for the test polynomials of very high degree, such as p( x) =(x+1)1000.

DCBOR: A Density Clustering Based on Outlier Removal

Data clustering is an important data exploration technique with many applications in data mining. We present an enhanced version of the well known single link clustering algorithm. We will refer to this algorithm as DCBOR. The proposed algorithm alleviates the chain effect by removing the outliers from the given dataset. So this algorithm provides outlier detection and data clustering simultaneously. This algorithm does not need to update the distance matrix, since the algorithm depends on merging the most k-nearest objects in one step and the cluster continues grow as long as possible under specified condition. So the algorithm consists of two phases; at the first phase, it removes the outliers from the input dataset. At the second phase, it performs the clustering process. This algorithm discovers clusters of different shapes, sizes, densities and requires only one input parameter; this parameter represents a threshold for outlier points. The value of the input parameter is ranging from 0 to 1. The algorithm supports the user in determining an appropriate value for it. We have tested this algorithm on different datasets contain outlier and connecting clusters by chain of density points, and the algorithm discovers the correct clusters. The results of our experiments demonstrate the effectiveness and the efficiency of DCBOR.

Complexity Analysis of Some Known Graph Coloring Instances

Graph coloring is an important problem in computer science and many algorithms are known for obtaining reasonably good solutions in polynomial time. One method of comparing different algorithms is to test them on a set of standard graphs where the optimal solution is already known. This investigation analyzes a set of 50 well known graph coloring instances according to a set of complexity measures. These instances come from a variety of sources some representing actual applications of graph coloring (register allocation) and others (mycieleski and leighton graphs) that are theoretically designed to be difficult to solve. The size of the graphs ranged from ranged from a low of 11 variables to a high of 864 variables. The method used to solve the coloring problem was the square of the adjacency (i.e., correlation) matrix. The results show that the most difficult graphs to solve were the leighton and the queen graphs. Complexity measures such as density, mobility, deviation from uniform color class size and number of block diagonal zeros are calculated for each graph. The results showed that the most difficult problems have low mobility (in the range of .2-.5) and relatively little deviation from uniform color class size.

Self-Adaptive Differential Evolution Based Power Economic Dispatch of Generators with Valve-Point Effects and Multiple Fuel Options

This paper presents the solution of power economic dispatch (PED) problem of generating units with valve point effects and multiple fuel options using Self-Adaptive Differential Evolution (SDE) algorithm. The global optimal solution by mathematical approaches becomes difficult for the realistic PED problem in power systems. The Differential Evolution (DE) algorithm is found to be a powerful evolutionary algorithm for global optimization in many real problems. In this paper the key parameters of control in DE algorithm such as the crossover constant CR and weight applied to random differential F are self-adapted. The PED problem formulation takes into consideration of nonsmooth fuel cost function due to valve point effects and multi fuel options of generator. The proposed approach has been examined and tested with the numerical results of PED problems with thirteen-generation units including valve-point effects, ten-generation units with multiple fuel options neglecting valve-point effects and ten-generation units including valve-point effects and multiple fuel options. The test results are promising and show the effectiveness of proposed approach for solving PED problems.

Genetic Programming Approach to Hierarchical Production Rule Discovery

Automated discovery of hierarchical structures in large data sets has been an active research area in the recent past. This paper focuses on the issue of mining generalized rules with crisp hierarchical structure using Genetic Programming (GP) approach to knowledge discovery. The post-processing scheme presented in this work uses flat rules as initial individuals of GP and discovers hierarchical structure. Suitable genetic operators are proposed for the suggested encoding. Based on the Subsumption Matrix(SM), an appropriate fitness function is suggested. Finally, Hierarchical Production Rules (HPRs) are generated from the discovered hierarchy. Experimental results are presented to demonstrate the performance of the proposed algorithm.

On Analysis of Boundness Property for ECATNets by Using Rewriting Logic

To analyze the behavior of Petri nets, the accessibility graph and Model Checking are widely used. However, if the analyzed Petri net is unbounded then the accessibility graph becomes infinite and Model Checking can not be used even for small Petri nets. ECATNets [2] are a category of algebraic Petri nets. The main feature of ECATNets is their sound and complete semantics based on rewriting logic [8] and its language Maude [9]. ECATNets analysis may be done by using techniques of accessibility analysis and Model Checking defined in Maude. But, these two techniques supported by Maude do not work also with infinite-states systems. As a category of Petri nets, ECATNets can be unbounded and so infinite systems. In order to know if we can apply accessibility analysis and Model Checking of Maude to an ECATNet, we propose in this paper an algorithm allowing the detection if the ECATNet is bounded or not. Moreover, we propose a rewriting logic based tool implementing this algorithm. We show that the development of this tool using the Maude system is facilitated thanks to the reflectivity of the rewriting logic. Indeed, the self-interpretation of this logic allows us both the modelling of an ECATNet and acting on it.

Efficient DTW-Based Speech Recognition System for Isolated Words of Arabic Language

Despite the fact that Arabic language is currently one of the most common languages worldwide, there has been only a little research on Arabic speech recognition relative to other languages such as English and Japanese. Generally, digital speech processing and voice recognition algorithms are of special importance for designing efficient, accurate, as well as fast automatic speech recognition systems. However, the speech recognition process carried out in this paper is divided into three stages as follows: firstly, the signal is preprocessed to reduce noise effects. After that, the signal is digitized and hearingized. Consequently, the voice activity regions are segmented using voice activity detection (VAD) algorithm. Secondly, features are extracted from the speech signal using Mel-frequency cepstral coefficients (MFCC) algorithm. Moreover, delta and acceleration (delta-delta) coefficients have been added for the reason of improving the recognition accuracy. Finally, each test word-s features are compared to the training database using dynamic time warping (DTW) algorithm. Utilizing the best set up made for all affected parameters to the aforementioned techniques, the proposed system achieved a recognition rate of about 98.5% which outperformed other HMM and ANN-based approaches available in the literature.

A Real-Time Signal Processing Technique for MIDI Generation

This paper presents a new hardware interface using a microcontroller which processes audio music signals to standard MIDI data. A technique for processing music signals by extracting note parameters from music signals is described. An algorithm to convert the voice samples for real-time processing without complex calculations is proposed. A high frequency microcontroller as the main processor is deployed to execute the outlined algorithm. The MIDI data generated is transmitted using the EIA-232 protocol. The analyses of data generated show the feasibility of using microcontrollers for real-time MIDI generation hardware interface.

On the Sphere Method of Linear Programming Using Multiple Interior Points Approach

The Sphere Method is a flexible interior point algorithm for linear programming problems. This was developed mainly by Professor Katta G. Murty. It consists of two steps, the centering step and the descent step. The centering step is the most expensive part of the algorithm. In this centering step we proposed some improvements such as introducing two or more initial feasible solutions as we solve for the more favorable new solution by objective value while working with the rigorous updates of the feasible region along with some ideas integrated in the descent step. An illustration is given confirming the advantage of using the proposed procedure.

Design of Multiplier-free State-Space Digital Filters

In this paper, a novel approach is presented for designing multiplier-free state-space digital filters. The multiplier-free design is obtained by finding power-of-2 coefficients and also quantizing the state variables to power-of-2 numbers. Expressions for the noise variance are derived for the quantized state vector and the output of the filter. A “structuretransformation matrix" is incorporated in these expressions. It is shown that quantization effects can be minimized by properly designing the structure-transformation matrix. Simulation results are very promising and illustrate the design algorithm.

Induction of Expressive Rules using the Binary Coding Method

In most rule-induction algorithms, the only operator used against nominal attributes is the equality operator =. In this paper, we first propose the use of the inequality operator, ≠, in addition to the equality operator, to increase the expressiveness of induced rules. Then, we present a new method, Binary Coding, which can be used along with an arbitrary rule-induction algorithm to make use of the inequality operator without any need to change the algorithm. Experimental results suggest that the Binary Coding method is promising enough for further investigation, especially in cases where the minimum number of rules is desirable.

Fuzzy Sliding Mode Speed Controller for a Vector Controlled Induction Motor

This paper presents a speed fuzzy sliding mode controller for a vector controlled induction machine (IM) fed by a voltage source inverter (PWM). The sliding mode based fuzzy control method is developed to achieve fast response, a best disturbance rejection and to maintain a good decoupling. The problem with sliding mode control is that there is high frequency switching around the sliding mode surface. The FSMC is the combination of the robustness of Sliding Mode Control (SMC) and the smoothness of Fuzzy Logic (FL). To reduce the torque fluctuations (chattering), the sign function used in the conventional SMC is substituted with a fuzzy logic algorithm. The proposed algorithm was simulated by Matlab/Simulink software and simulation results show that the performance of the control scheme is robust and the chattering problem is solved.

Pipelined Control-Path Effects on Area and Performance of a Wormhole-Switched Network-on-Chip

This paper presents design trade-off and performance impacts of the amount of pipeline phase of control path signals in a wormhole-switched network-on-chip (NoC). The numbers of the pipeline phase of the control path vary between two- and one-cycle pipeline phase. The control paths consist of the routing request paths for output selection and the arbitration paths for input selection. Data communications between on-chip routers are implemented synchronously and for quality of service, the inter-router data transports are controlled by using a link-level congestion control to avoid lose of data because of an overflow. The trade-off between the area (logic cell area) and the performance (bandwidth gain) of two proposed NoC router microarchitectures are presented in this paper. The performance evaluation is made by using a traffic scenario with different number of workloads under 2D mesh NoC topology using a static routing algorithm. By using a 130-nm CMOS standard-cell technology, our NoC routers can be clocked at 1 GHz, resulting in a high speed network link and high router bandwidth capacity of about 320 Gbit/s. Based on our experiments, the amount of control path pipeline stages gives more significant impact on the NoC performance than the impact on the logic area of the NoC router.

Scale Time Offset Robust Modulation (STORM) in a Code Division Multiaccess Environment

Scale Time Offset Robust Modulation (STORM) [1]– [3] is a high bandwidth waveform design that adds time-scale to embedded reference modulations using only time-delay [4]. In an environment where each user has a specific delay and scale, identification of the user with the highest signal power and that user-s phase is facilitated by the STORM processor. Both of these parameters are required in an efficient multiuser detection algorithm. In this paper, the STORM modulation approach is evaluated with a direct sequence spread quadrature phase shift keying (DS-QPSK) system. A misconception of the STORM time scale modulation is that a fine temporal resolution is required at the receiver. STORM will be applied to a QPSK code division multiaccess (CDMA) system by modifying the spreading codes. Specifically, the in-phase code will use a typical spreading code, and the quadrature code will use a time-delayed and time-scaled version of the in-phase code. Subsequently, the same temporal resolution in the receiver is required before and after the application of STORM. In this paper, the bit error performance of STORM in a synchronous CDMA system is evaluated and compared to theory, and the bit error performance of STORM incorporated in a single user WCDMA downlink is presented to demonstrate the applicability of STORM in a modern communication system.

Enhanced Genetic Algorithm Approach for Security Constrained Optimal Power Flow Including FACTS Devices

This paper presents a genetic algorithm based approach for solving security constrained optimal power flow problem (SCOPF) including FACTS devices. The optimal location of FACTS devices are identified using an index called overload index and the optimal values are obtained using an enhanced genetic algorithm. The optimal allocation by the proposed method optimizes the investment, taking into account its effects on security in terms of the alleviation of line overloads. The proposed approach has been tested on IEEE-30 bus system to show the effectiveness of the proposed algorithm for solving the SCOPF problem.