The Influence of Preprocessing Parameters on Text Categorization

Text categorization (the assignment of texts in natural language into predefined categories) is an important and extensively studied problem in Machine Learning. Currently, popular techniques developed to deal with this task include many preprocessing and learning algorithms, many of which in turn require tuning nontrivial internal parameters. Although partial studies are available, many authors fail to report values of the parameters they use in their experiments, or reasons why these values were used instead of others. The goal of this work then is to create a more thorough comparison of preprocessing parameters and their mutual influence, and report interesting observations and results.

A Hybrid Approach for Color Image Quantization Using K-means and Firefly Algorithms

Color Image quantization (CQ) is an important problem in computer graphics, image and processing. The aim of quantization is to reduce colors in an image with minimum distortion. Clustering is a widely used technique for color quantization; all colors in an image are grouped to small clusters. In this paper, we proposed a new hybrid approach for color quantization using firefly algorithm (FA) and K-means algorithm. Firefly algorithm is a swarmbased algorithm that can be used for solving optimization problems. The proposed method can overcome the drawbacks of both algorithms such as the local optima converge problem in K-means and the early converge of firefly algorithm. Experiments on three commonly used images and the comparison results shows that the proposed algorithm surpasses both the base-line technique k-means clustering and original firefly algorithm.

Multiple Moving Talker Tracking by Integration of Two Successive Algorithms

In this paper, an estimation accuracy of multiple moving talker tracking using a microphone array is improved. The tracking can be achieved by the adaptive method in which two algorithms are integrated, namely, the PAST (Projection Approximation Subspace Tracking) algorithm and the IPLS (Interior Point Least Square) algorithm. When either talker begins to speak again after a silent period, an appropriate feasible region for an evaluation function of the IPLS algorithm might not be set. Then, the tracking fails due to the incorrect updating. Therefore, if an increment of the number of active talkers is detected, the feasible region must be reset. Then, a low cost realization is required for the high speed tracking and a high accuracy realization is desired for the precise tracking. In this paper, the directions roughly estimated using the delayed-sum-array method are used for the resetting. Several results of experiments performed in an actual room environment show the effectiveness of the proposed method.

Weka Based Desktop Data Mining as Web Service

Data mining is the process of sifting through large volumes of data, analyzing data from different perspectives and summarizing it into useful information. One of the widely used desktop applications for data mining is the Weka tool which is nothing but a collection of machine learning algorithms implemented in Java and open sourced under the General Public License (GPL). A web service is a software system designed to support interoperable machine to machine interaction over a network using SOAP messages. Unlike a desktop application, a web service is easy to upgrade, deliver and access and does not occupy any memory on the system. Keeping in mind the advantages of a web service over a desktop application, in this paper we are demonstrating how this Java based desktop data mining application can be implemented as a web service to support data mining across the internet.

Motion Prediction and Motion Vector Cost Reduction during Fast Block Motion Estimation in MCTF

In 3D-wavelet video coding framework temporal filtering is done along the trajectory of motion using Motion Compensated Temporal Filtering (MCTF). Hence computationally efficient motion estimation technique is the need of MCTF. In this paper a predictive technique is proposed in order to reduce the computational complexity of the MCTF framework, by exploiting the high correlation among the frames in a Group Of Picture (GOP). The proposed technique applies coarse and fine searches of any fast block based motion estimation, only to the first pair of frames in a GOP. The generated motion vectors are supplied to the next consecutive frames, even to subsequent temporal levels and only fine search is carried out around those predicted motion vectors. Hence coarse search is skipped for all the motion estimation in a GOP except for the first pair of frames. The technique has been tested for different fast block based motion estimation algorithms over different standard test sequences using MC-EZBC, a state-of-the-art scalable video coder. The simulation result reveals substantial reduction (i.e. 20.75% to 38.24%) in the number of search points during motion estimation, without compromising the quality of the reconstructed video compared to non-predictive techniques. Since the motion vectors of all the pair of frames in a GOP except the first pair will have value ±1 around the motion vectors of the previous pair of frames, the number of bits required for motion vectors is also reduced by 50%.

STRPRO Tool for Manipulation of Stratified Programs Based on SEPN

Negation is useful in the majority of the real world applications. However, its introduction leads to semantic and canonical problems. SEPN nets are well adapted extension of predicate nets for the definition and manipulation of stratified programs. This formalism is characterized by two main contributions. The first concerns the management of the whole class of stratified programs. The second contribution is related to usual operations optimization (maximal stratification, incremental updates ...). We propose, in this paper, useful algorithms for manipulating stratified programs using SEPN. These algorithms were implemented and validated with STRPRO tool.

Adaptive Pulse Coupled Neural Network Parameters for Image Segmentation

For over a decade, the Pulse Coupled Neural Network (PCNN) based algorithms have been successfully used in image interpretation applications including image segmentation. There are several versions of the PCNN based image segmentation methods, and the segmentation accuracy of all of them is very sensitive to the values of the network parameters. Most methods treat PCNN parameters like linking coefficient and primary firing threshold as global parameters, and determine them by trial-and-error. The automatic determination of appropriate values for linking coefficient, and primary firing threshold is a challenging problem and deserves further research. This paper presents a method for obtaining global as well as local values for the linking coefficient and the primary firing threshold for neurons directly from the image statistics. Extensive simulation results show that the proposed approach achieves excellent segmentation accuracy comparable to the best accuracy obtainable by trial-and-error for a variety of images.

On-line Image Mosaicing of Live Stem Cells

Image mosaicing is a technique that permits to enlarge the field of view of a camera. For instance, it is employed to achieve panoramas with common cameras or even in scientific applications, to achieve the image of a whole culture in microscopical imaging. Usually, a mosaic of cell cultures is achieved through using automated microscopes. However, this is often performed in batch, through CPU intensive minimization algorithms. In addition, live stem cells are studied in phase contrast, showing a low contrast that cannot be improved further. We present a method to study the flat field from live stem cells images even in case of 100% confluence, this permitting to build accurate mosaics on-line using high performance algorithms.

Iterative Clustering Algorithm for Analyzing Temporal Patterns of Gene Expression

Microarray experiments are information rich; however, extensive data mining is required to identify the patterns that characterize the underlying mechanisms of action. For biologists, a key aim when analyzing microarray data is to group genes based on the temporal patterns of their expression levels. In this paper, we used an iterative clustering method to find temporal patterns of gene expression. We evaluated the performance of this method by applying it to real sporulation data and simulated data. The patterns obtained using the iterative clustering were found to be superior to those obtained using existing clustering algorithms.

A Hybrid Genetic Algorithm for the Sequence Dependent Flow-Shop Scheduling Problem

Flow-shop scheduling problem (FSP) deals with the scheduling of a set of jobs that visit a set of machines in the same order. The FSP is NP-hard, which means that an efficient algorithm for solving the problem to optimality is unavailable. To meet the requirements on time and to minimize the make-span performance of large permutation flow-shop scheduling problems in which there are sequence dependent setup times on each machine, this paper develops one hybrid genetic algorithms (HGA). Proposed HGA apply a modified approach to generate population of initial chromosomes and also use an improved heuristic called the iterated swap procedure to improve initial solutions. Also the author uses three genetic operators to make good new offspring. The results are compared to some recently developed heuristics and computational experimental results show that the proposed HGA performs very competitively with respect to accuracy and efficiency of solution.

Authenticated Mobile Device Proxy Service

In the current study we present a system that is capable to deliver proxy based differentiated service. It will help the carrier service node to sell a prepaid service to clients and limit the use to a particular mobile device or devices for a certain time. The system includes software and hardware architecture for a mobile device with moderate computational power, and a secure protocol for communication between it and its carrier service node. On the carrier service node a proxy runs on a centralized server to be capable of implementing cryptographic algorithms, while the mobile device contains a simple embedded processor capable of executing simple algorithms. One prerequisite is needed for the system to run efficiently that is a presence of Global Trusted Verification Authority (GTVA) which is equivalent to certifying authority in IP networks. This system appears to be of great interest for many commercial transactions, business to business electronic and mobile commerce, and military applications.

Blind Identification of MA Models Using Cumulants

In this paper, many techniques for blind identification of moving average (MA) process are presented. These methods utilize third- and fourth-order cumulants of the noisy observations of the system output. The system is driven by an independent and identically distributed (i.i.d) non-Gaussian sequence that is not observed. Two nonlinear optimization algorithms, namely the Gradient Descent and the Gauss-Newton algorithms are exposed. An algorithm based on the joint-diagonalization of the fourth-order cumulant matrices (FOSI) is also considered, as well as an improved version of the classical C(q, 0, k) algorithm based on the choice of the Best 1-D Slice of fourth-order cumulants. To illustrate the effectiveness of our methods, various simulation examples are presented.

An Edge-based Text Region Extraction Algorithm for Indoor Mobile Robot Navigation

Using bottom-up image processing algorithms to predict human eye fixations and extract the relevant embedded information in images has been widely applied in the design of active machine vision systems. Scene text is an important feature to be extracted, especially in vision-based mobile robot navigation as many potential landmarks such as nameplates and information signs contain text. This paper proposes an edge-based text region extraction algorithm, which is robust with respect to font sizes, styles, color/intensity, orientations, and effects of illumination, reflections, shadows, perspective distortion, and the complexity of image backgrounds. Performance of the proposed algorithm is compared against a number of widely used text localization algorithms and the results show that this method can quickly and effectively localize and extract text regions from real scenes and can be used in mobile robot navigation under an indoor environment to detect text based landmarks.

Optimal Image Compression Based on Sign and Magnitude Coding of Wavelet Coefficients

Wavelet transforms is a very powerful tools for image compression. One of its advantage is the provision of both spatial and frequency localization of image energy. However, wavelet transform coefficients are defined by both a magnitude and sign. While algorithms exist for efficiently coding the magnitude of the transform coefficients, they are not efficient for the coding of their sign. It is generally assumed that there is no compression gain to be obtained from the coding of the sign. Only recently have some authors begun to investigate the sign of wavelet coefficients in image coding. Some authors have assumed that the sign information bit of wavelet coefficients may be encoded with the estimated probability of 0.5; the same assumption concerns the refinement information bit. In this paper, we propose a new method for Separate Sign Coding (SSC) of wavelet image coefficients. The sign and the magnitude of wavelet image coefficients are examined to obtain their online probabilities. We use the scalar quantization in which the information of the wavelet coefficient to belong to the lower or to the upper sub-interval in the uncertainly interval is also examined. We show that the sign information and the refinement information may be encoded by the probability of approximately 0.5 only after about five bit planes. Two maps are separately entropy encoded: the sign map and the magnitude map. The refinement information of the wavelet coefficient to belong to the lower or to the upper sub-interval in the uncertainly interval is also entropy encoded. An algorithm is developed and simulations are performed on three standard images in grey scale: Lena, Barbara and Cameraman. Five scales are performed using the biorthogonal wavelet transform 9/7 filter bank. The obtained results are compared to JPEG2000 standard in terms of peak signal to noise ration (PSNR) for the three images and in terms of subjective quality (visual quality). It is shown that the proposed method outperforms the JPEG2000. The proposed method is also compared to other codec in the literature. It is shown that the proposed method is very successful and shows its performance in term of PSNR.

Feature Point Reduction for Video Stabilization

Corner detection and optical flow are common techniques for feature-based video stabilization. However, these algorithms are computationally expensive and should be performed at a reasonable rate. This paper presents an algorithm for discarding irrelevant feature points and maintaining them for future use so as to improve the computational cost. The algorithm starts by initializing a maintained set. The feature points in the maintained set are examined against its accuracy for modeling. Corner detection is required only when the feature points are insufficiently accurate for future modeling. Then, optical flows are computed from the maintained feature points toward the consecutive frame. After that, a motion model is estimated based on the simplified affine motion model and least square method, with outliers belonging to moving objects presented. Studentized residuals are used to eliminate such outliers. The model estimation and elimination processes repeat until no more outliers are identified. Finally, the entire algorithm repeats along the video sequence with the points remaining from the previous iteration used as the maintained set. As a practical application, an efficient video stabilization can be achieved by exploiting the computed motion models. Our study shows that the number of times corner detection needs to perform is greatly reduced, thus significantly improving the computational cost. Moreover, optical flow vectors are computed for only the maintained feature points, not for outliers, thus also reducing the computational cost. In addition, the feature points after reduction can sufficiently be used for background objects tracking as demonstrated in the simple video stabilizer based on our proposed algorithm.

Orthogonal Functions Approach to LQG Control

In this paper a unified approach via block-pulse functions (BPFs) or shifted Legendre polynomials (SLPs) is presented to solve the linear-quadratic-Gaussian (LQG) control problem. Also a recursive algorithm is proposed to solve the above problem via BPFs. By using the elegant operational properties of orthogonal functions (BPFs or SLPs) these computationally attractive algorithms are developed. To demonstrate the validity of the proposed approaches a numerical example is included.

Optimal Placement of Processors based on Effective Communication Load

This paper presents a new technique for the optimum placement of processors to minimize the total effective communication load under multi-processor communication dominated environment. This is achieved by placing heavily loaded processors near each other and lightly loaded ones far away from one another in the physical grid locations. The results are mathematically proved for the Algorithms are described.

Improved Artificial Immune System Algorithm with Local Search

The Artificial immune systems algorithms are Meta heuristic optimization method, which are used for clustering and pattern recognition applications are abundantly. These algorithms in multimodal optimization problems are more efficient than genetic algorithms. A major drawback in these algorithms is their slow convergence to global optimum and their weak stability can be considered in various running of these algorithms. In this paper, improved Artificial Immune System Algorithm is introduced for the first time to overcome its problems of artificial immune system. That use of the small size of a local search around the memory antibodies is used for improving the algorithm efficiently. The credibility of the proposed approach is evaluated by simulations, and it is shown that the proposed approach achieves better results can be achieved compared to the standard artificial immune system algorithms

New DES based on Elliptic Curves

It is known that symmetric encryption algorithms are fast and easy to implement in hardware. Also elliptic curves have proved to be a good choice for building encryption system. Although most of the symmetric systems have been broken, we can create a hybrid system that has the same properties of the symmetric encryption systems and in the same time, it has the strength of elliptic curves in encryption. As DES algorithm is considered the core of all successive symmetric encryption systems, we modified DES using elliptic curves and built a new DES algorithm that is hard to be broken and will be the core for all other symmetric systems.

Two Wheels Balancing Robot with Line Following Capability

This project focuses on the development of a line follower algorithm for a Two Wheels Balancing Robot. In this project, ATMEGA32 is chosen as the brain board controller to react towards the data received from Balance Processor Chip on the balance board to monitor the changes of the environment through two infra-red distance sensor to solve the inclination angle problem. Hence, the system will immediately restore to the set point (balance position) through the implementation of internal PID algorithms at the balance board. Application of infra-red light sensors with the PID control is vital, in order to develop a smooth line follower robot. As a result of combination between line follower program and internal self balancing algorithms, we are able to develop a dynamically stabilized balancing robot with line follower function.