Evaluating some Feature Selection Methods for an Improved SVM Classifier

Text categorization is the problem of classifying text documents into a set of predefined classes. After a preprocessing step the documents are typically represented as large sparse vectors. When training classifiers on large collections of documents, both the time and memory restrictions can be quite prohibitive. This justifies the application of features selection methods to reduce the dimensionality of the document-representation vector. Four feature selection methods are evaluated: Random Selection, Information Gain (IG), Support Vector Machine (called SVM_FS) and Genetic Algorithm with SVM (GA_FS). We showed that the best results were obtained with SVM_FS and GA_FS methods for a relatively small dimension of the features vector comparative with the IG method that involves longer vectors, for quite similar classification accuracies. Also we present a novel method to better correlate SVM kernel-s parameters (Polynomial or Gaussian kernel).

XML Schema Automatic Matching Solution

Schema matching plays a key role in many different applications, such as schema integration, data integration, data warehousing, data transformation, E-commerce, peer-to-peer data management, ontology matching and integration, semantic Web, semantic query processing, etc. Manual matching is expensive and error-prone, so it is therefore important to develop techniques to automate the schema matching process. In this paper, we present a solution for XML schema automated matching problem which produces semantic mappings between corresponding schema elements of given source and target schemas. This solution contributed in solving more comprehensively and efficiently XML schema automated matching problem. Our solution based on combining linguistic similarity, data type compatibility and structural similarity of XML schema elements. After describing our solution, we present experimental results that demonstrate the effectiveness of this approach.

Edge-end Pixel Extraction for Edge-based Image Segmentation

Extraction of edge-end-pixels is an important step for the edge linking process to achieve edge-based image segmentation. This paper presents an algorithm to extract edge-end pixels together with their directional sensitivities as an augmentation to the currently available mathematical models. The algorithm is implemented in the Java environment because of its inherent compatibility with web interfaces since its main use is envisaged to be for remote image analysis on a virtual instrumentation platform.

Detection and Pose Estimation of People in Images

Detection, feature extraction and pose estimation of people in images and video is made challenging by the variability of human appearance, the complexity of natural scenes and the high dimensionality of articulated body models and also the important field in Image, Signal and Vision Computing in recent years. In this paper, four types of people in 2D dimension image will be tested and proposed. The system will extract the size and the advantage of them (such as: tall fat, short fat, tall thin and short thin) from image. Fat and thin, according to their result from the human body that has been extract from image, will be obtained. Also the system extract every size of human body such as length, width and shown them in output.

Multi-models Approach for Describing and Verifying Constraints Based Interactive Systems

The requirements analysis, modeling, and simulation have consistently been one of the main challenges during the development of complex systems. The scenarios and the state machines are two successful models to describe the behavior of an interactive system. The scenarios represent examples of system execution in the form of sequences of messages exchanged between objects and are a partial view of the system. In contrast, state machines can represent the overall system behavior. The automation of processing scenarios in the state machines provide some answers to various problems such as system behavior validation and scenarios consistency checking. In this paper, we propose a method for translating scenarios in state machines represented by Discreet EVent Specification and procedure to detect implied scenarios. Each induced DEVS model represents the behavior of an object of the system. The global system behavior is described by coupling the atomic DEVS models and validated through simulation. We improve the validation process with integrating formal methods to eliminate logical inconsistencies in the global model. For that end, we use the Z notation.

Power Line Carrier for Power Telemetering

This paper presents an application of power line carrier (PLC) for electrical power telemetering. This system has a special capability of transmitting the measured values to a centralized computer via power lines. The PLC modem as a passive high-pass filter is designed for transmitting and receiving information. Its function is to send the information carrier together with transmitted data by superimposing it on the 50 Hz power frequency signal. A microcontroller is employed to function as the main processing of the modem. It is programmed for PLC control and interfacing with other devices. Each power meter, connected via a PLC modem, is assigned with a unique identification number (address) for distinguishing each device from one another.

Model Development for Allocation of Raw Material in Timber Processing Industry in Indonesia

This research is intended to develop a raw material allocation model in timber processing industry in Perum Perhutani Unit I, Central Java, Indonesia. The model can be used to determine the quantity of allocation of timber between chain in the supply chain to select supplier considering factors that are log price and the distance. In determining the quantity of allocation of timber between chains in the supply chain, the model considers the optimal inventory in each chain. Whilst the optimal inventory is determined based on demand forecast, the capacity and safety stock. Problem solving allocation is conducted by developing linear programming model that aims to minimize the total cost of the purchase, transportation cost and storage costs at each chain. The results of numerical examples show that the proposed model can generate savings of the purchase cost of 20.84% and select suppliers with mileage closer.

Effective Sonar Target Classification via Parallel Structure of Minimal Resource Allocation Network

In this paper, the processing of sonar signals has been carried out using Minimal Resource Allocation Network (MRAN) and a Probabilistic Neural Network (PNN) in differentiation of commonly encountered features in indoor environments. The stability-plasticity behaviors of both networks have been investigated. The experimental result shows that MRAN possesses lower network complexity but experiences higher plasticity than PNN. An enhanced version called parallel MRAN (pMRAN) is proposed to solve this problem and is proven to be stable in prediction and also outperformed the original MRAN.

Performance Analysis of Digital Signal Processors Using SMV Benchmark

Unlike general-purpose processors, digital signal processors (DSP processors) are strongly application-dependent. To meet the needs for diverse applications, a wide variety of DSP processors based on different architectures ranging from the traditional to VLIW have been introduced to the market over the years. The functionality, performance, and cost of these processors vary over a wide range. In order to select a processor that meets the design criteria for an application, processor performance is usually the major concern for digital signal processing (DSP) application developers. Performance data are also essential for the designers of DSP processors to improve their design. Consequently, several DSP performance benchmarks have been proposed over the past decade or so. However, none of these benchmarks seem to have included recent new DSP applications. In this paper, we use a new benchmark that we recently developed to compare the performance of popular DSP processors from Texas Instruments and StarCore. The new benchmark is based on the Selectable Mode Vocoder (SMV), a speech-coding program from the recent third generation (3G) wireless voice applications. All benchmark kernels are compiled by the compilers of the respective DSP processors and run on their simulators. Weighted arithmetic mean of clock cycles and arithmetic mean of code size are used to compare the performance of five DSP processors. In addition, we studied how the performance of a processor is affected by code structure, features of processor architecture and optimization of compiler. The extensive experimental data gathered, analyzed, and presented in this paper should be helpful for DSP processor and compiler designers to meet their specific design goals.

Insights into Smoothies with High Levels of Fibre and Polyphenols: Factors Influencing Chemical, Rheological and Sensory Properties

Attempts to add fibre and polyphenols (PPs) into popular beverages present challenges related to the properties of finished products such as smoothies. Consumer acceptability, viscosity and phenolic composition of smoothies containing high levels of fruit fibre (2.5-7.5 g per 300 mL serve) and PPs (250-750 mg per 300 mL serve) were examined. The changes in total extractable PP, vitamin C content, and colour of selected smoothies over a storage stability trial (4°C, 14 days) were compared. A set of acidic aqueous model beverages were prepared to further examine the effect of two different heat treatments on the stability and extractability of PPs. Results show that overall consumer acceptability of high fibre and PP smoothies was low, with average hedonic scores ranging from 3.9 to 6.4 (on a 1-9 scale). Flavour, texture and overall acceptability decreased as fibre and polyphenol contents increased, with fibre content exerting a stronger effect. Higher fibre content resulted in greater viscosity, with an elevated PP content increasing viscosity only slightly. The presence of fibre also aided the stability and extractability of PPs after heating. A reduction of extractable PPs, vitamin C content and colour intensity of smoothies was observed after a 14-day storage period at 4°C. Two heat treatments (75°C for 45 min or 85°C for 1 min) that are normally used for beverage production, did not cause significant reduction of total extracted PPs. It is clear that high levels of added fibre and PPs greatly influence the consumer appeal of smoothies, suggesting the need to develop novel formulation and processing methods if a satisfactory functional beverage is to be developed incorporating these ingredients.

A Hybrid Approach to Fault Detection and Diagnosis in a Diesel Fuel Hydrotreatment Process

It is estimated that the total cost of abnormal conditions to US process industries is around $20 billion dollars in annual losses. The hydrotreatment (HDT) of diesel fuel in petroleum refineries is a conversion process that leads to high profitable economical returns. However, this is a difficult process to control because it is operated continuously, with high hydrogen pressures and it is also subject to disturbances in feed properties and catalyst performance. So, the automatic detection of fault and diagnosis plays an important role in this context. In this work, a hybrid approach based on neural networks together with a pos-processing classification algorithm is used to detect faults in a simulated HDT unit. Nine classes (8 faults and the normal operation) were correctly classified using the proposed approach in a maximum time of 5 minutes, based on on-line data process measurements.

Neutralization of Alkaline Waste-Waters using a Blend of Microorganisms

The efficient operation of any biological treatment process requires pre-treatment of incompatible pollutants such as acids, bases, oil, toxic substances, etc. which hamper the treatment of other major components which are otherwise degradable. The pre-treatment of alkaline waste-waters, generated from various industries like textile, paper & pulp, potato-processing industries, etc., having a pH of 10 or higher, is essential. The pre-treatment, i.e., neutralization of such alkaline waste-waters can be achieved by chemical as well as biological means. However, the biological pretreatment offers better package over the chemical means by being safe and economical. The biological pre-treatment can be accomplished by using a blend of microorganisms able to withstand such harsh alkaline conditions. In the present study, for the proper pre-treatment of alkaline waste-waters, a package of alkalophilic bacteria is formulated to neutralise the alkaline pH of the industrial waste-waters. The developed microbial package is cost-effective as well as environmental friendly.

Query Optimization Techniques for XML Databases

Over the past few years, XML (eXtensible Mark-up Language) has emerged as the standard for information representation and data exchange over the Internet. This paper provides a kick-start for new researches venturing in XML databases field. We survey the storage representation for XML document, review the XML query processing and optimization techniques with respect to the particular storage instance. Various optimization technologies have been developed to solve the query retrieval and updating problems. Towards the later year, most researchers proposed hybrid optimization techniques. Hybrid system opens the possibility of covering each technology-s weakness by its strengths. This paper reviews the advantages and limitations of optimization techniques.

Automated Thickness Measurement of Retinal Blood Vessels for Implementation of Clinical Decision Support Systems in Diagnostic Diabetic Retinopathy

The structure of retinal vessels is a prominent feature, that reveals information on the state of disease that are reflected in the form of measurable abnormalities in thickness and colour. Vascular structures of retina, for implementation of clinical diabetic retinopathy decision making system is presented in this paper. Retinal Vascular structure is with thin blood vessel, whose accuracy is highly dependent upon the vessel segmentation. In this paper the blood vessel thickness is automatically detected using preprocessing techniques and vessel segmentation algorithm. First the capture image is binarized to get the blood vessel structure clearly, then it is skeletonised to get the overall structure of all the terminal and branching nodes of the blood vessels. By identifying the terminal node and the branching points automatically, the main and branching blood vessel thickness is estimated. Results are presented and compared with those provided by clinical classification on 50 vessels collected from Bejan Singh Eye hospital..

A Text Mining Technique Using Association Rules Extraction

This paper describes text mining technique for automatically extracting association rules from collections of textual documents. The technique called, Extracting Association Rules from Text (EART). It depends on keyword features for discover association rules amongst keywords labeling the documents. In this work, the EART system ignores the order in which the words occur, but instead focusing on the words and their statistical distributions in documents. The main contributions of the technique are that it integrates XML technology with Information Retrieval scheme (TFIDF) (for keyword/feature selection that automatically selects the most discriminative keywords for use in association rules generation) and use Data Mining technique for association rules discovery. It consists of three phases: Text Preprocessing phase (transformation, filtration, stemming and indexing of the documents), Association Rule Mining (ARM) phase (applying our designed algorithm for Generating Association Rules based on Weighting scheme GARW) and Visualization phase (visualization of results). Experiments applied on WebPages news documents related to the outbreak of the bird flu disease. The extracted association rules contain important features and describe the informative news included in the documents collection. The performance of the EART system compared with another system that uses the Apriori algorithm throughout the execution time and evaluating extracted association rules.

A New Ridge Orientation based Method of Computation for Feature Extraction from Fingerprint Images

An important step in studying the statistics of fingerprint minutia features is to reliably extract minutia features from the fingerprint images. A new reliable method of computation for minutiae feature extraction from fingerprint images is presented. A fingerprint image is treated as a textured image. An orientation flow field of the ridges is computed for the fingerprint image. To accurately locate ridges, a new ridge orientation based computation method is proposed. After ridge segmentation a new method of computation is proposed for smoothing the ridges. The ridge skeleton image is obtained and then smoothed using morphological operators to detect the features. A post processing stage eliminates a large number of false features from the detected set of minutiae features. The detected features are observed to be reliable and accurate.

Complex-Valued Neural Network in Signal Processing: A Study on the Effectiveness of Complex Valued Generalized Mean Neuron Model

A complex valued neural network is a neural network which consists of complex valued input and/or weights and/or thresholds and/or activation functions. Complex-valued neural networks have been widening the scope of applications not only in electronics and informatics, but also in social systems. One of the most important applications of the complex valued neural network is in signal processing. In Neural networks, generalized mean neuron model (GMN) is often discussed and studied. The GMN includes a new aggregation function based on the concept of generalized mean of all the inputs to the neuron. This paper aims to present exhaustive results of using Generalized Mean Neuron model in a complex-valued neural network model that uses the back-propagation algorithm (called -Complex-BP-) for learning. Our experiments results demonstrate the effectiveness of a Generalized Mean Neuron Model in a complex plane for signal processing over a real valued neural network. We have studied and stated various observations like effect of learning rates, ranges of the initial weights randomly selected, error functions used and number of iterations for the convergence of error required on a Generalized Mean neural network model. Some inherent properties of this complex back propagation algorithm are also studied and discussed.

Detection of Ultrasonic Images in the Presence of a Random Number of Scatterers: A Statistical Learning Approach

Support Vector Machine (SVM) is a statistical learning tool that was initially developed by Vapnik in 1979 and later developed to a more complex concept of structural risk minimization (SRM). SVM is playing an increasing role in applications to detection problems in various engineering problems, notably in statistical signal processing, pattern recognition, image analysis, and communication systems. In this paper, SVM was applied to the detection of medical ultrasound images in the presence of partially developed speckle noise. The simulation was done for single look and multi-look speckle models to give a complete overlook and insight to the new proposed model of the SVM-based detector. The structure of the SVM was derived and applied to clinical ultrasound images and its performance in terms of the mean square error (MSE) metric was calculated. We showed that the SVM-detected ultrasound images have a very low MSE and are of good quality. The quality of the processed speckled images improved for the multi-look model. Furthermore, the contrast of the SVM detected images was higher than that of the original non-noisy images, indicating that the SVM approach increased the distance between the pixel reflectivity levels (detection hypotheses) in the original images.

Pulsed Multi-Layered Image Filtering: A VLSI Implementation

Image convolution similar to the receptive fields found in mammalian visual pathways has long been used in conventional image processing in the form of Gabor masks. However, no VLSI implementation of parallel, multi-layered pulsed processing has been brought forward which would emulate this property. We present a technical realization of such a pulsed image processing scheme. The discussed IC also serves as a general testbed for VLSI-based pulsed information processing, which is of interest especially with regard to the robustness of representing an analog signal in the phase or duration of a pulsed, quasi-digital signal, as well as the possibility of direct digital manipulation of such an analog signal. The network connectivity and processing properties are reconfigurable so as to allow adaptation to various processing tasks.

Identification of Wideband Sources Using Higher Order Statistics in Noisy Environment

This paper deals with the localization of the wideband sources. We develop a new approach for estimating the wide band sources parameters. This method is based on the high order statistics of the recorded data in order to eliminate the Gaussian components from the signals received on the various hydrophones.In fact the noise of sea bottom is regarded as being Gaussian. Thanks to the coherent signal subspace algorithm based on the cumulant matrix of the received data instead of the cross-spectral matrix the wideband correlated sources are perfectly located in the very noisy environment. We demonstrate the performance of the proposed algorithm on the real data recorded during an underwater acoustics experiments.