A Text Mining Technique Using Association Rules Extraction

This paper describes text mining technique for automatically extracting association rules from collections of textual documents. The technique called, Extracting Association Rules from Text (EART). It depends on keyword features for discover association rules amongst keywords labeling the documents. In this work, the EART system ignores the order in which the words occur, but instead focusing on the words and their statistical distributions in documents. The main contributions of the technique are that it integrates XML technology with Information Retrieval scheme (TFIDF) (for keyword/feature selection that automatically selects the most discriminative keywords for use in association rules generation) and use Data Mining technique for association rules discovery. It consists of three phases: Text Preprocessing phase (transformation, filtration, stemming and indexing of the documents), Association Rule Mining (ARM) phase (applying our designed algorithm for Generating Association Rules based on Weighting scheme GARW) and Visualization phase (visualization of results). Experiments applied on WebPages news documents related to the outbreak of the bird flu disease. The extracted association rules contain important features and describe the informative news included in the documents collection. The performance of the EART system compared with another system that uses the Apriori algorithm throughout the execution time and evaluating extracted association rules.

A New Ridge Orientation based Method of Computation for Feature Extraction from Fingerprint Images

An important step in studying the statistics of fingerprint minutia features is to reliably extract minutia features from the fingerprint images. A new reliable method of computation for minutiae feature extraction from fingerprint images is presented. A fingerprint image is treated as a textured image. An orientation flow field of the ridges is computed for the fingerprint image. To accurately locate ridges, a new ridge orientation based computation method is proposed. After ridge segmentation a new method of computation is proposed for smoothing the ridges. The ridge skeleton image is obtained and then smoothed using morphological operators to detect the features. A post processing stage eliminates a large number of false features from the detected set of minutiae features. The detected features are observed to be reliable and accurate.

Screened Potential in a Reverse Monte Carlo (RMC) Simulation

A structural study of an aqueous electrolyte whose experimental results are available. It is a solution of LiCl-6H2O type at glassy state (120K) contrasted with pure water at room temperature by means of Partial Distribution Functions (PDF) issue from neutron scattering technique. Based on these partial functions, the Reverse Monte Carlo method (RMC) computes radial and angular correlation functions which allow exploring a number of structural features of the system. The obtained curves include some artifacts. To remedy this, we propose to introduce a screened potential as an additional constraint. Obtained results show a good matching between experimental and computed functions and a significant improvement in PDFs curves with potential constraint. It suggests an efficient fit of pair distribution functions curves.

An Approach to Solving a Permutation Problem of Frequency Domain Independent Component Analysis for Blind Source Separation of Speech Signals

Independent component analysis (ICA) in the frequency domain is used for solving the problem of blind source separation (BSS). However, this method has some problems. For example, a general ICA algorithm cannot determine the permutation of signals which is important in the frequency domain ICA. In this paper, we propose an approach to the solution for a permutation problem. The idea is to effectively combine two conventional approaches. This approach improves the signal separation performance by exploiting features of the conventional approaches. We show the simulation results using artificial data.

Human Verification in a Video Surveillance System Using Statistical Features

A human verification system is presented in this paper. The system consists of several steps: background subtraction, thresholding, line connection, region growing, morphlogy, star skelatonization, feature extraction, feature matching, and decision making. The proposed system combines an advantage of star skeletonization and simple statistic features. A correlation matching and probability voting have been used for verification, followed by a logical operation in a decision making stage. The proposed system uses small number of features and the system reliability is convincing.

A Novel Hybrid Mobile Agent Based Distributed Intrusion Detection System

The first generation of Mobile Agents based Intrusion Detection System just had two components namely data collection and single centralized analyzer. The disadvantage of this type of intrusion detection is if connection to the analyzer fails, the entire system will become useless. In this work, we propose novel hybrid model for Mobile Agent based Distributed Intrusion Detection System to overcome the current problem. The proposed model has new features such as robustness, capability of detecting intrusion against the IDS itself and capability of updating itself to detect new pattern of intrusions. In addition, our proposed model is also capable of tackling some of the weaknesses of centralized Intrusion Detection System models.

Analytical Study of Component Based Software Engineering

This paper is a survey of current component-based software technologies and the description of promotion and inhibition factors in CBSE. The features that software components inherit are also discussed. Quality Assurance issues in componentbased software are also catered to. The feat research on the quality model of component based system starts with the study of what the components are, CBSE, its development life cycle and the pro & cons of CBSE. Various attributes are studied and compared keeping in view the study of various existing models for general systems and CBS. When illustrating the quality of a software component an apt set of quality attributes for the description of the system (or components) should be selected. Finally, the research issues that can be extended are tabularized.

Molecular Dynamics of Fatty Acid Interacting with Carbon Nanotube as Selective Device

In this paper we study a system composed by carbon nanotube (CNT) and bundle of carbon nanotube (BuCNT) interacting with a specific fatty acid as molecular probe. Full system is represented by open nanotube (or nanotubes) and the linoleic acid (LA) relaxing due the interaction with CNT and BuCNT. The LA has in his form an asymmetric shape with COOH termination provoking a close BuCNT interaction mainly by van der Waals force field. The simulations were performed by classical molecular dynamics with standard parameterizations. Our results show that these BuCNT and CNT are dynamically stable and it shows a preferential interaction position with LA resulting in three features: (i) when the LA is interacting with CNT and BuCNT (including both termination, CH2 or COOH), the LA is repelled; (ii) when the LA terminated with CH2 is closer to open extremity of BuCNT, the LA is also repelled by the interaction between them; and (iii) when the LA terminated with COOH is closer to open extremity of BuCNT, the LA is encapsulated by the BuCNT. These simulations are part of a more extensive work on searching efficient selective molecular devices and could be useful to reach this goal.

The Characteristics of the Factors that Govern the Preferred Force in the Social Force Model of Pedestrian Movement

The social force model which belongs to the microscopic pedestrian studies has been considered as the supremacy by many researchers and due to the main feature of reproducing the self-organized phenomena resulted from pedestrian dynamic. The Preferred Force which is a measurement of pedestrian-s motivation to adapt his actual velocity to his desired velocity is an essential term on which the model was set up. This Force has gone through stages of development: first of all, Helbing and Molnar (1995) have modeled the original force for the normal situation. Second, Helbing and his co-workers (2000) have incorporated the panic situation into this force by incorporating the panic parameter to account for the panic situations. Third, Lakoba and Kaup (2005) have provided the pedestrians some kind of intelligence by incorporating aspects of the decision-making capability. In this paper, the authors analyze the most important incorporations into the model regarding the preferred force. They make comparisons between the different factors of these incorporations. Furthermore, to enhance the decision-making ability of the pedestrians, they introduce additional features such as the familiarity factor to the preferred force to let it appear more representative of what actually happens in reality.

Dynamic Load Balancing Strategy for Grid Computing

Workload and resource management are two essential functions provided at the service level of the grid software infrastructure. To improve the global throughput of these software environments, workloads have to be evenly scheduled among the available resources. To realize this goal several load balancing strategies and algorithms have been proposed. Most strategies were developed in mind, assuming homogeneous set of sites linked with homogeneous and fast networks. However for computational grids we must address main new issues, namely: heterogeneity, scalability and adaptability. In this paper, we propose a layered algorithm which achieve dynamic load balancing in grid computing. Based on a tree model, our algorithm presents the following main features: (i) it is layered; (ii) it supports heterogeneity and scalability; and, (iii) it is totally independent from any physical architecture of a grid.

Distinguishing Innocent Murmurs from Murmurs caused by Aortic Stenosis by Recurrence Quantification Analysis

It is sometimes difficult to differentiate between innocent murmurs and pathological murmurs during auscultation. In these difficult cases, an intelligent stethoscope with decision support abilities would be of great value. In this study, using a dog model, phonocardiographic recordings were obtained from 27 boxer dogs with various degrees of aortic stenosis (AS) severity. As a reference for severity assessment, continuous wave Doppler was used. The data were analyzed with recurrence quantification analysis (RQA) with the aim to find features able to distinguish innocent murmurs from murmurs caused by AS. Four out of eight investigated RQA features showed significant differences between innocent murmurs and pathological murmurs. Using a plain linear discriminant analysis classifier, the best pair of features (recurrence rate and entropy) resulted in a sensitivity of 90% and a specificity of 88%. In conclusion, RQA provide valid features which can be used for differentiation between innocent murmurs and murmurs caused by AS.

In Search of Robustness and Efficiency via l1− and l2− Regularized Optimization for Physiological Motion Compensation

Compensating physiological motion in the context of minimally invasive cardiac surgery has become an attractive issue since it outperforms traditional cardiac procedures offering remarkable benefits. Owing to space restrictions, computer vision techniques have proven to be the most practical and suitable solution. However, the lack of robustness and efficiency of existing methods make physiological motion compensation an open and challenging problem. This work focusses on increasing robustness and efficiency via exploration of the classes of 1−and 2−regularized optimization, emphasizing the use of explicit regularization. Both approaches are based on natural features of the heart using intensity information. Results pointed out the 1−regularized optimization class as the best since it offered the shortest computational cost, the smallest average error and it proved to work even under complex deformations.

Investigation of Some Methodologies in Providing Erosion Maps of Surface, Rill and Gully and Erosion Features

Some methodologies were compared in providing erosion maps of surface, rill and gully and erosion features, in research which took place in the Varamin sub-basin, north-east Tehran, Iran. A photomorphic unit map was produced from processed satellite images, and four other maps were prepared by the integration of different data layers, including slope, plant cover, geology, land use, rocks erodibility and land units. Comparison of ground truth maps of erosion types and working unit maps indicated that the integration of land use, land units and rocks erodibility layers with satellite image photomorphic units maps provide the best methods in producing erosion types maps.

Evaluation of Haar Cascade Classifiers Designed for Face Detection

In the past years a lot of effort has been made in the field of face detection. The human face contains important features that can be used by vision-based automated systems in order to identify and recognize individuals. Face location, the primary step of the vision-based automated systems, finds the face area in the input image. An accurate location of the face is still a challenging task. Viola-Jones framework has been widely used by researchers in order to detect the location of faces and objects in a given image. Face detection classifiers are shared by public communities, such as OpenCV. An evaluation of these classifiers will help researchers to choose the best classifier for their particular need. This work focuses of the evaluation of face detection classifiers minding facial landmarks.

Face Recognition using Radial Basis Function Network based on LDA

This paper describes a method to improve the robustness of a face recognition system based on the combination of two compensating classifiers. The face images are preprocessed by the appearance-based statistical approaches such as Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA). LDA features of the face image are taken as the input of the Radial Basis Function Network (RBFN). The proposed approach has been tested on the ORL database. The experimental results show that the LDA+RBFN algorithm has achieved a recognition rate of 93.5%

Energy Distribution of EEG Signals: EEG Signal Wavelet-Neural Network Classifier

In this paper, a wavelet-based neural network (WNN) classifier for recognizing EEG signals is implemented and tested under three sets EEG signals (healthy subjects, patients with epilepsy and patients with epileptic syndrome during the seizure). First, the Discrete Wavelet Transform (DWT) with the Multi-Resolution Analysis (MRA) is applied to decompose EEG signal at resolution levels of the components of the EEG signal (δ, θ, α, β and γ) and the Parseval-s theorem are employed to extract the percentage distribution of energy features of the EEG signal at different resolution levels. Second, the neural network (NN) classifies these extracted features to identify the EEGs type according to the percentage distribution of energy features. The performance of the proposed algorithm has been evaluated using in total 300 EEG signals. The results showed that the proposed classifier has the ability of recognizing and classifying EEG signals efficiently.

Modeling a Multinomial Logit Model of Intercity Travel Mode Choice Behavior for All Trips in Libya

In the planning point of view, it is essential to have mode choice, due to the massive amount of incurred in transportation systems. The intercity travellers in Libya have distinct features, as against travellers from other countries, which includes cultural and socioeconomic factors. Consequently, the goal of this study is to recognize the behavior of intercity travel using disaggregate models, for projecting the demand of nation-level intercity travel in Libya. Multinomial Logit Model for all the intercity trips has been formulated to examine the national-level intercity transportation in Libya. The Multinomial logit model was calibrated using nationwide revealed preferences (RP) and stated preferences (SP) survey. The model was developed for deference purpose of intercity trips (work, social and recreational). The variables of the model have been predicted based on maximum likelihood method. The data needed for model development were obtained from all major intercity corridors in Libya. The final sample size consisted of 1300 interviews. About two-thirds of these data were used for model calibration, and the remaining parts were used for model validation. This study, which is the first of its kind in Libya, investigates the intercity traveler’s mode-choice behavior. The intercity travel mode-choice model was successfully calibrated and validated. The outcomes indicate that, the overall model is effective and yields higher precision of estimation. The proposed model is beneficial, due to the fact that, it is receptive to a lot of variables, and can be employed to determine the impact of modifications in the numerous characteristics on the need for various travel modes. Estimations of the model might also be of valuable to planners, who can estimate possibilities for various modes and determine the impact of unique policy modifications on the need for intercity travel.

Analytical Study of Sedimentation Formation in Lined Canals using the SHARC Software- A Case Study of the Sabilli Canal in Dezful, Iran

Sediment formation and its transport along the river course is considered as important hydraulic consideration in river engineering. Their impact on the morphology of rivers on one hand and important considerations of which in the design and construction of the hydraulic structures on the other has attracted the attention of experts in arid and semi-arid regions. Under certain conditions where the momentum energy of the flow stream reaches a specific rate, the sediment materials start to be transported with the flow. This can usually be analyzed in two different categories of suspended and bed load materials. Sedimentation phenomenon along the waterways and the conveyance of vast volume of materials into the canal networks can potentially influence water abstraction in the intake structures. This can pose a serious threat to operational sustainability and water delivery performance in the canal networks. The situation is serious where ineffective watershed management (poor vegetation cover in the water basin) is the underlying cause of soil erosion which feeds the materials into the waterways that intern would necessitate comprehensive study. The present paper aims to present an analytical investigation of the sediment process in the waterways on one hand and estimation of the sediment load transport into the lined canals using the SHARC software on the other. For this reason, the paper focuses on the comparative analysis of the hydraulic behaviors of the Sabilli main canal that feeds the pumping station with that of the Western canal in the Greater Dezful region to identify effective factors in sedimentation and ways of mitigating their impact on water abstraction in the canal systems. The method involved use of observational data available in the Dezful Dastmashoon hydrometric station along a 6 km waterway of the Sabilli main canal using the SHARC software to estimate the suspended load concentration and bed load materials. Results showed the transport of a significant volume of sediment loads from the waterways into the canal system which is assumed to have arisen from the absence of stilling basin on one hand and the gravity flow on the other has caused serious challenges. This is contrary to what occurs in the Sabilli canal, where the design feature which incorporates a settling basin just before the pumping station is the major cause of reduced sediment load transport into the canal system.Results showed that modification of the present design features by constructing a settling basin just upstream of the western intake structure can considerably reduce the entry of sediment materials into the canal system. Not only this can result in the sustainability of the hydraulic structures but can also improve operational performance of water conveyance and distribution system, all of which are the pre-requisite to secure reliable and equitable water delivery regime for the command area.

The Performance Improvement of Automatic Modulation Recognition Using Simple Feature Manipulation, Analysis of the HOS, and Voted Decision

The use of High Order Statistics (HOS) analysis is expected to provide so many candidates of features that can be selected for pattern recognition. More candidates of the feature can be extracted using simple manipulation through a specific mathematical function prior to the HOS analysis. Feature extraction method using HOS analysis combined with Difference to the Nth-Power manipulation has been examined in application for Automatic Modulation Recognition (AMR) to perform scheme recognition of three digital modulation signal, i.e. QPSK-16QAM-64QAM in the AWGN transmission channel. The simulation results is reported when the analysis of HOS up to order-12 and the manipulation of Difference to the Nth-Power up to N = 4. The obtained accuracy rate of AMR using the method of Simple Decision obtained 90% in SNR > 10 dB in its classifier, while using the method of Voted Decision is 96% in SNR > 2 dB.

Named Entity Recognition using Support Vector Machine: A Language Independent Approach

Named Entity Recognition (NER) aims to classify each word of a document into predefined target named entity classes and is now-a-days considered to be fundamental for many Natural Language Processing (NLP) tasks such as information retrieval, machine translation, information extraction, question answering systems and others. This paper reports about the development of a NER system for Bengali and Hindi using Support Vector Machine (SVM). Though this state of the art machine learning technique has been widely applied to NER in several well-studied languages, the use of this technique to Indian languages (ILs) is very new. The system makes use of the different contextual information of the words along with the variety of features that are helpful in predicting the four different named (NE) classes, such as Person name, Location name, Organization name and Miscellaneous name. We have used the annotated corpora of 122,467 tokens of Bengali and 502,974 tokens of Hindi tagged with the twelve different NE classes 1, defined as part of the IJCNLP-08 NER Shared Task for South and South East Asian Languages (SSEAL) 2. In addition, we have manually annotated 150K wordforms of the Bengali news corpus, developed from the web-archive of a leading Bengali newspaper. We have also developed an unsupervised algorithm in order to generate the lexical context patterns from a part of the unlabeled Bengali news corpus. Lexical patterns have been used as the features of SVM in order to improve the system performance. The NER system has been tested with the gold standard test sets of 35K, and 60K tokens for Bengali, and Hindi, respectively. Evaluation results have demonstrated the recall, precision, and f-score values of 88.61%, 80.12%, and 84.15%, respectively, for Bengali and 80.23%, 74.34%, and 77.17%, respectively, for Hindi. Results show the improvement in the f-score by 5.13% with the use of context patterns. Statistical analysis, ANOVA is also performed to compare the performance of the proposed NER system with that of the existing HMM based system for both the languages.