Text Mining Technique for Data Mining Application

Text Mining is around applying knowledge discovery techniques to unstructured text is termed knowledge discovery in text (KDT), or Text data mining or Text Mining. In decision tree approach is most useful in classification problem. With this technique, tree is constructed to model the classification process. There are two basic steps in the technique: building the tree and applying the tree to the database. This paper describes a proposed C5.0 classifier that performs rulesets, cross validation and boosting for original C5.0 in order to reduce the optimization of error ratio. The feasibility and the benefits of the proposed approach are demonstrated by means of medial data set like hypothyroid. It is shown that, the performance of a classifier on the training cases from which it was constructed gives a poor estimate by sampling or using a separate test file, either way, the classifier is evaluated on cases that were not used to build and evaluate the classifier are both are large. If the cases in hypothyroid.data and hypothyroid.test were to be shuffled and divided into a new 2772 case training set and a 1000 case test set, C5.0 might construct a different classifier with a lower or higher error rate on the test cases. An important feature of see5 is its ability to classifiers called rulesets. The ruleset has an error rate 0.5 % on the test cases. The standard errors of the means provide an estimate of the variability of results. One way to get a more reliable estimate of predictive is by f-fold –cross- validation. The error rate of a classifier produced from all the cases is estimated as the ratio of the total number of errors on the hold-out cases to the total number of cases. The Boost option with x trials instructs See5 to construct up to x classifiers in this manner. Trials over numerous datasets, large and small, show that on average 10-classifier boosting reduces the error rate for test cases by about 25%.

Self-Organization of Radiation Defects: Temporal Dissipative Structures

A theoretical approach to radiation damage evolution is developed. Stable temporal behavior taking place in solids under irradiation are examined as phenomena of self-organization in nonequilibrium systems. Experimental effects of temporal self-organization in solids under irradiation are reviewed. Their essential common properties and features are highlighted and analyzed. Dynamical model to describe development of self-oscillation of density of point defects under stationary irradiation is proposed. The emphasis is the nonlinear couplings between rate of annealing and density of defects that determine the kind and parameters of an arising self-oscillation. The field of parameters (defect generation rate and environment temperature) at which self-oscillations develop is found. Bifurcation curve and self-oscillation period near it is obtained.

A Modularized Design for Multi-Drivers Off-Road Vehicle Driving-Line and its Performance Assessment

Modularized design approach can facilitate the modeling of complex systems and support behavior analysis and simulation in an iterative and thus complex engineering process, by using encapsulated submodels of components and of their interfaces. Therefore it can improve the design efficiency and simplify the solving complicated problem. Multi-drivers off-road vehicle is comparatively complicated. Driving-line is an important core part to a vehicle; it has a significant contribution to the performance of a vehicle. Multi-driver off-road vehicles have complex driving-line, so its performance is heavily dependent on the driving-line. A typical off-road vehicle-s driving-line system consists of torque converter, transmission, transfer case and driving-axles, which transfer the power, generated by the engine and distribute it effectively to the driving wheels according to the road condition. According to its main function, this paper puts forward a modularized approach for designing and evaluation of vehicle-s driving-line. It can be used to effectively estimate the performance of driving-line during concept design stage. Through appropriate analysis and assessment method, an optimal design can be reached. This method has been applied to the practical vehicle design, it can improve the design efficiency and is convenient to assess and validate the performance of a vehicle, especially of multi-drivers off-road vehicle.

Stress Relaxation of Date at Different Temperature and Moisture Content of Product: A New Approach

Iran is one of the greatest producers of date in the world. However due to lack of information about its viscoelastic properties, much of the production downgraded during harvesting and postharvesting processes. In this study the effect of temperature and moisture content of product were investigated on stress relaxation characteristics. Therefore, the freshly harvested date (kabkab) at tamar stage were put in controlled environment chamber to obtain different temperature levels (25, 35, 45, and 55 0C) and moisture contents (8.5, 8.7, 9.2, 15.3, 20, 32.2 %d.b.). A texture analyzer TAXT2 (Stable Microsystems, UK) was used to apply uniaxial compression tests. A chamber capable to control temperature was designed and fabricated around the plunger of texture analyzer to control the temperature during the experiment. As a new approach a CCD camera (A4tech, 30 fps) was mounted on a cylindrical glass probe to scan and record contact area between date and disk. Afterwards, pictures were analyzed using image processing toolbox of Matlab software. Individual date fruit was uniaxially compressed at speed of 1 mm/s. The constant strain of 30% of thickness of date was applied to the horizontally oriented fruit. To select a suitable model for describing stress relaxation of date, experimental data were fitted with three famous stress relaxation models including the generalized Maxwell, Nussinovitch, and Pelege. The constant in mentioned model were determined and correlated with temperature and moisture content of product using non-linear regression analysis. It was found that Generalized Maxwell and Nussinovitch models appropriately describe viscoelastic characteristics of date fruits as compared to Peleg mode.

Neural Network Based Icing Identification and Fault Tolerant Control of a 340 Aircraft

This paper presents a Neural Network (NN) identification of icing parameters in an A340 aircraft and a reconfiguration technique to keep the A/C performance close to the performance prior to icing. Five aircraft parameters are assumed to be considerably affected by icing. The off-line training for identifying the clear and iced dynamics is based on the Levenberg-Marquard Backpropagation algorithm. The icing parameters are located in the system matrix. The physical locations of the icing are assumed at the right and left wings. The reconfiguration is based on the technique known as the control mixer approach or pseudo inverse technique. This technique generates the new control input vector such that the A/C dynamics is not much affected by icing. In the simulations, the longitudinal and lateral dynamics of an Airbus A340 aircraft model are considered, and the stability derivatives affected by icing are identified. The simulation results show the successful NN identification of the icing parameters and the reconfigured flight dynamics having the similar performance before the icing. In other words, the destabilizing icing affect is compensated.

Evolutionary Approach for Automated Discovery of Censored Production Rules

In the recent past, there has been an increasing interest in applying evolutionary methods to Knowledge Discovery in Databases (KDD) and a number of successful applications of Genetic Algorithms (GA) and Genetic Programming (GP) to KDD have been demonstrated. The most predominant representation of the discovered knowledge is the standard Production Rules (PRs) in the form If P Then D. The PRs, however, are unable to handle exceptions and do not exhibit variable precision. The Censored Production Rules (CPRs), an extension of PRs, were proposed by Michalski & Winston that exhibit variable precision and supports an efficient mechanism for handling exceptions. A CPR is an augmented production rule of the form: If P Then D Unless C, where C (Censor) is an exception to the rule. Such rules are employed in situations, in which the conditional statement 'If P Then D' holds frequently and the assertion C holds rarely. By using a rule of this type we are free to ignore the exception conditions, when the resources needed to establish its presence are tight or there is simply no information available as to whether it holds or not. Thus, the 'If P Then D' part of the CPR expresses important information, while the Unless C part acts only as a switch and changes the polarity of D to ~D. This paper presents a classification algorithm based on evolutionary approach that discovers comprehensible rules with exceptions in the form of CPRs. The proposed approach has flexible chromosome encoding, where each chromosome corresponds to a CPR. Appropriate genetic operators are suggested and a fitness function is proposed that incorporates the basic constraints on CPRs. Experimental results are presented to demonstrate the performance of the proposed algorithm.

A Heuristic Algorithm Approach for Scheduling of Multi-criteria Unrelated Parallel Machines

In this paper we address a multi-objective scheduling problem for unrelated parallel machines. In unrelated parallel systems, the processing cost/time of a given job on different machines may vary. The objective of scheduling is to simultaneously determine the job-machine assignment and job sequencing on each machine. In such a way the total cost of the schedule is minimized. The cost function consists of three components, namely; machining cost, earliness/tardiness penalties and makespan related cost. Such scheduling problem is combinatorial in nature. Therefore, a Simulated Annealing approach is employed to provide good solutions within reasonable computational times. Computational results show that the proposed approach can efficiently solve such complicated problems.

Non-negative Principal Component Analysis for Face Recognition

Principle component analysis is often combined with the state-of-art classification algorithms to recognize human faces. However, principle component analysis can only capture these features contributing to the global characteristics of data because it is a global feature selection algorithm. It misses those features contributing to the local characteristics of data because each principal component only contains some levels of global characteristics of data. In this study, we present a novel face recognition approach using non-negative principal component analysis which is added with the constraint of non-negative to improve data locality and contribute to elucidating latent data structures. Experiments are performed on the Cambridge ORL face database. We demonstrate the strong performances of the algorithm in recognizing human faces in comparison with PCA and NREMF approaches.

Intelligent Heart Disease Prediction System Using CANFIS and Genetic Algorithm

Heart disease (HD) is a major cause of morbidity and mortality in the modern society. Medical diagnosis is an important but complicated task that should be performed accurately and efficiently and its automation would be very useful. All doctors are unfortunately not equally skilled in every sub specialty and they are in many places a scarce resource. A system for automated medical diagnosis would enhance medical care and reduce costs. In this paper, a new approach based on coactive neuro-fuzzy inference system (CANFIS) was presented for prediction of heart disease. The proposed CANFIS model combined the neural network adaptive capabilities and the fuzzy logic qualitative approach which is then integrated with genetic algorithm to diagnose the presence of the disease. The performances of the CANFIS model were evaluated in terms of training performances and classification accuracies and the results showed that the proposed CANFIS model has great potential in predicting the heart disease.

An Adaptive Model for Blind Image Restoration using Bayesian Approach

Image restoration involves elimination of noise. Filtering techniques were adopted so far to restore images since last five decades. In this paper, we consider the problem of image restoration degraded by a blur function and corrupted by random noise. A method for reducing additive noise in images by explicit analysis of local image statistics is introduced and compared to other noise reduction methods. The proposed method, which makes use of an a priori noise model, has been evaluated on various types of images. Bayesian based algorithms and technique of image processing have been described and substantiated with experimentation using MATLAB.

Modeling of Reinforcement in Concrete Beams Using Machine Learning Tools

The paper discusses the results obtained to predict reinforcement in singly reinforced beam using Neural Net (NN), Support Vector Machines (SVM-s) and Tree Based Models. Major advantage of SVM-s over NN is of minimizing a bound on the generalization error of model rather than minimizing a bound on mean square error over the data set as done in NN. Tree Based approach divides the problem into a small number of sub problems to reach at a conclusion. Number of data was created for different parameters of beam to calculate the reinforcement using limit state method for creation of models and validation. The results from this study suggest a remarkably good performance of tree based and SVM-s models. Further, this study found that these two techniques work well and even better than Neural Network methods. A comparison of predicted values with actual values suggests a very good correlation coefficient with all four techniques.

A New Heuristic Approach for the Stock- Cutting Problems

This paper addresses a stock-cutting problem with rotation of items and without the guillotine cutting constraint. In order to solve the large-scale problem effectively and efficiently, we propose a simple but fast heuristic algorithm. It is shown that this heuristic outperforms the latest published algorithms for large-scale problem instances.

The Design of Axisymmetric Ducts for Incompressible Flow with a Parabolic Axial Velocity Inlet Profile

In this paper a numerical algorithm is described for solving the boundary value problem associated with axisymmetric, inviscid, incompressible, rotational (and irrotational) flow in order to obtain duct wall shapes from prescribed wall velocity distributions. The governing equations are formulated in terms of the stream function ψ (x,y)and the function φ (x,y)as independent variables where for irrotational flow φ (x,y)can be recognized as the velocity potential function, for rotational flow φ (x,y)ceases being the velocity potential function but does remain orthogonal to the stream lines. A numerical method based on the finite difference scheme on a uniform mesh is employed. The technique described is capable of tackling the so-called inverse problem where the velocity wall distributions are prescribed from which the duct wall shape is calculated, as well as the direct problem where the velocity distribution on the duct walls are calculated from prescribed duct geometries. The two different cases as outlined in this paper are in fact boundary value problems with Neumann and Dirichlet boundary conditions respectively. Even though both approaches are discussed, only numerical results for the case of the Dirichlet boundary conditions are given. A downstream condition is prescribed such that cylindrical flow, that is flow which is independent of the axial coordinate, exists.

A New Approach to Solve Blasius Equation using Parameter Identification of Nonlinear Functions based on the Bees Algorithm (BA)

In this paper, a new approach is introduced to solve Blasius equation using parameter identification of a nonlinear function which is used as approximation function. Bees Algorithm (BA) is applied in order to find the adjustable parameters of approximation function regarding minimizing a fitness function including these parameters (i.e. adjustable parameters). These parameters are determined how the approximation function has to satisfy the boundary conditions. In order to demonstrate the presented method, the obtained results are compared with another numerical method. Present method can be easily extended to solve a wide range of problems.

Novelty as a Measure of Interestingness in Knowledge Discovery

Rule Discovery is an important technique for mining knowledge from large databases. Use of objective measures for discovering interesting rules leads to another data mining problem, although of reduced complexity. Data mining researchers have studied subjective measures of interestingness to reduce the volume of discovered rules to ultimately improve the overall efficiency of KDD process. In this paper we study novelty of the discovered rules as a subjective measure of interestingness. We propose a hybrid approach based on both objective and subjective measures to quantify novelty of the discovered rules in terms of their deviations from the known rules (knowledge). We analyze the types of deviation that can arise between two rules and categorize the discovered rules according to the user specified threshold. We implement the proposed framework and experiment with some public datasets. The experimental results are promising.

TRS: System for Recommending Semantic Web Service Composition Approaches

A large number of semantic web service composition approaches are developed by the research community and one is more efficient than the other one depending on the particular situation of use. So a close look at the requirements of ones particular situation is necessary to find a suitable approach to use. In this paper, we present a Technique Recommendation System (TRS) which using a classification of state-of-art semantic web service composition approaches, can provide the user of the system with the recommendations regarding the use of service composition approach based on some parameters regarding situation of use. TRS has modular architecture and uses the production-rules for knowledge representation.

A Rough-set Based Approach to Design an Expert System for Personnel Selection

Effective employee selection is a critical component of a successful organization. Many important criteria for personnel selection such as decision-making ability, adaptability, ambition, and self-organization are naturally vague and imprecise to evaluate. The rough sets theory (RST) as a new mathematical approach to vagueness and uncertainty is a very well suited tool to deal with qualitative data and various decision problems. This paper provides conceptual, descriptive, and simulation results, concentrating chiefly on human resources and personnel selection factors. The current research derives certain decision rules which are able to facilitate personnel selection and identifies several significant features based on an empirical study conducted in an IT company in Iran.

Rapid Frequency Response Measurement of Power Conversion Products with Coherence-Based Confidence Analysis

Switched-mode converters play now a significant role in modern society. Their operation are often crucial in various electrical applications affecting the every day life. Therefore, the quality of the converters needs to be reliably verified. Recent studies have shown that the converters can be fully characterized by a set of frequency responses which can be efficiently used to validate the proper operation of the converters. Consequently, several methods have been proposed to measure the frequency responses fast and accurately. Most often correlation-based techniques have been applied. The presented measurement methods are highly sensitive to external errors and system nonlinearities. This fact has been often forgotten and the necessary uncertainty analysis of the measured responses has been neglected. This paper presents a simple approach to analyze the noise and nonlinearities in the frequency-response measurements of switched-mode converters. Coherence analysis is applied to form a confidence interval characterizing the noise and nonlinearities involved in the measurements. The presented method is verified by practical measurements from a high-frequency switchedmode converter.

Cognitive Landscape of Values – Understanding the Information Contents of Mental Representations

The values of managers and employees in organizations are phenomena that have captured the interest of researchers at large. Despite this attention, there continues to be a lack of agreement on what values are and how they influence individuals, or how they are constituted in individuals- mind. In this article content-based approach is presented as alternative reference frame for exploring values. In content-based approach human thinking in different contexts is set at the focal point. Differences in valuations can be explained through the information contents of mental representations. In addition to the information contents, attention is devoted to those cognitive processes through which mental representations of values are constructed. Such informational contents are in decisive role for understanding human behavior. By applying content-based analysis to an examination of values as mental representations, it is possible to reach a deeper to the motivational foundation of behaviors, such as decision making in organizational procedures, through understanding the structure and meanings of specific values at play.

The Robust Clustering with Reduction Dimension

A clustering is process to identify a homogeneous groups of object called as cluster. Clustering is one interesting topic on data mining. A group or class behaves similarly characteristics. This paper discusses a robust clustering process for data images with two reduction dimension approaches; i.e. the two dimensional principal component analysis (2DPCA) and principal component analysis (PCA). A standard approach to overcome this problem is dimension reduction, which transforms a high-dimensional data into a lower-dimensional space with limited loss of information. One of the most common forms of dimensionality reduction is the principal components analysis (PCA). The 2DPCA is often called a variant of principal component (PCA), the image matrices were directly treated as 2D matrices; they do not need to be transformed into a vector so that the covariance matrix of image can be constructed directly using the original image matrices. The decomposed classical covariance matrix is very sensitive to outlying observations. The objective of paper is to compare the performance of robust minimizing vector variance (MVV) in the two dimensional projection PCA (2DPCA) and the PCA for clustering on an arbitrary data image when outliers are hiden in the data set. The simulation aspects of robustness and the illustration of clustering images are discussed in the end of paper