Text Mining Technique for Data Mining Application

Text Mining is around applying knowledge discovery techniques to unstructured text is termed knowledge discovery in text (KDT), or Text data mining or Text Mining. In decision tree approach is most useful in classification problem. With this technique, tree is constructed to model the classification process. There are two basic steps in the technique: building the tree and applying the tree to the database. This paper describes a proposed C5.0 classifier that performs rulesets, cross validation and boosting for original C5.0 in order to reduce the optimization of error ratio. The feasibility and the benefits of the proposed approach are demonstrated by means of medial data set like hypothyroid. It is shown that, the performance of a classifier on the training cases from which it was constructed gives a poor estimate by sampling or using a separate test file, either way, the classifier is evaluated on cases that were not used to build and evaluate the classifier are both are large. If the cases in hypothyroid.data and hypothyroid.test were to be shuffled and divided into a new 2772 case training set and a 1000 case test set, C5.0 might construct a different classifier with a lower or higher error rate on the test cases. An important feature of see5 is its ability to classifiers called rulesets. The ruleset has an error rate 0.5 % on the test cases. The standard errors of the means provide an estimate of the variability of results. One way to get a more reliable estimate of predictive is by f-fold –cross- validation. The error rate of a classifier produced from all the cases is estimated as the ratio of the total number of errors on the hold-out cases to the total number of cases. The Boost option with x trials instructs See5 to construct up to x classifiers in this manner. Trials over numerous datasets, large and small, show that on average 10-classifier boosting reduces the error rate for test cases by about 25%.

Electronic Voting System using Mobile Terminal

Electronic voting (E-voting) using an internet has been recently performed in some nations and regions. There is no spatial restriction which a voter directly has to visit the polling place, but an e-voting using an internet has to go together the computer in which the internet connection is possible. Also, this voting requires an access code for the e-voting through the beforehand report of a voter. To minimize these disadvantages, we propose a method in which a voter, who has the wireless certificate issued in advance, uses its own cellular phone for an e-voting without the special registration for a vote. Our proposal allows a voter to cast his vote in a simple and convenient way without the limit of time and location, thereby increasing the voting rate, and also ensuring confidentiality and anonymity.

A Hidden Markov Model for Modeling Pavement Deterioration under Incomplete Monitoring Data

In this paper, the potential use of an exponential hidden Markov model to model a hidden pavement deterioration process, i.e. one that is not directly measurable, is investigated. It is assumed that the evolution of the physical condition, which is the hidden process, and the evolution of the values of pavement distress indicators, can be adequately described using discrete condition states and modeled as a Markov processes. It is also assumed that condition data can be collected by visual inspections over time and represented continuously using an exponential distribution. The advantage of using such a model in decision making process is illustrated through an empirical study using real world data.

Kinetics of Hydrodesulphurization of Diesel: Mass Transfer Aspects

In order to meet environmental norms, Indian fuel policy aims at producing ultra low sulphur diesel (ULSD) in near future. A catalyst for meeting such requirements has been developed and kinetics of this catalytic process is being looked into. In the present investigations, effect of mass transfer on kinetics of ultra deep hydrodesulphurization (UDHDS) to produce ULSD has been studied to determine intrinsic kinetics over a pre-sulphided catalyst. Experiments have been carried out in a continuous flow micro reactor operated in the temperature range of 330 to 3600C, whsv of 1 hr-1 at a pressure of 35 bar, and its parameters estimated. Based on the derived rate expression and estimated parameters optimum operation range has been determined for this UDHDS catalyst to obtain ULSD product.

A Modularized Design for Multi-Drivers Off-Road Vehicle Driving-Line and its Performance Assessment

Modularized design approach can facilitate the modeling of complex systems and support behavior analysis and simulation in an iterative and thus complex engineering process, by using encapsulated submodels of components and of their interfaces. Therefore it can improve the design efficiency and simplify the solving complicated problem. Multi-drivers off-road vehicle is comparatively complicated. Driving-line is an important core part to a vehicle; it has a significant contribution to the performance of a vehicle. Multi-driver off-road vehicles have complex driving-line, so its performance is heavily dependent on the driving-line. A typical off-road vehicle-s driving-line system consists of torque converter, transmission, transfer case and driving-axles, which transfer the power, generated by the engine and distribute it effectively to the driving wheels according to the road condition. According to its main function, this paper puts forward a modularized approach for designing and evaluation of vehicle-s driving-line. It can be used to effectively estimate the performance of driving-line during concept design stage. Through appropriate analysis and assessment method, an optimal design can be reached. This method has been applied to the practical vehicle design, it can improve the design efficiency and is convenient to assess and validate the performance of a vehicle, especially of multi-drivers off-road vehicle.

Probabilistic Modelling of Marine Bridge Deterioration

Chloride induced corrosion of steel reinforcement is the main cause of deterioration of reinforced concrete marine structures. This paper investigates the relative performance of alternative repair options with respect to the deterioration of reinforced concrete bridge elements in marine environments. Focus is placed on the initiation phase of reinforcement corrosion. A laboratory study is described which involved exposing concrete samples to accelerated chloride-ion ingress. The study examined the relative efficiencies of two repair methods, namely Ordinary Portland Cement (OPC) concrete and a concrete which utilised Ground Granulated Blastfurnace Cement (GGBS) as a partial cement replacement. The mix designs and materials utilised were identical to those implemented in the repair of a marine bridge on the South East coast of Ireland in 2007. The results of this testing regime serve to inform input variables employed in probabilistic modelling of deterioration for subsequent reliability based analysis to compare the relative performance of the studied repair options.

Parallel Direct Integration Variable Step Block Method for Solving Large System of Higher Order Ordinary Differential Equations

The aim of this paper is to investigate the performance of the developed two point block method designed for two processors for solving directly non stiff large systems of higher order ordinary differential equations (ODEs). The method calculates the numerical solution at two points simultaneously and produces two new equally spaced solution values within a block and it is possible to assign the computational tasks at each time step to a single processor. The algorithm of the method was developed in C language and the parallel computation was done on a parallel shared memory environment. Numerical results are given to compare the efficiency of the developed method to the sequential timing. For large problems, the parallel implementation produced 1.95 speed-up and 98% efficiency for the two processors.

Neural Network Based Icing Identification and Fault Tolerant Control of a 340 Aircraft

This paper presents a Neural Network (NN) identification of icing parameters in an A340 aircraft and a reconfiguration technique to keep the A/C performance close to the performance prior to icing. Five aircraft parameters are assumed to be considerably affected by icing. The off-line training for identifying the clear and iced dynamics is based on the Levenberg-Marquard Backpropagation algorithm. The icing parameters are located in the system matrix. The physical locations of the icing are assumed at the right and left wings. The reconfiguration is based on the technique known as the control mixer approach or pseudo inverse technique. This technique generates the new control input vector such that the A/C dynamics is not much affected by icing. In the simulations, the longitudinal and lateral dynamics of an Airbus A340 aircraft model are considered, and the stability derivatives affected by icing are identified. The simulation results show the successful NN identification of the icing parameters and the reconfigured flight dynamics having the similar performance before the icing. In other words, the destabilizing icing affect is compensated.

From Hype to Ignorance – A Review of 30 Years of Lean Production

Lean production (or lean management respectively) gained popularity in several waves. The last three decades have been filled with numerous attempts to apply these concepts in companies. However, this has only been partially successful. The roots of lean production can be traced back to Toyota-s just-in-time production. This concept, which according to Womack-s, Jones- and Roos- research at MIT was employed by Japanese car manufacturers, became popular under its international names “lean production", “lean-manufacturing" and was termed “Schlanke Produktion" in Germany. This contribution shows a review about lean production in Germany over the last thirty years: development, trial & error and implementation as well.

Capacity Enhancement in Wireless Networks using Directional Antennas

One of the biggest drawbacks of the wireless environment is the limited bandwidth. However, the users sharing this limited bandwidth have been increasing considerably. SDMA technique which entails using directional antennas allows to increase the capacity of a wireless network by separating users in the medium. In this paper, it has been presented how the capacity can be enhanced while the mean delay is reduced by using directional antennas in wireless networks employing TDMA/FDD MAC. Computer modeling and simulation of the wireless system studied are realized using OPNET Modeler. Preliminary simulation results are presented and the performance of the model using directional antennas is evaluated and compared consistently with the one using omnidirectional antennas.

Leaf Chlorophyll of Corn, Sweet basil and Borage under Intercropping System in Weed Interference

Intercropping is one of the sustainable agricultural factors. The SPAD meter can be used to predict nitrogen index reliably, it may also be a useful tool for assessing the relative impact of weeds on crops. In order to study the effect of weeds on SPAD in corn (Zea mays L.), sweet basil (Ocimum basilicum L.) and borage (Borago officinalis L.) in intercropping system, a factorial experiment was conducted in three replications in 2011. Experimental factors were included intercropping of corn with sweet basil and borage in different ratios (100:0, 75:25, 50:50, 25:75 and 0:100 corn: borage or sweet basil) and weed infestation (weed control and weed interference). The results showed that intercropping of corn with sweet basil and borage increased the SPAD value of corn compare to monoculture in weed interference condition. Sweet basil SPAD value in weed control treatments (43.66) was more than weed interference treatments (40.17). Corn could increase the borage SPAD value compare to monoculture in weed interference treatments.

Evolutionary Approach for Automated Discovery of Censored Production Rules

In the recent past, there has been an increasing interest in applying evolutionary methods to Knowledge Discovery in Databases (KDD) and a number of successful applications of Genetic Algorithms (GA) and Genetic Programming (GP) to KDD have been demonstrated. The most predominant representation of the discovered knowledge is the standard Production Rules (PRs) in the form If P Then D. The PRs, however, are unable to handle exceptions and do not exhibit variable precision. The Censored Production Rules (CPRs), an extension of PRs, were proposed by Michalski & Winston that exhibit variable precision and supports an efficient mechanism for handling exceptions. A CPR is an augmented production rule of the form: If P Then D Unless C, where C (Censor) is an exception to the rule. Such rules are employed in situations, in which the conditional statement 'If P Then D' holds frequently and the assertion C holds rarely. By using a rule of this type we are free to ignore the exception conditions, when the resources needed to establish its presence are tight or there is simply no information available as to whether it holds or not. Thus, the 'If P Then D' part of the CPR expresses important information, while the Unless C part acts only as a switch and changes the polarity of D to ~D. This paper presents a classification algorithm based on evolutionary approach that discovers comprehensible rules with exceptions in the form of CPRs. The proposed approach has flexible chromosome encoding, where each chromosome corresponds to a CPR. Appropriate genetic operators are suggested and a fitness function is proposed that incorporates the basic constraints on CPRs. Experimental results are presented to demonstrate the performance of the proposed algorithm.

Blind Channel Estimation Based on URV Decomposition Technique for Uplink of MC-CDMA

In this paper, we investigate a blind channel estimation method for Multi-carrier CDMA systems that use a subspace decomposition technique. This technique exploits the orthogonality property between the noise subspace and the received user codes to obtain channel of each user. In the past we used Singular Value Decomposition (SVD) technique but SVD have most computational complexity so in this paper use a new algorithm called URV Decomposition, which serve as an intermediary between the QR decomposition and SVD, replaced in SVD technique to track the noise space of the received data. Because of the URV decomposition has almost the same estimation performance as the SVD, but has less computational complexity.

The Analysis of the Software Industry in Thailand

The software industry has been considered a critical infrastructure for any nation. Several studies have indicated that national competitiveness increasingly depends upon Information and Communication Technology (ICT), and software is one of the major components of ICT, important for both large and small enterprises. Even though there has been strong growth in the software industry in Thailand, the industry has faced many challenges and problems that need to be resolved. For example, the amount of pirated software has been rising, and Thailand still has a large gap in the digital divide. Additionally, the adoption among SMEs has been slow. This paper investigates various issues in the software industry in Thailand, using information acquired through analysis of secondary sources, observation, and focus groups. The results of this study can be used as “lessons learned" for the development of the software industry in any developing country.

A High Performance Technique in Harmonic Omitting Based on Predictive Current Control of a Shunt Active Power Filter

The perfect operation of common Active Filters is depended on accuracy of identification system distortion. Also, using a suitable method in current injection and reactive power compensation, leads to increased filter performance. Due to this fact, this paper presents a method based on predictive current control theory in shunt active filter applications. The harmonics of the load current is identified by using o–d–q reference frame on load current and eliminating the DC part of d–q components. Then, the rest of these components deliver to predictive current controller as a Threephase reference current by using Park inverse transformation. System is modeled in discreet time domain. The proposed method has been tested using MATLAB model for a nonlinear load (with Total Harmonic Distortion=20%). The simulation results indicate that the proposed filter leads to flowing a sinusoidal current (THD=0.15%) through the source. In addition, the results show that the filter tracks the reference current accurately.

Probabilistic Model Development for Project Performance Forecasting

In this paper, based on the past project cost and time performance, a model for forecasting project cost performance is developed. This study presents a probabilistic project control concept to assure an acceptable forecast of project cost performance. In this concept project activities are classified into sub-groups entitled control accounts. Then obtain the Stochastic S-Curve (SS-Curve), for each sub-group and the project SS-Curve is obtained by summing sub-groups- SS-Curves. In this model, project cost uncertainties are considered through Beta distribution functions of the project activities costs required to complete the project at every selected time sections through project accomplishment, which are extracted from a variety of sources. Based on this model, after a percentage of the project progress, the project performance is measured via Earned Value Management to adjust the primary cost probability distribution functions. Then, accordingly the future project cost performance is predicted by using the Monte-Carlo simulation method.

Non-negative Principal Component Analysis for Face Recognition

Principle component analysis is often combined with the state-of-art classification algorithms to recognize human faces. However, principle component analysis can only capture these features contributing to the global characteristics of data because it is a global feature selection algorithm. It misses those features contributing to the local characteristics of data because each principal component only contains some levels of global characteristics of data. In this study, we present a novel face recognition approach using non-negative principal component analysis which is added with the constraint of non-negative to improve data locality and contribute to elucidating latent data structures. Experiments are performed on the Cambridge ORL face database. We demonstrate the strong performances of the algorithm in recognizing human faces in comparison with PCA and NREMF approaches.

Designing and Implementing an Innovative Course about World Wide Web, Based on the Conceptual Representations of Students

Internet is nowadays included to all National Curriculums of the elementary school. A comparative study of their goals leads to the conclusion that a complete curriculum should aim to student-s acquisition of the abilities to navigate and search for information and additionally to emphasize on the evaluation of the information provided by the World Wide Web. In a constructivistic knowledge framework the design of a course has to take under consideration the conceptual representations of students. The following paper presents the conceptual representation of students of eleven years old, attending the Sixth Grade of Greek Elementary School about World Wide Web and their use in the design and implementation of an innovative course.

Solving Part Type Selection and Loading Problem in Flexible Manufacturing System Using Real Coded Genetic Algorithms – Part I: Modeling

This paper and its companion (Part 2) deal with modeling and optimization of two NP-hard problems in production planning of flexible manufacturing system (FMS), part type selection problem and loading problem. The part type selection problem and the loading problem are strongly related and heavily influence the system-s efficiency and productivity. The complexity of the problems is harder when flexibilities of operations such as the possibility of operation processed on alternative machines with alternative tools are considered. These problems have been modeled and solved simultaneously by using real coded genetic algorithms (RCGA) which uses an array of real numbers as chromosome representation. These real numbers can be converted into part type sequence and machines that are used to process the part types. This first part of the papers focuses on the modeling of the problems and discussing how the novel chromosome representation can be applied to solve the problems. The second part will discuss the effectiveness of the RCGA to solve various test bed problems.

Quasi Multi-Pulse Back-to-Back Static Synchronous Compensator Employing Line Frequency Switching 2-Level GTO Inverters

Back-to-back static synchronous compensator (BtBSTATCOM) consists of two back-to-back voltage-source converters (VSC) with a common DC link in a substation. This configuration extends the capabilities of conventional STATCOM that bidirectional active power transfer from one bus to another is possible. In this paper, VSCs are designed in quasi multi-pulse form in which GTOs are triggered only once per cycle in PSCAD/EMTDC. The design details of VSCs as well as gate switching circuits and controllers are fully represented. Regulation modes of BtBSTATCOM are verified and tested on a multi-machine power system through different simulation cases. The results presented in the form of typical time responses show that practical PI controllers are almost robust and stable in case of start-up, set-point change, and line faults.