Text Mining Technique for Data Mining Application

Text Mining is around applying knowledge discovery techniques to unstructured text is termed knowledge discovery in text (KDT), or Text data mining or Text Mining. In decision tree approach is most useful in classification problem. With this technique, tree is constructed to model the classification process. There are two basic steps in the technique: building the tree and applying the tree to the database. This paper describes a proposed C5.0 classifier that performs rulesets, cross validation and boosting for original C5.0 in order to reduce the optimization of error ratio. The feasibility and the benefits of the proposed approach are demonstrated by means of medial data set like hypothyroid. It is shown that, the performance of a classifier on the training cases from which it was constructed gives a poor estimate by sampling or using a separate test file, either way, the classifier is evaluated on cases that were not used to build and evaluate the classifier are both are large. If the cases in hypothyroid.data and hypothyroid.test were to be shuffled and divided into a new 2772 case training set and a 1000 case test set, C5.0 might construct a different classifier with a lower or higher error rate on the test cases. An important feature of see5 is its ability to classifiers called rulesets. The ruleset has an error rate 0.5 % on the test cases. The standard errors of the means provide an estimate of the variability of results. One way to get a more reliable estimate of predictive is by f-fold –cross- validation. The error rate of a classifier produced from all the cases is estimated as the ratio of the total number of errors on the hold-out cases to the total number of cases. The Boost option with x trials instructs See5 to construct up to x classifiers in this manner. Trials over numerous datasets, large and small, show that on average 10-classifier boosting reduces the error rate for test cases by about 25%.

Probabilistic Modelling of Marine Bridge Deterioration

Chloride induced corrosion of steel reinforcement is the main cause of deterioration of reinforced concrete marine structures. This paper investigates the relative performance of alternative repair options with respect to the deterioration of reinforced concrete bridge elements in marine environments. Focus is placed on the initiation phase of reinforcement corrosion. A laboratory study is described which involved exposing concrete samples to accelerated chloride-ion ingress. The study examined the relative efficiencies of two repair methods, namely Ordinary Portland Cement (OPC) concrete and a concrete which utilised Ground Granulated Blastfurnace Cement (GGBS) as a partial cement replacement. The mix designs and materials utilised were identical to those implemented in the repair of a marine bridge on the South East coast of Ireland in 2007. The results of this testing regime serve to inform input variables employed in probabilistic modelling of deterioration for subsequent reliability based analysis to compare the relative performance of the studied repair options.

Evolutionary Approach for Automated Discovery of Censored Production Rules

In the recent past, there has been an increasing interest in applying evolutionary methods to Knowledge Discovery in Databases (KDD) and a number of successful applications of Genetic Algorithms (GA) and Genetic Programming (GP) to KDD have been demonstrated. The most predominant representation of the discovered knowledge is the standard Production Rules (PRs) in the form If P Then D. The PRs, however, are unable to handle exceptions and do not exhibit variable precision. The Censored Production Rules (CPRs), an extension of PRs, were proposed by Michalski & Winston that exhibit variable precision and supports an efficient mechanism for handling exceptions. A CPR is an augmented production rule of the form: If P Then D Unless C, where C (Censor) is an exception to the rule. Such rules are employed in situations, in which the conditional statement 'If P Then D' holds frequently and the assertion C holds rarely. By using a rule of this type we are free to ignore the exception conditions, when the resources needed to establish its presence are tight or there is simply no information available as to whether it holds or not. Thus, the 'If P Then D' part of the CPR expresses important information, while the Unless C part acts only as a switch and changes the polarity of D to ~D. This paper presents a classification algorithm based on evolutionary approach that discovers comprehensible rules with exceptions in the form of CPRs. The proposed approach has flexible chromosome encoding, where each chromosome corresponds to a CPR. Appropriate genetic operators are suggested and a fitness function is proposed that incorporates the basic constraints on CPRs. Experimental results are presented to demonstrate the performance of the proposed algorithm.

Non-negative Principal Component Analysis for Face Recognition

Principle component analysis is often combined with the state-of-art classification algorithms to recognize human faces. However, principle component analysis can only capture these features contributing to the global characteristics of data because it is a global feature selection algorithm. It misses those features contributing to the local characteristics of data because each principal component only contains some levels of global characteristics of data. In this study, we present a novel face recognition approach using non-negative principal component analysis which is added with the constraint of non-negative to improve data locality and contribute to elucidating latent data structures. Experiments are performed on the Cambridge ORL face database. We demonstrate the strong performances of the algorithm in recognizing human faces in comparison with PCA and NREMF approaches.

Designing and Implementing an Innovative Course about World Wide Web, Based on the Conceptual Representations of Students

Internet is nowadays included to all National Curriculums of the elementary school. A comparative study of their goals leads to the conclusion that a complete curriculum should aim to student-s acquisition of the abilities to navigate and search for information and additionally to emphasize on the evaluation of the information provided by the World Wide Web. In a constructivistic knowledge framework the design of a course has to take under consideration the conceptual representations of students. The following paper presents the conceptual representation of students of eleven years old, attending the Sixth Grade of Greek Elementary School about World Wide Web and their use in the design and implementation of an innovative course.

Technique for Processing and Preservation of Human Amniotic Membrane for Ocular Surface Reconstruction

Human amniotic membrane (HAM) is a useful biological material for the reconstruction of damaged ocular surface. The processing and preservation of HAM is critical to prevent the patients undergoing amniotic membrane transplant (AMT) from cross infections. For HAM preparation human placenta is obtained after an elective cesarean delivery. Before collection, the donor is screened for seronegativity of HCV, Hbs Ag, HIV and Syphilis. After collection, placenta is washed in balanced salt solution (BSS) in sterile environment. Amniotic membrane is then separated from the placenta as well as chorion while keeping the preparation in BSS. Scrapping of HAM is then carried out manually until all the debris is removed and clear transparent membrane is acquired. Nitrocellulose membrane filters are then placed on the stromal side of HAM, cut around the edges with little membrane folded towards other side making it easy to separate during surgery. HAM is finally stored in solution of glycerine and Dulbecco-s Modified Eagle Medium (DMEM) in 1:1 ratio containing antibiotics. The capped borosil vials containing HAM are kept at -80°C until use. This vial is thawed to room temperature and opened under sterile operation theatre conditions at the time of surgery.

3D Modeling of Temperature by Finite Element in Machining with Experimental Authorization

In the present paper, the three-dimensional temperature field of tool is determined during the machining and compared with experimental work on C45 workpiece using carbide cutting tool inserts. During the metal cutting operations, high temperature is generated in the tool cutting edge which influence on the rate of tool wear. Temperature is most important characteristic of machining processes; since many parameters such as cutting speed, surface quality and cutting forces depend on the temperature and high temperatures can cause high mechanical stresses which lead to early tool wear and reduce tool life. Therefore, considerable attention is paid to determine tool temperatures. The experiments are carried out for dry and orthogonal machining condition. The results show that the increase of tool temperature depends on depth of cut and especially cutting speed in high range of cutting conditions.

Data Preprocessing for Supervised Leaning

Many factors affect the success of Machine Learning (ML) on a given task. The representation and quality of the instance data is first and foremost. If there is much irrelevant and redundant information present or noisy and unreliable data, then knowledge discovery during the training phase is more difficult. It is well known that data preparation and filtering steps take considerable amount of processing time in ML problems. Data pre-processing includes data cleaning, normalization, transformation, feature extraction and selection, etc. The product of data pre-processing is the final training set. It would be nice if a single sequence of data pre-processing algorithms had the best performance for each data set but this is not happened. Thus, we present the most well know algorithms for each step of data pre-processing so that one achieves the best performance for their data set.

Effect of COD Loading Rate on Hydrogen Production from Alcohol Wastewater

The objective of this study was to investigate hydrogen production from alcohol wastewater by anaerobic sequencing batch reactor (ASBR) under thermophillic operation. The ASBR unit used in this study had a liquid holding volume of 4 L and was operated at 6 cycles per day. The seed sludge taken from an upflow anaerobic sludge blanket unit treating the same wastewater was boiled at 95 °C for 15 min before being fed to the ASBR unit. The ASBR system was operated at different COD loading rates at a thermophillic temperature (55 °C), and controlled pH of 5.5. When the system was operated under optimum conditions (providing maximum hydrogen production performance) at a feed COD of 60 000 mg/l, and a COD loading rate of 68 kg/m3 d, the produced gas contained 43 % H2 content in the produced gas. Moreover, the hydrogen yield and the specific hydrogen production rate (SHPR) were 130 ml H2/g COD removed and 2100 ml H2/l d, respectively.

Novelty as a Measure of Interestingness in Knowledge Discovery

Rule Discovery is an important technique for mining knowledge from large databases. Use of objective measures for discovering interesting rules leads to another data mining problem, although of reduced complexity. Data mining researchers have studied subjective measures of interestingness to reduce the volume of discovered rules to ultimately improve the overall efficiency of KDD process. In this paper we study novelty of the discovered rules as a subjective measure of interestingness. We propose a hybrid approach based on both objective and subjective measures to quantify novelty of the discovered rules in terms of their deviations from the known rules (knowledge). We analyze the types of deviation that can arise between two rules and categorize the discovered rules according to the user specified threshold. We implement the proposed framework and experiment with some public datasets. The experimental results are promising.

Study on the Effect of Weight Percentage Variation and Size Variation of Magnesium Ferrosilicon Added, Gating System Design and Reaction Chamber Design on Inmold Process

This research focuses on the effect of weight percentage variation and size variation of MgFeSi added, gating system design and reaction chamber design on inmold process. By using inmold process, well-known problem of fading is avoided because the liquid iron reacts with magnesium in the mold and not, as usual, in the ladle. During the pouring operation, liquid metal passes through the chamber containing the magnesium, where the reaction of the metal with magnesium proceeds in the absence of atmospheric oxygen [1].In this paper, the results of microstructural characteristic of ductile iron on this parameters are mentioned. The mechanisms of the inmold process are also described [2]. The data obtained from this research will assist in producing the vehicle parts and other machinery parts for different industrial zones and government industries and in transferring the technology to all industrial zones in Myanmar. Therefore, the inmold technology offers many advantages over traditional treatment methods both from a technical and environmental, as well as an economical point of view. The main objective of this research is to produce ductile iron castings in all industrial sectors in Myanmar more easily with lower costs. It will also assist the sharing of knowledge and experience related to the ductile iron production.

Product Configuration Strategy Based On Product Family Similarity

To offer a large variety of products while maintaining low costs, high speed, and high quality in a mass customization product development environment, platform based product development has much benefit and usefulness in many industry fields. This paper proposes a product configuration strategy by similarity measure, incorporating the knowledge engineering principles such as product information model, ontology engineering, and formal concept analysis.

TRS: System for Recommending Semantic Web Service Composition Approaches

A large number of semantic web service composition approaches are developed by the research community and one is more efficient than the other one depending on the particular situation of use. So a close look at the requirements of ones particular situation is necessary to find a suitable approach to use. In this paper, we present a Technique Recommendation System (TRS) which using a classification of state-of-art semantic web service composition approaches, can provide the user of the system with the recommendations regarding the use of service composition approach based on some parameters regarding situation of use. TRS has modular architecture and uses the production-rules for knowledge representation.

Solver for a Magnetic Equivalent Circuit and Modeling the Inrush Current of a 3-Phase Transformer

Knowledge about the magnetic quantities in a magnetic circuit is always of great interest. On the one hand, this information is needed for the simulation of a transformer. On the other hand, parameter studies are more reliable, if the magnetic quantities are derived from a well established model. One possibility to model the 3-phase transformer is by using a magnetic equivalent circuit (MEC). Though this is a well known system, it is often not an easy task to set up such a model for a large number of lumped elements which additionally includes the nonlinear characteristic of the magnetic material. Here we show the setup of a solver for a MEC and the results of the calculation in comparison to measurements taken. The equations of the MEC are based on a rearranged system of the nodal analysis. Thus it is possible to achieve a minimum number of equations, and a clear and simple structure. Hence, it is uncomplicated in its handling and it supports the iteration process. Additional helpful tasks are implemented within the solver to enhance the performance. The electric circuit is described by an electric equivalent circuit (EEC). Our results for the 3-phase transformer demonstrate the computational efficiency of the solver, and show the benefit of the application of a MEC.

Dynamic Action Induced By Walking Pedestrian

The main focus of this paper is on the human induced forces. Almost all existing force models for this type of load (defined either in the time or frequency domain) are developed from the assumption of perfect periodicity of the force and are based on force measurements conducted on rigid (i.e. high frequency) surfaces. To verify the different authors conclusions the vertical pressure measurements invoked during the walking was performed, using pressure gauges in various configurations. The obtained forces are analyzed using Fourier transformation. This load is often decisive in the design of footbridges. Design criteria and load models proposed by widely used standards and other researchers were introduced and a comparison was made.

Segmentation of Breast Lesions in Ultrasound Images Using Spatial Fuzzy Clustering and Structure Tensors

Segmentation in ultrasound images is challenging due to the interference from speckle noise and fuzziness of boundaries. In this paper, a segmentation scheme using fuzzy c-means (FCM) clustering incorporating both intensity and texture information of images is proposed to extract breast lesions in ultrasound images. Firstly, the nonlinear structure tensor, which can facilitate to refine the edges detected by intensity, is used to extract speckle texture. And then, a spatial FCM clustering is applied on the image feature space for segmentation. In the experiments with simulated and clinical ultrasound images, the spatial FCM clustering with both intensity and texture information gets more accurate results than the conventional FCM or spatial FCM without texture information.

Knowledge Continuity as a Part of Business Continuity Management

Today the intangible assets are the capital of knowledge and are the most important and the most valuable resource for organizations. All employees have knowledge independently of the kind of jobs they do. Knowledge is thus an asset, which influences business operations. The objective of this article is to identify knowledge continuity as an objective of business continuity management. The article has been prepared based on the analysis of secondary sources and the evaluation of primary sources of data by means of a quantitative survey conducted in the Czech Republic. The conclusion of the article is that organizations that apply business continuity management do not focus on the preservation of the knowledge of key employees. Organizations ensure knowledge continuity only intuitively, on a random basis, non-systematically and discontinuously. The non-ensuring of knowledge continuity represents a threat of loss of key knowledge for organizations and can also negatively affect business continuity.

Trust Building Mechanisms for Electronic Business Networks and Their Relation to eSkills

Globalization, supported by information and communication technologies, changes the rules of competitiveness and increases the significance of information, knowledge and network cooperation. In line with this trend, the need for efficient trust-building tools has emerged. The absence of trust building mechanisms and strategies was identified within several studies. Through trust development, participation on e-business network and usage of network services will increase and provide to SMEs new economic benefits. This work is focused on effective trust building strategies development for electronic business network platforms. Based on trust building mechanism identification, the questionnairebased analysis of its significance and minimum level of requirements was conducted. In the paper, we are confirming the trust dependency on e-Skills which play crucial role in higher level of trust into the more sophisticated and complex trust building ICT solutions.

Project Management in Student Satellite Projects: A University – Industry Collaboration View

This research contribution propels the idea of collaborating environment for the execution of student satellite projects in the backdrop of project management principles. The recent past has witnessed a technological shift in the aerospace industry from the big satellite projects to the small spacecrafts especially for the earth observation and communication purposes. This vibrant shift has vitalized the academia and industry to share their resources and to create a win-win paradigm of mutual success and technological development along with the human resource development in the field of aerospace. Small student satellites are the latest jargon of academia and more than 100 CUBESAT projects have been executed successfully all over the globe and many new student satellite projects are in the development phase. The small satellite project management requires the application of specific knowledge, skills, tools and techniques to achieve the defined mission requirements. The Authors have presented the detailed outline for the project management of student satellites and presented the role of industry to collaborate with the academia to get the optimized results in academic environment.

Increasing Fishery Economic Added Value through Post Fishing Program: Cold Storage Program

The purpose of this paper is to guide the effort in improving the economic added value of Indonesian fisheries product through post fishing program, which is cold storage program. Indonesia's fisheries potential has been acknowledged by the world. FAO (2009) stated that Indonesia is one of the tenth highest producers of fishery products in the world. Based on BPS (Statistics Indonesia data), the national fisheries production in 2011 reached 5.714 million tons, which 93.55% came from marine fisheries and 6.45% from open waters. Indonesian territory consist of 2/3 of Indonesian waters, has given enormous benefits for Indonesia, especially fishermen. To improve the economic level of fishermen requires efforts to develop fisheries business unit. On of the efforts is by improving the quality of products which are marketed in the regional and international levels. It is certainly need the support of the existence of various fishery facilities (infrastructure to superstructure), one of which is cold storage. Given the many benefits of cold storage as a means of processing of fishery resources, Indonesia Maritime Security Coordinating Board (IMSCB) as one of the maritime institutions for maritime security and safety, has a program to empower the coastal community through encourages the development of cold storage in the middle and lower fishery business unit. The development of cold storage facilities which able to run its maximum role requires synergistic efforts of various parties.