Prediction of a Human Facial Image by ANN using Image Data and its Content on Web Pages

Choosing the right metadata is a critical, as good information (metadata) attached to an image will facilitate its visibility from a pile of other images. The image-s value is enhanced not only by the quality of attached metadata but also by the technique of the search. This study proposes a technique that is simple but efficient to predict a single human image from a website using the basic image data and the embedded metadata of the image-s content appearing on web pages. The result is very encouraging with the prediction accuracy of 95%. This technique may become a great assist to librarians, researchers and many others for automatically and efficiently identifying a set of human images out of a greater set of images.

Social and Economic Effects of Mining Industry Restructuring in Romania -Case Studies

As in other countries from Central and Eastern Europe, the economic restructuring occurred in the last decade of the twentieth century affected the mining industry in Romania, an oversize and heavily subsidized sector before 1989. After more than a decade since the beginning of mining restructuring, an evaluation of current social implications of the process it is required, together with an efficiency analysis of the adaptation mechanisms developed at governmental level. This article aims to provide an insight into these issues through case studies conducted in the most important coal basin of Romania, Petroşani Depression.

Use of Regression Analysis in Determining the Length of Plastic Hinge in Reinforced Concrete Columns

Basic objective of this study is to create a regression analysis method that can estimate the length of a plastic hinge which is an important design parameter, by making use of the outcomes of (lateral load-lateral displacement hysteretic curves) the experimental studies conducted for the reinforced square concrete columns. For this aim, 170 different square reinforced concrete column tests results have been collected from the existing literature. The parameters which are thought affecting the plastic hinge length such as crosssection properties, features of material used, axial loading level, confinement of the column, longitudinal reinforcement bars in the columns etc. have been obtained from these 170 different square reinforced concrete column tests. In the study, when determining the length of plastic hinge, using the experimental test results, a regression analysis have been separately tested and compared with each other. In addition, the outcome of mentioned methods on determination of plastic hinge length of the reinforced concrete columns has been compared to other methods available in the literature.

Idiopathic Constipation can be Subdivided in Clinical Subtypes: Data Mining by Cluster Analysis on a Population based Study

The prevalence of non organic constipation differs from country to country and the reliability of the estimate rates is uncertain. Moreover, the clinical relevance of subdividing the heterogeneous functional constipation disorders into pre-defined subgroups is largely unknown.. Aim: to estimate the prevalence of constipation in a population-based sample and determine whether clinical subgroups can be identified. An age and gender stratified sample population from 5 Italian cities was evaluated using a previously validated questionnaire. Data mining by cluster analysis was used to determine constipation subgroups. Results: 1,500 complete interviews were obtained from 2,083 contacted households (72%). Self-reported constipation correlated poorly with symptombased constipation found in 496 subjects (33.1%). Cluster analysis identified four constipation subgroups which correlated to subgroups identified according to pre-defined symptom criteria. Significant differences in socio-demographics and lifestyle were observed among subgroups.

Combining Bagging and Boosting

Bagging and boosting are among the most popular resampling ensemble methods that generate and combine a diversity of classifiers using the same learning algorithm for the base-classifiers. Boosting algorithms are considered stronger than bagging on noisefree data. However, there are strong empirical indications that bagging is much more robust than boosting in noisy settings. For this reason, in this work we built an ensemble using a voting methodology of bagging and boosting ensembles with 10 subclassifiers in each one. We performed a comparison with simple bagging and boosting ensembles with 25 sub-classifiers, as well as other well known combining methods, on standard benchmark datasets and the proposed technique was the most accurate.

Data Mining Techniques in Computer-Aided Diagnosis: Non-Invasive Cancer Detection

Diagnosis can be achieved by building a model of a certain organ under surveillance and comparing it with the real time physiological measurements taken from the patient. This paper deals with the presentation of the benefits of using Data Mining techniques in the computer-aided diagnosis (CAD), focusing on the cancer detection, in order to help doctors to make optimal decisions quickly and accurately. In the field of the noninvasive diagnosis techniques, the endoscopic ultrasound elastography (EUSE) is a recent elasticity imaging technique, allowing characterizing the difference between malignant and benign tumors. Digitalizing and summarizing the main EUSE sample movies features in a vector form concern with the use of the exploratory data analysis (EDA). Neural networks are then trained on the corresponding EUSE sample movies vector input in such a way that these intelligent systems are able to offer a very precise and objective diagnosis, discriminating between benign and malignant tumors. A concrete application of these Data Mining techniques illustrates the suitability and the reliability of this methodology in CAD.

A Comparison between Heuristic and Meta-Heuristic Methods for Solving the Multiple Traveling Salesman Problem

The multiple traveling salesman problem (mTSP) can be used to model many practical problems. The mTSP is more complicated than the traveling salesman problem (TSP) because it requires determining which cities to assign to each salesman, as well as the optimal ordering of the cities within each salesman's tour. Previous studies proposed that Genetic Algorithm (GA), Integer Programming (IP) and several neural network (NN) approaches could be used to solve mTSP. This paper compared the results for mTSP, solved with Genetic Algorithm (GA) and Nearest Neighbor Algorithm (NNA). The number of cities is clustered into a few groups using k-means clustering technique. The number of groups depends on the number of salesman. Then, each group is solved with NNA and GA as an independent TSP. It is found that k-means clustering and NNA are superior to GA in terms of performance (evaluated by fitness function) and computing time.

Shunt Power Active Filter Control under NonIdeal Voltages Conditions

In this paper, we propose the Modified Synchronous Detection (MSD) Method for determining the reference compensating currents of the shunt active power filter under non sinusoidal voltages conditions. For controlling the inverter switching we used the PI regulator. The numerical simulation results, using Power System Blockset Toolbox PSB of Matlab, from a complete structure, are presented and discussed.

A Multi-layer Artificial Neural Network Architecture Design for Load Forecasting in Power Systems

In this paper, the modelling and design of artificial neural network architecture for load forecasting purposes is investigated. The primary pre-requisite for power system planning is to arrive at realistic estimates of future demand of power, which is known as Load Forecasting. Short Term Load Forecasting (STLF) helps in determining the economic, reliable and secure operating strategies for power system. The dependence of load on several factors makes the load forecasting a very challenging job. An over estimation of the load may cause premature investment and unnecessary blocking of the capital where as under estimation of load may result in shortage of equipment and circuits. It is always better to plan the system for the load slightly higher than expected one so that no exigency may arise. In this paper, a load-forecasting model is proposed using a multilayer neural network with an appropriately modified back propagation learning algorithm. Once the neural network model is designed and trained, it can forecast the load of the power system 24 hours ahead on daily basis and can also forecast the cumulative load on daily basis. The real load data that is used for the Artificial Neural Network training was taken from LDC, Gujarat Electricity Board, Jambuva, Gujarat, India. The results show that the load forecasting of the ANN model follows the actual load pattern more accurately throughout the forecasted period.

Mathematical Model and Solution Algorithm for Containership Operation/Maintenance Scheduling

This study considers the problem of determining operation and maintenance schedules for a containership equipped with components during its sailing according to a pre-determined navigation schedule. The operation schedule, which specifies work time of each component, determines the due-date of each maintenance activity, and the maintenance schedule specifies the actual start time of each maintenance activity. The main constraints are component requirements, workforce availability, working time limitation, and inter-maintenance time. To represent the problem mathematically, a mixed integer programming model is developed. Then, due to the problem complexity, we suggest a heuristic for the objective of minimizing the sum of earliness and tardiness between the due-date and the starting time of each maintenance activity. Computational experiments were done on various test instances and the results are reported.

An Identification Method of Geological Boundary Using Elastic Waves

This paper focuses on a technique for identifying the geological boundary of the ground strata in front of a tunnel excavation site using the first order adjoint method based on the optimal control theory. The geological boundary is defined as the boundary which is different layers of elastic modulus. At tunnel excavations, it is important to presume the ground situation ahead of the cutting face beforehand. Excavating into weak strata or fault fracture zones may cause extension of the construction work and human suffering. A theory for determining the geological boundary of the ground in a numerical manner is investigated, employing excavating blasts and its vibration waves as the observation references. According to the optimal control theory, the performance function described by the square sum of the residuals between computed and observed velocities is minimized. The boundary layer is determined by minimizing the performance function. The elastic analysis governed by the Navier equation is carried out, assuming the ground as an elastic body with linear viscous damping. To identify the boundary, the gradient of the performance function with respect to the geological boundary can be calculated using the adjoint equation. The weighed gradient method is effectively applied to the minimization algorithm. To solve the governing and adjoint equations, the Galerkin finite element method and the average acceleration method are employed for the spatial and temporal discretizations, respectively. Based on the method presented in this paper, the different boundary of three strata can be identified. For the numerical studies, the Suemune tunnel excavation site is employed. At first, the blasting force is identified in order to perform the accuracy improvement of analysis. We identify the geological boundary after the estimation of blasting force. With this identification procedure, the numerical analysis results which almost correspond with the observation data were provided.

Powerful Tool to Expand Business Intelligence: Text Mining

With the extensive inclusion of document, especially text, in the business systems, data mining does not cover the full scope of Business Intelligence. Data mining cannot deliver its impact on extracting useful details from the large collection of unstructured and semi-structured written materials based on natural languages. The most pressing issue is to draw the potential business intelligence from text. In order to gain competitive advantages for the business, it is necessary to develop the new powerful tool, text mining, to expand the scope of business intelligence. In this paper, we will work out the strong points of text mining in extracting business intelligence from huge amount of textual information sources within business systems. We will apply text mining to each stage of Business Intelligence systems to prove that text mining is the powerful tool to expand the scope of BI. After reviewing basic definitions and some related technologies, we will discuss the relationship and the benefits of these to text mining. Some examples and applications of text mining will also be given. The motivation behind is to develop new approach to effective and efficient textual information analysis. Thus we can expand the scope of Business Intelligence using the powerful tool, text mining.

Determining the Minimum Threshold for the Functional Relatedness of Inner-Outer Class

Inner class is a specialized class that defined within a regular outer class. It is used in some programming languages such as Java to carry out the task which is related to its outer class. The functional relatedness between inner class and outer class is always the main concern of defining an inner class. However, excessive use of inner class could sabotage the class cohesiveness. In addition, excessive inner class leads to the difficulty of software maintenance and comprehension. Our research aims at determining the minimum threshold for the functional relatedness of inner-outer class. Such minimum threshold is a guideline for removing or relocating the excessive inner class. Our research provides a feasible way for software developers to define inner classes which are functionally related to the outer class.

Order Partitioning in Hybrid MTS/MTO Contexts using Fuzzy ANP

A novel concept to balance and tradeoff between make-to-stock and make-to-order has been hybrid MTS/MTO production context. One of the most important decisions involved in the hybrid MTS/MTO environment is determining whether a product is manufactured to stock, to order, or hybrid MTS/MTO strategy. In this paper, a model based on analytic network process is developed to tackle the addressed decision. Since the regarded decision deals with the uncertainty and ambiguity of data as well as experts- and managers- linguistic judgments, the proposed model is equipped with fuzzy sets theory. An important attribute of the model is its generality due to diverse decision factors which are elicited from the literature and developed by the authors. Finally, the model is validated by applying to a real case study to reveal how the proposed model can actually be implemented.

Performance Comparison of Particle Swarm Optimization with Traditional Clustering Algorithms used in Self-Organizing Map

Self-organizing map (SOM) is a well known data reduction technique used in data mining. It can reveal structure in data sets through data visualization that is otherwise hard to detect from raw data alone. However, interpretation through visual inspection is prone to errors and can be very tedious. There are several techniques for the automatic detection of clusters of code vectors found by SOM, but they generally do not take into account the distribution of code vectors; this may lead to unsatisfactory clustering and poor definition of cluster boundaries, particularly where the density of data points is low. In this paper, we propose the use of an adaptive heuristic particle swarm optimization (PSO) algorithm for finding cluster boundaries directly from the code vectors obtained from SOM. The application of our method to several standard data sets demonstrates its feasibility. PSO algorithm utilizes a so-called U-matrix of SOM to determine cluster boundaries; the results of this novel automatic method compare very favorably to boundary detection through traditional algorithms namely k-means and hierarchical based approach which are normally used to interpret the output of SOM.

Classifier Based Text Mining for Neural Network

Text Mining is around applying knowledge discovery techniques to unstructured text is termed knowledge discovery in text (KDT), or Text data mining or Text Mining. In Neural Network that address classification problems, training set, testing set, learning rate are considered as key tasks. That is collection of input/output patterns that are used to train the network and used to assess the network performance, set the rate of adjustments. This paper describes a proposed back propagation neural net classifier that performs cross validation for original Neural Network. In order to reduce the optimization of classification accuracy, training time. The feasibility the benefits of the proposed approach are demonstrated by means of five data sets like contact-lenses, cpu, weather symbolic, Weather, labor-nega-data. It is shown that , compared to exiting neural network, the training time is reduced by more than 10 times faster when the dataset is larger than CPU or the network has many hidden units while accuracy ('percent correct') was the same for all datasets but contact-lences, which is the only one with missing attributes. For contact-lences the accuracy with Proposed Neural Network was in average around 0.3 % less than with the original Neural Network. This algorithm is independent of specify data sets so that many ideas and solutions can be transferred to other classifier paradigms.

Modified Data Mining Approach for Defective Diagnosis in Hard Disk Drive Industry

Currently, slider process of Hard Disk Drive Industry become more complex, defective diagnosis for yield improvement becomes more complicated and time-consumed. Manufacturing data analysis with data mining approach is widely used for solving that problem. The existing mining approach from combining of the KMean clustering, the machine oriented Kruskal-Wallis test and the multivariate chart were applied for defective diagnosis but it is still be a semiautomatic diagnosis system. This article aims to modify an algorithm to support an automatic decision for the existing approach. Based on the research framework, the new approach can do an automatic diagnosis and help engineer to find out the defective factors faster than the existing approach about 50%.

Application of the Improved QFD Method Case Study: Kitchen Utensils Rack Design

This paper presents an application of the improved QFD method for determining the specifications of kitchen utensils rack. By using the improved method, the subjective nature in original QFD was reduced; particularly in defining the relationship between customer requirement and engineering characteristics. The regression analysis that was used for obtaining the relationship functions between customer requirement and engineering characteristics also accommodated the inaccurateness of the competitive assessment results. The improved method which is represented in the form of a mathematical model had become a formal guidance to allocate the resource for improving the specifications of kitchen utensils rack. The specifications obtained had led to the achievement of the highest feasible customer satisfaction.

Explorative Data Mining of Constructivist Learning Experiences and Activities with Multiple Dimensions

This paper discusses the use of explorative data mining tools that allow the educator to explore new relationships between reported learning experiences and actual activities, even if there are multiple dimensions with a large number of measured items. The underlying technology is based on the so-called Compendium Platform for Reproducible Computing (http://www.freestatistics.org) which was built on top the computational R Framework (http://www.wessa.net).

Housing Defect of Newly Completed House: An Analysis Using Condition Survey Protocol (CSP) 1 Matrix

Housing is a basic human right. The provision of new house shall be free from any defects, even for the defects that people do normally considered as 'cosmetic defects'. This paper studies about the building defects of newly completed house of 72 unit of double-storey terraced located in Bangi, Selangor. The building survey implemented using protocol 1 (visual inspection). As for new house, the survey work is very stringent in determining the defects condition and priority. Survey and reporting procedure is carried out based on CSP1 Matrix that involved scoring system, photographs and plan tagging. The analysis is done using Statistical Package for Social Sciences (SPSS). The finding reveals that there are 2119 defects recorded in 72 terraced houses. The cumulative score obtained was 27644 while the overall rating is 13.05. These results indicate that the construction quality of the newly terraced houses is low and not up to an acceptable standard as the new house should be.