Abstract: Choosing the right metadata is a critical, as good
information (metadata) attached to an image will facilitate its
visibility from a pile of other images. The image-s value is enhanced
not only by the quality of attached metadata but also by the technique
of the search. This study proposes a technique that is simple but
efficient to predict a single human image from a website using the
basic image data and the embedded metadata of the image-s content
appearing on web pages. The result is very encouraging with the
prediction accuracy of 95%. This technique may become a great
assist to librarians, researchers and many others for automatically and
efficiently identifying a set of human images out of a greater set of
images.
Abstract: As in other countries from Central and Eastern Europe,
the economic restructuring occurred in the last decade of the
twentieth century affected the mining industry in Romania, an
oversize and heavily subsidized sector before 1989. After more than
a decade since the beginning of mining restructuring, an evaluation
of current social implications of the process it is required, together
with an efficiency analysis of the adaptation mechanisms developed
at governmental level. This article aims to provide an insight into
these issues through case studies conducted in the most important
coal basin of Romania, Petroşani Depression.
Abstract: Basic objective of this study is to create a regression
analysis method that can estimate the length of a plastic hinge which
is an important design parameter, by making use of the outcomes of
(lateral load-lateral displacement hysteretic curves) the experimental
studies conducted for the reinforced square concrete columns. For
this aim, 170 different square reinforced concrete column tests results
have been collected from the existing literature. The parameters
which are thought affecting the plastic hinge length such as crosssection
properties, features of material used, axial loading level,
confinement of the column, longitudinal reinforcement bars in the
columns etc. have been obtained from these 170 different square
reinforced concrete column tests. In the study, when determining the
length of plastic hinge, using the experimental test results, a
regression analysis have been separately tested and compared with
each other. In addition, the outcome of mentioned methods on
determination of plastic hinge length of the reinforced concrete
columns has been compared to other methods available in the
literature.
Abstract: The prevalence of non organic constipation differs
from country to country and the reliability of the estimate rates is
uncertain. Moreover, the clinical relevance of subdividing the
heterogeneous functional constipation disorders into pre-defined
subgroups is largely unknown.. Aim: to estimate the prevalence of
constipation in a population-based sample and determine whether
clinical subgroups can be identified. An age and gender stratified
sample population from 5 Italian cities was evaluated using a
previously validated questionnaire. Data mining by cluster analysis
was used to determine constipation subgroups. Results: 1,500
complete interviews were obtained from 2,083 contacted households
(72%). Self-reported constipation correlated poorly with symptombased
constipation found in 496 subjects (33.1%). Cluster analysis
identified four constipation subgroups which correlated to subgroups
identified according to pre-defined symptom criteria. Significant
differences in socio-demographics and lifestyle were observed
among subgroups.
Abstract: Bagging and boosting are among the most popular resampling ensemble methods that generate and combine a diversity of classifiers using the same learning algorithm for the base-classifiers. Boosting algorithms are considered stronger than bagging on noisefree data. However, there are strong empirical indications that bagging is much more robust than boosting in noisy settings. For this reason, in this work we built an ensemble using a voting methodology of bagging and boosting ensembles with 10 subclassifiers in each one. We performed a comparison with simple bagging and boosting ensembles with 25 sub-classifiers, as well as other well known combining methods, on standard benchmark datasets and the proposed technique was the most accurate.
Abstract: Diagnosis can be achieved by building a model of a
certain organ under surveillance and comparing it with the real time
physiological measurements taken from the patient. This paper deals
with the presentation of the benefits of using Data Mining techniques
in the computer-aided diagnosis (CAD), focusing on the cancer
detection, in order to help doctors to make optimal decisions quickly
and accurately. In the field of the noninvasive diagnosis techniques,
the endoscopic ultrasound elastography (EUSE) is a recent elasticity
imaging technique, allowing characterizing the difference between
malignant and benign tumors. Digitalizing and summarizing the main
EUSE sample movies features in a vector form concern with the use
of the exploratory data analysis (EDA). Neural networks are then
trained on the corresponding EUSE sample movies vector input in
such a way that these intelligent systems are able to offer a very
precise and objective diagnosis, discriminating between benign and
malignant tumors. A concrete application of these Data Mining
techniques illustrates the suitability and the reliability of this
methodology in CAD.
Abstract: The multiple traveling salesman problem (mTSP) can be used to model many practical problems. The mTSP is more complicated than the traveling salesman problem (TSP) because it requires determining which cities to assign to each salesman, as well as the optimal ordering of the cities within each salesman's tour. Previous studies proposed that Genetic Algorithm (GA), Integer Programming (IP) and several neural network (NN) approaches could be used to solve mTSP. This paper compared the results for mTSP, solved with Genetic Algorithm (GA) and Nearest Neighbor Algorithm (NNA). The number of cities is clustered into a few groups using k-means clustering technique. The number of groups depends on the number of salesman. Then, each group is solved with NNA and GA as an independent TSP. It is found that k-means clustering and NNA are superior to GA in terms of performance (evaluated by fitness function) and computing time.
Abstract: In this paper, we propose the Modified Synchronous Detection (MSD) Method for determining the reference compensating currents of the shunt active power filter under non sinusoidal voltages conditions. For controlling the inverter switching we used the PI regulator. The numerical simulation results, using Power System Blockset Toolbox PSB of Matlab, from a complete structure, are presented and discussed.
Abstract: In this paper, the modelling and design of artificial neural network architecture for load forecasting purposes is investigated. The primary pre-requisite for power system planning is to arrive at realistic estimates of future demand of power, which is known as Load Forecasting. Short Term Load Forecasting (STLF) helps in determining the economic, reliable and secure operating strategies for power system. The dependence of load on several factors makes the load forecasting a very challenging job. An over estimation of the load may cause premature investment and unnecessary blocking of the capital where as under estimation of load may result in shortage of equipment and circuits. It is always better to plan the system for the load slightly higher than expected one so that no exigency may arise. In this paper, a load-forecasting model is proposed using a multilayer neural network with an appropriately modified back propagation learning algorithm. Once the neural network model is designed and trained, it can forecast the load of the power system 24 hours ahead on daily basis and can also forecast the cumulative load on daily basis. The real load data that is used for the Artificial Neural Network training was taken from LDC, Gujarat Electricity Board, Jambuva, Gujarat, India. The results show that the load forecasting of the ANN model follows the actual load pattern more accurately throughout the forecasted period.
Abstract: This study considers the problem of determining
operation and maintenance schedules for a containership equipped
with components during its sailing according to a pre-determined
navigation schedule. The operation schedule, which specifies work
time of each component, determines the due-date of each maintenance
activity, and the maintenance schedule specifies the actual start
time of each maintenance activity. The main constraints are component
requirements, workforce availability, working time limitation,
and inter-maintenance time. To represent the problem mathematically,
a mixed integer programming model is developed. Then,
due to the problem complexity, we suggest a heuristic for the objective
of minimizing the sum of earliness and tardiness between the
due-date and the starting time of each maintenance activity. Computational
experiments were done on various test instances and the
results are reported.
Abstract: This paper focuses on a technique for identifying the geological boundary of the ground strata in front of a tunnel excavation site using the first order adjoint method based on the optimal control theory. The geological boundary is defined as the boundary which is different layers of elastic modulus. At tunnel excavations, it is important to presume the ground situation ahead of the cutting face beforehand. Excavating into weak strata or fault fracture zones may cause extension of the construction work and human suffering. A theory for determining the geological boundary of the ground in a numerical manner is investigated, employing excavating blasts and its vibration waves as the observation references. According to the optimal control theory, the performance function described by the square sum of the residuals between computed and observed velocities is minimized. The boundary layer is determined by minimizing the performance function. The elastic analysis governed by the Navier equation is carried out, assuming the ground as an elastic body with linear viscous damping. To identify the boundary, the gradient of the performance function with respect to the geological boundary can be calculated using the adjoint equation. The weighed gradient method is effectively applied to the minimization algorithm. To solve the governing and adjoint equations, the Galerkin finite element method and the average acceleration method are employed for the spatial and temporal discretizations, respectively. Based on the method presented in this paper, the different boundary of three strata can be identified. For the numerical studies, the Suemune tunnel excavation site is employed. At first, the blasting force is identified in order to perform the accuracy improvement of analysis. We identify the geological boundary after the estimation of blasting force. With this identification procedure, the numerical analysis results which almost correspond with the observation data were provided.
Abstract: With the extensive inclusion of document, especially
text, in the business systems, data mining does not cover the full
scope of Business Intelligence. Data mining cannot deliver its impact
on extracting useful details from the large collection of unstructured
and semi-structured written materials based on natural languages.
The most pressing issue is to draw the potential business intelligence
from text. In order to gain competitive advantages for the business, it
is necessary to develop the new powerful tool, text mining, to expand
the scope of business intelligence.
In this paper, we will work out the strong points of text mining in
extracting business intelligence from huge amount of textual
information sources within business systems. We will apply text
mining to each stage of Business Intelligence systems to prove that
text mining is the powerful tool to expand the scope of BI. After
reviewing basic definitions and some related technologies, we will
discuss the relationship and the benefits of these to text mining. Some
examples and applications of text mining will also be given. The
motivation behind is to develop new approach to effective and
efficient textual information analysis. Thus we can expand the scope
of Business Intelligence using the powerful tool, text mining.
Abstract: Inner class is a specialized class that defined within a
regular outer class. It is used in some programming languages such as
Java to carry out the task which is related to its outer class. The
functional relatedness between inner class and outer class is always
the main concern of defining an inner class. However, excessive use
of inner class could sabotage the class cohesiveness. In addition,
excessive inner class leads to the difficulty of software maintenance
and comprehension. Our research aims at determining the minimum
threshold for the functional relatedness of inner-outer class. Such
minimum threshold is a guideline for removing or relocating the
excessive inner class. Our research provides a feasible way for
software developers to define inner classes which are functionally
related to the outer class.
Abstract: A novel concept to balance and tradeoff between
make-to-stock and make-to-order has been hybrid MTS/MTO production context. One of the most important decisions involved in
the hybrid MTS/MTO environment is determining whether a product
is manufactured to stock, to order, or hybrid MTS/MTO strategy. In this paper, a model based on analytic network process is developed to tackle the addressed decision. Since the regarded decision deals with
the uncertainty and ambiguity of data as well as experts- and
managers- linguistic judgments, the proposed model is equipped with
fuzzy sets theory. An important attribute of the model is its generality due to diverse decision factors which are elicited from the
literature and developed by the authors. Finally, the model is validated by applying to a real case study to reveal how the proposed
model can actually be implemented.
Abstract: Self-organizing map (SOM) is a well known data
reduction technique used in data mining. It can reveal structure in
data sets through data visualization that is otherwise hard to detect
from raw data alone. However, interpretation through visual
inspection is prone to errors and can be very tedious. There are
several techniques for the automatic detection of clusters of code
vectors found by SOM, but they generally do not take into account
the distribution of code vectors; this may lead to unsatisfactory
clustering and poor definition of cluster boundaries, particularly
where the density of data points is low. In this paper, we propose the
use of an adaptive heuristic particle swarm optimization (PSO)
algorithm for finding cluster boundaries directly from the code
vectors obtained from SOM. The application of our method to
several standard data sets demonstrates its feasibility. PSO algorithm
utilizes a so-called U-matrix of SOM to determine cluster boundaries;
the results of this novel automatic method compare very favorably to
boundary detection through traditional algorithms namely k-means
and hierarchical based approach which are normally used to interpret
the output of SOM.
Abstract: Text Mining is around applying knowledge discovery techniques to unstructured text is termed knowledge discovery in text (KDT), or Text data mining or Text Mining. In Neural Network that address classification problems, training set, testing set, learning rate are considered as key tasks. That is collection of input/output patterns that are used to train the network and used to assess the network performance, set the rate of adjustments. This paper describes a proposed back propagation neural net classifier that performs cross validation for original Neural Network. In order to reduce the optimization of classification accuracy, training time. The feasibility the benefits of the proposed approach are demonstrated by means of five data sets like contact-lenses, cpu, weather symbolic, Weather, labor-nega-data. It is shown that , compared to exiting neural network, the training time is reduced by more than 10 times faster when the dataset is larger than CPU or the network has many hidden units while accuracy ('percent correct') was the same for all datasets but contact-lences, which is the only one with missing attributes. For contact-lences the accuracy with Proposed Neural Network was in average around 0.3 % less than with the original Neural Network. This algorithm is independent of specify data sets so that many ideas and solutions can be transferred to other classifier paradigms.
Abstract: Currently, slider process of Hard Disk Drive Industry
become more complex, defective diagnosis for yield improvement
becomes more complicated and time-consumed. Manufacturing data
analysis with data mining approach is widely used for solving that
problem. The existing mining approach from combining of the KMean
clustering, the machine oriented Kruskal-Wallis test and the
multivariate chart were applied for defective diagnosis but it is still
be a semiautomatic diagnosis system. This article aims to modify an
algorithm to support an automatic decision for the existing approach.
Based on the research framework, the new approach can do an
automatic diagnosis and help engineer to find out the defective
factors faster than the existing approach about 50%.
Abstract: This paper presents an application of the improved
QFD method for determining the specifications of kitchen utensils
rack. By using the improved method, the subjective nature in original
QFD was reduced; particularly in defining the relationship between
customer requirement and engineering characteristics. The regression
analysis that was used for obtaining the relationship functions
between customer requirement and engineering characteristics also
accommodated the inaccurateness of the competitive assessment
results. The improved method which is represented in the form of a
mathematical model had become a formal guidance to allocate the
resource for improving the specifications of kitchen utensils rack.
The specifications obtained had led to the achievement of the highest
feasible customer satisfaction.
Abstract: This paper discusses the use of explorative data
mining tools that allow the educator to explore new relationships
between reported learning experiences and actual activities,
even if there are multiple dimensions with a large number
of measured items. The underlying technology is based on
the so-called Compendium Platform for Reproducible Computing
(http://www.freestatistics.org) which was built on top the computational
R Framework (http://www.wessa.net).
Abstract: Housing is a basic human right. The provision of new
house shall be free from any defects, even for the defects that people
do normally considered as 'cosmetic defects'. This paper studies
about the building defects of newly completed house of 72 unit of
double-storey terraced located in Bangi, Selangor. The building
survey implemented using protocol 1 (visual inspection). As for new
house, the survey work is very stringent in determining the defects
condition and priority. Survey and reporting procedure is carried out
based on CSP1 Matrix that involved scoring system, photographs and
plan tagging. The analysis is done using Statistical Package for Social
Sciences (SPSS). The finding reveals that there are 2119 defects
recorded in 72 terraced houses. The cumulative score obtained was
27644 while the overall rating is 13.05. These results indicate that the
construction quality of the newly terraced houses is low and not up to
an acceptable standard as the new house should be.