Abstract: Factors affecting construction unit cost vary
depending on a country’s political, economic, social and
technological inclinations. Factors affecting construction costs have
been studied from various perspectives. Analysis of cost factors
requires an appreciation of a country’s practices. Identified cost
factors provide an indication of a country’s construction economic
strata. The purpose of this paper is to identify the essential factors
that affect unit cost estimation and their breakdown using artificial
neural networks. Twenty five (25) identified cost factors in road
construction were subjected to a questionnaire survey and employing
SPSS factor analysis the factors were reduced to eight. The 8 factors
were analysed using neural network (NN) to determine the
proportionate breakdown of the cost factors in a given construction
unit rate. NN predicted that political environment accounted 44% of
the unit rate followed by contractor capacity at 22% and financial
delays, project feasibility and overhead & profit each at 11%. Project
location, material availability and corruption perception index had
minimal impact on the unit cost from the training data provided.
Quantified cost factors can be incorporated in unit cost estimation
models (UCEM) to produce more accurate estimates. This can create
improvements in the cost estimation of infrastructure projects and
establish a benchmark standard to assist the process of alignment of
work practises and training of new staff, permitting the on-going
development of best practises in cost estimation to become more
effective.
Abstract: Nowadays social media information, such as news,
links, images, or VDOs, is shared extensively. However, the
effectiveness of disseminating information through social media
lacks in quality: less fact checking, more biases, and several rumors.
Many researchers have investigated about credibility on Twitter, but
there is no the research report about credibility information on
Facebook. This paper proposes features for measuring credibility on
Facebook information. We developed the system for credibility on
Facebook. First, we have developed FB credibility evaluator for
measuring credibility of each post by manual human’s labelling. We
then collected the training data for creating a model using Support
Vector Machine (SVM). Secondly, we developed a chrome extension
of FB credibility for Facebook users to evaluate the credibility of
each post. Based on the usage analysis of our FB credibility chrome
extension, about 81% of users’ responses agree with suggested
credibility automatically computed by the proposed system.
Abstract: Accurate Short Term Load Forecasting (STLF) is essential for a variety of decision making processes. However, forecasting accuracy can drop due to the presence of uncertainty in the operation of energy systems or unexpected behavior of exogenous variables. Interval Type 2 Fuzzy Logic System (IT2 FLS), with additional degrees of freedom, gives an excellent tool for handling uncertainties and it improved the prediction accuracy. The training data used in this study covers the period from January 1, 2012 to February 1, 2012 for winter season and the period from July 1, 2012 to August 1, 2012 for summer season. The actual load forecasting period starts from January 22, till 28, 2012 for winter model and from July 22 till 28, 2012 for summer model. The real data for Iraqi power system which belongs to the Ministry of Electricity.
Abstract: Work is in on line Arabic character recognition and the principal motivation is to study the Arab manuscript with on line technology.
This system is a Markovian system, which one can see as like a Dynamic Bayesian Network (DBN). One of the major interests of these systems resides in the complete models training (topology and parameters) starting from training data.
Our approach is based on the dynamic Bayesian Networks formalism. The DBNs theory is a Bayesians networks generalization to the dynamic processes. Among our objective, amounts finding better parameters, which represent the links (dependences) between dynamic network variables.
In applications in pattern recognition, one will carry out the fixing of the structure, which obliges us to admit some strong assumptions (for example independence between some variables). Our application will relate to the Arabic isolated characters on line recognition using our laboratory database: NOUN. A neural tester proposed for DBN external optimization.
The DBN scores and DBN mixed are respectively 70.24% and 62.50%, which lets predict their further development; other approaches taking account time were considered and implemented until obtaining a significant recognition rate 94.79%.
Abstract: This work presents a proposal to perform contextual sentiment analysis using a supervised learning algorithm and disregarding the extensive training of annotators. To achieve this goal, a web platform was developed to perform the entire procedure outlined in this paper. The main contribution of the pipeline described in this article is to simplify and automate the annotation process through a system of analysis of congruence between the notes. This ensured satisfactory results even without using specialized annotators in the context of the research, avoiding the generation of biased training data for the classifiers. For this, a case
study was conducted in a blog of entrepreneurship. The experimental results were consistent with the literature related annotation using formalized process with experts.
Abstract: Fully reusable spaceplanes do not exist as yet. This implies that design-qualification for optimized highly-integrated forebody-inlet configuration of booster-stage vehicle cannot be based on archival data of other spaceplanes. Therefore, this paper proposes a novel TIPSO-SVM expert system methodology. A non-trivial problem related to optimization and classification of hypersonic forebody-inlet configuration in conjunction with mass-model of the two-stage-to-orbit (TSTO) vehicle is solved. The hybrid-heuristic machine learning methodology is based on two-step improved particle swarm optimizer (TIPSO) algorithm and two-step support vector machine (SVM) data classification method. The efficacy of method is tested by first evolving an optimal configuration for hypersonic compression system using TIPSO algorithm; thereafter, classifying the results using two-step SVM method. In the first step extensive but non-classified mass-model training data for multiple optimized configurations is segregated and pre-classified for learning of SVM algorithm. In second step the TIPSO optimized mass-model data is classified using the SVM classification. Results showed remarkable improvement in configuration and mass-model along with sizing parameters.
Abstract: The author proposes an extension of genetic algorithm (GA) for solving fuzzy-valued optimization problems. In the proposed GA, values in the genotypes are not real numbers but fuzzy numbers. Evolutionary processes in GA are extended so that GA can handle genotype instances with fuzzy numbers. The proposed method is applied to evolving neural networks with fuzzy weights and biases. Experimental results showed that fuzzy neural networks evolved by the fuzzy GA could model hidden target fuzzy functions well despite the fact that no training data was explicitly provided.
Abstract: The author proposes an extension of particle swarm optimization (PSO) for solving interval-valued optimization problems and applies the extended PSO to evolutionary training of neural networks (NNs) with interval weights. In the proposed PSO, values in the genotypes are not real numbers but intervals. Experimental results show that interval-valued NNs trained by the proposed method could well approximate hidden target functions despite the fact that no training data was explicitly provided.
Abstract: In this study, a fuzzy similarity approach for Arabic
web pages classification is presented. The approach uses a fuzzy
term-category relation by manipulating membership degree for the
training data and the degree value for a test web page. Six measures
are used and compared in this study. These measures include:
Einstein, Algebraic, Hamacher, MinMax, Special case fuzzy and
Bounded Difference approaches. These measures are applied and
compared using 50 different Arabic web pages. Einstein measure was
gave best performance among the other measures. An analysis of
these measures and concluding remarks are drawn in this study.
Abstract: This paper presents the development of a wavelet
based algorithm, for distinguishing between magnetizing inrush
currents and power system fault currents, which is quite adequate,
reliable, fast and computationally efficient tool. The proposed
technique consists of a preprocessing unit based on discrete wavelet
transform (DWT) in combination with an artificial neural network
(ANN) for detecting and classifying fault currents. The DWT acts as
an extractor of distinctive features in the input signals at the relay
location. This information is then fed into an ANN for classifying
fault and magnetizing inrush conditions. A 220/55/55 V, 50Hz
laboratory transformer connected to a 380 V power system were
simulated using ATP-EMTP. The DWT was implemented by using
Matlab and Coiflet mother wavelet was used to analyze primary
currents and generate training data. The simulated results presented
clearly show that the proposed technique can accurately discriminate
between magnetizing inrush and fault currents in transformer
protection.
Abstract: In this paper, Wavelet based ANFIS for finding inter
turn fault of generator is proposed. The detector uniquely responds to
the winding inter turn fault with remarkably high sensitivity.
Discrimination of different percentage of winding affected by inter
turn fault is provided via ANFIS having an Eight dimensional input
vector. This input vector is obtained from features extracted from
DWT of inter turn faulty current leaving the generator phase
winding. Training data for ANFIS are generated via a simulation of
generator with inter turn fault using MATLAB. The proposed
algorithm using ANFIS is giving satisfied performance than ANN
with selected statistical data of decomposed levels of faulty current.
Abstract: Text Mining is around applying knowledge discovery
techniques to unstructured text is termed knowledge discovery in text
(KDT), or Text data mining or Text Mining. In decision tree
approach is most useful in classification problem. With this
technique, tree is constructed to model the classification process.
There are two basic steps in the technique: building the tree and
applying the tree to the database. This paper describes a proposed
C5.0 classifier that performs rulesets, cross validation and boosting
for original C5.0 in order to reduce the optimization of error ratio.
The feasibility and the benefits of the proposed approach are
demonstrated by means of medial data set like hypothyroid. It is
shown that, the performance of a classifier on the training cases from
which it was constructed gives a poor estimate by sampling or using a
separate test file, either way, the classifier is evaluated on cases that
were not used to build and evaluate the classifier are both are large. If
the cases in hypothyroid.data and hypothyroid.test were to be
shuffled and divided into a new 2772 case training set and a 1000
case test set, C5.0 might construct a different classifier with a lower
or higher error rate on the test cases. An important feature of see5 is
its ability to classifiers called rulesets. The ruleset has an error rate
0.5 % on the test cases. The standard errors of the means provide an
estimate of the variability of results. One way to get a more reliable
estimate of predictive is by f-fold –cross- validation. The error rate of
a classifier produced from all the cases is estimated as the ratio of the
total number of errors on the hold-out cases to the total number of
cases. The Boost option with x trials instructs See5 to construct up to
x classifiers in this manner. Trials over numerous datasets, large and
small, show that on average 10-classifier boosting reduces the error
rate for test cases by about 25%.
Abstract: This paper proposes an efficient learning method for the layered neural networks based on the selection of training data and input characteristics of an output layer unit. Comparing to recent neural networks; pulse neural networks, quantum neuro computation, etc, the multilayer network is widely used due to its simple structure. When learning objects are complicated, the problems, such as unsuccessful learning or a significant time required in learning, remain unsolved. Focusing on the input data during the learning stage, we undertook an experiment to identify the data that makes large errors and interferes with the learning process. Our method devides the learning process into several stages. In general, input characteristics to an output layer unit show oscillation during learning process for complicated problems. The multi-stage learning method proposes by the authors for the function approximation problems of classifying learning data in a phased manner, focusing on their learnabilities prior to learning in the multi layered neural network, and demonstrates validity of the multi-stage learning method. Specifically, this paper verifies by computer experiments that both of learning accuracy and learning time are improved of the BP method as a learning rule of the multi-stage learning method. In learning, oscillatory phenomena of a learning curve serve an important role in learning performance. The authors also discuss the occurrence mechanisms of oscillatory phenomena in learning. Furthermore, the authors discuss the reasons that errors of some data remain large value even after learning, observing behaviors during learning.
Abstract: This paper presents an exact pruning algorithm with
adaptive pruning interval for general dynamic neural networks
(GDNN). GDNNs are artificial neural networks with internal dynamics.
All layers have feedback connections with time delays to the
same and to all other layers. The structure of the plant is unknown, so
the identification process is started with a larger network architecture
than necessary. During parameter optimization with the Levenberg-
Marquardt (LM) algorithm irrelevant weights of the dynamic neural
network are deleted in order to find a model for the plant as
simple as possible. The weights to be pruned are found by direct
evaluation of the training data within a sliding time window. The
influence of pruning on the identification system depends on the
network architecture at pruning time and the selected weight to be
deleted. As the architecture of the model is changed drastically during
the identification and pruning process, it is suggested to adapt the
pruning interval online. Two system identification examples show
the architecture selection ability of the proposed pruning approach.
Abstract: We present a hybrid architecture of recurrent neural
networks (RNNs) inspired by hidden Markov models (HMMs). We
train the hybrid architecture using genetic algorithms to learn and
represent dynamical systems. We train the hybrid architecture on a
set of deterministic finite-state automata strings and observe the
generalization performance of the hybrid architecture when presented
with a new set of strings which were not present in the training data
set. In this way, we show that the hybrid system of HMM and RNN
can learn and represent deterministic finite-state automata. We ran
experiments with different sets of population sizes in the genetic
algorithm; we also ran experiments to find out which weight
initializations were best for training the hybrid architecture. The
results show that the hybrid architecture of recurrent neural networks
inspired by hidden Markov models can train and represent dynamical
systems. The best training and generalization performance is
achieved when the hybrid architecture is initialized with random real
weight values of range -15 to 15.
Abstract: In this study, a fuzzy similarity approach for Arabic web pages classification is presented. The approach uses a fuzzy term-category relation by manipulating membership degree for the training data and the degree value for a test web page. Six measures are used and compared in this study. These measures include: Einstein, Algebraic, Hamacher, MinMax, Special case fuzzy and Bounded Difference approaches. These measures are applied and compared using 50 different Arabic web-pages. Einstein measure was gave best performance among the other measures. An analysis of these measures and concluding remarks are drawn in this study.
Abstract: This paper explores the scalability issues associated
with solving the Named Entity Recognition (NER) problem using
Support Vector Machines (SVM) and high-dimensional features. The
performance results of a set of experiments conducted using binary
and multi-class SVM with increasing training data sizes are
examined. The NER domain chosen for these experiments is the
biomedical publications domain, especially selected due to its
importance and inherent challenges. A simple machine learning
approach is used that eliminates prior language knowledge such as
part-of-speech or noun phrase tagging thereby allowing for its
applicability across languages. No domain-specific knowledge is
included. The accuracy measures achieved are comparable to those
obtained using more complex approaches, which constitutes a
motivation to investigate ways to improve the scalability of multiclass
SVM in order to make the solution more practical and useable.
Improving training time of multi-class SVM would make support
vector machines a more viable and practical machine learning
solution for real-world problems with large datasets. An initial
prototype results in great improvement of the training time at the
expense of memory requirements.
Abstract: As a popular rank-reduced vector space approach,
Latent Semantic Indexing (LSI) has been used in information
retrieval and other applications. In this paper, an LSI-based content
vector model for text classification is presented, which constructs
multiple augmented category LSI spaces and classifies text by their
content. The model integrates the class discriminative information
from the training data and is equipped with several pertinent feature
selection and text classification algorithms. The proposed classifier
has been applied to email classification and its experiments on a
benchmark spam testing corpus (PU1) have shown that the approach
represents a competitive alternative to other email classifiers based
on the well-known SVM and naïve Bayes algorithms.
Abstract: Proteomics is one of the largest areas of research for
bioinformatics and medical science. An ambitious goal of proteomics
is to elucidate the structure, interactions and functions of all proteins
within cells and organisms. Predicting Protein-Protein Interaction
(PPI) is one of the crucial and decisive problems in current research.
Genomic data offer a great opportunity and at the same time a lot of
challenges for the identification of these interactions. Many methods
have already been proposed in this regard. In case of in-silico
identification, most of the methods require both positive and negative
examples of protein interaction and the perfection of these examples
are very much crucial for the final prediction accuracy. Positive
examples are relatively easy to obtain from well known databases. But
the generation of negative examples is not a trivial task. Current PPI
identification methods generate negative examples based on some
assumptions, which are likely to affect their prediction accuracy.
Hence, if more reliable negative examples are used, the PPI prediction
methods may achieve even more accuracy. Focusing on this issue, a
graph based negative example generation method is proposed, which
is simple and more accurate than the existing approaches. An
interaction graph of the protein sequences is created. The basic
assumption is that the longer the shortest path between two
protein-sequences in the interaction graph, the less is the possibility of
their interaction. A well established PPI detection algorithm is
employed with our negative examples and in most cases it increases
the accuracy more than 10% in comparison with the negative pair
selection method in that paper.
Abstract: Emotion recognition is an important research field that finds lots of applications nowadays. This work emphasizes on recognizing different emotions from speech signal. The extracted features are related to statistics of pitch, formants, and energy contours, as well as spectral, perceptual and temporal features, jitter, and shimmer. The Artificial Neural Networks (ANN) was chosen as the classifier. Working on finding a robust and fast ANN classifier suitable for different real life application is our concern. Several experiments were carried out on different ANN to investigate the different factors that impact the classification success rate. Using a database containing 7 different emotions, it will be shown that with a proper and careful adjustment of features format, training data sorting, number of features selected and even the ANN type and architecture used, a success rate of 85% or even more can be achieved without increasing the system complicity and the computation time