Abstract: This paper presents a computational methodology
based on matrix operations for a computer based solution to the
problem of performance analysis of software reliability models
(SRMs). A set of seven comparison criteria have been formulated to
rank various non-homogenous Poisson process software reliability
models proposed during the past 30 years to estimate software
reliability measures such as the number of remaining faults, software
failure rate, and software reliability. Selection of optimal SRM for
use in a particular case has been an area of interest for researchers in
the field of software reliability. Tools and techniques for software
reliability model selection found in the literature cannot be used with
high level of confidence as they use a limited number of model
selection criteria. A real data set of middle size software project from
published papers has been used for demonstration of matrix method.
The result of this study will be a ranking of SRMs based on the
Permanent value of the criteria matrix formed for each model based
on the comparison criteria. The software reliability model with
highest value of the Permanent is ranked at number – 1 and so on.
Abstract: A neurofuzzy approach for a given set of input-output training data is proposed in two phases. Firstly, the data set is partitioned automatically into a set of clusters. Then a fuzzy if-then rule is extracted from each cluster to form a fuzzy rule base. Secondly, a fuzzy neural network is constructed accordingly and parameters are tuned to increase the precision of the fuzzy rule base. This network is able to learn and optimize the rule base of a Sugeno like Fuzzy inference system using Hybrid learning algorithm, which combines gradient descent, and least mean square algorithm. This proposed neurofuzzy system has the advantage of determining the number of rules automatically and also reduce the number of rules, decrease computational time, learns faster and consumes less memory. The authors also investigate that how neurofuzzy techniques can be applied in the area of control theory to design a fuzzy controller for linear and nonlinear dynamic systems modelling from a set of input/output data. The simulation analysis on a wide range of processes, to identify nonlinear components on-linely in a control system and a benchmark problem involving the prediction of a chaotic time series is carried out. Furthermore, the well-known examples of linear and nonlinear systems are also simulated under the Matlab/Simulink environment. The above combination is also illustrated in modeling the relationship between automobile trips and demographic factors.
Abstract: As emails communications have no consistent
authentication procedure to ensure the authenticity, we present an
investigation analysis approach for detecting forged emails based on
Random Forests and Naïve Bays classifiers. Instead of investigating
the email headers, we use the body content to extract a unique writing
style for all the possible suspects. Our approach consists of four main
steps: (1) The cybercrime investigator extract different effective
features including structural, lexical, linguistic, and syntactic
evidence from previous emails for all the possible suspects, (2) The
extracted features vectors are normalized to increase the accuracy
rate. (3) The normalized features are then used to train the learning
engine, (4) upon receiving the anonymous email (M); we apply the
feature extraction process to produce a feature vector. Finally, using
the machine learning classifiers the email is assigned to one of the
suspects- whose writing style closely matches M. Experimental
results on real data sets show the improved performance of the
proposed method and the ability of identifying the authors with a
very limited number of features.
Abstract: This paper discusses the classification process for medical data. In this paper, we use the data from ACM KDDCup 2008 to demonstrate our classification process based on latent topic discovery. In this data set, the target set and outliers are quite different in their nature: target set is only 0.6% size in total, while the outliers consist of 99.4% of the data set. We use this data set as an example to show how we dealt with this extremely biased data set with latent topic discovery and noise reduction techniques. Our experiment faces two major challenge: (1) extremely distributed outliers, and (2) positive samples are far smaller than negative ones. We try to propose a suitable process flow to deal with these issues and get a best AUC result of 0.98.
Abstract: This paper uses the radial basis function neural
network (RBFNN) for system identification of nonlinear systems.
Five nonlinear systems are used to examine the activity of RBFNN in
system modeling of nonlinear systems; the five nonlinear systems are
dual tank system, single tank system, DC motor system, and two
academic models. The feed forward method is considered in this
work for modelling the non-linear dynamic models, where the KMeans
clustering algorithm used in this paper to select the centers of
radial basis function network, because it is reliable, offers fast
convergence and can handle large data sets. The least mean square
method is used to adjust the weights to the output layer, and
Euclidean distance method used to measure the width of the Gaussian
function.
Abstract: This paper presents an exploration into the structure of the corporate governance network and interlocking directorates in the Czech Republic. First a literature overview and a basic terminology of the network theory is presented. Further in the text, statistics and other calculations relevant to corporate governance networks are presented. For this purpose an empirical data set consisting of 2 906 joint stock companies in the Czech Republic was examined. Industries with the highest average number of interlocks per company were healthcare, and energy and utilities. There is no observable link between the financial performance of the company and the number of its interlocks. Also interlocks with financial companies are very rare.
Abstract: Data Envelopment Analysis (DEA) is a methodology
that computes efficiency values for decision making units (DMU) in a
given period by comparing the outputs with the inputs. In many cases,
there are some time lag between the consumption of inputs and the
production of outputs. For a long-term research project, it is hard to
avoid the production lead time phenomenon. This time lag effect
should be considered in evaluating the performance of organizations.
This paper suggests a model to calculate efficiency values for the
performance evaluation problem with time lag. In the experimental
part, the proposed methods are compared with the CCR and an
existing time lag model using the data set of the 21st century frontier
R&D program which is a long-term national R&D program of Korea.
Abstract: The problem of estimating time-varying regression is
inevitably concerned with the necessity to choose the appropriate
level of model volatility - ranging from the full stationarity of instant
regression models to their absolute independence of each other. In the
stationary case the number of regression coefficients to be estimated
equals that of regressors, whereas the absence of any smoothness
assumptions augments the dimension of the unknown vector by the
factor of the time-series length. The Akaike Information Criterion
is a commonly adopted means of adjusting a model to the given
data set within a succession of nested parametric model classes,
but its crucial restriction is that the classes are rigidly defined by
the growing integer-valued dimension of the unknown vector. To
make the Kullback information maximization principle underlying the
classical AIC applicable to the problem of time-varying regression
estimation, we extend it onto a wider class of data models in which
the dimension of the parameter is fixed, but the freedom of its values
is softly constrained by a family of continuously nested a priori
probability distributions.
Abstract: Natural resources management including water resources requires reliable estimations of time variant environmental parameters. Small improvements in the estimation of environmental parameters would result in grate effects on managing decisions. Noise reduction using wavelet techniques is an effective approach for preprocessing of practical data sets. Predictability enhancement of the river flow time series are assessed using fractal approaches before and after applying wavelet based preprocessing. Time series correlation and persistency, the minimum sufficient length for training the predicting model and the maximum valid length of predictions were also investigated through a fractal assessment.
Abstract: This article proposes an Ant Colony Optimization
(ACO) metaheuristic to minimize total makespan for scheduling a set
of jobs and assign workers for uniformly related parallel machines.
An algorithm based on ACO has been developed and coded on a
computer program Matlab®, to solve this problem. The paper
explains various steps to apply Ant Colony approach to the problem
of minimizing makespan for the worker assignment & jobs
scheduling problem in a parallel machine model and is aimed at
evaluating the strength of ACO as compared to other conventional
approaches. One data set containing 100 problems (12 Jobs, 03
machines and 10 workers) which is available on internet, has been
taken and solved through this ACO algorithm. The results of our
ACO based algorithm has shown drastically improved results,
especially, in terms of negligible computational effort of CPU, to
reach the optimal solution. In our case, the time taken to solve all 100
problems is even lesser than the average time taken to solve one
problem in the data set by other conventional approaches like GA
algorithm and SPT-A/LMC heuristics.
Abstract: Gasoline Octane Number is the standard measure of
the anti-knock properties of a motor in platforming processes, that is
one of the important unit operations for oil refineries and can be
determined with online measurement or use CFR (Cooperative Fuel
Research) engines. Online measurements of the Octane number can
be done using direct octane number analyzers, that it is too
expensive, so we have to find feasible analyzer, like ANFIS
estimators.
ANFIS is the systems that neural network incorporated in fuzzy
systems, using data automatically by learning algorithms of NNs.
ANFIS constructs an input-output mapping based both on human
knowledge and on generated input-output data pairs.
In this research, 31 industrial data sets are used (21 data for training
and the rest of the data used for generalization). Results show that,
according to this simulation, hybrid method training algorithm in
ANFIS has good agreements between industrial data and simulated
results.
Abstract: In this paper, a clustering algorithm named KHarmonic
means (KHM) was employed in the training of Radial
Basis Function Networks (RBFNs). KHM organized the data in
clusters and determined the centres of the basis function. The popular
clustering algorithms, namely K-means (KM) and Fuzzy c-means
(FCM), are highly dependent on the initial identification of elements
that represent the cluster well. In KHM, the problem can be avoided.
This leads to improvement in the classification performance when
compared to other clustering algorithms. A comparison of the
classification accuracy was performed between KM, FCM and KHM.
The classification performance is based on the benchmark data sets:
Iris Plant, Diabetes and Breast Cancer. RBFN training with the KHM
algorithm shows better accuracy in classification problem.
Abstract: In the present study, position estimation of switched reluctance motor (SRM) has been achieved on the basis of the artificial neural networks (ANNs). The ANNs can estimate the rotor position without using an extra rotor position sensor by measuring the phase flux linkages and phase currents. Flux linkage-phase current-rotor position data set and supervised backpropagation learning algorithm are used in training of the ANN based position estimator. A 4-phase SRM have been used to verify the accuracy and feasibility of the proposed position estimator. Simulation results show that the proposed position estimator gives precise and accurate position estimations for both under the low and high level reference speeds of the SRM
Abstract: Using logarithmic mean Divisia decomposition technique, this paper analyzes the change in industrial energy intensity of Fujian Province in China, based on data sets of added value and energy consumption for 35 selected industrial sub-sectors from 1999 to 2009. The change in industrial energy intensity is decomposed into intensity effect and structure effect. Results show that the industrial energy intensity of Fujian Province has achieved a reduction of 51% over the past ten years. The structural change, a shift in the mix of industrial sub-sectors, made overwhelming contribution to the reduction. The impact of energy efficiency’s improvement was relatively small. However, the aggregate industrial energy intensity was very sensitive to both the changes in energy intensity and in production share of energy-intensive sub-sectors, such as production and supply of electric power, steam and hot water. Pathway to reduce industrial energy intensity for energy conservation in Fujian Province is proposed in the end.
Abstract: Response surface methodology (RSM) is a very
efficient tool to provide a good practical insight into developing new
process and optimizing them. This methodology could help
engineers to raise a mathematical model to represent the behavior of
system as a convincing function of process parameters.
Through this paper the sequential nature of the RSM surveyed for process
engineers and its relationship to design of experiments (DOE), regression
analysis and robust design reviewed. The proposed four-step procedure in
two different phases could help system analyst to resolve the parameter
design problem involving responses. In order to check accuracy of the
designed model, residual analysis and prediction error sum of squares
(PRESS) described.
It is believed that the proposed procedure in this study can resolve a
complex parameter design problem with one or more responses. It can be
applied to those areas where there are large data sets and a number of
responses are to be optimized simultaneously. In addition, the proposed
procedure is relatively simple and can be implemented easily by using
ready-made standard statistical packages.
Abstract: This study introduces a new method for detecting,
sorting, and localizing spikes from multiunit EEG recordings. The
method combines the wavelet transform, which localizes distinctive
spike features, with Super-Paramagnetic Clustering (SPC) algorithm,
which allows automatic classification of the data without assumptions
such as low variance or Gaussian distributions. Moreover, the method
is capable of setting amplitude thresholds for spike detection. The
method makes use of several real EEG data sets, and accordingly the
spikes are detected, clustered and their times were detected.
Abstract: Emerging Bio-engineering fields such as Brain
Computer Interfaces, neuroprothesis devices and modeling and
simulation of neural networks have led to increased research activity
in algorithms for the detection, isolation and classification of Action
Potentials (AP) from noisy data trains. Current techniques in the field
of 'unsupervised no-prior knowledge' biosignal processing include
energy operators, wavelet detection and adaptive thresholding. These
tend to bias towards larger AP waveforms, AP may be missed due to
deviations in spike shape and frequency and correlated noise
spectrums can cause false detection. Also, such algorithms tend to
suffer from large computational expense.
A new signal detection technique based upon the ideas of phasespace
diagrams and trajectories is proposed based upon the use of a
delayed copy of the AP to highlight discontinuities relative to
background noise. This idea has been used to create algorithms that
are computationally inexpensive and address the above problems.
Distinct AP have been picked out and manually classified from
real physiological data recorded from a cockroach. To facilitate
testing of the new technique, an Auto Regressive Moving Average
(ARMA) noise model has been constructed bases upon background
noise of the recordings. Along with the AP classification means this
model enables generation of realistic neuronal data sets at arbitrary
signal to noise ratio (SNR).
Abstract: This article presents a short discussion on
optimum neighborhood size selection in a spherical selforganizing
feature map (SOFM). A majority of the literature
on the SOFMs have addressed the issue of selecting optimal
learning parameters in the case of Cartesian topology SOFMs.
However, the use of a Spherical SOFM suggested that the
learning aspects of Cartesian topology SOFM are not directly
translated. This article presents an approach on how to
estimate the neighborhood size of a spherical SOFM based on
the data. It adopts the L-curve criterion, previously suggested
for choosing the regularization parameter on problems of
linear equations where their right-hand-side is contaminated
with noise. Simulation results are presented on two artificial
4D data sets of the coupled Hénon-Ikeda map.
Abstract: An Artificial Neural Network based modeling
technique has been used to study the influence of different
combinations of meteorological parameters on evaporation from a
reservoir. The data set used is taken from an earlier reported study.
Several input combination were tried so as to find out the importance
of different input parameters in predicting the evaporation. The
prediction accuracy of Artificial Neural Network has also been
compared with the accuracy of linear regression for predicting
evaporation. The comparison demonstrated superior performance of
Artificial Neural Network over linear regression approach. The
findings of the study also revealed the requirement of all input
parameters considered together, instead of individual parameters
taken one at a time as reported in earlier studies, in predicting the
evaporation. The highest correlation coefficient (0.960) along with
lowest root mean square error (0.865) was obtained with the input
combination of air temperature, wind speed, sunshine hours and
mean relative humidity. A graph between the actual and predicted
values of evaporation suggests that most of the values lie within a
scatter of ±15% with all input parameters. The findings of this study
suggest the usefulness of ANN technique in predicting the
evaporation losses from reservoirs.
Abstract: This paper tries to shed light on the existence of a bank lending channel (BLC) in South Eastern European countries (SEE). Based on a VAR framework we test the responsiveness of credit supply to monetary policy shocks. By compiling a new data set and using the reserve requirement ratio, among others, as the policy instrument we measure the effectiveness of the BLC and the buffering effect of the banks in the SEE countries. The results indicate that loan supply is significantly affected by shifts in monetary policy, when demand factors are controlled. Furthermore, by analyzing the effect of the Greek banks in the region we conclude that Greek banks do buffer the negative effects of monetary policy transmission. By having a significant market share of the SEE-s banking markets we argue that Greek banks influence positively the economic growth of SEE countries.