Abstract: Until recently, researchers have developed various
tools and methodologies for effective clinical decision-making.
Among those decisions, chest pain diseases have been one of
important diagnostic issues especially in an emergency department. To
improve the ability of physicians in diagnosis, many researchers have
developed diagnosis intelligence by using machine learning and data
mining. However, most of the conventional methodologies have been
generally based on a single classifier for disease classification and
prediction, which shows moderate performance. This study utilizes an
ensemble strategy to combine multiple different classifiers to help
physicians diagnose chest pain diseases more accurately than ever.
Specifically the ensemble strategy is applied by using the integration
of decision trees, neural networks, and support vector machines. The
ensemble models are applied to real-world emergency data. This study
shows that the performance of the ensemble models is superior to each
of single classifiers.
Abstract: This paper investigates the problem of tracking spa¬tiotemporal changes of a satellite image through the use of Knowledge Discovery in Database (KDD). The purpose of this study is to help a given user effectively discover interesting knowledge and then build prediction and decision models. Unfortunately, the KDD process for spatiotemporal data is always marked by several types of imperfections. In our paper, we take these imperfections into consideration in order to provide more accurate decisions. To achieve this objective, different KDD methods are used to discover knowledge in satellite image databases. Each method presents a different point of view of spatiotemporal evolution of a query model (which represents an extracted object from a satellite image). In order to combine these methods, we use the evidence fusion theory which considerably improves the spatiotemporal knowledge discovery process and increases our belief in the spatiotemporal model change. Experimental results of satellite images representing the region of Auckland in New Zealand depict the improvement in the overall change detection as compared to using classical methods.
Abstract: Purpose: To explore the use of Curvelet transform to
extract texture features of pulmonary nodules in CT image and support
vector machine to establish prediction model of small solitary
pulmonary nodules in order to promote the ratio of detection and
diagnosis of early-stage lung cancer. Methods: 2461 benign or
malignant small solitary pulmonary nodules in CT image from 129
patients were collected. Fourteen Curvelet transform textural features
were as parameters to establish support vector machine prediction
model. Results: Compared with other methods, using 252 texture
features as parameters to establish prediction model is more proper.
And the classification consistency, sensitivity and specificity for the
model are 81.5%, 93.8% and 38.0% respectively. Conclusion: Based
on texture features extracted from Curvelet transform, support vector
machine prediction model is sensitive to lung cancer, which can
promote the rate of diagnosis for early-stage lung cancer to some
extent.
Abstract: Scale defects are common surface defects in hot steel rolling. The modelling of such defects is problematic and their causes are not straightforward. In this study, we investigated genetic algorithms in search for a mathematical solution to scale formation. For this research, a high-dimensional data set from hot steel rolling process was gathered. The synchronisation of the variables as well as the allocation of the measurements made on the steel strip were solved before the modelling phase.
Abstract: In this study, a mathematical model was proposed and
the accuracy of this model was assessed to predict the growth of
Pseudomonas aeruginosa and rhamnolipid production under nitrogen
limiting (sodium nitrate) fed-batch fermentation. All of the
parameters used in this model were achieved individually without
using any data from the literature.
The overall growth kinetic of the strain was evaluated using a
dual-parallel substrate Monod equation which was described by
several batch experimental data. Fed-batch data under different
glycerol (as the sole carbon source, C/N=10) concentrations and feed
flow rates were used to describe the proposed fed-batch model and
other parameters. In order to verify the accuracy of the proposed
model several verification experiments were performed in a vast
range of initial glycerol concentrations. While the results showed an
acceptable prediction for rhamnolipid production (less than 10%
error), in case of biomass prediction the errors were less than 23%. It
was also found that the rhamnolipid production by P. aeruginosa was
more sensitive at low glycerol concentrations.
Based on the findings of this work, it was concluded that the
proposed model could effectively be employed for rhamnolipid
production by this strain under fed-batch fermentation on up to 80 g l-
1 glycerol.
Abstract: Dynamic bandwidth allocation in EPONs can be
generally separated into inter-ONU scheduling and intra-ONU scheduling. In our previous work, the active intra-ONU scheduling
(AS) utilizes multiple queue reports (QRs) in each report message to cooperate with the inter-ONU scheduling and makes the granted
bandwidth fully utilized without leaving unused slot remainder (USR).
This scheme successfully solves the USR problem originating from the
inseparability of Ethernet frame. However, without proper setting of
threshold value in AS, the number of QRs constrained by the IEEE
802.3ah standard is not enough, especially in the unbalanced traffic
environment. This limitation may be solved by enlarging the threshold
value. The large threshold implies the large gap between the adjacent QRs, thus resulting in the large difference between the best granted bandwidth and the real granted bandwidth. In this paper, we integrate
AS with a cooperative prediction mechanism and distribute multiple
QRs to reduce the penalty brought by the prediction error.
Furthermore, to improve the QoS and save the usage of queue reports,
the highest priority (EF) traffic which comes during the waiting time is
granted automatically by OLT and is not considered in the requested
bandwidth of ONU. The simulation results show that the proposed
scheme has better performance metrics in terms of bandwidth
utilization and average delay for different classes of packets.
Abstract: Compression algorithms reduce the redundancy in
data representation to decrease the storage required for that data.
Lossless compression researchers have developed highly
sophisticated approaches, such as Huffman encoding, arithmetic
encoding, the Lempel-Ziv (LZ) family, Dynamic Markov
Compression (DMC), Prediction by Partial Matching (PPM), and
Burrows-Wheeler Transform (BWT) based algorithms.
Decompression is also required to retrieve the original data by
lossless means. A compression scheme for text files coupled with
the principle of dynamic decompression, which decompresses only
the section of the compressed text file required by the user instead of
decompressing the entire text file. Dynamic decompressed files offer
better disk space utilization due to higher compression ratios
compared to most of the currently available text file formats.
Abstract: Segmentation, filtering out of measurement errors and
identification of breakpoints are integral parts of any analysis of
microarray data for the detection of copy number variation (CNV).
Existing algorithms designed for these tasks have had some successes
in the past, but they tend to be O(N2) in either computation time or
memory requirement, or both, and the rapid advance of microarray
resolution has practically rendered such algorithms useless. Here we
propose an algorithm, SAD, that is much faster and much less thirsty
for memory – O(N) in both computation time and memory requirement
-- and offers higher accuracy. The two key ingredients of SAD are the
fundamental assumption in statistics that measurement errors are
normally distributed and the mathematical relation that the product of
two Gaussians is another Gaussian (function). We have produced a
computer program for analyzing CNV based on SAD. In addition to
being fast and small it offers two important features: quantitative
statistics for predictions and, with only two user-decided parameters,
ease of use. Its speed shows little dependence on genomic profile.
Running on an average modern computer, it completes CNV analyses
for a 262 thousand-probe array in ~1 second and a 1.8 million-probe
array in 9 seconds
Abstract: Software maintenance is extremely important activity in software development life cycle. It involves a lot of human efforts, cost and time. Software maintenance may be further subdivided into different activities such as fault prediction, fault detection, fault prevention, fault correction etc. This topic has gained substantial attention due to sophisticated and complex applications, commercial hardware, clustered architecture and artificial intelligence. In this paper we surveyed the work done in the field of software maintenance. Software fault prediction has been studied in context of fault prone modules, self healing systems, developer information, maintenance models etc. Still a lot of things like modeling and weightage of impact of different kind of faults in the various types of software systems need to be explored in the field of fault severity.
Abstract: Time series models have been used to make predictions of academic enrollments, weather, road accident, casualties and stock prices, etc. Based on the concepts of quartile regression models, we have developed a simple time variant quantile based fuzzy time series forecasting method. The proposed method bases the forecast using prediction of future trend of the data. In place of actual quantiles of the data at each point, we have converted the statistical concept into fuzzy concept by using fuzzy quantiles using fuzzy membership function ensemble. We have given a fuzzy metric to use the trend forecast and calculate the future value. The proposed model is applied for TAIFEX forecasting. It is shown that proposed method work best as compared to other models when compared with respect to model complexity and forecasting accuracy.
Abstract: In this paper, we propose a new hybrid learning model for stock market indices prediction by adding a passive congregation term to the standard hybrid model comprising Particle Swarm Optimization (PSO) with Genetic Algorithm (GA) operators in training Neural Networks (NN). This new passive congregation term is based on the cooperation between different particles in determining new positions rather than depending on the particles selfish thinking without considering other particles positions, thus it enables PSO to perform both the local and global search instead of only doing the local search. Experiment study carried out on the most famous European stock market indices in both long term and short term prediction shows significantly the influence of the passive congregation term in improving the prediction accuracy compared to standard hybrid model.
Abstract: Heart disease (HD) is a major cause of morbidity and mortality in the modern society. Medical diagnosis is an important but complicated task that should be performed accurately and efficiently and its automation would be very useful. All doctors are unfortunately not equally skilled in every sub specialty and they are in many places a scarce resource. A system for automated medical diagnosis would enhance medical care and reduce costs. In this paper, a new approach based on coactive neuro-fuzzy inference system (CANFIS) was presented for prediction of heart disease. The proposed CANFIS model combined the neural network adaptive capabilities and the fuzzy logic qualitative approach which is then integrated with genetic algorithm to diagnose the presence of the disease. The performances of the CANFIS model were evaluated in terms of training performances and classification accuracies and the results showed that the proposed CANFIS model has great potential in predicting the heart disease.
Abstract: Wireless Sensor Network (WSN) comprises of sensor
nodes which are designed to sense the environment, transmit sensed
data back to the base station via multi-hop routing to reconstruct
physical phenomena. Since physical phenomena exists significant
overlaps between temporal redundancy and spatial redundancy, it is
necessary to use Redundancy Suppression Algorithms (RSA) for sensor
node to lower energy consumption by reducing the transmission
of redundancy. A conventional algorithm of RSAs is threshold-based
RSA, which sets threshold to suppress redundant data. Although
many temporal and spatial RSAs are proposed, temporal-spatial RSA
are seldom to be proposed because it is difficult to determine when
to utilize temporal or spatial RSAs. In this paper, we proposed a
novel temporal-spatial redundancy suppression algorithm, Codebookbase
Redundancy Suppression Mechanism (CRSM). CRSM adopts
vector quantization to generate a codebook, which is easily used to
implement temporal-spatial RSA. CRSM not only achieves power
saving and reliability for WSN, but also provides the predictability
of network lifetime. Simulation result shows that the network lifetime
of CRSM outperforms at least 23% of that of other RSAs.
Abstract: Protein structure determination and prediction has
been a focal research subject in the field of bioinformatics due to the
importance of protein structure in understanding the biological and
chemical activities of organisms. The experimental methods used by
biotechnologists to determine the structures of proteins demand
sophisticated equipment and time. A host of computational methods
are developed to predict the location of secondary structure elements
in proteins for complementing or creating insights into experimental
results. However, prediction accuracies of these methods rarely
exceed 70%.
Abstract: The main objectives of this paper are to measure
pollutants concentrations in the oil refinery area in Kuwait over three
periods during one year, obtain recent emission inventory for the
three refineries of Kuwait, use AERMOD and the emission inventory
to predict pollutants concentrations and distribution, compare model
predictions against measured data, and perform numerical
experiments to determine conditions at which emission rates and the
resulting pollutant dispersion is below maximum allowable limits.
Abstract: The simulation of extrusion process is studied widely
in order to both increase products and improve quality, with broad
application in wire coating. The annular tube-tooling extrusion was
set up by a model that is termed as Navier-Stokes equation in
addition to a rheological model of differential form based on singlemode
exponential Phan-Thien/Tanner constitutive equation in a twodimensional
cylindrical coordinate system for predicting the
contraction point of the polymer melt beyond the die. Numerical
solutions are sought through semi-implicit Taylor-Galerkin pressurecorrection
finite element scheme. The investigation was focused on
incompressible creeping flow with long relaxation time in terms of
Weissenberg numbers up to 200. The isothermal case was considered
with surface tension effect on free surface in extrudate flow and no
slip at die wall. The Stream Line Upwind Petrov-Galerkin has been
proposed to stabilize solution. The structure of mesh after die exit
was adjusted following prediction of both top and bottom free
surfaces so as to keep the location of contraction point around one
unit length which is close to experimental results. The simulation of
extrusion process is studied widely in order to both increase products
and improve quality, with broad application in wire coating. The
annular tube-tooling extrusion was set up by a model that is termed
as Navier-Stokes equation in addition to a rheological model of
differential form based on single-mode exponential Phan-
Thien/Tanner constitutive equation in a two-dimensional cylindrical
coordinate system for predicting the contraction point of the polymer
melt beyond the die. Numerical solutions are sought through semiimplicit
Taylor-Galerkin pressure-correction finite element scheme.
The investigation was focused on incompressible creeping flow with
long relaxation time in terms of Weissenberg numbers up to 200. The
isothermal case was considered with surface tension effect on free
surface in extrudate flow and no slip at die wall. The Stream Line
Upwind Petrov-Galerkin has been proposed to stabilize solution. The
structure of mesh after die exit was adjusted following prediction of
both top and bottom free surfaces so as to keep the location of
contraction point around one unit length which is close to
experimental results.
Abstract: Diabetes Mellitus is a chronic metabolic disorder, where the improper management of the blood glucose level in the diabetic patients will lead to the risk of heart attack, kidney disease and renal failure. This paper attempts to enhance the diagnostic accuracy of the advancing blood glucose levels of the diabetic patients, by combining principal component analysis and wavelet neural network. The proposed system makes separate blood glucose prediction in the morning, afternoon, evening and night intervals, using dataset from one patient covering a period of 77 days. Comparisons of the diagnostic accuracy with other neural network models, which use the same dataset are made. The comparison results showed overall improved accuracy, which indicates the effectiveness of this proposed system.
Abstract: This paper focuses on the data-driven generation
of fuzzy IF...THEN rules. The resulted fuzzy rule base can be
applied to build a classifier, a model used for prediction, or
it can be applied to form a decision support system. Among
the wide range of possible approaches, the decision tree and
the association rule based algorithms are overviewed, and two
new approaches are presented based on the a priori fuzzy
clustering based partitioning of the continuous input variables.
An application study is also presented, where the developed
methods are tested on the well known Wisconsin Breast Cancer
classification problem.
Abstract: Wind is among the potential energy resources which
can be harnessed to generate wind energy for conversion into
electrical power. Due to the variability of wind speed with time and
height, it becomes difficult to predict the generated wind energy more
optimally. In this paper, an attempt is made to establish a
probabilistic model fitting the wind speed data recorded at
Makambako site in Tanzania. Wind speeds and direction were
respectively measured using anemometer (type AN1) and wind Vane
(type WD1) both supplied by Delta-T-Devices at a measurement
height of 2 m. Wind speeds were then extrapolated for the height of
10 m using power law equation with an exponent of 0.47. Data were
analysed using MINITAB statistical software to show the variability
of wind speeds with time and height, and to determine the underlying
probability model of the extrapolated wind speed data. The results
show that wind speeds at Makambako site vary cyclically over time;
and they conform to the Weibull probability distribution. From these
results, Weibull probability density function can be used to predict
the wind energy.
Abstract: Nowadays, precipitation prediction is required for proper planning and management of water resources. Prediction with neural network models has received increasing interest in various research and application domains. However, it is difficult to determine the best neural network architecture for prediction since it is not immediately obvious how many input or hidden nodes are used in the model. In this paper, neural network model is used as a forecasting tool. The major aim is to evaluate a suitable neural network model for monthly precipitation mapping of Myanmar. Using 3-layerd neural network models, 100 cases are tested by changing the number of input and hidden nodes from 1 to 10 nodes, respectively, and only one outputnode used. The optimum model with the suitable number of nodes is selected in accordance with the minimum forecast error. In measuring network performance using Root Mean Square Error (RMSE), experimental results significantly show that 3 inputs-10 hiddens-1 output architecture model gives the best prediction result for monthly precipitation in Myanmar.