Clinical Decision Support for Disease Classification based on the Tests Association

Until recently, researchers have developed various tools and methodologies for effective clinical decision-making. Among those decisions, chest pain diseases have been one of important diagnostic issues especially in an emergency department. To improve the ability of physicians in diagnosis, many researchers have developed diagnosis intelligence by using machine learning and data mining. However, most of the conventional methodologies have been generally based on a single classifier for disease classification and prediction, which shows moderate performance. This study utilizes an ensemble strategy to combine multiple different classifiers to help physicians diagnose chest pain diseases more accurately than ever. Specifically the ensemble strategy is applied by using the integration of decision trees, neural networks, and support vector machines. The ensemble models are applied to real-world emergency data. This study shows that the performance of the ensemble models is superior to each of single classifiers.

Improving Spatiotemporal Change Detection: A High Level Fusion Approach for Discovering Uncertain Knowledge from Satellite Image Database

This paper investigates the problem of tracking spa¬tiotemporal changes of a satellite image through the use of Knowledge Discovery in Database (KDD). The purpose of this study is to help a given user effectively discover interesting knowledge and then build prediction and decision models. Unfortunately, the KDD process for spatiotemporal data is always marked by several types of imperfections. In our paper, we take these imperfections into consideration in order to provide more accurate decisions. To achieve this objective, different KDD methods are used to discover knowledge in satellite image databases. Each method presents a different point of view of spatiotemporal evolution of a query model (which represents an extracted object from a satellite image). In order to combine these methods, we use the evidence fusion theory which considerably improves the spatiotemporal knowledge discovery process and increases our belief in the spatiotemporal model change. Experimental results of satellite images representing the region of Auckland in New Zealand depict the improvement in the overall change detection as compared to using classical methods.

Support Vector Machine Prediction Model of Early-stage Lung Cancer Based on Curvelet Transform to Extract Texture Features of CT Image

Purpose: To explore the use of Curvelet transform to extract texture features of pulmonary nodules in CT image and support vector machine to establish prediction model of small solitary pulmonary nodules in order to promote the ratio of detection and diagnosis of early-stage lung cancer. Methods: 2461 benign or malignant small solitary pulmonary nodules in CT image from 129 patients were collected. Fourteen Curvelet transform textural features were as parameters to establish support vector machine prediction model. Results: Compared with other methods, using 252 texture features as parameters to establish prediction model is more proper. And the classification consistency, sensitivity and specificity for the model are 81.5%, 93.8% and 38.0% respectively. Conclusion: Based on texture features extracted from Curvelet transform, support vector machine prediction model is sensitive to lung cancer, which can promote the rate of diagnosis for early-stage lung cancer to some extent.

Genetic Algorithms in Hot Steel Rolling for Scale Defect Prediction

Scale defects are common surface defects in hot steel rolling. The modelling of such defects is problematic and their causes are not straightforward. In this study, we investigated genetic algorithms in search for a mathematical solution to scale formation. For this research, a high-dimensional data set from hot steel rolling process was gathered. The synchronisation of the variables as well as the allocation of the measurements made on the steel strip were solved before the modelling phase.

A Mathematical Modelling to Predict Rhamnolipid Production by Pseudomonas aeruginosa under Nitrogen Limiting Fed-Batch Fermentation

In this study, a mathematical model was proposed and the accuracy of this model was assessed to predict the growth of Pseudomonas aeruginosa and rhamnolipid production under nitrogen limiting (sodium nitrate) fed-batch fermentation. All of the parameters used in this model were achieved individually without using any data from the literature. The overall growth kinetic of the strain was evaluated using a dual-parallel substrate Monod equation which was described by several batch experimental data. Fed-batch data under different glycerol (as the sole carbon source, C/N=10) concentrations and feed flow rates were used to describe the proposed fed-batch model and other parameters. In order to verify the accuracy of the proposed model several verification experiments were performed in a vast range of initial glycerol concentrations. While the results showed an acceptable prediction for rhamnolipid production (less than 10% error), in case of biomass prediction the errors were less than 23%. It was also found that the rhamnolipid production by P. aeruginosa was more sensitive at low glycerol concentrations. Based on the findings of this work, it was concluded that the proposed model could effectively be employed for rhamnolipid production by this strain under fed-batch fermentation on up to 80 g l- 1 glycerol.

Active Intra-ONU Scheduling with Cooperative Prediction Mechanism in EPONs

Dynamic bandwidth allocation in EPONs can be generally separated into inter-ONU scheduling and intra-ONU scheduling. In our previous work, the active intra-ONU scheduling (AS) utilizes multiple queue reports (QRs) in each report message to cooperate with the inter-ONU scheduling and makes the granted bandwidth fully utilized without leaving unused slot remainder (USR). This scheme successfully solves the USR problem originating from the inseparability of Ethernet frame. However, without proper setting of threshold value in AS, the number of QRs constrained by the IEEE 802.3ah standard is not enough, especially in the unbalanced traffic environment. This limitation may be solved by enlarging the threshold value. The large threshold implies the large gap between the adjacent QRs, thus resulting in the large difference between the best granted bandwidth and the real granted bandwidth. In this paper, we integrate AS with a cooperative prediction mechanism and distribute multiple QRs to reduce the penalty brought by the prediction error. Furthermore, to improve the QoS and save the usage of queue reports, the highest priority (EF) traffic which comes during the waiting time is granted automatically by OLT and is not considered in the requested bandwidth of ONU. The simulation results show that the proposed scheme has better performance metrics in terms of bandwidth utilization and average delay for different classes of packets.

Dynamic Decompression for Text Files

Compression algorithms reduce the redundancy in data representation to decrease the storage required for that data. Lossless compression researchers have developed highly sophisticated approaches, such as Huffman encoding, arithmetic encoding, the Lempel-Ziv (LZ) family, Dynamic Markov Compression (DMC), Prediction by Partial Matching (PPM), and Burrows-Wheeler Transform (BWT) based algorithms. Decompression is also required to retrieve the original data by lossless means. A compression scheme for text files coupled with the principle of dynamic decompression, which decompresses only the section of the compressed text file required by the user instead of decompressing the entire text file. Dynamic decompressed files offer better disk space utilization due to higher compression ratios compared to most of the currently available text file formats.

A Pairwise-Gaussian-Merging Approach: Towards Genome Segmentation for Copy Number Analysis

Segmentation, filtering out of measurement errors and identification of breakpoints are integral parts of any analysis of microarray data for the detection of copy number variation (CNV). Existing algorithms designed for these tasks have had some successes in the past, but they tend to be O(N2) in either computation time or memory requirement, or both, and the rapid advance of microarray resolution has practically rendered such algorithms useless. Here we propose an algorithm, SAD, that is much faster and much less thirsty for memory – O(N) in both computation time and memory requirement -- and offers higher accuracy. The two key ingredients of SAD are the fundamental assumption in statistics that measurement errors are normally distributed and the mathematical relation that the product of two Gaussians is another Gaussian (function). We have produced a computer program for analyzing CNV based on SAD. In addition to being fast and small it offers two important features: quantitative statistics for predictions and, with only two user-decided parameters, ease of use. Its speed shows little dependence on genomic profile. Running on an average modern computer, it completes CNV analyses for a 262 thousand-probe array in ~1 second and a 1.8 million-probe array in 9 seconds

Impact of Faults in Different Software Systems: A Survey

Software maintenance is extremely important activity in software development life cycle. It involves a lot of human efforts, cost and time. Software maintenance may be further subdivided into different activities such as fault prediction, fault detection, fault prevention, fault correction etc. This topic has gained substantial attention due to sophisticated and complex applications, commercial hardware, clustered architecture and artificial intelligence. In this paper we surveyed the work done in the field of software maintenance. Software fault prediction has been studied in context of fault prone modules, self healing systems, developer information, maintenance models etc. Still a lot of things like modeling and weightage of impact of different kind of faults in the various types of software systems need to be explored in the field of fault severity.

A New Quantile Based Fuzzy Time Series Forecasting Model

Time series models have been used to make predictions of academic enrollments, weather, road accident, casualties and stock prices, etc. Based on the concepts of quartile regression models, we have developed a simple time variant quantile based fuzzy time series forecasting method. The proposed method bases the forecast using prediction of future trend of the data. In place of actual quantiles of the data at each point, we have converted the statistical concept into fuzzy concept by using fuzzy quantiles using fuzzy membership function ensemble. We have given a fuzzy metric to use the trend forecast and calculate the future value. The proposed model is applied for TAIFEX forecasting. It is shown that proposed method work best as compared to other models when compared with respect to model complexity and forecasting accuracy.

A New Hybrid Model with Passive Congregation for Stock Market Indices Prediction

In this paper, we propose a new hybrid learning model for stock market indices prediction by adding a passive congregation term to the standard hybrid model comprising Particle Swarm Optimization (PSO) with Genetic Algorithm (GA) operators in training Neural Networks (NN). This new passive congregation term is based on the cooperation between different particles in determining new positions rather than depending on the particles selfish thinking without considering other particles positions, thus it enables PSO to perform both the local and global search instead of only doing the local search. Experiment study carried out on the most famous European stock market indices in both long term and short term prediction shows significantly the influence of the passive congregation term in improving the prediction accuracy compared to standard hybrid model.

Intelligent Heart Disease Prediction System Using CANFIS and Genetic Algorithm

Heart disease (HD) is a major cause of morbidity and mortality in the modern society. Medical diagnosis is an important but complicated task that should be performed accurately and efficiently and its automation would be very useful. All doctors are unfortunately not equally skilled in every sub specialty and they are in many places a scarce resource. A system for automated medical diagnosis would enhance medical care and reduce costs. In this paper, a new approach based on coactive neuro-fuzzy inference system (CANFIS) was presented for prediction of heart disease. The proposed CANFIS model combined the neural network adaptive capabilities and the fuzzy logic qualitative approach which is then integrated with genetic algorithm to diagnose the presence of the disease. The performances of the CANFIS model were evaluated in terms of training performances and classification accuracies and the results showed that the proposed CANFIS model has great potential in predicting the heart disease.

A Codebook-based Redundancy Suppression Mechanism with Lifetime Prediction in Cluster-based WSN

Wireless Sensor Network (WSN) comprises of sensor nodes which are designed to sense the environment, transmit sensed data back to the base station via multi-hop routing to reconstruct physical phenomena. Since physical phenomena exists significant overlaps between temporal redundancy and spatial redundancy, it is necessary to use Redundancy Suppression Algorithms (RSA) for sensor node to lower energy consumption by reducing the transmission of redundancy. A conventional algorithm of RSAs is threshold-based RSA, which sets threshold to suppress redundant data. Although many temporal and spatial RSAs are proposed, temporal-spatial RSA are seldom to be proposed because it is difficult to determine when to utilize temporal or spatial RSAs. In this paper, we proposed a novel temporal-spatial redundancy suppression algorithm, Codebookbase Redundancy Suppression Mechanism (CRSM). CRSM adopts vector quantization to generate a codebook, which is easily used to implement temporal-spatial RSA. CRSM not only achieves power saving and reliability for WSN, but also provides the predictability of network lifetime. Simulation result shows that the network lifetime of CRSM outperforms at least 23% of that of other RSAs.

Protein Secondary Structure Prediction

Protein structure determination and prediction has been a focal research subject in the field of bioinformatics due to the importance of protein structure in understanding the biological and chemical activities of organisms. The experimental methods used by biotechnologists to determine the structures of proteins demand sophisticated equipment and time. A host of computational methods are developed to predict the location of secondary structure elements in proteins for complementing or creating insights into experimental results. However, prediction accuracies of these methods rarely exceed 70%.

Oil Refineries Emissions: Source and Impact: A Study using AERMOD

The main objectives of this paper are to measure pollutants concentrations in the oil refinery area in Kuwait over three periods during one year, obtain recent emission inventory for the three refineries of Kuwait, use AERMOD and the emission inventory to predict pollutants concentrations and distribution, compare model predictions against measured data, and perform numerical experiments to determine conditions at which emission rates and the resulting pollutant dispersion is below maximum allowable limits.

The Contraction Point for Phan-Thien/Tanner Model of Tube-Tooling Wire-Coating Flow

The simulation of extrusion process is studied widely in order to both increase products and improve quality, with broad application in wire coating. The annular tube-tooling extrusion was set up by a model that is termed as Navier-Stokes equation in addition to a rheological model of differential form based on singlemode exponential Phan-Thien/Tanner constitutive equation in a twodimensional cylindrical coordinate system for predicting the contraction point of the polymer melt beyond the die. Numerical solutions are sought through semi-implicit Taylor-Galerkin pressurecorrection finite element scheme. The investigation was focused on incompressible creeping flow with long relaxation time in terms of Weissenberg numbers up to 200. The isothermal case was considered with surface tension effect on free surface in extrudate flow and no slip at die wall. The Stream Line Upwind Petrov-Galerkin has been proposed to stabilize solution. The structure of mesh after die exit was adjusted following prediction of both top and bottom free surfaces so as to keep the location of contraction point around one unit length which is close to experimental results. The simulation of extrusion process is studied widely in order to both increase products and improve quality, with broad application in wire coating. The annular tube-tooling extrusion was set up by a model that is termed as Navier-Stokes equation in addition to a rheological model of differential form based on single-mode exponential Phan- Thien/Tanner constitutive equation in a two-dimensional cylindrical coordinate system for predicting the contraction point of the polymer melt beyond the die. Numerical solutions are sought through semiimplicit Taylor-Galerkin pressure-correction finite element scheme. The investigation was focused on incompressible creeping flow with long relaxation time in terms of Weissenberg numbers up to 200. The isothermal case was considered with surface tension effect on free surface in extrudate flow and no slip at die wall. The Stream Line Upwind Petrov-Galerkin has been proposed to stabilize solution. The structure of mesh after die exit was adjusted following prediction of both top and bottom free surfaces so as to keep the location of contraction point around one unit length which is close to experimental results.

A Neural Network Approach in Predicting the Blood Glucose Level for Diabetic Patients

Diabetes Mellitus is a chronic metabolic disorder, where the improper management of the blood glucose level in the diabetic patients will lead to the risk of heart attack, kidney disease and renal failure. This paper attempts to enhance the diagnostic accuracy of the advancing blood glucose levels of the diabetic patients, by combining principal component analysis and wavelet neural network. The proposed system makes separate blood glucose prediction in the morning, afternoon, evening and night intervals, using dataset from one patient covering a period of 77 days. Comparisons of the diagnostic accuracy with other neural network models, which use the same dataset are made. The comparison results showed overall improved accuracy, which indicates the effectiveness of this proposed system.

Association Rule and Decision Tree based Methodsfor Fuzzy Rule Base Generation

This paper focuses on the data-driven generation of fuzzy IF...THEN rules. The resulted fuzzy rule base can be applied to build a classifier, a model used for prediction, or it can be applied to form a decision support system. Among the wide range of possible approaches, the decision tree and the association rule based algorithms are overviewed, and two new approaches are presented based on the a priori fuzzy clustering based partitioning of the continuous input variables. An application study is also presented, where the developed methods are tested on the well known Wisconsin Breast Cancer classification problem.

Establishing a Probabilistic Model of Extrapolated Wind Speed Data for Wind Energy Prediction

Wind is among the potential energy resources which can be harnessed to generate wind energy for conversion into electrical power. Due to the variability of wind speed with time and height, it becomes difficult to predict the generated wind energy more optimally. In this paper, an attempt is made to establish a probabilistic model fitting the wind speed data recorded at Makambako site in Tanzania. Wind speeds and direction were respectively measured using anemometer (type AN1) and wind Vane (type WD1) both supplied by Delta-T-Devices at a measurement height of 2 m. Wind speeds were then extrapolated for the height of 10 m using power law equation with an exponent of 0.47. Data were analysed using MINITAB statistical software to show the variability of wind speeds with time and height, and to determine the underlying probability model of the extrapolated wind speed data. The results show that wind speeds at Makambako site vary cyclically over time; and they conform to the Weibull probability distribution. From these results, Weibull probability density function can be used to predict the wind energy.

Optimum Neural Network Architecture for Precipitation Prediction of Myanmar

Nowadays, precipitation prediction is required for proper planning and management of water resources. Prediction with neural network models has received increasing interest in various research and application domains. However, it is difficult to determine the best neural network architecture for prediction since it is not immediately obvious how many input or hidden nodes are used in the model. In this paper, neural network model is used as a forecasting tool. The major aim is to evaluate a suitable neural network model for monthly precipitation mapping of Myanmar. Using 3-layerd neural network models, 100 cases are tested by changing the number of input and hidden nodes from 1 to 10 nodes, respectively, and only one outputnode used. The optimum model with the suitable number of nodes is selected in accordance with the minimum forecast error. In measuring network performance using Root Mean Square Error (RMSE), experimental results significantly show that 3 inputs-10 hiddens-1 output architecture model gives the best prediction result for monthly precipitation in Myanmar.