Empirical Statistical Modeling of Rainfall Prediction over Myanmar

One of the essential sectors of Myanmar economy is agriculture which is sensitive to climate variation. The most important climatic element which impacts on agriculture sector is rainfall. Thus rainfall prediction becomes an important issue in agriculture country. Multi variables polynomial regression (MPR) provides an effective way to describe complex nonlinear input output relationships so that an outcome variable can be predicted from the other or others. In this paper, the modeling of monthly rainfall prediction over Myanmar is described in detail by applying the polynomial regression equation. The proposed model results are compared to the results produced by multiple linear regression model (MLR). Experiments indicate that the prediction model based on MPR has higher accuracy than using MLR.

A Genetic Algorithm Based Classification Approach for Finding Fault Prone Classes

Fault-proneness of a software module is the probability that the module contains faults. A correlation exists between the fault-proneness of the software and the measurable attributes of the code (i.e. the static metrics) and of the testing (i.e. the dynamic metrics). Early detection of fault-prone software components enables verification experts to concentrate their time and resources on the problem areas of the software system under development. This paper introduces Genetic Algorithm based software fault prediction models with Object-Oriented metrics. The contribution of this paper is that it has used Metric values of JEdit open source software for generation of the rules for the classification of software modules in the categories of Faulty and non faulty modules and thereafter empirically validation is performed. The results shows that Genetic algorithm approach can be used for finding the fault proneness in object oriented software components.

Support Vector Machine Prediction Model of Early-stage Lung Cancer Based on Curvelet Transform to Extract Texture Features of CT Image

Purpose: To explore the use of Curvelet transform to extract texture features of pulmonary nodules in CT image and support vector machine to establish prediction model of small solitary pulmonary nodules in order to promote the ratio of detection and diagnosis of early-stage lung cancer. Methods: 2461 benign or malignant small solitary pulmonary nodules in CT image from 129 patients were collected. Fourteen Curvelet transform textural features were as parameters to establish support vector machine prediction model. Results: Compared with other methods, using 252 texture features as parameters to establish prediction model is more proper. And the classification consistency, sensitivity and specificity for the model are 81.5%, 93.8% and 38.0% respectively. Conclusion: Based on texture features extracted from Curvelet transform, support vector machine prediction model is sensitive to lung cancer, which can promote the rate of diagnosis for early-stage lung cancer to some extent.

Development of Predictive Model for Surface Roughness in End Milling of Al-SiCp Metal Matrix Composites using Fuzzy Logic

Metal matrix composites have been increasingly used as materials for components in automotive and aerospace industries because of their improved properties compared with non-reinforced alloys. During machining the selection of appropriate machining parameters to produce job for desired surface roughness is of great concern considering the economy of manufacturing process. In this study, a surface roughness prediction model using fuzzy logic is developed for end milling of Al-SiCp metal matrix composite component using carbide end mill cutter. The surface roughness is modeled as a function of spindle speed (N), feed rate (f), depth of cut (d) and the SiCp percentage (S). The predicted values surface roughness is compared with experimental result. The model predicts average percentage error as 4.56% and mean square error as 0.0729. It is observed that surface roughness is most influenced by feed rate, spindle speed and SiC percentage. Depth of cut has least influence.

Development of Neural Network Prediction Model of Energy Consumption

In the oil and gas industry, energy prediction can help the distributor and customer to forecast the outgoing and incoming gas through the pipeline. It will also help to eliminate any uncertainties in gas metering for billing purposes. The objective of this paper is to develop Neural Network Model for energy consumption and analyze the performance model. This paper provides a comprehensive review on published research on the energy consumption prediction which focuses on structures and the parameters used in developing Neural Network models. This paper is then focused on the parameter selection of the neural network prediction model development for energy consumption and analysis on the result. The most reliable model that gives the most accurate result is proposed for the prediction. The result shows that the proposed neural network energy prediction model is able to demonstrate an adequate performance with least Root Mean Square Error.

Performance Analysis of Evolutionary ANN for Output Prediction of a Grid-Connected Photovoltaic System

This paper presents performance analysis of the Evolutionary Programming-Artificial Neural Network (EPANN) based technique to optimize the architecture and training parameters of a one-hidden layer feedforward ANN model for the prediction of energy output from a grid connected photovoltaic system. The ANN utilizes solar radiation and ambient temperature as its inputs while the output is the total watt-hour energy produced from the grid-connected PV system. EP is used to optimize the regression performance of the ANN model by determining the optimum values for the number of nodes in the hidden layer as well as the optimal momentum rate and learning rate for the training. The EPANN model is tested using two types of transfer function for the hidden layer, namely the tangent sigmoid and logarithmic sigmoid. The best transfer function, neural topology and learning parameters were selected based on the highest regression performance obtained during the ANN training and testing process. It is observed that the best transfer function configuration for the prediction model is [logarithmic sigmoid, purely linear].

Energy Map Construction using Adaptive Alpha Grey Prediction Model in WSNs

Wireless Sensor Networks can be used to monitor the physical phenomenon in such areas where human approach is nearly impossible. Hence the limited power supply is the major constraint of the WSNs due to the use of non-rechargeable batteries in sensor nodes. A lot of researches are going on to reduce the energy consumption of sensor nodes. Energy map can be used with clustering, data dissemination and routing techniques to reduce the power consumption of WSNs. Energy map can also be used to know which part of the network is going to fail in near future. In this paper, Energy map is constructed using the prediction based approach. Adaptive alpha GM(1,1) model is used as the prediction model. GM(1,1) is being used worldwide in many applications for predicting future values of time series using some past values due to its high computational efficiency and accuracy.

Development of Accident Predictive Model for Rural Roadway

This paper present the study carried out of accident analysis, black spot study and to develop accident predictive models based on the data collected at rural roadway, Federal Route 50 (F050) Malaysia. The road accident trends and black spot ranking were established on the F050. The development of the accident prediction model will concentrate in Parit Raja area from KM 19 to KM 23. Multiple non-linear regression method was used to relate the discrete accident data with the road and traffic flow explanatory variable. The dependent variable was modeled as the number of crashes namely accident point weighting, however accident point weighting have rarely been account in the road accident prediction Models. The result show that, the existing number of major access points, without traffic light, rise in speed, increasing number of Annual Average Daily Traffic (AADT), growing number of motorcycle and motorcar and reducing the time gap are the potential contributors of increment accident rates on multiple rural roadway.

Examination of Flood Runoff Reproductivity for Different Rainfall Sources in Central Vietnam

This paper presents the combination of different precipitation data sets and the distributed hydrological model, in order to examine the flood runoff reproductivity of scattered observation catchments. The precipitation data sets were obtained from observation using rain-gages, satellite based estimate (TRMM), and numerical weather prediction model (NWP), then were coupled with the super tank model. The case study was conducted in three basins (small, medium, and large size) located in Central Vietnam. Calculated hydrographs based on ground observation rainfall showed best fit to measured stream flow, while those obtained from TRMM and NWP showed high uncertainty of peak discharges. However, calculated hydrographs using the adjusted rainfield depicted a promising alternative for the application of TRMM and NWP in flood modeling for scattered observation catchments, especially for the extension of forecast lead time.

A Study on Early Prediction of Fault Proneness in Software Modules using Genetic Algorithm

Fault-proneness of a software module is the probability that the module contains faults. To predict faultproneness of modules different techniques have been proposed which includes statistical methods, machine learning techniques, neural network techniques and clustering techniques. The aim of proposed study is to explore whether metrics available in the early lifecycle (i.e. requirement metrics), metrics available in the late lifecycle (i.e. code metrics) and metrics available in the early lifecycle (i.e. requirement metrics) combined with metrics available in the late lifecycle (i.e. code metrics) can be used to identify fault prone modules using Genetic Algorithm technique. This approach has been tested with real time defect C Programming language datasets of NASA software projects. The results show that the fusion of requirement and code metric is the best prediction model for detecting the faults as compared with commonly used code based model.

Improving Air Temperature Prediction with Artificial Neural Networks

The mitigation of crop loss due to damaging freezes requires accurate air temperature prediction models. Previous work established that the Ward-style artificial neural network (ANN) is a suitable tool for developing such models. The current research focused on developing ANN models with reduced average prediction error by increasing the number of distinct observations used in training, adding additional input terms that describe the date of an observation, increasing the duration of prior weather data included in each observation, and reexamining the number of hidden nodes used in the network. Models were created to predict air temperature at hourly intervals from one to 12 hours ahead. Each ANN model, consisting of a network architecture and set of associated parameters, was evaluated by instantiating and training 30 networks and calculating the mean absolute error (MAE) of the resulting networks for some set of input patterns. The inclusion of seasonal input terms, up to 24 hours of prior weather information, and a larger number of processing nodes were some of the improvements that reduced average prediction error compared to previous research across all horizons. For example, the four-hour MAE of 1.40°C was 0.20°C, or 12.5%, less than the previous model. Prediction MAEs eight and 12 hours ahead improved by 0.17°C and 0.16°C, respectively, improvements of 7.4% and 5.9% over the existing model at these horizons. Networks instantiating the same model but with different initial random weights often led to different prediction errors. These results strongly suggest that ANN model developers should consider instantiating and training multiple networks with different initial weights to establish preferred model parameters.

Model Predictive Fuzzy Control of Air-ratio for Automotive Engines

Automotive engine air-ratio plays an important role of emissions and fuel consumption reduction while maintains satisfactory engine power among all of the engine control variables. In order to effectively control the air-ratio, this paper presents a model predictive fuzzy control algorithm based on online least-squares support vector machines prediction model and fuzzy logic optimizer. The proposed control algorithm was also implemented on a real car for testing and the results are highly satisfactory. Experimental results show that the proposed control algorithm can regulate the engine air-ratio to the stoichiometric value, 1.0, under external disturbance with less than 5% tolerance.

Improving University Operations with Data Mining: Predicting Student Performance

The purpose of this paper is to develop models that would enable predicting student success. These models could improve allocation of students among colleges and optimize the newly introduced model of government subsidies for higher education. For the purpose of collecting data, an anonymous survey was carried out in the last year of undergraduate degree student population using random sampling method. Decision trees were created of which two have been chosen that were most successful in predicting student success based on two criteria: Grade Point Average (GPA) and time that a student needs to finish the undergraduate program (time-to-degree). Decision trees have been shown as a good method of classification student success and they could be even more improved by increasing survey sample and developing specialized decision trees for each type of college. These types of methods have a big potential for use in decision support systems.

Alternative Methods to Rank the Impact of Object Oriented Metrics in Fault Prediction Modeling using Neural Networks

The aim of this paper is to rank the impact of Object Oriented(OO) metrics in fault prediction modeling using Artificial Neural Networks(ANNs). Past studies on empirical validation of object oriented metrics as fault predictors using ANNs have focused on the predictive quality of neural networks versus standard statistical techniques. In this empirical study we turn our attention to the capability of ANNs in ranking the impact of these explanatory metrics on fault proneness. In ANNs data analysis approach, there is no clear method of ranking the impact of individual metrics. Five ANN based techniques are studied which rank object oriented metrics in predicting fault proneness of classes. These techniques are i) overall connection weights method ii) Garson-s method iii) The partial derivatives methods iv) The Input Perturb method v) the classical stepwise methods. We develop and evaluate different prediction models based on the ranking of the metrics by the individual techniques. The models based on overall connection weights and partial derivatives methods have been found to be most accurate.

Recurrent Radial Basis Function Network for Failure Time Series Prediction

An adaptive software reliability prediction model using evolutionary connectionist approach based on Recurrent Radial Basis Function architecture is proposed. Based on the currently available software failure time data, Fuzzy Min-Max algorithm is used to globally optimize the number of the k Gaussian nodes. The corresponding optimized neural network architecture is iteratively and dynamically reconfigured in real-time as new actual failure time data arrives. The performance of our proposed approach has been tested using sixteen real-time software failure data. Numerical results show that our proposed approach is robust across different software projects, and has a better performance with respect to next-steppredictability compared to existing neural network model for failure time prediction.

Protein Residue Contact Prediction using Support Vector Machine

Protein residue contact map is a compact representation of secondary structure of protein. Due to the information hold in the contact map, attentions from researchers in related field were drawn and plenty of works have been done throughout the past decade. Artificial intelligence approaches have been widely adapted in related works such as neural networks, genetic programming, and Hidden Markov model as well as support vector machine. However, the performance of the prediction was not generalized which probably depends on the data used to train and generate the prediction model. This situation shown the importance of the features or information used in affecting the prediction performance. In this research, support vector machine was used to predict protein residue contact map on different combination of features in order to show and analyze the effectiveness of the features.

Novel Hybrid Method for Gene Selection and Cancer Prediction

Microarray data profiles gene expression on a whole genome scale, therefore, it provides a good way to study associations between gene expression and occurrence or progression of cancer. More and more researchers realized that microarray data is helpful to predict cancer sample. However, the high dimension of gene expressions is much larger than the sample size, which makes this task very difficult. Therefore, how to identify the significant genes causing cancer becomes emergency and also a hot and hard research topic. Many feature selection algorithms have been proposed in the past focusing on improving cancer predictive accuracy at the expense of ignoring the correlations between the features. In this work, a novel framework (named by SGS) is presented for stable gene selection and efficient cancer prediction . The proposed framework first performs clustering algorithm to find the gene groups where genes in each group have higher correlation coefficient, and then selects the significant genes in each group with Bayesian Lasso and important gene groups with group Lasso, and finally builds prediction model based on the shrinkage gene space with efficient classification algorithm (such as, SVM, 1NN, Regression and etc.). Experiment results on real world data show that the proposed framework often outperforms the existing feature selection and prediction methods, say SAM, IG and Lasso-type prediction model.

A Hybrid Model of ARIMA and Multiple Polynomial Regression for Uncertainties Modeling of a Serial Production Line

Uncertainties of a serial production line affect on the production throughput. The uncertainties cannot be prevented in a real production line. However the uncertain conditions can be controlled by a robust prediction model. Thus, a hybrid model including autoregressive integrated moving average (ARIMA) and multiple polynomial regression, is proposed to model the nonlinear relationship of production uncertainties with throughput. The uncertainties under consideration of this study are demand, breaktime, scrap, and lead-time. The nonlinear relationship of production uncertainties with throughput are examined in the form of quadratic and cubic regression models, where the adjusted R-squared for quadratic and cubic regressions was 98.3% and 98.2%. We optimized the multiple quadratic regression (MQR) by considering the time series trend of the uncertainties using ARIMA model. Finally the hybrid model of ARIMA and MQR is formulated by better adjusted R-squared, which is 98.9%.

Development of a Real-Time Energy Models for Photovoltaic Water Pumping System

This purpose of this paper is to develop and validate a model to accurately predict the cell temperature of a PV module that adapts to various mounting configurations, mounting locations, and climates while only requiring readily available data from the module manufacturer. Results from this model are also compared to results from published cell temperature models. The models were used to predict real-time performance from a PV water pumping systems in the desert of Medenine, south of Tunisia using 60-min intervals of measured performance data during one complete year. Statistical analysis of the predicted results and measured data highlight possible sources of errors and the limitations and/or adequacy of existing models, to describe the temperature and efficiency of PV-cells and consequently, the accuracy of performance of PV water pumping systems prediction models.

Software Reliability Prediction Model Analysis

Software reliability prediction gives a great opportunity to measure the software failure rate at any point throughout system test. A software reliability prediction model provides with the technique for improving reliability. Software reliability is very important factor for estimating overall system reliability, which depends on the individual component reliabilities. It differs from hardware reliability in that it reflects the design perfection. Main reason of software reliability problems is high complexity of software. Various approaches can be used to improve the reliability of software. We focus on software reliability model in this article, assuming that there is a time redundancy, the value of which (the number of repeated transmission of basic blocks) can be an optimization parameter. We consider given mathematical model in the assumption that in the system may occur not only irreversible failures, but also a failure that can be taken as self-repairing failures that significantly affect the reliability and accuracy of information transfer. Main task of the given paper is to find a time distribution function (DF) of instructions sequence transmission, which consists of random number of basic blocks. We consider the system software unreliable; the time between adjacent failures has exponential distribution.