Abstract: The development of allometric models is crucial to
accurate forest biomass/carbon stock assessment. The aim of this
study was to develop a set of biomass prediction models that will
enable the determination of total tree aboveground biomass for
savannah woodland area in Niger State, Nigeria. Based on the data
collected through biometric measurements of 1816 trees and
destructive sampling of 36 trees, five species specific and one site
specific models were developed. The sample size was distributed
equally between the five most dominant species in the study site
(Vitellaria paradoxa, Irvingia gabonensis, Parkia biglobosa,
Anogeissus leiocarpus, Pterocarpus erinaceous). Firstly, the
equations were developed for five individual species. Secondly these
five species were mixed and were used to develop an allometric
equation of mixed species. Overall, there was a strong positive
relationship between total tree biomass and the stem diameter. The
coefficient of determination (R2 values) ranging from 0.93 to 0.99 P
< 0.001 were realised for the models; with considerable low standard
error of the estimates (SEE) which confirms that the total tree above
ground biomass has a significant relationship with the dbh. F-test
values for the biomass prediction models were also significant at p
Abstract: Previous studies on financial distress prediction choose
the conventional failing and non-failing dichotomy; however, the
distressed extent differs substantially among different financial
distress events. To solve the problem, “non-distressed”, “slightlydistressed”
and “reorganization and bankruptcy” are used in our article
to approximate the continuum of corporate financial health. This paper
explains different financial distress events using the two-stage method.
First, this investigation adopts firm-specific financial ratios, corporate
governance and market factors to measure the probability of various
financial distress events based on multinomial logit models.
Specifically, the bootstrapping simulation is performed to examine the
difference of estimated misclassifying cost (EMC). Second, this work
further applies macroeconomic factors to establish the credit cycle
index and determines the distressed cut-off indicator of the two-stage
models using such index. Two different models, one-stage and
two-stage prediction models are developed to forecast financial
distress, and the results acquired from different models are compared
with each other, and with the collected data. The findings show that the
one-stage model has the lower misclassification error rate than the
two-stage model. The one-stage model is more accurate than the
two-stage model.
Abstract: The Cone Penetration Test (CPT) is a common in-situ
test which generally investigates a much greater volume of soil more
quickly than possible from sampling and laboratory tests. Therefore,
it has the potential to realize both cost savings and assessment of soil
properties rapidly and continuously. The principle objective of this
paper is to demonstrate the feasibility and efficiency of using
artificial neural networks (ANNs) to predict the soil angle of internal
friction (Φ) and the soil modulus of elasticity (E) from CPT results
considering the uncertainties and non-linearities of the soil. In
addition, ANNs are used to study the influence of different
parameters and recommend which parameters should be included as
input parameters to improve the prediction. Neural networks discover
relationships in the input data sets through the iterative presentation
of the data and intrinsic mapping characteristics of neural topologies.
General Regression Neural Network (GRNN) is one of the powerful
neural network architectures which is utilized in this study. A large
amount of field and experimental data including CPT results, plate
load tests, direct shear box, grain size distribution and calculated data
of overburden pressure was obtained from a large project in the
United Arab Emirates. This data was used for the training and the
validation of the neural network. A comparison was made between
the obtained results from the ANN's approach, and some common
traditional correlations that predict Φ and E from CPT results with
respect to the actual results of the collected data. The results show
that the ANN is a very powerful tool. Very good agreement was
obtained between estimated results from ANN and actual measured
results with comparison to other correlations available in the
literature. The study recommends some easily available parameters
that should be included in the estimation of the soil properties to
improve the prediction models. It is shown that the use of friction
ration in the estimation of Φ and the use of fines content in the
estimation of E considerable improve the prediction models.
Abstract: Urban areas have been expanded throughout the
globe. Monitoring and modelling urban growth have become a
necessity for a sustainable urban planning and decision making.
Urban prediction models are important tools for analyzing the causes
and consequences of urban land use dynamics. The objective of this
research paper is to analyze and model the urban change, which has
been occurred from 1990 to 2000 using CORINE land cover maps.
The model was developed using drivers of urban changes (such as
road distance, slope, etc.) under an Artificial Neural Network
modelling approach. Validation was achieved using a prediction map
for 2006 which was compared with a real map of Urban Atlas of
2006. The accuracy produced a Kappa index of agreement of 0,639
and a value of Cramer's V of 0,648. These encouraging results
indicate the importance of the developed urban growth prediction
model which using a set of available common biophysical drivers
could serve as a management tool for the assessment of urban
change.
Abstract: Near infrared (NIR) spectroscopy has always been of
great interest in the food and agriculture industries. The development
of prediction models has facilitated the estimation process in recent
years. In this study, 110 crude palm oil (CPO) samples were used to
build a free fatty acid (FFA) prediction model. 60% of the collected
data were used for training purposes and the remaining 40% used for
testing. The visible peaks on the NIR spectrum were at 1725 nm and
1760 nm, indicating the existence of the first overtone of C-H bands.
Principal component regression (PCR) was applied to the data in
order to build this mathematical prediction model. The optimal
number of principal components was 10. The results showed
R2=0.7147 for the training set and R2=0.6404 for the testing set.
Abstract: Characterization of the engineering behavior of
unsaturated soil is dependent on the soil-water characteristic curve
(SWCC), a graphical representation of the relationship between water
content or degree of saturation and soil suction. A reasonable
description of the SWCC is thus important for the accurate prediction
of unsaturated soil parameters. The measurement procedures for
determining the SWCC, however, are difficult, expensive, and timeconsuming.
During the past few decades, researchers have laid a
major focus on developing empirical equations for predicting the
SWCC, with a large number of empirical models suggested. One of
the most crucial questions is how precisely existing equations can
represent the SWCC. As different models have different ranges of
capability, it is essential to evaluate the precision of the SWCC
models used for each particular soil type for better SWCC estimation.
It is expected that better estimation of SWCC would be achieved via
a thorough statistical analysis of its distribution within a particular
soil class. With this in view, a statistical analysis was conducted in
order to evaluate the reliability of the SWCC prediction models
against laboratory measurement. Optimization techniques were used
to obtain the best-fit of the model parameters in four forms of SWCC
equation, using laboratory data for relatively coarse-textured (i.e.,
sandy) soil. The four most prominent SWCCs were evaluated and
computed for each sample. The result shows that the Brooks and
Corey model is the most consistent in describing the SWCC for sand
soil type. The Brooks and Corey model prediction also exhibit
compatibility with samples ranging from low to high soil water
content in which subjected to the samples that evaluated in this study.
Abstract: Pasta is one of the most widely consumed food products around the world. Rapid determination of the moisture content in pasta will assist food processors to provide online quality control of pasta during large scale production. Rapid Fourier transform near-infrared method (FT-NIR) was developed for determining moisture content in pasta. A calibration set of 150 samples, a validation set of 30 samples and a prediction set of 25 samples of pasta were used. The diffuse reflection spectra of different types of pastas were measured by FT-NIR analyzer in the 4,000-12,000cm-1 spectral range. Calibration and validation sets were designed for the conception and evaluation of the method adequacy in the range of moisture content 10 to 15 percent (w.b) of the pasta. The prediction models based on partial least squares (PLS) regression, were developed in the near-infrared. Conventional criteria such as the R2, the root mean square errors of cross validation (RMSECV), root mean square errors of estimation (RMSEE) as well as the number of PLS factors were considered for the selection of three pre-processing (vector normalization, minimum-maximum normalization and multiplicative scatter correction) methods. Spectra of pasta sample were treated with different mathematic pre-treatments before being used to build models between the spectral information and moisture content. The moisture content in pasta predicted by FT-NIR methods had very good correlation with their values determined via traditional methods (R2 = 0.983), which clearly indicated that FT-NIR methods could be used as an effective tool for rapid determination of moisture content in pasta. The best calibration model was developed with min-max normalization (MMN) spectral pre-processing (R2 = 0.9775). The MMN pre-processing method was found most suitable and the maximum coefficient of determination (R2) value of 0.9875 was obtained for the calibration model developed.
Abstract: This study is focused on the development of prediction models of the Ozone concentration time series. Prediction model is built based on chaotic approach. Firstly, the chaotic nature of the time series is detected by means of phase space plot and the Cao method. Then, the prediction model is built and the local linear approximation method is used for the forecasting purposes. Traditional prediction of autoregressive linear model is also built. Moreover, an improvement in local linear approximation method is also performed. Prediction models are applied to the hourly Ozone time series observed at the benchmark station in Malaysia. Comparison of all models through the calculation of mean absolute error, root mean squared error and correlation coefficient shows that the one with improved prediction method is the best. Thus, chaotic approach is a good approach to be used to develop a prediction model for the Ozone concentration time series.
Abstract: Fault-proneness of a software module is the
probability that the module contains faults. A correlation exists
between the fault-proneness of the software and the measurable
attributes of the code (i.e. the static metrics) and of the testing (i.e.
the dynamic metrics). Early detection of fault-prone software
components enables verification experts to concentrate their time and
resources on the problem areas of the software system under
development. This paper introduces Genetic Algorithm based
software fault prediction models with Object-Oriented metrics. The
contribution of this paper is that it has used Metric values of JEdit
open source software for generation of the rules for the classification
of software modules in the categories of Faulty and non faulty
modules and thereafter empirically validation is performed. The
results shows that Genetic algorithm approach can be used for
finding the fault proneness in object oriented software components.
Abstract: This paper present the study carried out of accident
analysis, black spot study and to develop accident predictive models
based on the data collected at rural roadway, Federal Route 50 (F050)
Malaysia. The road accident trends and black spot ranking were
established on the F050. The development of the accident prediction
model will concentrate in Parit Raja area from KM 19 to KM 23.
Multiple non-linear regression method was used to relate the discrete
accident data with the road and traffic flow explanatory variable. The
dependent variable was modeled as the number of crashes namely
accident point weighting, however accident point weighting have
rarely been account in the road accident prediction Models. The result
show that, the existing number of major access points, without traffic
light, rise in speed, increasing number of Annual Average Daily
Traffic (AADT), growing number of motorcycle and motorcar and
reducing the time gap are the potential contributors of increment
accident rates on multiple rural roadway.
Abstract: The mitigation of crop loss due to damaging freezes
requires accurate air temperature prediction models. Previous work
established that the Ward-style artificial neural network (ANN) is a
suitable tool for developing such models. The current research
focused on developing ANN models with reduced average prediction
error by increasing the number of distinct observations used in
training, adding additional input terms that describe the date of an
observation, increasing the duration of prior weather data included in
each observation, and reexamining the number of hidden nodes used
in the network. Models were created to predict air temperature at
hourly intervals from one to 12 hours ahead. Each ANN model,
consisting of a network architecture and set of associated parameters,
was evaluated by instantiating and training 30 networks and
calculating the mean absolute error (MAE) of the resulting networks
for some set of input patterns. The inclusion of seasonal input terms,
up to 24 hours of prior weather information, and a larger number of
processing nodes were some of the improvements that reduced
average prediction error compared to previous research across all
horizons. For example, the four-hour MAE of 1.40°C was 0.20°C, or
12.5%, less than the previous model. Prediction MAEs eight and 12
hours ahead improved by 0.17°C and 0.16°C, respectively,
improvements of 7.4% and 5.9% over the existing model at these
horizons. Networks instantiating the same model but with different
initial random weights often led to different prediction errors. These
results strongly suggest that ANN model developers should consider
instantiating and training multiple networks with different initial
weights to establish preferred model parameters.
Abstract: The purpose of this paper is to develop models that would enable predicting student success. These models could improve allocation of students among colleges and optimize the newly introduced model of government subsidies for higher education. For the purpose of collecting data, an anonymous survey was carried out in the last year of undergraduate degree student population using random sampling method. Decision trees were created of which two have been chosen that were most successful in predicting student success based on two criteria: Grade Point Average (GPA) and time that a student needs to finish the undergraduate program (time-to-degree). Decision trees have been shown as a good method of classification student success and they could be even more improved by increasing survey sample and developing specialized decision trees for each type of college. These types of methods have a big potential for use in decision support systems.
Abstract: The aim of this paper is to rank the impact of Object
Oriented(OO) metrics in fault prediction modeling using Artificial
Neural Networks(ANNs). Past studies on empirical validation of
object oriented metrics as fault predictors using ANNs have focused
on the predictive quality of neural networks versus standard
statistical techniques. In this empirical study we turn our attention to
the capability of ANNs in ranking the impact of these explanatory
metrics on fault proneness. In ANNs data analysis approach, there is
no clear method of ranking the impact of individual metrics. Five
ANN based techniques are studied which rank object oriented
metrics in predicting fault proneness of classes. These techniques are
i) overall connection weights method ii) Garson-s method iii) The
partial derivatives methods iv) The Input Perturb method v) the
classical stepwise methods. We develop and evaluate different
prediction models based on the ranking of the metrics by the
individual techniques. The models based on overall connection
weights and partial derivatives methods have been found to be most
accurate.
Abstract: This purpose of this paper is to develop and validate a
model to accurately predict the cell temperature of a PV module that
adapts to various mounting configurations, mounting locations, and
climates while only requiring readily available data from the module
manufacturer. Results from this model are also compared to results
from published cell temperature models. The models were used to
predict real-time performance from a PV water pumping systems in
the desert of Medenine, south of Tunisia using 60-min intervals of
measured performance data during one complete year. Statistical
analysis of the predicted results and measured data highlight possible
sources of errors and the limitations and/or adequacy of existing
models, to describe the temperature and efficiency of PV-cells and
consequently, the accuracy of performance of PV water pumping
systems prediction models.
Abstract: This paper presents the methodology from machine
learning approaches for short-term rain forecasting system. Decision
Tree, Artificial Neural Network (ANN), and Support Vector Machine
(SVM) were applied to develop classification and prediction models
for rainfall forecasts. The goals of this presentation are to
demonstrate (1) how feature selection can be used to identify the
relationships between rainfall occurrences and other weather
conditions and (2) what models can be developed and deployed for
predicting the accurate rainfall estimates to support the decisions to
launch the cloud seeding operations in the northeastern part of
Thailand. Datasets collected during 2004-2006 from the
Chalermprakiat Royal Rain Making Research Center at Hua Hin,
Prachuap Khiri khan, the Chalermprakiat Royal Rain Making
Research Center at Pimai, Nakhon Ratchasima and Thai
Meteorological Department (TMD). A total of 179 records with 57
features was merged and matched by unique date. There are three
main parts in this work. Firstly, a decision tree induction algorithm
(C4.5) was used to classify the rain status into either rain or no-rain.
The overall accuracy of classification tree achieves 94.41% with the
five-fold cross validation. The C4.5 algorithm was also used to
classify the rain amount into three classes as no-rain (0-0.1 mm.),
few-rain (0.1- 10 mm.), and moderate-rain (>10 mm.) and the overall
accuracy of classification tree achieves 62.57%. Secondly, an ANN
was applied to predict the rainfall amount and the root mean square
error (RMSE) were used to measure the training and testing errors of
the ANN. It is found that the ANN yields a lower RMSE at 0.171 for
daily rainfall estimates, when compared to next-day and next-2-day
estimation. Thirdly, the ANN and SVM techniques were also used to
classify the rain amount into three classes as no-rain, few-rain, and
moderate-rain as above. The results achieved in 68.15% and 69.10%
of overall accuracy of same-day prediction for the ANN and SVM
models, respectively. The obtained results illustrated the comparison
of the predictive power of different methods for rainfall estimation.
Abstract: The mitigation of crop loss due to damaging freezes requires accurate air temperature prediction models. An improved model for temperature prediction in Georgia was developed by including information on seasonality and modifying parameters of an existing artificial neural network model. Alternative models were compared by instantiating and training multiple networks for each model. The inclusion of up to 24 hours of prior weather information and inputs reflecting the day of year were among improvements that reduced average four-hour prediction error by 0.18°C compared to the prior model. Results strongly suggest model developers should instantiate and train multiple networks with different initial weights to establish appropriate model parameters.
Abstract: In recent years, the underground water sources in
southern Taiwan have become salinized because of saltwater
intrusions. This study explores the adsorption characteristics of
activated carbon on salinizing inorganic salts using isothermal
adsorption experiments and provides a model analysis. The
temperature range for the isothermal adsorption experiments ranged
between 5 to 45 ℃, and the amount adsorbed varied between 28.21 to
33.87 mg/g. All experimental data of adsorption can be fitted to both
the Langmuir and the Freundlich models. The thermodynamic
parameters for per chlorate onto granular activated carbon were
calculated as -0.99 to -1.11 kcal/mol for DG°, -0.6 kcal/mol for DH°,
and 1.21 to 1.84 kcal/mol for DS°. This shows that the adsorption
process of granular activated carbon is spontaneously exothermic. The
observation of adsorption behaviors under low ionic strength, low pH
values, and low temperatures is beneficial to the adsorption removal of
perchlorate with granular activated carbon.
Abstract: The prediction of Software quality during development life cycle of software project helps the development organization to make efficient use of available resource to produce the product of highest quality. “Whether a module is faulty or not" approach can be used to predict quality of a software module. There are numbers of software quality prediction models described in the literature based upon genetic algorithms, artificial neural network and other data mining algorithms. One of the promising aspects for quality prediction is based on clustering techniques. Most quality prediction models that are based on clustering techniques make use of K-means, Mixture-of-Guassians, Self-Organizing Map, Neural Gas and fuzzy K-means algorithm for prediction. In all these techniques a predefined structure is required that is number of neurons or clusters should be known before we start clustering process. But in case of Growing Neural Gas there is no need of predetermining the quantity of neurons and the topology of the structure to be used and it starts with a minimal neurons structure that is incremented during training until it reaches a maximum number user defined limits for clusters. Hence, in this work we have used Growing Neural Gas as underlying cluster algorithm that produces the initial set of labeled cluster from training data set and thereafter this set of clusters is used to predict the quality of test data set of software modules. The best testing results shows 80% accuracy in evaluating the quality of software modules. Hence, the proposed technique can be used by programmers in evaluating the quality of modules during software development.
Abstract: A novel typical day prediction model have been built and validated by the measured data of a grid-connected solar photovoltaic (PV) system in Macau. Unlike conventional statistical method used by previous study on PV systems which get results by averaging nearby continuous points, the present typical day statistical method obtain the value at every minute in a typical day by averaging discontinuous points at the same minute in different days. This typical day statistical method based on discontinuous point averaging makes it possible for us to obtain the Gaussian shape dynamical distributions for solar irradiance and output power in a yearly or monthly typical day. Based on the yearly typical day statistical analysis results, the maximum possible accumulated output energy in a year with on site climate conditions and the corresponding optimal PV system running time are obtained. Periodic Gaussian shape prediction models for solar irradiance, output energy and system energy efficiency have been built and their coefficients have been determined based on the yearly, maximum and minimum monthly typical day Gaussian distribution parameters, which are obtained from iterations for minimum Root Mean Squared Deviation (RMSD). With the present model, the dynamical effects due to time difference in a day are kept and the day to day uncertainty due to weather changing are smoothed but still included. The periodic Gaussian shape correlations for solar irradiance, output power and system energy efficiency have been compared favorably with data of the PV system in Macau and proved to be an improvement than previous models.