Abstract: This paper presents a method of evaluating the effect
of aggregate angularity on hot mix asphalt (HMA) properties and its
relationship to the Permanent Deformation resistance. The research
concluded that aggregate particle angularity had a significant effect
on the Permanent Deformation performance, and also that with an
increase in coarse aggregate angularity there was an increase in the
resistance of mixes to Permanent Deformation. A comparison
between the measured data and predictive data of permanent
deformation predictive models showed the limits of existing
prediction models. The numerical analysis described the permanent
deformation zones and concluded that angularity has an effect of the
onset of these zones. Prediction of permanent deformation help road
agencies and by extension economists and engineers determine the
best approach for maintenance, rehabilitation, and new construction
works of the road infrastructure.
Abstract: Software fault prediction models are created by using
the source code, processed metrics from the same or previous version
of code and related fault data. Some company do not store and keep
track of all artifacts which are required for software fault prediction.
To construct fault prediction model for such company, the training
data from the other projects can be one potential solution. Earlier we
predicted the fault the less cost it requires to correct. The training
data consists of metrics data and related fault data at function/module
level. This paper investigates fault predictions at early stage using the
cross-project data focusing on the design metrics. In this study,
empirical analysis is carried out to validate design metrics for cross
project fault prediction. The machine learning techniques used for
evaluation is Naïve Bayes. The design phase metrics of other projects
can be used as initial guideline for the projects where no previous
fault data is available. We analyze seven datasets from NASA
Metrics Data Program which offer design as well as code metrics.
Overall, the results of cross project is comparable to the within
company data learning.
Abstract: Cloud computing is the innovative and leading
information technology model for enabling convenient, on-demand
network access to a shared pool of configurable computing resources
that can be rapidly provisioned and released with minimal
management effort. In this paper, we aim at the development of
workflow management system for cloud computing platforms based
on our previous research on the dynamic allocation of the cloud
computing resources and its workflow process. We took advantage of
the HTML5 technology and developed web-based workflow interface.
In order to enable the combination of many tasks running on the cloud
platform in sequence, we designed a mechanism and developed an
execution engine for workflow management on clouds. We also
established a prediction model which was integrated with job queuing
system to estimate the waiting time and cost of the individual tasks on
different computing nodes, therefore helping users achieve maximum
performance at lowest payment. This proposed effort has the potential
to positively provide an efficient, resilience and elastic environment
for cloud computing platform. This development also helps boost user
productivity by promoting a flexible workflow interface that lets users
design and control their tasks' flow from anywhere.
Abstract: The main objective of this paper is to provide a new
methodology for road safety assessment in Oman through the
development of suitable accident prediction models. GLM technique
with Poisson or NBR using SAS package was carried out to develop
these models. The paper utilized the accidents data of 31 un-signalized
T-intersections during three years. Five goodness-of-fit
measures were used to assess the overall quality of the developed
models. Two types of models were developed separately; the flow-based
models including only traffic exposure functions, and the full
models containing both exposure functions and other significant
geometry and traffic variables.
The results show that, traffic exposure functions produced much
better fit to the accident data. The most effective geometric variables
were major-road mean speed, minor-road 85th percentile speed,
major-road lane width, distance to the nearest junction, and right-turn
curb radius.
The developed models can be used for intersection treatment or
upgrading and specify the appropriate design parameters of T-intersections.
Finally, the models presented in this thesis reflect the intersection
conditions in Oman and could represent the typical conditions in
several countries in the middle east area, especially gulf countries.
Abstract: The arm length, hand length, hand breadth and middle
finger length of 1540 right-handed industrial workers of Haryana
state was used to assess the relationship between the upper limb
dimensions and stature. Initially, the data were analyzed using basic
univariate analysis and independent t-tests; then simple and multiple
linear regression models were used to estimate stature using SPSS
(version 17). There was a positive correlation between upper limb
measurements (hand length, hand breadth, arm length and middle
finger length) and stature (p < 0.01), which was highest for hand
length. The accuracy of stature prediction ranged from ± 54.897 mm
to ± 58.307 mm. The use of multiple regression equations gave better
results than simple regression equations. This study provides new
forensic standards for stature estimation from the upper limb
measurements of male industrial workers of Haryana (India). The
results of this research indicate that stature can be determined using
hand dimensions with accuracy, when only upper limb is available
due to any reasons likewise explosions, train/plane crashes, mutilated
bodies, etc. The regression formula derived in this study will be
useful for anatomists, archaeologists, anthropologists, design
engineers and forensic scientists for fairly prediction of stature using
regression equations.
Abstract: Pulmonary Function Tests are important non-invasive
diagnostic tests to assess respiratory impairments and provides
quantifiable measures of lung function. Spirometry is the most
frequently used measure of lung function and plays an essential role
in the diagnosis and management of pulmonary diseases. However,
the test requires considerable patient effort and cooperation,
markedly related to the age of patients resulting in incomplete data
sets. This paper presents, a nonlinear model built using Multivariate
adaptive regression splines and Random forest regression model to
predict the missing spirometric features. Random forest based feature
selection is used to enhance both the generalization capability and the
model interpretability. In the present study, flow-volume data are
recorded for N= 198 subjects. The ranked order of feature importance
index calculated by the random forests model shows that the
spirometric features FVC, FEF25, PEF, FEF25-75, FEF50 and the
demographic parameter height are the important descriptors. A
comparison of performance assessment of both models prove that, the
prediction ability of MARS with the `top two ranked features namely
the FVC and FEF25 is higher, yielding a model fit of R2= 0.96 and
R2= 0.99 for normal and abnormal subjects. The Root Mean Square
Error analysis of the RF model and the MARS model also shows that
the latter is capable of predicting the missing values of FEV1 with a
notably lower error value of 0.0191 (normal subjects) and 0.0106
(abnormal subjects) with the aforementioned input features. It is
concluded that combining feature selection with a prediction model
provides a minimum subset of predominant features to train the
model, as well as yielding better prediction performance. This
analysis can assist clinicians with a intelligence support system in the
medical diagnosis and improvement of clinical care.
Abstract: The development of allometric models is crucial to
accurate forest biomass/carbon stock assessment. The aim of this
study was to develop a set of biomass prediction models that will
enable the determination of total tree aboveground biomass for
savannah woodland area in Niger State, Nigeria. Based on the data
collected through biometric measurements of 1816 trees and
destructive sampling of 36 trees, five species specific and one site
specific models were developed. The sample size was distributed
equally between the five most dominant species in the study site
(Vitellaria paradoxa, Irvingia gabonensis, Parkia biglobosa,
Anogeissus leiocarpus, Pterocarpus erinaceous). Firstly, the
equations were developed for five individual species. Secondly these
five species were mixed and were used to develop an allometric
equation of mixed species. Overall, there was a strong positive
relationship between total tree biomass and the stem diameter. The
coefficient of determination (R2 values) ranging from 0.93 to 0.99 P
< 0.001 were realised for the models; with considerable low standard
error of the estimates (SEE) which confirms that the total tree above
ground biomass has a significant relationship with the dbh. F-test
values for the biomass prediction models were also significant at p
Abstract: Previous studies on financial distress prediction choose
the conventional failing and non-failing dichotomy; however, the
distressed extent differs substantially among different financial
distress events. To solve the problem, “non-distressed”, “slightlydistressed”
and “reorganization and bankruptcy” are used in our article
to approximate the continuum of corporate financial health. This paper
explains different financial distress events using the two-stage method.
First, this investigation adopts firm-specific financial ratios, corporate
governance and market factors to measure the probability of various
financial distress events based on multinomial logit models.
Specifically, the bootstrapping simulation is performed to examine the
difference of estimated misclassifying cost (EMC). Second, this work
further applies macroeconomic factors to establish the credit cycle
index and determines the distressed cut-off indicator of the two-stage
models using such index. Two different models, one-stage and
two-stage prediction models are developed to forecast financial
distress, and the results acquired from different models are compared
with each other, and with the collected data. The findings show that the
one-stage model has the lower misclassification error rate than the
two-stage model. The one-stage model is more accurate than the
two-stage model.
Abstract: The Cone Penetration Test (CPT) is a common in-situ
test which generally investigates a much greater volume of soil more
quickly than possible from sampling and laboratory tests. Therefore,
it has the potential to realize both cost savings and assessment of soil
properties rapidly and continuously. The principle objective of this
paper is to demonstrate the feasibility and efficiency of using
artificial neural networks (ANNs) to predict the soil angle of internal
friction (Φ) and the soil modulus of elasticity (E) from CPT results
considering the uncertainties and non-linearities of the soil. In
addition, ANNs are used to study the influence of different
parameters and recommend which parameters should be included as
input parameters to improve the prediction. Neural networks discover
relationships in the input data sets through the iterative presentation
of the data and intrinsic mapping characteristics of neural topologies.
General Regression Neural Network (GRNN) is one of the powerful
neural network architectures which is utilized in this study. A large
amount of field and experimental data including CPT results, plate
load tests, direct shear box, grain size distribution and calculated data
of overburden pressure was obtained from a large project in the
United Arab Emirates. This data was used for the training and the
validation of the neural network. A comparison was made between
the obtained results from the ANN's approach, and some common
traditional correlations that predict Φ and E from CPT results with
respect to the actual results of the collected data. The results show
that the ANN is a very powerful tool. Very good agreement was
obtained between estimated results from ANN and actual measured
results with comparison to other correlations available in the
literature. The study recommends some easily available parameters
that should be included in the estimation of the soil properties to
improve the prediction models. It is shown that the use of friction
ration in the estimation of Φ and the use of fines content in the
estimation of E considerable improve the prediction models.
Abstract: Urban areas have been expanded throughout the
globe. Monitoring and modelling urban growth have become a
necessity for a sustainable urban planning and decision making.
Urban prediction models are important tools for analyzing the causes
and consequences of urban land use dynamics. The objective of this
research paper is to analyze and model the urban change, which has
been occurred from 1990 to 2000 using CORINE land cover maps.
The model was developed using drivers of urban changes (such as
road distance, slope, etc.) under an Artificial Neural Network
modelling approach. Validation was achieved using a prediction map
for 2006 which was compared with a real map of Urban Atlas of
2006. The accuracy produced a Kappa index of agreement of 0,639
and a value of Cramer's V of 0,648. These encouraging results
indicate the importance of the developed urban growth prediction
model which using a set of available common biophysical drivers
could serve as a management tool for the assessment of urban
change.
Abstract: Near infrared (NIR) spectroscopy has always been of
great interest in the food and agriculture industries. The development
of prediction models has facilitated the estimation process in recent
years. In this study, 110 crude palm oil (CPO) samples were used to
build a free fatty acid (FFA) prediction model. 60% of the collected
data were used for training purposes and the remaining 40% used for
testing. The visible peaks on the NIR spectrum were at 1725 nm and
1760 nm, indicating the existence of the first overtone of C-H bands.
Principal component regression (PCR) was applied to the data in
order to build this mathematical prediction model. The optimal
number of principal components was 10. The results showed
R2=0.7147 for the training set and R2=0.6404 for the testing set.
Abstract: Characterization of the engineering behavior of
unsaturated soil is dependent on the soil-water characteristic curve
(SWCC), a graphical representation of the relationship between water
content or degree of saturation and soil suction. A reasonable
description of the SWCC is thus important for the accurate prediction
of unsaturated soil parameters. The measurement procedures for
determining the SWCC, however, are difficult, expensive, and timeconsuming.
During the past few decades, researchers have laid a
major focus on developing empirical equations for predicting the
SWCC, with a large number of empirical models suggested. One of
the most crucial questions is how precisely existing equations can
represent the SWCC. As different models have different ranges of
capability, it is essential to evaluate the precision of the SWCC
models used for each particular soil type for better SWCC estimation.
It is expected that better estimation of SWCC would be achieved via
a thorough statistical analysis of its distribution within a particular
soil class. With this in view, a statistical analysis was conducted in
order to evaluate the reliability of the SWCC prediction models
against laboratory measurement. Optimization techniques were used
to obtain the best-fit of the model parameters in four forms of SWCC
equation, using laboratory data for relatively coarse-textured (i.e.,
sandy) soil. The four most prominent SWCCs were evaluated and
computed for each sample. The result shows that the Brooks and
Corey model is the most consistent in describing the SWCC for sand
soil type. The Brooks and Corey model prediction also exhibit
compatibility with samples ranging from low to high soil water
content in which subjected to the samples that evaluated in this study.
Abstract: This work had three stages. In the first stage was
examined pull-out process for steel fiber was embedded into a
concrete by one end and was pulled out of concrete under the angle to
pulling out force direction. Angle was varied. On the obtained forcedisplacement
diagrams were observed jumps. For such mechanical
behavior explanation, fiber channel in concrete surface microscopical
experimental investigation, using microscope KEYENCE VHX2000,
was performed.
At the second stage were obtained diagrams for load- crack
opening displacement for breaking homogeneously reinforced and
layered fiberconcrete prisms (with dimensions 10x10x40cm)
subjected to 4-point bending. After testing was analyzed main crack.
At the third stage elaborated prediction model for the fiberconcrete
beam, failure under bending, using the following data: a) diagrams
for fibers pulling out at different angles; b) experimental data about
steel-straight fibers locations in the main crack. Experimental and
theoretical (modeling) data were compared.
Abstract: Pasta is one of the most widely consumed food products around the world. Rapid determination of the moisture content in pasta will assist food processors to provide online quality control of pasta during large scale production. Rapid Fourier transform near-infrared method (FT-NIR) was developed for determining moisture content in pasta. A calibration set of 150 samples, a validation set of 30 samples and a prediction set of 25 samples of pasta were used. The diffuse reflection spectra of different types of pastas were measured by FT-NIR analyzer in the 4,000-12,000cm-1 spectral range. Calibration and validation sets were designed for the conception and evaluation of the method adequacy in the range of moisture content 10 to 15 percent (w.b) of the pasta. The prediction models based on partial least squares (PLS) regression, were developed in the near-infrared. Conventional criteria such as the R2, the root mean square errors of cross validation (RMSECV), root mean square errors of estimation (RMSEE) as well as the number of PLS factors were considered for the selection of three pre-processing (vector normalization, minimum-maximum normalization and multiplicative scatter correction) methods. Spectra of pasta sample were treated with different mathematic pre-treatments before being used to build models between the spectral information and moisture content. The moisture content in pasta predicted by FT-NIR methods had very good correlation with their values determined via traditional methods (R2 = 0.983), which clearly indicated that FT-NIR methods could be used as an effective tool for rapid determination of moisture content in pasta. The best calibration model was developed with min-max normalization (MMN) spectral pre-processing (R2 = 0.9775). The MMN pre-processing method was found most suitable and the maximum coefficient of determination (R2) value of 0.9875 was obtained for the calibration model developed.
Abstract: Conical sections and shells made from metal plates are widely used in various industrial applications. 3-roller conical bending process is preferably used to produce such conical sections and shells. Bending mechanics involved in the process is complex and little work is done in this area. In the present paper an analytical model is developed to predict bending force which will be acting during 3-roller conical bending process. To verify the developed model, conical bending experiments are performed. Analytical results and experimental results were compared. Force predicted by analytical model is in close proximity of the experimental results. The error in the prediction is ±10%. Hence the model gives quite satisfactory results. Present model is also compared with the previously published bending force prediction model and it is found that the present model gives better results. The developed model can be used to estimate the bending force during 3-roller bending process and can be useful to the designers for designing the 3-roller conical bending machine.
Abstract: Most of greenhouse growers desire a determined amount of yields in order to accurately meet market requirements. The purpose of this paper is to model a simple but often satisfactory supervised classification method. The original naive Bayes have a serious weakness, which is producing redundant predictors. In this paper, utilized regularization technique was used to obtain a computationally efficient classifier based on naive Bayes. The suggested construction, utilized L1-penalty, is capable of clearing redundant predictors, where a modification of the LARS algorithm is devised to solve this problem, making this method applicable to a wide range of data. In the experimental section, a study conducted to examine the effect of redundant and irrelevant predictors, and test the method on WSG data set for tomato yields, where there are many more predictors than data, and the urge need to predict weekly yield is the goal of this approach. Finally, the modified approach is compared with several naive Bayes variants and other classification algorithms (SVM and kNN), and is shown to be fairly good.
Abstract: Glass fiber reinforced polymer (GFRP) laminates have been widely used because of their unique mechanical and physical properties such as high specific strength, stiffness and corrosive resistance. Accordingly, the demand for precise grinding of composites has been increasing enormously. Grinding is the one of the obligatory methods for fabricating products with composite materials and it is usually the final operation in the assembly of structural laminates. In this experimental study, an attempt has been made to develop an empirical model to predict the surface roughness of ground GFRP composite laminate with respect to the influencing grinding parameters by factorial design approach of design of experiments (DOE). The significance of grinding parameters and their three factor interaction effects on grinding of GFRP composite have been analyzed in detail. An empirical equation has been developed to attain minimum surface roughness in GFRP laminate grinding.
Abstract: Uncertain data is believed to be an important issue in building up a prediction model. The main objective in the time series uncertainty analysis is to formulate uncertain data in order to gain knowledge and fit low dimensional model prior to a prediction task. This paper discusses the performance of a number of techniques in dealing with uncertain data specifically those which solve uncertain data condition by minimizing the loss of compression properties.
Abstract: This study is focused on the development of prediction models of the Ozone concentration time series. Prediction model is built based on chaotic approach. Firstly, the chaotic nature of the time series is detected by means of phase space plot and the Cao method. Then, the prediction model is built and the local linear approximation method is used for the forecasting purposes. Traditional prediction of autoregressive linear model is also built. Moreover, an improvement in local linear approximation method is also performed. Prediction models are applied to the hourly Ozone time series observed at the benchmark station in Malaysia. Comparison of all models through the calculation of mean absolute error, root mean squared error and correlation coefficient shows that the one with improved prediction method is the best. Thus, chaotic approach is a good approach to be used to develop a prediction model for the Ozone concentration time series.
Abstract: In this paper, we investigated the characteristic of a
clinical dataseton the feature selection and classification
measurements which deal with missing values problem.And also
posed the appropriated techniques to achieve the aim of the activity;
in this research aims to find features that have high effect to mortality
and mortality time frame. We quantify the complexity of a clinical
dataset. According to the complexity of the dataset, we proposed the
data mining processto cope their complexity; missing values, high
dimensionality, and the prediction problem by using the methods of
missing value replacement, feature selection, and classification.The
experimental results will extend to develop the prediction model for
cardiology.