Abstract: Hydrologic models are increasingly used as tools to
predict stormwater quantity and quality from urban catchments.
However, due to a range of practical issues, most models produce
gross errors in simulating complex hydraulic and hydrologic systems.
Difficulty in finding a robust approach for model calibration is one of
the main issues. Though automatic calibration techniques are
available, they are rarely used in common commercial hydraulic and
hydrologic modelling software e.g. MIKE URBAN. This is partly
due to the need for a large number of parameters and large datasets in
the calibration process. To overcome this practical issue, a
framework for automatic calibration of a hydrologic model was
developed in R platform and presented in this paper. The model was
developed based on the time-area conceptualization. Four calibration
parameters, including initial loss, reduction factor, time of
concentration and time-lag were considered as the primary set of
parameters. Using these parameters, automatic calibration was
performed using Approximate Bayesian Computation (ABC). ABC is
a simulation-based technique for performing Bayesian inference
when the likelihood is intractable or computationally expensive to
compute. To test the performance and usefulness, the technique was
used to simulate three small catchments in Gold Coast. For
comparison, simulation outcomes from the same three catchments
using commercial modelling software, MIKE URBAN were used.
The graphical comparison shows strong agreement of MIKE URBAN
result within the upper and lower 95% credible intervals of posterior
predictions as obtained via ABC. Statistical validation for posterior
predictions of runoff result using coefficient of determination (CD),
root mean square error (RMSE) and maximum error (ME) was found
reasonable for three study catchments. The main benefit of using
ABC over MIKE URBAN is that ABC provides a posterior
distribution for runoff flow prediction, and therefore associated
uncertainty in predictions can be obtained. In contrast, MIKE
URBAN just provides a point estimate. Based on the results of the
analysis, it appears as though ABC the developed framework
performs well for automatic calibration.
Abstract: The aim of this paper is to propose a general
framework for storing, analyzing, and extracting knowledge from
two-dimensional echocardiographic images, color Doppler images,
non-medical images, and general data sets. A number of high
performance data mining algorithms have been used to carry out this
task. Our framework encompasses four layers namely physical
storage, object identification, knowledge discovery, user level.
Techniques such as active contour model to identify the cardiac
chambers, pixel classification to segment the color Doppler echo
image, universal model for image retrieval, Bayesian method for
classification, parallel algorithms for image segmentation, etc., were
employed. Using the feature vector database that have been
efficiently constructed, one can perform various data mining tasks
like clustering, classification, etc. with efficient algorithms along
with image mining given a query image. All these facilities are
included in the framework that is supported by state-of-the-art user
interface (UI). The algorithms were tested with actual patient data
and Coral image database and the results show that their performance
is better than the results reported already.
Abstract: We present probabilistic multinomial Dirichlet
classification model for multidimensional data and Gaussian process
priors. Here, we have considered efficient computational method that
can be used to obtain the approximate posteriors for latent variables
and parameters needed to define the multiclass Gaussian process
classification model. We first investigated the process of inducing a
posterior distribution for various parameters and latent function by
using the variational Bayesian approximations and important sampling
method, and next we derived a predictive distribution of latent
function needed to classify new samples. The proposed model is
applied to classify the synthetic multivariate dataset in order to verify
the performance of our model. Experiment result shows that our model
is more accurate than the other approximation methods.
Abstract: In this paper, we propose the variational EM inference
algorithm for the multi-class Gaussian process classification model
that can be used in the field of human behavior recognition. This
algorithm can drive simultaneously both a posterior distribution of a
latent function and estimators of hyper-parameters in a Gaussian
process classification model with multiclass. Our algorithm is based
on the Laplace approximation (LA) technique and variational EM
framework. This is performed in two steps: called expectation and
maximization steps. First, in the expectation step, using the Bayesian
formula and LA technique, we derive approximately the posterior
distribution of the latent function indicating the possibility that each
observation belongs to a certain class in the Gaussian process
classification model. Second, in the maximization step, using a derived
posterior distribution of latent function, we compute the maximum
likelihood estimator for hyper-parameters of a covariance matrix
necessary to define prior distribution for latent function. These two
steps iteratively repeat until a convergence condition satisfies.
Moreover, we apply the proposed algorithm with human action
classification problem using a public database, namely, the KTH
human action data set. Experimental results reveal that the proposed
algorithm shows good performance on this data set.
Abstract: This paper presents an approach for the classification of
an unstructured format description for identification of file formats.
The main contribution of this work is the employment of data mining
techniques to support file format selection with just the unstructured
text description that comprises the most important format features for
a particular organisation. Subsequently, the file format indentification
method employs file format classifier and associated configurations to
support digital preservation experts with an estimation of required file
format. Our goal is to make use of a format specification knowledge
base aggregated from a different Web sources in order to select file
format for a particular institution. Using the naive Bayes method,
the decision support system recommends to an expert, the file format
for his institution. The proposed methods facilitate the selection of
file format and the quality of a digital preservation process. The
presented approach is meant to facilitate decision making for the
preservation of digital content in libraries and archives using domain
expert knowledge and specifications of file formats. To facilitate
decision-making, the aggregated information about the file formats is
presented as a file format vocabulary that comprises most common
terms that are characteristic for all researched formats. The goal is to
suggest a particular file format based on this vocabulary for analysis
by an expert. The sample file format calculation and the calculation
results including probabilities are presented in the evaluation section.
Abstract: This paper presents a rank correlation curve. The
traditional correlation coefficient is valid for both continuous
variables and for integer variables using rank statistics. Since
the correlation coefficient has already been established in rank
statistics by Spearman, such a calculation can be extended to
the correlation curve.
This paper presents two survey questions. The survey
collected non-continuous variables. We will show weak to
moderate correlation. Obviously, one question has a negative
effect on the other. A review of the qualitative literature
can answer which question and why. The rank correlation
curve shows which collection of responses has a positive
slope and which collection of responses has a negative slope.
Such information is unavailable from the flat, ”first-glance”
correlation statistics.
Abstract: Two finite element (FEM) models are presented in
this paper to address the random nature of the response of glued
timber structures made of wood segments with variable elastic
moduli evaluated from 3600 indentation measurements. This total
database served to create the same number of ensembles as was the
number of segments in the tested beam. Statistics of these ensembles
were then assigned to given segments of beams and the Latin
Hypercube Sampling (LHS) method was called to perform 100
simulations resulting into the ensemble of 100 deflections subjected
to statistical evaluation. Here, a detailed geometrical arrangement of
individual segments in the laminated beam was considered in the
construction of two-dimensional FEM model subjected to in fourpoint
bending to comply with the laboratory tests. Since laboratory
measurements of local elastic moduli may in general suffer from a
significant experimental error, it appears advantageous to exploit the
full scale measurements of timber beams, i.e. deflections, to improve
their prior distributions with the help of the Bayesian statistical
method. This, however, requires an efficient computational model
when simulating the laboratory tests numerically. To this end, a
simplified model based on Mindlin’s beam theory was established.
The improved posterior distributions show that the most significant
change of the Young’s modulus distribution takes place in laminae in
the most strained zones, i.e. in the top and bottom layers within the
beam center region. Posterior distributions of moduli of elasticity
were subsequently utilized in the 2D FEM model and compared with
the original simulations.
Abstract: In this paper, Bayesian online inference in models of
data series are constructed by change-points algorithm, which
separated the observed time series into independent series and study
the change and variation of the regime of the data with related
statistical characteristics. variation of statistical characteristics of time
series data often represent separated phenomena in the some
dynamical system, like a change in state of brain dynamical reflected
in EEG signal data measurement or a change in important regime of
data in many dynamical system. In this paper, prediction algorithm
for studying change point location in some time series data is
simulated. It is verified that pattern of proposed distribution of data
has important factor on simpler and smother fluctuation of hazard
rate parameter and also for better identification of change point
locations. Finally, the conditions of how the time series distribution
effect on factors in this approach are explained and validated with
different time series databases for some dynamical system.
Abstract: In this paper, we present a new maintenance model
for a partially observable system subject to two failure modes,
namely a catastrophic failure and a failure due to the system
degradation. The system is subject to condition monitoring and the
degradation process is described by a hidden Markov model. A
cost-optimal Bayesian control policy is developed for maintaining
the system. The control problem is formulated in the semi-Markov
decision process framework. An effective computational algorithm is
developed, illustrated by a numerical example.
Abstract: This paper proposes a linear mixed model (LMM) with spatial effects to forecast rice and cassava yields in Thailand at the same time. A multivariate conditional autoregressive (MCAR) model is assumed to present the spatial effects. A Bayesian method is used for parameter estimation via Gibbs sampling Markov Chain Monte Carlo (MCMC). The model is applied to the rice and cassava yields monthly data which have been extracted from the Office of Agricultural Economics, Ministry of Agriculture and Cooperatives of Thailand. The results show that the proposed model has better performance in most provinces in both fitting part and validation part compared to the simple exponential smoothing and conditional auto regressive models (CAR) from our previous study.
Abstract: A forecasting model for steel demand uncertainty in Thailand is proposed. It consists of trend, autocorrelation, and outliers in a hierarchical Bayesian frame work. The proposed model uses a cumulative Weibull distribution function, latent first-order autocorrelation, and binary selection, to account for trend, time-varying autocorrelation, and outliers, respectively. The Gibbs sampling Markov Chain Monte Carlo (MCMC) is used for parameter estimation. The proposed model is applied to steel demand index data in Thailand. The root mean square error (RMSE), mean absolute percentage error (MAPE), and mean absolute error (MAE) criteria are used for model comparison. The study reveals that the proposed model is more appropriate than the exponential smoothing method.
Abstract: This paper proposes a GLMM with spatial and
temporal effects for malaria data in Thailand. A Bayesian method is
used for parameter estimation via Gibbs sampling MCMC. A
conditional autoregressive (CAR) model is assumed to present the
spatial effects. The temporal correlation is presented through the
covariance matrix of the random effects. The malaria quarterly data
have been extracted from the Bureau of Epidemiology, Ministry of
Public Health of Thailand. The factors considered are rainfall and
temperature. The result shows that rainfall and temperature are
positively related to the malaria morbidity rate. The posterior means
of the estimated morbidity rates are used to construct the malaria
maps. The top 5 highest morbidity rates (per 100,000 population) are
in Trat (Q3, 111.70), Chiang Mai (Q3, 104.70), Narathiwat (Q4,
97.69), Chiang Mai (Q2, 88.51), and Chanthaburi (Q3, 86.82).
According to the DIC criterion, the proposed model has a better
performance than the GLMM with spatial effects but without
temporal terms.
Abstract: Work is in on line Arabic character recognition and the principal motivation is to study the Arab manuscript with on line technology.
This system is a Markovian system, which one can see as like a Dynamic Bayesian Network (DBN). One of the major interests of these systems resides in the complete models training (topology and parameters) starting from training data.
Our approach is based on the dynamic Bayesian Networks formalism. The DBNs theory is a Bayesians networks generalization to the dynamic processes. Among our objective, amounts finding better parameters, which represent the links (dependences) between dynamic network variables.
In applications in pattern recognition, one will carry out the fixing of the structure, which obliges us to admit some strong assumptions (for example independence between some variables). Our application will relate to the Arabic isolated characters on line recognition using our laboratory database: NOUN. A neural tester proposed for DBN external optimization.
The DBN scores and DBN mixed are respectively 70.24% and 62.50%, which lets predict their further development; other approaches taking account time were considered and implemented until obtaining a significant recognition rate 94.79%.
Abstract: A comprehensive Bayesian analysis has been carried out in the context of informative and non-informative priors for the shape parameter of the Burr type X distribution under different symmetric and asymmetric loss functions. Elicitation of hyperparameter through prior predictive approach is also discussed. Also we derive the expression for posterior predictive distributions, predictive intervals and the credible Intervals. As an illustration, comparisons of these estimators are made through simulation study.
Abstract: This paper discusses the effects of using progressive Type-I right censoring on the design of the Simple Step Accelerated Life testing using Bayesian approach for Weibull life products under the assumption of cumulative exposure model. The optimization criterion used in this paper is to minimize the expected pre-posterior variance of the Pth percentile time of failures. The model variables are the stress changing time and the stress value for the first step. A comparison between the conventional and the progressive Type-I right censoring is provided. The results have shown that the progressive Type-I right censoring reduces the cost of testing on the expense of the test precision when the sample size is small. Moreover, the results have shown that using strong priors or large sample size reduces the sensitivity of the test precision to the censoring proportion. Hence, the progressive Type-I right censoring is recommended in these cases as progressive Type-I right censoring reduces the cost of the test and doesn't affect the precision of the test a lot. Moreover, the results have shown that using direct or indirect priors affects the precision of the test.
Abstract: Sensor-based Activity Recognition systems usually accounts which sensors have been activated to perform an activity. The system then combines the conditional probabilities of those sensors to represent different activities and takes the decision based on that. However, the information about the sensors which are not activated may also be of great help in deciding which activity has been performed. This paper proposes an approach where the sensory data related to both usage and non-usage of objects are utilized to make the classification of activities. Experimental results also show the promising performance of the proposed method.
Abstract: This paper proposes a data-driven, biology-inspired neural segmentation method of 3D drosophila Brainbow images. We use Bayesian Sequential Partitioning algorithm for probabilistic modeling, which can be used to detect somas and to eliminate
crosstalk effects. This work attempts to develop an automatic methodology for neuron image segmentation, which nowadays still
lacks a complete solution due to the complexity of the image. The proposed method does not need any predetermined, risk-prone thresholds, since biological information is inherently included inside the image processing procedure. Therefore, it is less sensitive to variations in neuron morphology; meanwhile, its flexibility would be beneficial for tracing the intertwining structure of neurons.
Abstract: The article is concerned with analysis of failure rate (shape parameter) under the Topp Leone distribution using a Bayesian framework. Different loss functions and a couple of noninformative priors have been assumed for posterior estimation. The posterior predictive distributions have also been derived. A simulation study has been carried to compare the performance of different estimators. A real life example has been used to illustrate the applicability of the results obtained. The findings of the study suggest that the precautionary loss function based on Jeffreys prior and singly type II censored samples can effectively be employed to
obtain the Bayes estimate of the failure rate under Topp Leone distribution.
Abstract: Keys to high-quality face-to-face education are ensuring flexibility in the way lectures are given, and providing care and responsiveness to learners. This paper describes a face-to-face education support system that is designed to raise the satisfaction of learners and reduce the workload on instructors. This system consists of a lecture adaptation assistance part, which assists instructors in adapting teaching content and strategy, and a Q&A assistance part, which provides learners with answers to their questions. The core component of the former part is a “learning achievement map", which is composed of a Bayesian network (BN). From learners- performance in exercises on relevant past lectures, the lecture adaptation assistance part obtains information required to adapt appropriately the presentation of the next lecture. The core component of the Q&A assistance part is a case base, which accumulates cases consisting of questions expected from learners and answers to them. The Q&A assistance part is a case-based search system equipped with a search index which performs probabilistic inference. A prototype face-to-face education support system has been built, which is intended for the teaching of Java programming, and this approach was evaluated using this system. The expected degree of understanding of each learner for a future lecture was derived from his or her performance in exercises on past lectures, and this expected degree of understanding was used to select one of three adaptation levels. A model for determining the adaptation level most suitable for the individual learner has been identified. An experimental case base was built to examine the search performance of the Q&A assistance part, and it was found that the rate of successfully finding an appropriate case was 56%.
Abstract: In pattern recognition applications the low level segmentation and the high level object recognition are generally considered as two separate steps. The paper presents a method that bridges the gap between the low and the high level object recognition. It is based on a Bayesian network representation and network propagation algorithm. At the low level it uses hierarchical structure of quadratic spline wavelet image bases. The method is demonstrated for a simple circuit diagram component identification problem.