Abstract: The objective of our work is to develop a new approach for discovering knowledge from a large mass of data, the result of applying this approach will be an expert system that will serve as diagnostic tools of a phenomenon related to a huge information system. We first recall the general problem of learning Bayesian network structure from data and suggest a solution for optimizing the complexity by using organizational and optimization methods of data. Afterward we proposed a new heuristic of learning a Multi-Entities Bayesian Networks structures. We have applied our approach to biological facts concerning hereditary complex illnesses where the literatures in biology identify the responsible variables for those diseases. Finally we conclude on the limits arched by this work.
Abstract: In this paper we present a novel approach for human
Body configuration based on the Silhouette. We propose to address
this problem under the Bayesian framework. We use an effective
Model based MCMC (Markov Chain Monte Carlo) method to solve
the configuration problem, in which the best configuration could be
defined as MAP (maximize a posteriori probability) in Bayesian
model. This model based MCMC utilizes the human body model to
drive the MCMC sampling from the solution space. It converses the
original high dimension space into a restricted sub-space constructed
by the human model and uses a hybrid sampling algorithm. We
choose an explicit human model and carefully select the likelihood
functions to represent the best configuration solution. The
experiments show that this method could get an accurate
configuration and timesaving for different human from multi-views.
Abstract: Recent years have seen a growing trend towards the
integration of multiple information sources to support large-scale
prediction of protein-protein interaction (PPI) networks in model
organisms. Despite advances in computational approaches, the
combination of multiple “omic" datasets representing the same type
of data, e.g. different gene expression datasets, has not been
rigorously studied. Furthermore, there is a need to further investigate
the inference capability of powerful approaches, such as fullyconnected
Bayesian networks, in the context of the prediction of PPI
networks. This paper addresses these limitations by proposing a
Bayesian approach to integrate multiple datasets, some of which
encode the same type of “omic" data to support the identification of
PPI networks. The case study reported involved the combination of
three gene expression datasets relevant to human heart failure (HF).
In comparison with two traditional methods, Naive Bayesian and
maximum likelihood ratio approaches, the proposed technique can
accurately identify known PPI and can be applied to infer potentially
novel interactions.
Abstract: In general, image-based 3D scenes can now be found in many popular vision systems, computer games and virtual reality tours. So, It is important to segment ROI (region of interest) from input scenes as a preprocessing step for geometric stricture detection in 3D scene. In this paper, we propose a method for segmenting ROI based on tensor voting and Dirichlet process mixture model. In particular, to estimate geometric structure information for 3D scene from a single outdoor image, we apply the tensor voting and Dirichlet process mixture model to a image segmentation. The tensor voting is used based on the fact that homogeneous region in an image are usually close together on a smooth region and therefore the tokens corresponding to centers of these regions have high saliency values. The proposed approach is a novel nonparametric Bayesian segmentation method using Gaussian Dirichlet process mixture model to automatically segment various natural scenes. Finally, our method can label regions of the input image into coarse categories: “ground", “sky", and “vertical" for 3D application. The experimental results show that our method successfully segments coarse regions in many complex natural scene images for 3D.
Abstract: Skin color is an important visual cue for computer
vision systems involving human users. In this paper we combine skin
color and optical flow for detection and tracking of skin regions. We
apply these techniques to gesture recognition with encouraging
results. We propose a novel skin similarity measure. For grouping
detected skin regions we propose a novel skin region grouping
mechanism. The proposed techniques work with any number of skin
regions making them suitable for a multiuser scenario.
Abstract: In this paper we investigate the influence of external
noise on the inference of network structures. The purpose of our
simulations is to gain insights in the experimental design of microarray
experiments to infer, e.g., transcription regulatory networks
from microarray experiments. Here external noise means, that the
dynamics of the system under investigation, e.g., temporal changes of
mRNA concentration, is affected by measurement errors. Additionally
to external noise another problem occurs in the context of microarray
experiments. Practically, it is not possible to monitor the mRNA
concentration over an arbitrary long time period as demanded by the
statistical methods used to learn the underlying network structure. For
this reason, we use only short time series to make our simulations
more biologically plausible.
Abstract: The sanitary sewerage connection rate becomes an
important indicator of advanced cities. Following the construction of
sanitary sewerages, the maintenance and management systems are
required for keeping pipelines and facilities functioning well. These
maintenance tasks often require sewer workers to enter the manholes
and the pipelines, which are confined spaces short of natural
ventilation and full of hazardous substances. Working in sewers could
be easily exposed to a risk of adverse health effects. This paper
proposes the use of Bayesian belief networks (BBN) as a higher level
of noncarcinogenic health risk assessment of sewer workers. On the
basis of the epidemiological studies, the actual hospital attendance
records and expert experiences, the BBN is capable of capturing the
probabilistic relationships between the hazardous substances in sewers
and their adverse health effects, and accordingly inferring the
morbidity and mortality of the adverse health effects. The provision of
the morbidity and mortality rates of the related diseases is more
informative and can alleviate the drawbacks of conventional methods.
Abstract: In this paper, a new learning approach for network
intrusion detection using naïve Bayesian classifier and ID3 algorithm
is presented, which identifies effective attributes from the training
dataset, calculates the conditional probabilities for the best attribute
values, and then correctly classifies all the examples of training and
testing dataset. Most of the current intrusion detection datasets are
dynamic, complex and contain large number of attributes. Some of
the attributes may be redundant or contribute little for detection
making. It has been successfully tested that significant attribute
selection is important to design a real world intrusion detection
systems (IDS). The purpose of this study is to identify effective
attributes from the training dataset to build a classifier for network
intrusion detection using data mining algorithms. The experimental
results on KDD99 benchmark intrusion detection dataset demonstrate
that this new approach achieves high classification rates and reduce
false positives using limited computational resources.
Abstract: This research investigates risk factors for defective products in autoparts factories. Under a Bayesian framework, a generalized linear mixed model (GLMM) in which the dependent variable, the number of defective products, has a Poisson distribution is adopted. Its performance is compared with the Poisson GLM under a Bayesian framework. The factors considered are production process, machines, and workers. The products coded RT50 are observed. The study found that the Poisson GLMM is more appropriate than the Poisson GLM. For the production Process factor, the highest risk of producing defective products is Process 1, for the Machine factor, the highest risk is Machine 5, and for the Worker factor, the highest risk is Worker 6.
Abstract: In this contribution a newly developed e-learning environment is presented, which incorporates Intelligent Agents and Computational Intelligence Techniques. The new e-learning environment is constituted by three parts, the E-learning platform Front-End, the Student Questioner Reasoning and the Student Model Agent. These parts are distributed geographically in dispersed computer servers, with main focus on the design and development of these subsystems through the use of new and emerging technologies. These parts are interconnected in an interoperable way, using web services for the integration of the subsystems, in order to enhance the user modelling procedure and achieve the goals of the learning process.
Abstract: The problem of spam has been seriously troubling the Internet community during the last few years and currently reached an alarming scale. Observations made at CERN (European Organization for Nuclear Research located in Geneva, Switzerland) show that spam mails can constitute up to 75% of daily SMTP traffic. A naïve Bayesian classifier based on a Bag Of Words representation of an email is widely used to stop this unwanted flood as it combines good performance with simplicity of the training and classification processes. However, facing the constantly changing patterns of spam, it is necessary to assure online adaptability of the classifier. This work proposes combining such a classifier with another NBC (naïve Bayesian classifier) based on pairs of adjacent words. Only the latter will be retrained with examples of spam reported by users. Tests are performed on considerable sets of mails both from public spam archives and CERN mailboxes. They suggest that this architecture can increase spam recall without affecting the classifier precision as it happens when only the NBC based on single words is retrained.
Abstract: In this paper we address the problem of musical style
classification, which has a number of applications like indexing in
musical databases or automatic composition systems. Starting from
MIDI files of real-world improvisations, we extract the melody track
and cut it into overlapping segments of equal length. From these
fragments, some numerical features are extracted as descriptors of
style samples. We show that a standard Bayesian classifier can be
conveniently employed to build an effective musical style classifier,
once this set of features has been extracted from musical data.
Preliminary experimental results show the effectiveness of the
developed classifier that represents the first component of a musical
audio retrieval system
Abstract: With the development of the Internet, E-commerce is
growing at an exponential rate, and lots of online stores are built up to
sell their goods online. A major factor influencing the successful
adoption of E-commerce is consumer-s trust. For new or unknown
Internet business, consumers- lack of trust has been cited as a major
barrier to its proliferation. As web sites provide key interface for
consumer use of E-Commerce, we investigate the design of web site to
build trust in E-Commerce from a design science approach. A
conceptual model is proposed in this paper to describe the ontology of
online transaction and human-computer interaction. Based on this
conceptual model, we provide a personalized webpage design
approach using Bayesian networks learning method. Experimental
evaluation are designed to show the effectiveness of web
personalization in improving consumer-s trust in new or unknown
online store.
Abstract: This paper presents Faults Forecasting System (FFS)
that utilizes statistical forecasting techniques in analyzing process
variables data in order to forecast faults occurrences. FFS is
proposing new idea in detecting faults. Current techniques used in
faults detection are based on analyzing the current status of the
system variables in order to check if the current status is fault or not.
FFS is using forecasting techniques to predict future timing for faults
before it happens. Proposed model is applying subset modeling
strategy and Bayesian approach in order to decrease dimensionality
of the process variables and improve faults forecasting accuracy. A
practical experiment, designed and implemented in Okayama
University, Japan, is implemented, and the comparison shows that
our proposed model is showing high forecasting accuracy and
BEFORE-TIME.
Abstract: In pattern recognition applications the low level
segmentation and the high level object recognition are generally
considered as two separate steps. The paper presents a method that
bridges the gap between the low and the high level object
recognition. It is based on a Bayesian network representation and
network propagation algorithm. At the low level it uses hierarchical
structure of quadratic spline wavelet image bases. The method is
demonstrated for a simple circuit diagram component identification
problem.
Abstract: Trust is essential for further and wider acceptance of
contemporary e-services. It was first addressed almost thirty years
ago in Trusted Computer System Evaluation Criteria standard by
the US DoD. But this and other proposed approaches of that
period were actually solving security. Roughly some ten years ago,
methodologies followed that addressed trust phenomenon at its core,
and they were based on Bayesian statistics and its derivatives, while
some approaches were based on game theory. However, trust is a
manifestation of judgment and reasoning processes. It has to be dealt
with in accordance with this fact and adequately supported in cyber
environment. On the basis of the results in the field of psychology
and our own findings, a methodology called qualitative algebra has
been developed, which deals with so far overlooked elements of trust
phenomenon. It complements existing methodologies and provides a
basis for a practical technical solution that supports management of
trust in contemporary computing environments. Such solution is also
presented at the end of this paper.
Abstract: In large Internet backbones, Service Providers
typically have to explicitly manage the traffic flows in order to
optimize the use of network resources. This process is often referred
to as Traffic Engineering (TE). Common objectives of traffic
engineering include balance traffic distribution across the network
and avoiding congestion hot spots. Raj P H and SVK Raja designed
the Bayesian network approach to identify congestion hors pots in
MPLS. In this approach for every node in the network the
Conditional Probability Distribution (CPD) is specified. Based on
the CPD the congestion hot spots are identified. Then the traffic can
be distributed so that no link in the network is either over utilized or
under utilized. Although the Bayesian network approach has been
implemented in operational networks, it has a number of well known
scaling issues.
This paper proposes a new approach, which we call the Pragati
(means Progress) Node Popularity (PNP) approach to identify the
congestion hot spots with the network topology alone. In the new
Pragati Node Popularity approach, IP routing runs natively over the
physical topology rather than depending on the CPD of each node as
in Bayesian network. We first illustrate our approach with a simple
network, then present a formal analysis of the Pragati Node
Popularity approach. Our PNP approach shows that for any given
network of Bayesian approach, it exactly identifies the same result
with minimum efforts. We further extend the result to a more
generic one: for any network topology and even though the network
is loopy. A theoretical insight of our result is that the optimal routing
is always shortest path routing with respect to some considerations of
hot spots in the networks.
Abstract: One of the difficulties of the vibration-based damage identification methods is the nonuniqueness of the results of damage identification. The different damage locations and severity may cause the identical response signal, which is even more severe for detection of the multiple damage. This paper proposes a new strategy for damage detection to avoid this nonuniqueness. This strategy firstly determines the approximates damage area based on the statistical pattern recognition method using the dynamic strain signal measured by the distributed fiber Bragg grating, and then accurately evaluates the damage information based on the Bayesian model updating method using the experimental modal data. The stochastic simulation method is then used to compute the high-dimensional integral in the Bayesian problem. Finally, an experiment of the plate structure, simulating one part of mechanical structure, is used to verify the effectiveness of this approach.
Abstract: The use of a Bayesian Hierarchical Model (BHM) to interpret breath measurements obtained during a 13C Octanoic Breath Test (13COBT) is demonstrated. The statistical analysis was implemented using WinBUGS, a commercially available computer package for Bayesian inference. A hierarchical setting was adopted where poorly defined parameters associated with a delayed Gastric Emptying (GE) were able to "borrow" strength from global distributions. This is proved to be a sufficient tool to correct model's failures and data inconsistencies apparent in conventional analyses employing a Non-linear least squares technique (NLS). Direct comparison of two parameters describing gastric emptying ng ( tlag -lag phase, t1/ 2 -half emptying time) revealed a strong correlation between the two methods. Despite our large dataset ( n = 164 ), Bayesian modeling was fast and provided a successful fitting for all subjects. On the contrary, NLS failed to return acceptable estimates in cases where GE was delayed.
Abstract: Understanding proteins functions is a major goal in
the post-genomic era. Proteins usually work in context of other
proteins and rarely function alone. Therefore, it is highly relevant to
study the interaction partners of a protein in order to understand its
function. Machine learning techniques have been widely applied to
predict protein-protein interactions. Kernel functions play an
important role for a successful machine learning technique. Choosing
the appropriate kernel function can lead to a better accuracy in a
binary classifier such as the support vector machines. In this paper,
we describe a Bayesian kernel for the support vector machine to
predict protein-protein interactions. The use of Bayesian kernel can
improve the classifier performance by incorporating the probability
characteristic of the available experimental protein-protein
interactions data that were compiled from different sources. In
addition, the probabilistic output from the Bayesian kernel can assist
biologists to conduct more research on the highly predicted
interactions. The results show that the accuracy of the classifier has
been improved using the Bayesian kernel compared to the standard
SVM kernels. These results imply that protein-protein interaction can
be predicted using Bayesian kernel with better accuracy compared to
the standard SVM kernels.