Data-organization Before Learning Multi-Entity Bayesian Networks Structure

The objective of our work is to develop a new approach for discovering knowledge from a large mass of data, the result of applying this approach will be an expert system that will serve as diagnostic tools of a phenomenon related to a huge information system. We first recall the general problem of learning Bayesian network structure from data and suggest a solution for optimizing the complexity by using organizational and optimization methods of data. Afterward we proposed a new heuristic of learning a Multi-Entities Bayesian Networks structures. We have applied our approach to biological facts concerning hereditary complex illnesses where the literatures in biology identify the responsible variables for those diseases. Finally we conclude on the limits arched by this work.

Human Body Configuration using Bayesian Model

In this paper we present a novel approach for human Body configuration based on the Silhouette. We propose to address this problem under the Bayesian framework. We use an effective Model based MCMC (Markov Chain Monte Carlo) method to solve the configuration problem, in which the best configuration could be defined as MAP (maximize a posteriori probability) in Bayesian model. This model based MCMC utilizes the human body model to drive the MCMC sampling from the solution space. It converses the original high dimension space into a restricted sub-space constructed by the human model and uses a hybrid sampling algorithm. We choose an explicit human model and carefully select the likelihood functions to represent the best configuration solution. The experiments show that this method could get an accurate configuration and timesaving for different human from multi-views.

An Integrative Bayesian Approach to Supporting the Prediction of Protein-Protein Interactions: A Case Study in Human Heart Failure

Recent years have seen a growing trend towards the integration of multiple information sources to support large-scale prediction of protein-protein interaction (PPI) networks in model organisms. Despite advances in computational approaches, the combination of multiple “omic" datasets representing the same type of data, e.g. different gene expression datasets, has not been rigorously studied. Furthermore, there is a need to further investigate the inference capability of powerful approaches, such as fullyconnected Bayesian networks, in the context of the prediction of PPI networks. This paper addresses these limitations by proposing a Bayesian approach to integrate multiple datasets, some of which encode the same type of “omic" data to support the identification of PPI networks. The case study reported involved the combination of three gene expression datasets relevant to human heart failure (HF). In comparison with two traditional methods, Naive Bayesian and maximum likelihood ratio approaches, the proposed technique can accurately identify known PPI and can be applied to infer potentially novel interactions.

Region Segmentation based on Gaussian Dirichlet Process Mixture Model and its Application to 3D Geometric Stricture Detection

In general, image-based 3D scenes can now be found in many popular vision systems, computer games and virtual reality tours. So, It is important to segment ROI (region of interest) from input scenes as a preprocessing step for geometric stricture detection in 3D scene. In this paper, we propose a method for segmenting ROI based on tensor voting and Dirichlet process mixture model. In particular, to estimate geometric structure information for 3D scene from a single outdoor image, we apply the tensor voting and Dirichlet process mixture model to a image segmentation. The tensor voting is used based on the fact that homogeneous region in an image are usually close together on a smooth region and therefore the tokens corresponding to centers of these regions have high saliency values. The proposed approach is a novel nonparametric Bayesian segmentation method using Gaussian Dirichlet process mixture model to automatically segment various natural scenes. Finally, our method can label regions of the input image into coarse categories: “ground", “sky", and “vertical" for 3D application. The experimental results show that our method successfully segments coarse regions in many complex natural scene images for 3D.

Combining Skin Color and Optical Flow for Computer Vision Systems

Skin color is an important visual cue for computer vision systems involving human users. In this paper we combine skin color and optical flow for detection and tracking of skin regions. We apply these techniques to gesture recognition with encouraging results. We propose a novel skin similarity measure. For grouping detected skin regions we propose a novel skin region grouping mechanism. The proposed techniques work with any number of skin regions making them suitable for a multiuser scenario.

Influence of Noise on the Inference of Dynamic Bayesian Networks from Short Time Series

In this paper we investigate the influence of external noise on the inference of network structures. The purpose of our simulations is to gain insights in the experimental design of microarray experiments to infer, e.g., transcription regulatory networks from microarray experiments. Here external noise means, that the dynamics of the system under investigation, e.g., temporal changes of mRNA concentration, is affected by measurement errors. Additionally to external noise another problem occurs in the context of microarray experiments. Practically, it is not possible to monitor the mRNA concentration over an arbitrary long time period as demanded by the statistical methods used to learn the underlying network structure. For this reason, we use only short time series to make our simulations more biologically plausible.

Health Risk Assessment for Sewer Workers using Bayesian Belief Networks

The sanitary sewerage connection rate becomes an important indicator of advanced cities. Following the construction of sanitary sewerages, the maintenance and management systems are required for keeping pipelines and facilities functioning well. These maintenance tasks often require sewer workers to enter the manholes and the pipelines, which are confined spaces short of natural ventilation and full of hazardous substances. Working in sewers could be easily exposed to a risk of adverse health effects. This paper proposes the use of Bayesian belief networks (BBN) as a higher level of noncarcinogenic health risk assessment of sewer workers. On the basis of the epidemiological studies, the actual hospital attendance records and expert experiences, the BBN is capable of capturing the probabilistic relationships between the hazardous substances in sewers and their adverse health effects, and accordingly inferring the morbidity and mortality of the adverse health effects. The provision of the morbidity and mortality rates of the related diseases is more informative and can alleviate the drawbacks of conventional methods.

Adaptive Network Intrusion Detection Learning: Attribute Selection and Classification

In this paper, a new learning approach for network intrusion detection using naïve Bayesian classifier and ID3 algorithm is presented, which identifies effective attributes from the training dataset, calculates the conditional probabilities for the best attribute values, and then correctly classifies all the examples of training and testing dataset. Most of the current intrusion detection datasets are dynamic, complex and contain large number of attributes. Some of the attributes may be redundant or contribute little for detection making. It has been successfully tested that significant attribute selection is important to design a real world intrusion detection systems (IDS). The purpose of this study is to identify effective attributes from the training dataset to build a classifier for network intrusion detection using data mining algorithms. The experimental results on KDD99 benchmark intrusion detection dataset demonstrate that this new approach achieves high classification rates and reduce false positives using limited computational resources.

Risk Factors for Defective Autoparts Products Using Bayesian Method in Poisson Generalized Linear Mixed Model

This research investigates risk factors for defective products in autoparts factories. Under a Bayesian framework, a generalized linear mixed model (GLMM) in which the dependent variable, the number of defective products, has a Poisson distribution is adopted. Its performance is compared with the Poisson GLM under a Bayesian framework. The factors considered are production process, machines, and workers. The products coded RT50 are observed. The study found that the Poisson GLMM is more appropriate than the Poisson GLM. For the production Process factor, the highest risk of producing defective products is Process 1, for the Machine factor, the highest risk is Machine 5, and for the Worker factor, the highest risk is Worker 6.

Computational Intelligence Techniques and Agents- Technology in E-learning Environments

In this contribution a newly developed e-learning environment is presented, which incorporates Intelligent Agents and Computational Intelligence Techniques. The new e-learning environment is constituted by three parts, the E-learning platform Front-End, the Student Questioner Reasoning and the Student Model Agent. These parts are distributed geographically in dispersed computer servers, with main focus on the design and development of these subsystems through the use of new and emerging technologies. These parts are interconnected in an interoperable way, using web services for the integration of the subsystems, in order to enhance the user modelling procedure and achieve the goals of the learning process.

Adaptive Naïve Bayesian Anti-Spam Engine

The problem of spam has been seriously troubling the Internet community during the last few years and currently reached an alarming scale. Observations made at CERN (European Organization for Nuclear Research located in Geneva, Switzerland) show that spam mails can constitute up to 75% of daily SMTP traffic. A naïve Bayesian classifier based on a Bag Of Words representation of an email is widely used to stop this unwanted flood as it combines good performance with simplicity of the training and classification processes. However, facing the constantly changing patterns of spam, it is necessary to assure online adaptability of the classifier. This work proposes combining such a classifier with another NBC (naïve Bayesian classifier) based on pairs of adjacent words. Only the latter will be retrained with examples of spam reported by users. Tests are performed on considerable sets of mails both from public spam archives and CERN mailboxes. They suggest that this architecture can increase spam recall without affecting the classifier precision as it happens when only the NBC based on single words is retrained.

Feature-Driven Classification of Musical Styles

In this paper we address the problem of musical style classification, which has a number of applications like indexing in musical databases or automatic composition systems. Starting from MIDI files of real-world improvisations, we extract the melody track and cut it into overlapping segments of equal length. From these fragments, some numerical features are extracted as descriptors of style samples. We show that a standard Bayesian classifier can be conveniently employed to build an effective musical style classifier, once this set of features has been extracted from musical data. Preliminary experimental results show the effectiveness of the developed classifier that represents the first component of a musical audio retrieval system

Web Personalization to Build Trust in E-Commerce: A Design Science Approach

With the development of the Internet, E-commerce is growing at an exponential rate, and lots of online stores are built up to sell their goods online. A major factor influencing the successful adoption of E-commerce is consumer-s trust. For new or unknown Internet business, consumers- lack of trust has been cited as a major barrier to its proliferation. As web sites provide key interface for consumer use of E-Commerce, we investigate the design of web site to build trust in E-Commerce from a design science approach. A conceptual model is proposed in this paper to describe the ontology of online transaction and human-computer interaction. Based on this conceptual model, we provide a personalized webpage design approach using Bayesian networks learning method. Experimental evaluation are designed to show the effectiveness of web personalization in improving consumer-s trust in new or unknown online store.

Faults Forecasting System

This paper presents Faults Forecasting System (FFS) that utilizes statistical forecasting techniques in analyzing process variables data in order to forecast faults occurrences. FFS is proposing new idea in detecting faults. Current techniques used in faults detection are based on analyzing the current status of the system variables in order to check if the current status is fault or not. FFS is using forecasting techniques to predict future timing for faults before it happens. Proposed model is applying subset modeling strategy and Bayesian approach in order to decrease dimensionality of the process variables and improve faults forecasting accuracy. A practical experiment, designed and implemented in Okayama University, Japan, is implemented, and the comparison shows that our proposed model is showing high forecasting accuracy and BEFORE-TIME.

Integrating Low and High Level Object Recognition Steps

In pattern recognition applications the low level segmentation and the high level object recognition are generally considered as two separate steps. The paper presents a method that bridges the gap between the low and the high level object recognition. It is based on a Bayesian network representation and network propagation algorithm. At the low level it uses hierarchical structure of quadratic spline wavelet image bases. The method is demonstrated for a simple circuit diagram component identification problem.

Trust Managementfor Pervasive Computing Environments

Trust is essential for further and wider acceptance of contemporary e-services. It was first addressed almost thirty years ago in Trusted Computer System Evaluation Criteria standard by the US DoD. But this and other proposed approaches of that period were actually solving security. Roughly some ten years ago, methodologies followed that addressed trust phenomenon at its core, and they were based on Bayesian statistics and its derivatives, while some approaches were based on game theory. However, trust is a manifestation of judgment and reasoning processes. It has to be dealt with in accordance with this fact and adequately supported in cyber environment. On the basis of the results in the field of psychology and our own findings, a methodology called qualitative algebra has been developed, which deals with so far overlooked elements of trust phenomenon. It complements existing methodologies and provides a basis for a practical technical solution that supports management of trust in contemporary computing environments. Such solution is also presented at the end of this paper.

Pragati Node Popularity (PNP) Approach to Identify Congestion Hot Spots in MPLS

In large Internet backbones, Service Providers typically have to explicitly manage the traffic flows in order to optimize the use of network resources. This process is often referred to as Traffic Engineering (TE). Common objectives of traffic engineering include balance traffic distribution across the network and avoiding congestion hot spots. Raj P H and SVK Raja designed the Bayesian network approach to identify congestion hors pots in MPLS. In this approach for every node in the network the Conditional Probability Distribution (CPD) is specified. Based on the CPD the congestion hot spots are identified. Then the traffic can be distributed so that no link in the network is either over utilized or under utilized. Although the Bayesian network approach has been implemented in operational networks, it has a number of well known scaling issues. This paper proposes a new approach, which we call the Pragati (means Progress) Node Popularity (PNP) approach to identify the congestion hot spots with the network topology alone. In the new Pragati Node Popularity approach, IP routing runs natively over the physical topology rather than depending on the CPD of each node as in Bayesian network. We first illustrate our approach with a simple network, then present a formal analysis of the Pragati Node Popularity approach. Our PNP approach shows that for any given network of Bayesian approach, it exactly identifies the same result with minimum efforts. We further extend the result to a more generic one: for any network topology and even though the network is loopy. A theoretical insight of our result is that the optimal routing is always shortest path routing with respect to some considerations of hot spots in the networks.

A New Damage Identification Strategy for SHM Based On FBGs and Bayesian Model Updating Method

One of the difficulties of the vibration-based damage identification methods is the nonuniqueness of the results of damage identification. The different damage locations and severity may cause the identical response signal, which is even more severe for detection of the multiple damage. This paper proposes a new strategy for damage detection to avoid this nonuniqueness. This strategy firstly determines the approximates damage area based on the statistical pattern recognition method using the dynamic strain signal measured by the distributed fiber Bragg grating, and then accurately evaluates the damage information based on the Bayesian model updating method using the experimental modal data. The stochastic simulation method is then used to compute the high-dimensional integral in the Bayesian problem. Finally, an experiment of the plate structure, simulating one part of mechanical structure, is used to verify the effectiveness of this approach.

A Bayesian Hierarchical 13COBT to Correct Estimates Associated with a Delayed Gastric Emptying

The use of a Bayesian Hierarchical Model (BHM) to interpret breath measurements obtained during a 13C Octanoic Breath Test (13COBT) is demonstrated. The statistical analysis was implemented using WinBUGS, a commercially available computer package for Bayesian inference. A hierarchical setting was adopted where poorly defined parameters associated with a delayed Gastric Emptying (GE) were able to "borrow" strength from global distributions. This is proved to be a sufficient tool to correct model's failures and data inconsistencies apparent in conventional analyses employing a Non-linear least squares technique (NLS). Direct comparison of two parameters describing gastric emptying ng ( tlag -lag phase, t1/ 2 -half emptying time) revealed a strong correlation between the two methods. Despite our large dataset ( n = 164 ), Bayesian modeling was fast and provided a successful fitting for all subjects. On the contrary, NLS failed to return acceptable estimates in cases where GE was delayed.

A Bayesian Kernel for the Prediction of Protein- Protein Interactions

Understanding proteins functions is a major goal in the post-genomic era. Proteins usually work in context of other proteins and rarely function alone. Therefore, it is highly relevant to study the interaction partners of a protein in order to understand its function. Machine learning techniques have been widely applied to predict protein-protein interactions. Kernel functions play an important role for a successful machine learning technique. Choosing the appropriate kernel function can lead to a better accuracy in a binary classifier such as the support vector machines. In this paper, we describe a Bayesian kernel for the support vector machine to predict protein-protein interactions. The use of Bayesian kernel can improve the classifier performance by incorporating the probability characteristic of the available experimental protein-protein interactions data that were compiled from different sources. In addition, the probabilistic output from the Bayesian kernel can assist biologists to conduct more research on the highly predicted interactions. The results show that the accuracy of the classifier has been improved using the Bayesian kernel compared to the standard SVM kernels. These results imply that protein-protein interaction can be predicted using Bayesian kernel with better accuracy compared to the standard SVM kernels.