Abstract: Data fusion technology can be the best way to extract
useful information from multiple sources of data. It has been widely
applied in various applications. This paper presents a data fusion
approach in multimedia data for event detection in twitter by using
Dempster-Shafer evidence theory. The methodology applies a mining
algorithm to detect the event. There are two types of data in the
fusion. The first is features extracted from text by using the bag-ofwords
method which is calculated using the term frequency-inverse
document frequency (TF-IDF). The second is the visual features
extracted by applying scale-invariant feature transform (SIFT). The
Dempster - Shafer theory of evidence is applied in order to fuse the
information from these two sources. Our experiments have indicated
that comparing to the approaches using individual data source, the
proposed data fusion approach can increase the prediction accuracy
for event detection. The experimental result showed that the proposed
method achieved a high accuracy of 0.97, comparing with 0.93 with
texts only, and 0.86 with images only.
Abstract: XML is a markup language which is becoming the
standard format for information representation and data exchange. A
major purpose of XML is the explicit representation of the logical
structure of a document. Much research has been performed to
exploit logical structure of documents in information retrieval in
order to precisely extract user information need from large
collections of XML documents. In this paper, we describe an XML
information retrieval weighting scheme that tries to find the most
relevant elements in XML documents in response to a user query.
We present this weighting model for information retrieval systems
that utilize plausible inferences to infer the relevance of elements in
XML documents. We also add to this model the Dempster-Shafer
theory of evidence to express the uncertainty in plausible inferences
and Dempster-Shafer rule of combination to combine evidences
derived from different inferences.
Abstract: We study the problem of decision making with Dempster-Shafer belief structure. We analyze the previous work developed by Yager about using the ordered weighted averaging (OWA) operator in the aggregation of the Dempster-Shafer decision process. We discuss the possibility of aggregating with an ascending order in the OWA operator for the cases where the smallest value is the best result. We suggest the introduction of the ordered weighted geometric (OWG) operator in the Dempster-Shafer framework. In this case, we also discuss the possibility of aggregating with an ascending order and we find that it is completely necessary as the OWG operator cannot aggregate negative numbers. Finally, we give an illustrative example where we can see the different results obtained by using the OWA, the Ascending OWA (AOWA), the OWG and the Ascending OWG (AOWG) operator.
Abstract: In this paper we propose a new knowledge model using
the Dempster-Shafer-s evidence theory for image segmentation and
fusion. The proposed method is composed essentially of two steps.
First, mass distributions in Dempster-Shafer theory are obtained from
the membership degrees of each pixel covering the three image
components (R, G and B). Each membership-s degree is determined by
applying Fuzzy C-Means (FCM) clustering to the gray levels of the
three images. Second, the fusion process consists in defining three
discernment frames which are associated with the three images to be
fused, and then combining them to form a new frame of discernment.
The strategy used to define mass distributions in the combined
framework is discussed in detail. The proposed fusion method is
illustrated in the context of image segmentation. Experimental
investigations and comparative studies with the other previous methods
are carried out showing thus the robustness and superiority of the
proposed method in terms of image segmentation.
Abstract: The network traffic data provided for the design of
intrusion detection always are large with ineffective information and
enclose limited and ambiguous information about users- activities.
We study the problems and propose a two phases approach in our
intrusion detection design. In the first phase, we develop a
correlation-based feature selection algorithm to remove the worthless
information from the original high dimensional database. Next, we
design an intrusion detection method to solve the problems of
uncertainty caused by limited and ambiguous information. In the
experiments, we choose six UCI databases and DARPA KDD99
intrusion detection data set as our evaluation tools. Empirical studies
indicate that our feature selection algorithm is capable of reducing the
size of data set. Our intrusion detection method achieves a better
performance than those of participating intrusion detectors.
Abstract: The goal of this paper is to find Wardrop equilibrium
in transport networks at case of uncertainty situations, where the
uncertainty comes from lack of information. We use simulation tool
to find the equilibrium, which gives only approximate solution, but
this is sufficient for large networks as well. In order to take the
uncertainty into account we have developed an interval-based
procedure for finding the paths with minimal cost using the
Dempster-Shafer theory. Furthermore we have investigated the users-
behaviors using game theory approach, because their path choices
influence the costs of the other users- paths.
Abstract: Knowledge is indispensable but voluminous knowledge becomes a bottleneck for efficient processing. A great challenge for data mining activity is the generation of large number of potential rules as a result of mining process. In fact sometimes result size is comparable to the original data. Traditional data mining pruning activities such as support do not sufficiently reduce the huge rule space. Moreover, many practical applications are characterized by continual change of data and knowledge, thereby making knowledge voluminous with each change. The most predominant representation of the discovered knowledge is the standard Production Rules (PRs) in the form If P Then D. Michalski & Winston proposed Censored Production Rules (CPRs), as an extension of production rules, that exhibit variable precision and supports an efficient mechanism for handling exceptions. A CPR is an augmented production rule of the form: If P Then D Unless C, where C (Censor) is an exception to the rule. Such rules are employed in situations in which the conditional statement 'If P Then D' holds frequently and the assertion C holds rarely. By using a rule of this type we are free to ignore the exception conditions, when the resources needed to establish its presence, are tight or there is simply no information available as to whether it holds or not. Thus the 'If P Then D' part of the CPR expresses important information while the Unless C part acts only as a switch changes the polarity of D to ~D. In this paper a scheme based on Dempster-Shafer Theory (DST) interpretation of a CPR is suggested for discovering CPRs from the discovered flat PRs. The discovery of CPRs from flat rules would result in considerable reduction of the already discovered rules. The proposed scheme incrementally incorporates new knowledge and also reduces the size of knowledge base considerably with each episode. Examples are given to demonstrate the behaviour of the proposed scheme. The suggested cumulative learning scheme would be useful in mining data streams.