Abstract: Often the users of a semantic search application are facing the problem that they do not find appropriate terms for their search. This holds especially if the data to be searched is from a technical field in which the user does not have expertise. In order to support the user finding the results he seeks, we developed a domain-specific ontology and implemented it into a search application. The ontology serves as a knowledge base, suggesting technical terms to the user which he can add to his query. In this paper, we present the search application and the underlying ontology as well as the project EnArgus in which the application was developed.
Abstract: With a growing number of digital libraries and other
open education repositories being made available throughout the
world, effective search and retrieval tools are necessary to access the
desired materials that surpass the effectiveness of traditional, allinclusive
search engines. This paper discusses the design and use of
Folksemantic, a platform that integrates OpenCourseWare search,
Open Educational Resource recommendations, and social network
functionality into a single open source project. The paper describes
how the system was originally envisioned, its goals for users, and
data that provides insight into how it is actually being used. Data
sources include website click-through data, query logs, web server
log files and user account data. Based on a descriptive analysis of its
current use, modifications to the platform's design are recommended
to better address goals of the system, along with recommendations
for additional phases of research.
Abstract: FAQ system can make user find answer to the problem that puzzles them. But now the research on Chinese FAQ system is still on the theoretical stage. This paper presents an approach to semantic inference for FAQ mining. To enhance the efficiency, a small pool of the candidate question-answering pairs retrieved from the system for the follow-up work according to the concept of the agriculture domain extracted from user input .Input queries or questions are converted into four parts, the question word segment (QWS), the verb segment (VS), the concept of agricultural areas segment (CS), the auxiliary segment (AS). A semantic matching method is presented to estimate the similarity between the semantic segments of the query and the questions in the pool of the candidate. A thesaurus constructed from the HowNet, a Chinese knowledge base, is adopted for word similarity measure in the matcher. The questions are classified into eleven intension categories using predefined question stemming keywords. For FAQ mining, given a query, the question part and answer part in an FAQ question-answer pair is matched with the input query, respectively. Finally, the probabilities estimated from these two parts are integrated and used to choose the most likely answer for the input query. These approaches are experimented on an agriculture FAQ system. Experimental results indicate that the proposed approach outperformed the FAQ-Finder system in agriculture FAQ retrieval.
Abstract: Most of the Question Answering systems
composed of three main modules: question processing,
document processing and answer processing. Question
processing module plays an important role in QA systems. If
this module doesn't work properly, it will make problems for
other sections. Moreover answer processing module is an
emerging topic in Question Answering, where these systems
are often required to rank and validate candidate answers.
These techniques aiming at finding short and precise answers
are often based on the semantic classification.
This paper discussed about a new model for question
answering which improved two main modules, question
processing and answer processing.
There are two important components which are the bases
of the question processing. First component is question
classification that specifies types of question and answer.
Second one is reformulation which converts the user's
question into an understandable question by QA system in a
specific domain. Answer processing module, consists of
candidate answer filtering, candidate answer ordering
components and also it has a validation section for interacting
with user. This module makes it more suitable to find exact
answer. In this paper we have described question and answer
processing modules with modeling, implementing and
evaluating the system. System implemented in two versions.
Results show that 'Version No.1' gave correct answer to 70%
of questions (30 correct answers to 50 asked questions) and
'version No.2' gave correct answers to 94% of questions (47
correct answers to 50 asked questions).
Abstract: Wireless sensor network can be applied to both abominable
and military environments. A primary goal in the design of
wireless sensor networks is lifetime maximization, constrained by
the energy capacity of batteries. One well-known method to reduce
energy consumption in such networks is data aggregation. Providing
efcient data aggregation while preserving data privacy is a challenging
problem in wireless sensor networks research. In this paper,
we present privacy-preserving data aggregation scheme for additive
aggregation functions. The Cluster-based Private Data Aggregation
(CPDA)leverages clustering protocol and algebraic properties of
polynomials. It has the advantage of incurring less communication
overhead. The goal of our work is to bridge the gap between
collaborative data collection by wireless sensor networks and data
privacy. We present simulation results of our schemes and compare
their performance to a typical data aggregation scheme TAG, where
no data privacy protection is provided. Results show the efficacy and
efficiency of our schemes.
Abstract: For a spatiotemporal database management system,
I/O cost of queries and other operations is an important performance
criterion. In order to optimize this cost, an intense research on
designing robust index structures has been done in the past decade.
With these major considerations, there are still other design issues
that deserve addressing due to their direct impact on the I/O cost.
Having said this, an efficient buffer management strategy plays a key
role on reducing redundant disk access. In this paper, we proposed an
efficient buffer strategy for a spatiotemporal database index
structure, specifically indexing objects moving over a network of
roads. The proposed strategy, namely MONPAR, is based on the data
type (i.e. spatiotemporal data) and the structure of the index
structure. For the purpose of an experimental evaluation, we set up a
simulation environment that counts the number of disk accesses
while executing a number of spatiotemporal range-queries over the
index. We reiterated simulations with query sets with different
distributions, such as uniform query distribution and skewed query
distribution. Based on the comparison of our strategy with wellknown
page-replacement techniques, like LRU-based and Prioritybased
buffers, we conclude that MONPAR behaves better than its
competitors for small and medium size buffers under all used query-distributions.
Abstract: In practice, we often come across situations where it is
necessary to make decisions based on incomplete or uncertain data.
In control systems it may be due to the unknown exact mathematical
model, or its excessive complexity (e.g. nonlinearity) when it is
necessary to simplify it, respectively, to solve it using a rule base. In
the case of databases, searching data we compare a similarity
measure with of the requirements of the selection with stored data,
where both the select query and the data itself may contain vague
terms, for example in the form of linguistic qualifiers. In this paper,
we focus on the processing of uncertain data in databases and
demonstrate it on the example multi-criteria decision making in the
selection of variants, specified by higher number of technical
parameters.
Abstract: Ever increasing capacities of contemporary storage devices
inspire the vision to accumulate (personal) information without
the need of deleting old data over a long time-span. Hence the target
of SemanticLIFE project is to create a Personal Information Management
system for a human lifetime data. One of the most important
characteristics of the system is its dedication to retrieve information
in a very efficient way. By adopting user demands regarding the
reduction of ambiguities, our approach aims at a user-oriented and
yet powerful enough system with a satisfactory query performance.
We introduce the query system of SemanticLIFE, the Virtual Query
System, which uses emerging Semantic Web technologies to fulfill
users- requirements.
Abstract: P2P Networks are highly dynamic structures since
their nodes – peer users keep joining and leaving continuously. In the
paper, we study the effects of network change rates on query routing
efficiency. First we describe some background and an abstract system
model. The chosen routing technique makes use of cached metadata
from previous answer messages and also employs a mechanism for
broken path detection and metadata maintenance. Several metrics are
used to show that the protocol behaves quite well even with high rate
of node departures, but above a certain threshold it literally breaks
down and exhibits considerable efficiency degradation.
Abstract: The information on the Web increases tremendously.
A number of search engines have been developed for searching Web
information and retrieving relevant documents that satisfy the
inquirers needs. Search engines provide inquirers irrelevant
documents among search results, since the search is text-based rather
than semantic-based. Information retrieval research area has
presented a number of approaches and methodologies such as
profiling, feedback, query modification, human-computer interaction,
etc for improving search results. Moreover, information retrieval has
employed artificial intelligence techniques and strategies such as
machine learning heuristics, tuning mechanisms, user and system
vocabularies, logical theory, etc for capturing user's preferences and
using them for guiding the search based on the semantic analysis
rather than syntactic analysis. Although a valuable improvement has
been recorded on search results, the survey has shown that still
search engines users are not really satisfied with their search results.
Using ontologies for semantic-based searching is likely the key
solution. Adopting profiling approach and using ontology base
characteristics, this work proposes a strategy for finding the exact
meaning of the query terms in order to retrieve relevant information
according to user needs. The evaluation of conducted experiments
has shown the effectiveness of the suggested methodology and
conclusion is presented.
Abstract: Data warehouse is a dedicated database used for querying and reporting. Queries in this environment show special characteristics such as multidimensionality and aggregation. Exploiting the nature of queries, in this paper we propose a query driven design framework. The proposed framework is general and allows a designer to generate a schema based on a set of queries.
Abstract: This paper investigates the problem of tracking spa¬tiotemporal changes of a satellite image through the use of Knowledge Discovery in Database (KDD). The purpose of this study is to help a given user effectively discover interesting knowledge and then build prediction and decision models. Unfortunately, the KDD process for spatiotemporal data is always marked by several types of imperfections. In our paper, we take these imperfections into consideration in order to provide more accurate decisions. To achieve this objective, different KDD methods are used to discover knowledge in satellite image databases. Each method presents a different point of view of spatiotemporal evolution of a query model (which represents an extracted object from a satellite image). In order to combine these methods, we use the evidence fusion theory which considerably improves the spatiotemporal knowledge discovery process and increases our belief in the spatiotemporal model change. Experimental results of satellite images representing the region of Auckland in New Zealand depict the improvement in the overall change detection as compared to using classical methods.
Abstract: XML is a markup language which is becoming the
standard format for information representation and data exchange. A
major purpose of XML is the explicit representation of the logical
structure of a document. Much research has been performed to
exploit logical structure of documents in information retrieval in
order to precisely extract user information need from large
collections of XML documents. In this paper, we describe an XML
information retrieval weighting scheme that tries to find the most
relevant elements in XML documents in response to a user query.
We present this weighting model for information retrieval systems
that utilize plausible inferences to infer the relevance of elements in
XML documents. We also add to this model the Dempster-Shafer
theory of evidence to express the uncertainty in plausible inferences
and Dempster-Shafer rule of combination to combine evidences
derived from different inferences.
Abstract: Mobile adhoc network (MANET) is a collection of
mobile devices which form a communication network with no preexisting
wiring or infrastructure. Multiple routing protocols have
been developed for MANETs. As MANETs gain popularity, their
need to support real time applications is growing as well. Such
applications have stringent quality of service (QoS) requirements
such as throughput, end-to-end delay, and energy. Due to dynamic
topology and bandwidth constraint supporting QoS is a challenging
task. QoS aware routing is an important building block for QoS
support. The primary goal of the QoS aware protocol is to determine
the path from source to destination that satisfies the QoS
requirements. This paper proposes a new energy and delay aware
protocol called energy and delay aware TORA (EDTORA) based on
extension of Temporally Ordered Routing Protocol (TORA).Energy
and delay verifications of query packet have been done in each node.
Simulation results show that the proposed protocol has a higher
performance than TORA in terms of network lifetime, packet
delivery ratio and end-to-end delay.
Abstract: Nowadays e-Learning is more popular, in Vietnam
especially. In e-learning, materials for studying are very important.
It is necessary to design the knowledge base systems and expert
systems which support for searching, querying, solving of
problems. The ontology, which was called Computational Object
Knowledge Base Ontology (COB-ONT), is a useful tool for
designing knowledge base systems in practice. In this paper, a
design method for knowledge base systems in education using
COKB-ONT will be presented. We also present the design of a
knowledge base system that supports studying knowledge and
solving problems in higher mathematics.
Abstract: Question answering (QA) aims at retrieving precise information from a large collection of documents. Most of the Question Answering systems composed of three main modules: question processing, document processing and answer processing. Question processing module plays an important role in QA systems to reformulate questions. Moreover answer processing module is an emerging topic in QA systems, where these systems are often required to rank and validate candidate answers. These techniques aiming at finding short and precise answers are often based on the semantic relations and co-occurrence keywords. This paper discussed about a new model for question answering which improved two main modules, question processing and answer processing which both affect on the evaluation of the system operations. There are two important components which are the bases of the question processing. First component is question classification that specifies types of question and answer. Second one is reformulation which converts the user's question into an understandable question by QA system in a specific domain. The objective of an Answer Validation task is thus to judge the correctness of an answer returned by a QA system, according to the text snippet given to support it. For validating answers we apply candidate answer filtering, candidate answer ranking and also it has a final validation section by user voting. Also this paper described new architecture of question and answer processing modules with modeling, implementing and evaluating the system. The system differs from most question answering systems in its answer validation model. This module makes it more suitable to find exact answer. Results show that, from total 50 asked questions, evaluation of the model, show 92% improving the decision of the system.
Abstract: In this paper, enhanced ground proximity warning simulation and validation system is designed and implemented. First, based on square grid and sub-grid structure, the global digital terrain database is designed and constructed. Terrain data searching is implemented through querying the latitude and longitude bands and separated zones of global terrain database with the current aircraft position. A combination of dynamic scheduling and hierarchical scheduling is adopted to schedule the terrain data, and the terrain data can be read and delete dynamically in the memory. Secondly, according to the scope, distance, approach speed information etc. to the dangerous terrain in front, and using security profiles calculating method, collision threat detection is executed in real-time, and provides caution and warning alarm. According to this scheme, the implementation of the enhanced ground proximity warning simulation system is realized. Simulations are carried out to verify a good real-time in terrain display and alarm trigger, and the results show simulation system is realized correctly, reasonably and stable.
Abstract: Appeared toward 1986, the object-oriented databases
management systems had not known successes knew five years after
their birth. One of the major difficulties is the query optimization.
We propose in this paper a new approach that permits to enrich
techniques of query optimization existing in the object-oriented
databases. Seen success that knew the query optimization in the
relational model, our approach inspires itself of these optimization
techniques and enriched it so that they can support the new concepts
introduced by the object databases.
Abstract: In this paper we focus on event extraction from Tamil
news article. This system utilizes a scoring scheme for extracting and
grouping event-specific sentences. Using this scoring scheme eventspecific
clustering is performed for multiple documents. Events are
extracted from each document using a scoring scheme based on
feature score and condition score. Similarly event specific sentences
are clustered from multiple documents using this scoring scheme.
The proposed system builds the Event Template based on user
specified query. The templates are filled with event specific details
like person, location and timeline extracted from the formed clusters.
The proposed system applies these methodologies for Tamil news
articles that have been enconverted into UNL graphs using a Tamil to
UNL-enconverter. The main intention of this work is to generate an
event based template.
Abstract: In the world of Peer-to-Peer (P2P) networking
different protocols have been developed to make the resource sharing
or information retrieval more efficient. The SemPeer protocol is a
new layer on Gnutella that transforms the connections of the nodes
based on semantic information to make information retrieval more
efficient. However, this transformation causes high clustering in the
network that decreases the number of nodes reached, therefore the
probability of finding a document is also decreased. In this paper we
describe a mathematical model for the Gnutella and SemPeer
protocols that captures clustering-related issues, followed by a
proposition to modify the SemPeer protocol to achieve moderate
clustering. This modification is a sort of link management for the
individual nodes that allows the SemPeer protocol to be more
efficient, because the probability of a successful query in the P2P
network is reasonably increased. For the validation of the models, we
evaluated a series of simulations that supported our results.