Abstract: Some properties of Intuitionistic Fuzzy (IF) rough relational algebraic operators under an IF rough relational data model are investigated and illustrated using diabetes and heart disease databases. These properties are important and desirable for processing queries in an effective and efficient manner.
Abstract: This paper investigates a new data mining capability that entails mining of High Utility Itemsets (HUI) in a distributed environment. Existing research in data mining deals with only presence or absence of an items and do not consider the semantic measures like weight or cost of the items. Thus, HUI mining algorithm has evolved. HUI mining is the one kind of utility mining concept, aims to identify itemsets whose utility satisfies a given threshold. Although, the approach of mining HUIs in a distributed environment and mining of the same from XML data have not explored yet. In this work, a novel approach is proposed to mine HUIs from the XML based data in a distributed environment. This work utilizes Service Oriented Computing (SOC) paradigm which provides Knowledge as a Service (KaaS). The interesting patterns are provided via the web services with the help of knowledge server to answer the queries of the consumers. The performance of the approach is evaluated on various databases using execution time and memory consumption.
Abstract: FAQ system can make user find answer to the problem that puzzles them. But now the research on Chinese FAQ system is still on the theoretical stage. This paper presents an approach to semantic inference for FAQ mining. To enhance the efficiency, a small pool of the candidate question-answering pairs retrieved from the system for the follow-up work according to the concept of the agriculture domain extracted from user input .Input queries or questions are converted into four parts, the question word segment (QWS), the verb segment (VS), the concept of agricultural areas segment (CS), the auxiliary segment (AS). A semantic matching method is presented to estimate the similarity between the semantic segments of the query and the questions in the pool of the candidate. A thesaurus constructed from the HowNet, a Chinese knowledge base, is adopted for word similarity measure in the matcher. The questions are classified into eleven intension categories using predefined question stemming keywords. For FAQ mining, given a query, the question part and answer part in an FAQ question-answer pair is matched with the input query, respectively. Finally, the probabilities estimated from these two parts are integrated and used to choose the most likely answer for the input query. These approaches are experimented on an agriculture FAQ system. Experimental results indicate that the proposed approach outperformed the FAQ-Finder system in agriculture FAQ retrieval.
Abstract: For a spatiotemporal database management system,
I/O cost of queries and other operations is an important performance
criterion. In order to optimize this cost, an intense research on
designing robust index structures has been done in the past decade.
With these major considerations, there are still other design issues
that deserve addressing due to their direct impact on the I/O cost.
Having said this, an efficient buffer management strategy plays a key
role on reducing redundant disk access. In this paper, we proposed an
efficient buffer strategy for a spatiotemporal database index
structure, specifically indexing objects moving over a network of
roads. The proposed strategy, namely MONPAR, is based on the data
type (i.e. spatiotemporal data) and the structure of the index
structure. For the purpose of an experimental evaluation, we set up a
simulation environment that counts the number of disk accesses
while executing a number of spatiotemporal range-queries over the
index. We reiterated simulations with query sets with different
distributions, such as uniform query distribution and skewed query
distribution. Based on the comparison of our strategy with wellknown
page-replacement techniques, like LRU-based and Prioritybased
buffers, we conclude that MONPAR behaves better than its
competitors for small and medium size buffers under all used query-distributions.
Abstract: The proliferation of web application and the pervasiveness of mobile technology make web-based attacks even more attractive and even easier to launch. Web Application Firewall (WAF) is an intermediate tool between web server and users that provides comprehensive protection for web application. WAF is a negative security model where the detection and prevention mechanisms are based on predefined or user-defined attack signatures and patterns. However, WAF alone is not adequate to offer best defensive system against web vulnerabilities that are increasing in number and complexity daily. This paper presents a methodology to automatically design a positive security based model which identifies and allows only legitimate web queries. The paper shows a true positive rate of more than 90% can be achieved.
Abstract: Data warehouse is a dedicated database used for querying and reporting. Queries in this environment show special characteristics such as multidimensionality and aggregation. Exploiting the nature of queries, in this paper we propose a query driven design framework. The proposed framework is general and allows a designer to generate a schema based on a set of queries.
Abstract: Starting from the basic pillars of the supportability
analysis this paper queries its characteristics in LCI (Life Cycle
Integration) environment. The research methodology contents a
review of modern logistics engineering literature with the objective to
collect and synthesize the knowledge relating to standards of
supportability design in e-logistics environment. The results show
that LCI framework has properties which are in fully compatibility
with the requirement of simultaneous logistics support and productservice
bundle design. The proposed approach is a contribution to the
more comprehensive and efficient supportability design process.
Also, contributions are reflected through a greater consistency of
collected data, automated creation of reports suitable for different
analysis, as well as the possibility of their customization according
with customer needs. In addition to this, convenience of this approach
is its practical use in real time. In a broader sense, LCI allows
integration of enterprises on a worldwide basis facilitating electronic
business.
Abstract: In this paper, we propose effective system for digital music retrieval. We divided proposed system into Client and Server. Client part consists of pre-processing and Content-based feature extraction stages. In pre-processing stage, we minimized Time code Gap that is occurred among same music contents. As content-based feature, first-order differentiated MFCC were used. These presented approximately envelop of music feature sequences. Server part included Music Server and Music Matching stage. Extracted features from 1,000 digital music files were stored in Music Server. In Music Matching stage, we found retrieval result through similarity measure by DTW. In experiment, we used 450 queries. These were made by mixing different compression standards and sound qualities from 50 digital music files. Retrieval accurate indicated 97% and retrieval time was average 15ms in every single query. Out experiment proved that proposed system is effective in retrieve digital music and robust at various user environments of web.
Abstract: XML data consists of a very flexible tree-structure
which makes it difficult to support the storing and retrieving of XML
data. The node numbering scheme is one of the most popular
approaches to store XML in relational databases. Together with the
node numbering storage scheme, structural joins can be used to
efficiently process the hierarchical relationships in XML. However, in
order to process a tree-structured XPath query containing several
hierarchical relationships and conditional sentences on XML data,
many structural joins need to be carried out, which results in a high
query execution cost. This paper introduces mechanisms to reduce the
XPath queries including branch nodes into a much more efficient form
with less numbers of structural joins. A two step approach is proposed.
The first step merges duplicate nodes in the tree-structured query and
the second step divides the query into sub-queries, shortens the paths
and then merges the sub-queries back together. The proposed
approach can highly contribute to the efficient execution of XML
queries. Experimental results show that the proposed scheme can
reduce the query execution cost by up to an order of magnitude of the
original execution cost.
Abstract: Due to new distributed database applications such as
huge deductive database systems, the search complexity is constantly
increasing and we need better algorithms to speedup traditional
relational database queries. An optimal dynamic programming
method for such high dimensional queries has the big disadvantage of
its exponential order and thus we are interested in semi-optimal but
faster approaches. In this work we present a multi-agent based
mechanism to meet this demand and also compare the result with
some commonly used query optimization algorithms.
Abstract: Semantic query optimization consists in restricting the
search space in order to reduce the set of objects of interest for a
query. This paper presents an indexing method based on UB-trees
and a static analysis of the constraints associated to the views of the
database and to any constraint expressed on attributes. The result of
the static analysis is a partitioning of the object space into disjoint
blocks. Through Space Filling Curve (SFC) techniques, each
fragment (block) of the partition is assigned a unique identifier,
enabling the efficient indexing of fragments by UB-trees. The search
space corresponding to a range query is restricted to a subset of the
blocks of the partition. This approach has been developed in the
context of a KB-DBMS but it can be applied to any relational
system.
Abstract: Content-Based Image Retrieval (CBIR) has been
one on the most vivid research areas in the field of computer vision
over the last 10 years. Many programs and tools have been
developed to formulate and execute queries based on the visual or
audio content and to help browsing large multimedia repositories.
Still, no general breakthrough has been achieved with respect to
large varied databases with documents of difering sorts and with
varying characteristics. Answers to many questions with respect to
speed, semantic descriptors or objective image interpretations are
still unanswered. In the medical field, images, and especially
digital images, are produced in ever increasing quantities and used
for diagnostics and therapy. In several articles, content based
access to medical images for supporting clinical decision making
has been proposed that would ease the management of clinical data
and scenarios for the integration of content-based access methods
into Picture Archiving and Communication Systems (PACS) have
been created. This paper gives an overview of soft computing
techniques. New research directions are being defined that can
prove to be useful. Still, there are very few systems that seem to be
used in clinical practice. It needs to be stated as well that the goal
is not, in general, to replace text based retrieval methods as they
exist at the moment.
Abstract: This paper proposes rough set models with three
different level knowledge granules in incomplete information system
under tolerance relation by similarity between objects according to
their attribute values. Through introducing dominance relation on the
discourse to decompose similarity classes into three subclasses: little
better subclass, little worse subclass and vague subclass, it dismantles
lower and upper approximations into three components. By using
these components, retrieving information to find naturally hierarchical
expansions to queries and constructing answers to elaborative queries
can be effective. It illustrates the approach in applying rough set
models in the design of information retrieval system to access different
granular expanded documents. The proposed method enhances rough
set model application in the flexibility of expansions and elaborative
queries in information retrieval.
Abstract: XML is becoming a de facto standard for online data exchange. Existing XML filtering techniques based on a publish/subscribe model are focused on the highly structured data marked up with XML tags. These techniques are efficient in filtering the documents of data-centric XML but are not effective in filtering the element contents of the document-centric XML. In this paper, we propose an extended XPath specification which includes a special matching character '%' used in the LIKE operation of SQL in order to solve the difficulty of writing some queries to adequately filter element contents using the previous XPath specification. We also present a novel technique for filtering a collection of document-centric XMLs, called Pfilter, which is able to exploit the extended XPath specification. We show several performance studies, efficiency and scalability using the multi-query processing time (MQPT).
Abstract: The issue of real-time and reliable report delivery is extremely important for taking effective decision in a real world mission critical Wireless Sensor Network (WSN) based application. The sensor data behaves differently in many ways from the data in traditional databases. WSNs need a mechanism to register, process queries, and disseminate data. In this paper we propose an architectural framework for data placement and management. We propose a reliable and real time approach for data placement and achieving data integrity using self organized sensor clusters. Instead of storing information in individual cluster heads as suggested in some protocols, in our architecture we suggest storing of information of all clusters within a cell in the corresponding base station. For data dissemination and action in the wireless sensor network we propose to use Action and Relay Stations (ARS). To reduce average energy dissipation of sensor nodes, the data is sent to the nearest ARS rather than base station. We have designed our architecture in such a way so as to achieve greater energy savings, enhanced availability and reliability.
Abstract: Database management systems that integrate user preferences promise better solution for personalization, greater flexibility and higher quality of query responses. This paper presents a tentative work that studies and investigates approaches to express user preferences in queries. We sketch an extend capabilities of SQLf language that uses the fuzzy set theory in order to define the user preferences. For that, two essential points are considered: the first concerns the expression of user preferences in SQLf by so-called fuzzy commensurable predicates set. The second concerns the bipolar way in which these user preferences are expressed on mandatory and/or optional preferences.
Abstract: Our work is part of the heterogeneous data
integration, with the definition of a structural and semantic mediation
model. Our aim is to propose architecture for the heterogeneous
sources metadata mediation, represented by XML, RDF and RuleML
models, providing to the user the metadata transparency. This, by
including data structures, of natures fundamentally different, and
allowing the decomposition of a query involving multiple sources, to
queries specific to these sources, then recompose the result.
Abstract: A data warehouse (DW) is a system which has value and role for decision-making by querying. Queries to DW are critical regarding to their complexity and length. They often access millions of tuples, and involve joins between relations and aggregations. Materialized views are able to provide the better performance for DW queries. However, these views have maintenance cost, so materialization of all views is not possible. An important challenge of DW environment is materialized view selection because we have to realize the trade-off between performance and view maintenance. Therefore, in this paper, we introduce a new approach aimed to solve this challenge based on Two-Phase Optimization (2PO), which is a combination of Simulated Annealing (SA) and Iterative Improvement (II), with the use of Multiple View Processing Plan (MVPP). Our experiments show that 2PO outperform the original algorithms in terms of query processing cost and view maintenance cost.
Abstract: Modern spatial database management systems require a unique Spatial Access Method (SAM) in order solve complex spatial quires efficiently. In this case the spatial data structure takes a prominent place in the SAM. Inadequate data structure leads forming poor algorithmic choices and forging deficient understandings of algorithm behavior on the spatial database. A key step in developing a better semantic spatial object data structure is to quantify the performance effects of semantic and outlier detections that are not reflected in the previous tree structures (R-Tree and its variants). This paper explores a novel SSRO-Tree on SAM to the Topo-Semantic approach. The paper shows how to identify and handle the semantic spatial objects with outlier objects during page overflow/underflow, using gain/loss metrics. We introduce a new SSRO-Tree algorithm which facilitates the achievement of better performance in practice over algorithms that are superior in the R*-Tree and RO-Tree by considering selection queries.
Abstract: This paper proposes a new model to support user
queries on postgraduate research information at Universiti Tenaga
Nasional. The ontology to be developed will contribute towards
shareable and reusable domain knowledge that makes knowledge
assets intelligently accessible to both people and software. This work
adapts a methodology for ontology development based on the
framework proposed by Uschold and King. The concepts and
relations in this domain are represented in a class diagram using the
Protégé software. The ontology will be used to support a menudriven
query system for assisting students in searching for
information related to postgraduate research at the university.