Abstract: A data warehouse (DW) is a system which has value and role for decision-making by querying. Queries to DW are critical regarding to their complexity and length. They often access millions of tuples, and involve joins between relations and aggregations. Materialized views are able to provide the better performance for DW queries. However, these views have maintenance cost, so materialization of all views is not possible. An important challenge of DW environment is materialized view selection because we have to realize the trade-off between performance and view maintenance cost. Therefore, in this paper, we introduce a new approach aimed at solve this challenge based on Two-Phase Optimization (2PO), which is a combination of Simulated Annealing (SA) and Iterative Improvement (II), with the use of Multiple View Processing Plan (MVPP). Our experiments show that our method provides a further improvement in term of query processing cost and view maintenance cost.
Abstract: XML has become a popular standard for information exchange via web. Each XML document can be presented as a rooted, ordered, labeled tree. The Node label shows the exact position of a node in the original document. Region and Dewey encoding are two famous methods of labeling trees. In this paper, we propose a new insert friendly labeling method named IFDewey based on recently proposed scheme, called Extended Dewey. In Extended Dewey many labels must be modified when a new node is inserted into the XML tree. Our method eliminates this problem by reserving even numbers for future insertion. Numbers generated by Extended Dewey may be even or odd. IFDewey modifies Extended Dewey so that only odd numbers are generated and even numbers can then be used for a much easier insertion of nodes.
Abstract: XML files contain data which is in well formatted manner. By studying the format or semantics of the grammar it will be helpful for fast retrieval of the data. There are many algorithms which describes about searching the data from XML files. There are no. of approaches which uses data structure or are related to the contents of the document. In these cases user must know about the structure of the document and information retrieval techniques using NLPs is related to content of the document. Hence the result may be irrelevant or not so successful and may take more time to search.. This paper presents fast XML retrieval techniques by using new indexing technique and the concept of RXML. When indexing an XML document, the system takes into account both the document content and the document structure and assigns the value to each tag from file. To query the system, a user is not constrained about fixed format of query.
Abstract: The main idea behind in network aggregation is that,
rather than sending individual data items from sensors to sinks,
multiple data items are aggregated as they are forwarded by the
sensor network. Existing sensor network data aggregation techniques
assume that the nodes are preprogrammed and send data to a central
sink for offline querying and analysis. This approach faces two major
drawbacks. First, the system behavior is preprogrammed and cannot
be modified on the fly. Second, the increased energy wastage due to
the communication overhead will result in decreasing the overall
system lifetime. Thus, energy conservation is of prime consideration
in sensor network protocols in order to maximize the network-s
operational lifetime. In this paper, we give an energy efficient
approach to query processing by implementing new optimization
techniques applied to in-network aggregation. We first discuss earlier
approaches in sensors data management and highlight their
disadvantages. We then present our approach “Energy Efficient
Indexed Aggregation" (EEIA) and evaluate it through several
simulations to prove its efficiency, competence and effectiveness.
Abstract: The majority of today's IR systems base the IR task on two main processes: indexing and searching. There exists a special group of dynamic IR systems where both processes (indexing and searching) happen simultaneously; such a system discards obsolete information, simultaneously dealing with the insertion of new in¬formation, while still answering user queries. In these dynamic, time critical text document databases, it is often important to modify index structures quickly, as documents arrive. This paper presents a method for dynamization which may be used for this task. Experimental results show that the dynamization process is possible and that it guarantees the response time for the query operation and index actualization.
Abstract: In the current age, retrieval of relevant information
from massive amount of data is a challenging job. Over the years,
precise and relevant retrieval of information has attained high
significance. There is a growing need in the market to build systems,
which can retrieve multimedia information that precisely meets the
user's current needs. In this paper, we have introduced a framework
for refining query results before showing it to the user, using ambient
intelligence, user profile, group profile, user location, time, day, user
device type and extracted features. A prototypic tool was also
developed to demonstrate the efficiency of the proposed approach.
Abstract: Databases have become ubiquitous. Almost all IT applications are storing into and retrieving information from databases. Retrieving information from the database requires knowledge of technical languages such as Structured Query Language (SQL). However majority of the users who interact with the databases do not have a technical background and are intimidated by the idea of using languages such as SQL. This has led to the development of a few Natural Language Database Interfaces (NLDBIs). A NLDBI allows the user to query the database in a natural language. This paper highlights on architecture of new NLDBI system, its implementation and discusses on results obtained. In most of the typical NLDBI systems the natural language statement is converted into an internal representation based on the syntactic and semantic knowledge of the natural language. This representation is then converted into queries using a representation converter. A natural language query is translated to an equivalent SQL query after processing through various stages. The work has been experimented on primitive database queries with certain constraints.
Abstract: Unstructured peer-to-peer networks are popular due to
its robustness and scalability. Query schemes that are being used in
unstructured peer-to-peer such as the flooding and interest-based
shortcuts suffer various problems such as using large communication
overhead long delay response. The use of routing indices has been a
popular approach for peer-to-peer query routing. It helps the query
routing processes to learn the routing based on the feedbacks
collected. In an unstructured network where there is no global
information available, efficient and low cost routing approach is
needed for routing efficiency.
In this paper, we propose a novel mechanism for query-feedback
oriented routing indices to achieve routing efficiency in unstructured
network at a minimal cost. The approach also applied information
retrieval technique to make sure the content of the query is
understandable and will make the routing process not just based to
the query hits but also related to the query content. Experiments have
shown that the proposed mechanism performs more efficient than
flood-based routing.
Abstract: The data exchanged on the Web are of different nature
from those treated by the classical database management systems;
these data are called semi-structured data since they do not have a
regular and static structure like data found in a relational database;
their schema is dynamic and may contain missing data or types.
Therefore, the needs for developing further techniques and
algorithms to exploit and integrate such data, and extract relevant
information for the user have been raised. In this paper we present
the system OSIX (Osiris based System for Integration of XML
Sources). This system has a Data Warehouse model designed for the
integration of semi-structured data and more precisely for the
integration of XML documents. The architecture of OSIX relies on
the Osiris system, a DL-based model designed for the representation
and management of databases and knowledge bases. Osiris is a viewbased
data model whose indexing system supports semantic query
optimization. We show that the problem of query processing on a
XML source is optimized by the indexing approach proposed by
Osiris.
Abstract: With the rapid development in the field of life
sciences and the flooding of genomic information, the need for faster
and scalable searching methods has become urgent. One of the
approaches that were investigated is indexing. The indexing methods
have been categorized into three categories which are the lengthbased
index algorithms, transformation-based algorithms and mixed
techniques-based algorithms. In this research, we focused on the
transformation based methods. We embedded the N-gram method
into the transformation-based method to build an inverted index
table. We then applied the parallel methods to speed up the index
building time and to reduce the overall retrieval time when querying
the genomic database. Our experiments show that the use of N-Gram
transformation algorithm is an economical solution; it saves time and
space too. The result shows that the size of the index is smaller than
the size of the dataset when the size of N-Gram is 5 and 6. The
parallel N-Gram transformation algorithm-s results indicate that the
uses of parallel programming with large dataset are promising which
can be improved further.
Abstract: In this paper we present semantic assistant agent
(SAA), an open source digital library agent which takes user query
for finding information in the digital library and takes resources-
metadata and stores it semantically. SAA uses Semantic Web to
improve browsing and searching for resources in digital library. All
metadata stored in the library are available in RDF format for
querying and processing by SemanSreach which is a part of SAA
architecture. The architecture includes a generic RDF-based model
that represents relationships among objects and their components.
Queries against these relationships are supported by an RDF triple
store.
Abstract: PPX(Pretty Printer for XML) is a query language that offers a concise description method of formatting the XML data into HTML. In this paper, we propose a simple specification of formatting method that is a combination description of automatic layout operators and variables in the layout expression of the GENERATE clause of PPX. This method can automatically format irregular XML data included in a part of XML with layout decision rule that is referred to DTD. In the experiment, a quick comparison shows that PPX requires far less description compared to XSLT or XQuery programs doing same tasks.
Abstract: Images are important in disease research, education,
and clinical medicine. This paper presents a Web Service Platform
(WSP) for support multiple programming languages to access image
from biomedical databases. The main function WSP is to allow web
users access image from biomedical databases. The WSP will
receive web user-s queries. After that, it will send to Querying
Server (QS) and the QS will search and retrieve data from
biomedical databases. Finally, the information will display to the
web users. Simple application is developed and tested for
experiment purpose. Result from experiment indicated WSP can be
used in biomedical environment.
Abstract: Some meta-schedulers query the information system of individual supercomputers in order to submit jobs to the least busy supercomputer on a computational Grid. However, this information can become outdated by the time a job starts due to changes in scheduling priorities. The MSR scheme is based on Multiple Simultaneous Requests and can take advantage of opportunities resulting from these priorities changes. This paper presents the SWARM meta-scheduler, which can speed up the execution of large sets of tasks by minimizing the job queuing time through the submission of multiple requests. Performance tests have shown that this new meta-scheduler is faster than an implementation of the MSR scheme and the gLite meta-scheduler. SWARM has been used through the GridQTL project beta-testing portal during the past year. Statistics are provided for this usage and demonstrate its capacity to achieve reliably a substantial reduction of the execution time in production conditions.
Abstract: In this paper, we propose a new approach to query-by-humming, focusing on MP3 songs database. Since MP3 songs are much more difficult in melody representation than symbolic performance data, we adopt to extract feature descriptors from the vocal sounds part of the songs. Our approach is based on signal filtering, sub-band spectral processing, MDCT coefficients analysis and peak energy detection by ignorance of the background music as much as possible. Finally, we apply dual dynamic programming algorithm for feature similarity matching. Experiments will show us its online performance in precision and efficiency.
Abstract: In this paper we propose a novel approach for
searching eCommerce products using a mobile phone, illustrated by a
prototype eCoMobile. This approach aims to globalize the mobile
search by integrating the concept of user multilinguism into it. To
show that, we particularly deal with English and Arabic languages.
Indeed the mobile user can formulate his query on a commercial
product in either language (English/Arabic). The description of his
information need on commercial products relies on the ontology that
represents the conceptualization of the product catalogue knowledge
domain defined in both English and Arabic languages. A query
expressed on a mobile device client defines the concept that
corresponds to the name of the product followed by a set of pairs
(property, value) specifying the characteristics of the product. Once a
query is submitted it is then communicated to the server side which
analyses it and in its turn performs an http request to an eCommerce
application server (like Amazon). This latter responds by returning
an XML file representing a set of elements where each element
defines an item of the searched product with its specific
characteristics. The XML file is analyzed on the server side and then
items are displayed on the mobile device client along with its
relevant characteristics in the chosen language.