Abstract: The volume of XML data exchange is explosively
increasing, and the need for efficient mechanisms of XML data
management is vital. Many XML storage models have been proposed
for storing XML DTD-independent documents in relational database
systems. Benchmarking is the best way to highlight pros and cons of
different approaches. In this study, we use a common benchmarking
scheme, known as XMark to compare the most cited and newly
proposed DTD-independent methods in terms of logical reads,
physical I/O, CPU time and duration. We show the effect of Label
Path, extracting values and storing in another table and type of join
needed for each method-s query answering.
Abstract: XML data consists of a very flexible tree-structure
which makes it difficult to support the storing and retrieving of XML
data. The node numbering scheme is one of the most popular
approaches to store XML in relational databases. Together with the
node numbering storage scheme, structural joins can be used to
efficiently process the hierarchical relationships in XML. However, in
order to process a tree-structured XPath query containing several
hierarchical relationships and conditional sentences on XML data,
many structural joins need to be carried out, which results in a high
query execution cost. This paper introduces mechanisms to reduce the
XPath queries including branch nodes into a much more efficient form
with less numbers of structural joins. A two step approach is proposed.
The first step merges duplicate nodes in the tree-structured query and
the second step divides the query into sub-queries, shortens the paths
and then merges the sub-queries back together. The proposed
approach can highly contribute to the efficient execution of XML
queries. Experimental results show that the proposed scheme can
reduce the query execution cost by up to an order of magnitude of the
original execution cost.
Abstract: XML is an important standard of data exchange and
representation. As a mature database system, using relational database
to support XML data may bring some advantages. But storing XML in
relational database has obvious redundancy that wastes disk space,
bandwidth and disk I/O when querying XML data. For the efficiency
of storage and query XML, it is necessary to use compressed XML
data in relational database. In this paper, a compressed relational
database technology supporting XML data is presented. Original
relational storage structure is adaptive to XPath query process. The
compression method keeps this feature. Besides traditional relational
database techniques, additional query process technologies on
compressed relations and for special structure for XML are presented.
In this paper, technologies for XQuery process in compressed
relational database are presented..
Abstract: With the tremendous growth of World Wide Web
(WWW) data, there is an emerging need for effective information
retrieval at the document level. Several query languages such as
XML-QL, XPath, XQL, Quilt and XQuery are proposed in recent
years to provide faster way of querying XML data, but they still lack of
generality and efficiency. Our approach towards evolving a framework
for querying semistructured documents is based on formal query
algebra. Two elements are introduced in the proposed framework:
first, a generic and flexible data model for logical representation of
semistructured data and second, a set of operators for the manipulation
of objects defined in the data model. In additional to accommodating
several peculiarities of semistructured data, our model offers novel
features such as bidirectional paths for navigational querying and
partitions for data transformation that are not available in other
proposals.
Abstract: XML is becoming a de facto standard for online data exchange. Existing XML filtering techniques based on a publish/subscribe model are focused on the highly structured data marked up with XML tags. These techniques are efficient in filtering the documents of data-centric XML but are not effective in filtering the element contents of the document-centric XML. In this paper, we propose an extended XPath specification which includes a special matching character '%' used in the LIKE operation of SQL in order to solve the difficulty of writing some queries to adequately filter element contents using the previous XPath specification. We also present a novel technique for filtering a collection of document-centric XMLs, called Pfilter, which is able to exploit the extended XPath specification. We show several performance studies, efficiency and scalability using the multi-query processing time (MQPT).
Abstract: The volume of XML data exchange is explosively increasing, and the need for efficient mechanisms of XML data management is vital. Many XML storage models have been proposed for storing XML DTD-independent documents in relational database systems. Benchmarking is the best way to highlight pros and cons of different approaches. In this study, we use a common benchmarking scheme, known as XMark to compare the most cited and newly proposed DTD-independent methods in terms of logical reads, physical I/O, CPU time and duration. We show the effect of Label Path, extracting values and storing in another table and type of join needed for each method's query answering.