A Two-Step Approach for Tree-structured XPath Query Reduction
XML data consists of a very flexible tree-structure
which makes it difficult to support the storing and retrieving of XML
data. The node numbering scheme is one of the most popular
approaches to store XML in relational databases. Together with the
node numbering storage scheme, structural joins can be used to
efficiently process the hierarchical relationships in XML. However, in
order to process a tree-structured XPath query containing several
hierarchical relationships and conditional sentences on XML data,
many structural joins need to be carried out, which results in a high
query execution cost. This paper introduces mechanisms to reduce the
XPath queries including branch nodes into a much more efficient form
with less numbers of structural joins. A two step approach is proposed.
The first step merges duplicate nodes in the tree-structured query and
the second step divides the query into sub-queries, shortens the paths
and then merges the sub-queries back together. The proposed
approach can highly contribute to the efficient execution of XML
queries. Experimental results show that the proposed scheme can
reduce the query execution cost by up to an order of magnitude of the
original execution cost.
[1] Quanzhong Li and Bongki Moon. Indexing and querying XML data for
regular path expressions. In Proc. of the 27th VLDB conference, Rome,
Italy, Sep. 2001.
[2] Chun Zhang, Jeffrey F. Naughton, Qiong Luo, and David J. DeWitt, and
Guy M. Lohman. On supporting containment queries in relational
database management systems. SIGMOD conference, Santa Barbara, CA,
USA, May 2001.
[3] Divesh Srivastava, Shurug Al-Khalifa, H. V. Jagadish, Nick Koudas,
Jinesh M. Patel, and Yuqing Wu. Structural joins: A primitive for efficient
XML query pattern matching. IEEE conference on Data Engineering, San
Jose, USA, Feb. 2002.
[4] Shu-Yao Chien, Zografoula Vagena, Donghui Zhang, vassilis J. Tsotras,
and Carlo Zaniolo. Efficient structural joins on indexed XML documents.
28th VLDB conference, p.263-274, Hong Kong, China, Aug. 2002.
[5] Yuqing Wu, Jignesh M. Patel, and H. V. Jagadish, Structural Join Order
Selection for XML Query Optimization. IEEE conference on Data
Engineering, p. 443-454, Bangalore, India, March 2003.
[6] Nicolas Bruno, Nick Koudas, and Divesh Srivastava, Holistic twig joins:
optimal XML pattern matching, SIGMOD conference, p. 311-321,
Madison, Wisconsin, USA, Jun. 2002.
[7] Haifeng Jiang, Wei Wang, Hongjun Lu, and Jeffrey Xu Yu, Holistic Twig
Joins on Indexed XML Documents, 29th VLDB conference, p. 273-284,
Berlin, Germany, Sep. 2003.
[8] Mary Fernandez and Dan Suciu. Optimizing regular path expressions
using graph schemas. IEEE Conference on Data Engineering, p. 4-13,
Orlando, Florida, Feb. 1998.
[9] Hyoseop Shin, Minsoo Lee, An Efficient Branch Query Rewriting
Algorithm for XML Query Optimization, ODBASE 2005, LNCS 3761,
Springer-Verlag, p. 1629-1639, Agia Napa, Cyprus, October 31 -
November 4, 2005.
[10] Albrecht Schmidt, Florian Waas, Martin L. Kersten, Michael J. Carey,
Ioana Manolescu, Ralph Busse. XMark: A Benchmark for XML Data
Management. In Proc. of the 28th VLDB conference, p. 974-985, Hong
Kong, China, Aug. 2002.
[11] Sleepycat Software Inc., http://www.sleepycat.com.
[1] Quanzhong Li and Bongki Moon. Indexing and querying XML data for
regular path expressions. In Proc. of the 27th VLDB conference, Rome,
Italy, Sep. 2001.
[2] Chun Zhang, Jeffrey F. Naughton, Qiong Luo, and David J. DeWitt, and
Guy M. Lohman. On supporting containment queries in relational
database management systems. SIGMOD conference, Santa Barbara, CA,
USA, May 2001.
[3] Divesh Srivastava, Shurug Al-Khalifa, H. V. Jagadish, Nick Koudas,
Jinesh M. Patel, and Yuqing Wu. Structural joins: A primitive for efficient
XML query pattern matching. IEEE conference on Data Engineering, San
Jose, USA, Feb. 2002.
[4] Shu-Yao Chien, Zografoula Vagena, Donghui Zhang, vassilis J. Tsotras,
and Carlo Zaniolo. Efficient structural joins on indexed XML documents.
28th VLDB conference, p.263-274, Hong Kong, China, Aug. 2002.
[5] Yuqing Wu, Jignesh M. Patel, and H. V. Jagadish, Structural Join Order
Selection for XML Query Optimization. IEEE conference on Data
Engineering, p. 443-454, Bangalore, India, March 2003.
[6] Nicolas Bruno, Nick Koudas, and Divesh Srivastava, Holistic twig joins:
optimal XML pattern matching, SIGMOD conference, p. 311-321,
Madison, Wisconsin, USA, Jun. 2002.
[7] Haifeng Jiang, Wei Wang, Hongjun Lu, and Jeffrey Xu Yu, Holistic Twig
Joins on Indexed XML Documents, 29th VLDB conference, p. 273-284,
Berlin, Germany, Sep. 2003.
[8] Mary Fernandez and Dan Suciu. Optimizing regular path expressions
using graph schemas. IEEE Conference on Data Engineering, p. 4-13,
Orlando, Florida, Feb. 1998.
[9] Hyoseop Shin, Minsoo Lee, An Efficient Branch Query Rewriting
Algorithm for XML Query Optimization, ODBASE 2005, LNCS 3761,
Springer-Verlag, p. 1629-1639, Agia Napa, Cyprus, October 31 -
November 4, 2005.
[10] Albrecht Schmidt, Florian Waas, Martin L. Kersten, Michael J. Carey,
Ioana Manolescu, Ralph Busse. XMark: A Benchmark for XML Data
Management. In Proc. of the 28th VLDB conference, p. 974-985, Hong
Kong, China, Aug. 2002.
[11] Sleepycat Software Inc., http://www.sleepycat.com.
@article{"International Journal of Information, Control and Computer Sciences:61919", author = "Minsoo Lee and Yun-mi Kim and Yoon-kyung Lee", title = "A Two-Step Approach for Tree-structured XPath Query Reduction", abstract = "XML data consists of a very flexible tree-structure
which makes it difficult to support the storing and retrieving of XML
data. The node numbering scheme is one of the most popular
approaches to store XML in relational databases. Together with the
node numbering storage scheme, structural joins can be used to
efficiently process the hierarchical relationships in XML. However, in
order to process a tree-structured XPath query containing several
hierarchical relationships and conditional sentences on XML data,
many structural joins need to be carried out, which results in a high
query execution cost. This paper introduces mechanisms to reduce the
XPath queries including branch nodes into a much more efficient form
with less numbers of structural joins. A two step approach is proposed.
The first step merges duplicate nodes in the tree-structured query and
the second step divides the query into sub-queries, shortens the paths
and then merges the sub-queries back together. The proposed
approach can highly contribute to the efficient execution of XML
queries. Experimental results show that the proposed scheme can
reduce the query execution cost by up to an order of magnitude of the
original execution cost.", keywords = "XML, Xpath, tree-structured query, query reduction.", volume = "1", number = "10", pages = "3237-6", }