An Efficient Graph Query Algorithm Based on Important Vertices and Decision Features
Graph has become increasingly important in modeling
complicated structures and schemaless data such as proteins, chemical
compounds, and XML documents. Given a graph query, it is desirable
to retrieve graphs quickly from a large database via graph-based
indices. Different from the existing methods, our approach, called
VFM (Vertex to Frequent Feature Mapping), makes use of vertices
and decision features as the basic indexing feature. VFM constructs
two mappings between vertices and frequent features to answer graph
queries. The VFM approach not only provides an elegant solution to
the graph indexing problem, but also demonstrates how database
indexing and query processing can benefit from data mining,
especially frequent pattern mining. The results show that the proposed
method not only avoids the enumeration method of getting subgraphs
of query graph, but also effectively reduces the subgraph isomorphism
tests between the query graph and graphs in candidate answer set in
verification stage.
[1] X. Yan, P. S. Yu, et al. "Graph Indexing: A Frequent Structure based
Approach", in Proc. SIGMOD'04, 2004, p. 335-346.
[2] S. Zhang, M. Hu, et al. "TreePi: A Novel Graph Indexing Method", in Proc.
ICDE'07, 2007, p. 966-975.
[3] J. Cheng, Y. Ke, et al. "FG-Index: Towards Verification Free Query
Processing on Graph Databases", in Proc. SIGMOD'07, 2007, p. 857-872.
[4] P. Zhao, J. X. Yu, et al. "Graph Indexing: Tree + Delta >= Graph", in Proc.
VLDB'07, 2007, p. 938-949.
[5] X. Yan, P. S. Yu, et al."Substructure Similarity Search in Graph Databases",
in Proc, SIGMOD'05, 2005,p.766-777.
[6] X. Yan, F. Zhu, et al. Searching Substructures with Superimposed
Distance", in Proc. ICDE'06, 2006, p.88.
[7] D. Shasha, J. T. Wang, et al. "Algorithmics and Applications of Tree and
Graph Searching", in Proc. PODS'02, 2002, p. 39-52.
[8] H. He, A. K. Singh. "Closure-Tree: An Index Structure for Graph Queries",
in Proc. ICDE'06, 2006, p. 38.
[9] D. W. Williams, J. Huan, et al. "Graph Database Indexing Using Structured
Graph Decomposition", in Proc. ICDE'07, 2007, p. 976-985.
[10] H. Jiang, H. Wang, et al. "GString: A Novel Approach for Efficient Search in
Graph Databases", inProc. ICDE'07, 2007, p. 566-575.
[11] L. Zou, L. Chen, et al. "A novel spectral coding in a large graph database", in
Proc. EDBT'08, 2008, p. 181-192.
[12] K. Gupta, D. Suciu. "Stream Processing of XPath Queries with Predicate"s,
in Proc. SIGMOD, 2003, p. 419-430.
[13] P. Bohannon, W. Fan, et al. Narayan, "Information Preserving XML
Schema Embedding", in Proc. VLDB, 2005, p. 85-96.
[14] M. Garey, D. Johnson, Javier Bezos. Computers and Intractability: A Guide
to the Theory of NP-Completeness. W. H. Freeman and Company, 1979.
[15] B. H. Bloom. Space/time trade-offs in hash coding with allowable errors.
Commun. ACM, 13(7):422-426, 1970.
[16] M. Kuramochi, and G. Karypis. Frequent subgraph discovery. ICDE, 2001
[1] X. Yan, P. S. Yu, et al. "Graph Indexing: A Frequent Structure based
Approach", in Proc. SIGMOD'04, 2004, p. 335-346.
[2] S. Zhang, M. Hu, et al. "TreePi: A Novel Graph Indexing Method", in Proc.
ICDE'07, 2007, p. 966-975.
[3] J. Cheng, Y. Ke, et al. "FG-Index: Towards Verification Free Query
Processing on Graph Databases", in Proc. SIGMOD'07, 2007, p. 857-872.
[4] P. Zhao, J. X. Yu, et al. "Graph Indexing: Tree + Delta >= Graph", in Proc.
VLDB'07, 2007, p. 938-949.
[5] X. Yan, P. S. Yu, et al."Substructure Similarity Search in Graph Databases",
in Proc, SIGMOD'05, 2005,p.766-777.
[6] X. Yan, F. Zhu, et al. Searching Substructures with Superimposed
Distance", in Proc. ICDE'06, 2006, p.88.
[7] D. Shasha, J. T. Wang, et al. "Algorithmics and Applications of Tree and
Graph Searching", in Proc. PODS'02, 2002, p. 39-52.
[8] H. He, A. K. Singh. "Closure-Tree: An Index Structure for Graph Queries",
in Proc. ICDE'06, 2006, p. 38.
[9] D. W. Williams, J. Huan, et al. "Graph Database Indexing Using Structured
Graph Decomposition", in Proc. ICDE'07, 2007, p. 976-985.
[10] H. Jiang, H. Wang, et al. "GString: A Novel Approach for Efficient Search in
Graph Databases", inProc. ICDE'07, 2007, p. 566-575.
[11] L. Zou, L. Chen, et al. "A novel spectral coding in a large graph database", in
Proc. EDBT'08, 2008, p. 181-192.
[12] K. Gupta, D. Suciu. "Stream Processing of XPath Queries with Predicate"s,
in Proc. SIGMOD, 2003, p. 419-430.
[13] P. Bohannon, W. Fan, et al. Narayan, "Information Preserving XML
Schema Embedding", in Proc. VLDB, 2005, p. 85-96.
[14] M. Garey, D. Johnson, Javier Bezos. Computers and Intractability: A Guide
to the Theory of NP-Completeness. W. H. Freeman and Company, 1979.
[15] B. H. Bloom. Space/time trade-offs in hash coding with allowable errors.
Commun. ACM, 13(7):422-426, 1970.
[16] M. Kuramochi, and G. Karypis. Frequent subgraph discovery. ICDE, 2001
@article{"International Journal of Information, Control and Computer Sciences:58899", author = "Xiantong Li and Jianzhong Li", title = "An Efficient Graph Query Algorithm Based on Important Vertices and Decision Features", abstract = "Graph has become increasingly important in modeling
complicated structures and schemaless data such as proteins, chemical
compounds, and XML documents. Given a graph query, it is desirable
to retrieve graphs quickly from a large database via graph-based
indices. Different from the existing methods, our approach, called
VFM (Vertex to Frequent Feature Mapping), makes use of vertices
and decision features as the basic indexing feature. VFM constructs
two mappings between vertices and frequent features to answer graph
queries. The VFM approach not only provides an elegant solution to
the graph indexing problem, but also demonstrates how database
indexing and query processing can benefit from data mining,
especially frequent pattern mining. The results show that the proposed
method not only avoids the enumeration method of getting subgraphs
of query graph, but also effectively reduces the subgraph isomorphism
tests between the query graph and graphs in candidate answer set in
verification stage.", keywords = "Decision Feature, Frequent Feature, Graph Dataset,Graph Query", volume = "3", number = "11", pages = "2650-8", }