An Efficient Graph Query Algorithm Based on Important Vertices and Decision Features

Graph has become increasingly important in modeling complicated structures and schemaless data such as proteins, chemical compounds, and XML documents. Given a graph query, it is desirable to retrieve graphs quickly from a large database via graph-based indices. Different from the existing methods, our approach, called VFM (Vertex to Frequent Feature Mapping), makes use of vertices and decision features as the basic indexing feature. VFM constructs two mappings between vertices and frequent features to answer graph queries. The VFM approach not only provides an elegant solution to the graph indexing problem, but also demonstrates how database indexing and query processing can benefit from data mining, especially frequent pattern mining. The results show that the proposed method not only avoids the enumeration method of getting subgraphs of query graph, but also effectively reduces the subgraph isomorphism tests between the query graph and graphs in candidate answer set in verification stage.




References:
[1] X. Yan, P. S. Yu, et al. "Graph Indexing: A Frequent Structure based
Approach", in Proc. SIGMOD'04, 2004, p. 335-346.
[2] S. Zhang, M. Hu, et al. "TreePi: A Novel Graph Indexing Method", in Proc.
ICDE'07, 2007, p. 966-975.
[3] J. Cheng, Y. Ke, et al. "FG-Index: Towards Verification Free Query
Processing on Graph Databases", in Proc. SIGMOD'07, 2007, p. 857-872.
[4] P. Zhao, J. X. Yu, et al. "Graph Indexing: Tree + Delta >= Graph", in Proc.
VLDB'07, 2007, p. 938-949.
[5] X. Yan, P. S. Yu, et al."Substructure Similarity Search in Graph Databases",
in Proc, SIGMOD'05, 2005,p.766-777.
[6] X. Yan, F. Zhu, et al. Searching Substructures with Superimposed
Distance", in Proc. ICDE'06, 2006, p.88.
[7] D. Shasha, J. T. Wang, et al. "Algorithmics and Applications of Tree and
Graph Searching", in Proc. PODS'02, 2002, p. 39-52.
[8] H. He, A. K. Singh. "Closure-Tree: An Index Structure for Graph Queries",
in Proc. ICDE'06, 2006, p. 38.
[9] D. W. Williams, J. Huan, et al. "Graph Database Indexing Using Structured
Graph Decomposition", in Proc. ICDE'07, 2007, p. 976-985.
[10] H. Jiang, H. Wang, et al. "GString: A Novel Approach for Efficient Search in
Graph Databases", inProc. ICDE'07, 2007, p. 566-575.
[11] L. Zou, L. Chen, et al. "A novel spectral coding in a large graph database", in
Proc. EDBT'08, 2008, p. 181-192.
[12] K. Gupta, D. Suciu. "Stream Processing of XPath Queries with Predicate"s,
in Proc. SIGMOD, 2003, p. 419-430.
[13] P. Bohannon, W. Fan, et al. Narayan, "Information Preserving XML
Schema Embedding", in Proc. VLDB, 2005, p. 85-96.
[14] M. Garey, D. Johnson, Javier Bezos. Computers and Intractability: A Guide
to the Theory of NP-Completeness. W. H. Freeman and Company, 1979.
[15] B. H. Bloom. Space/time trade-offs in hash coding with allowable errors.
Commun. ACM, 13(7):422-426, 1970.
[16] M. Kuramochi, and G. Karypis. Frequent subgraph discovery. ICDE, 2001