A New Model for Discovering XML Association Rules from XML Documents
The inherent flexibilities of XML in both structure
and semantics makes mining from XML data a complex task with
more challenges compared to traditional association rule mining in
relational databases. In this paper, we propose a new model for the
effective extraction of generalized association rules form a XML
document collection. We directly use frequent subtree mining
techniques in the discovery process and do not ignore the tree
structure of data in the final rules. The frequent subtrees based on the
user provided support are split to complement subtrees to form the
rules. We explain our model within multi-steps from data preparation
to rule generation.
[1] Braga D., A. Campi, M. Klemettinen, and P. L. Lanzi. Mining
association rules from XML data. In Proceedings of the 4th International
Conference on Data Warehousing and Knowledge Discovery, September
4-6, Aixen-Provence, France 2002.
[2] Feng L. & T. Dillon. Mining XML-Enabled Association Rule with
Templates. In Proceedings of KDID04, 2004.
[3] Nayak, R. Discovering Knowledge from XML Documents, in Wong,
John, Eds. Encyclopedia of Data Warehousing and Mining. Idea Group
Publications, 2005.
[4] Tan, H., T.S. Dillon, L. Feng, E. Chang, F. Hadzic, "X3-Miner: Mining
Patterns from XML Database," In Proc. Data Mining '05. Skiathos,
Greece, 2005.
[5] M. .J. Zaki, "Efficiently Mining Frequent Trees in a Forest: Algorithms
and Applications," in IEEE Transaction on Knowledge and Data
Engineering, vol. 17, no. 8, pp. 1021-1035, 2005.
[6] M. J. Zaki,.. "Efficient Mining of Trees in the Forest". SIGKDD '02,
Edmonton, Alberta, Canada, ACM. 2002.
[7] Y. Chi, S. Nijssen, R.R. Muntz, J. N. Kok, "Frequent Subtree Mining An
Overview," Fundamental Informatics, Special Issue on Graph and Tree
Mining, 2005.
[8] R. Agrawal, H. Mannila, R. Srikant, H. Toivonen, and A. Inkeri
Verkamo, "Fast Discovery of Association Rules," Advances in
Knowledge Discovery, and Data Mining, U. Fayyad et al., eds.,pp. 307-
328, Menlo Park, Calif.: AAAI Press, 1996.
[9] R. AliMohammadzadeh, M. Haghir Chehreghani, A. Zarnani, M.
Rahgozar, "W3-Miner: Mining Weighted Frequent Subtree Patterns in a
Collection of Trees". In Proceedings of the Second International
Conference on Pattern Analysis (Budapest, Hungary, May 26-28, 2006).
ICPA-06. Transaction on Engineering, Computing and Technology,
ISSN 1305-5313, Pages 164-168, World Enformatika Society.
[10] M. Zaki. Efficiently mining frequent embedded unordered trees.
Fundamental Informatics, 65:1-20, 2005.
[11] M. J. Zaki and C. C. Aggarwal. XRules: An effective structural classifier
for XML data. In Proc. of the 2003 Int. Conf. Knowledge Discovery and
Data Mining, 2003.
[12] K. Abe, S. Kawasoe, T. Asai, H. Arimura, and S. Arikawa, "Optimized
Substructure Discovery for Semi-structured Data," In Proc. PKDD-02,
1-14, LNAI 2431, 2002.
[13] T. Asai, H. Arimura, T. Uno, and S. Nakano. Discovering frequent
substructures in large unordered trees. In Proc. of the 6th Intl. Conf. on
Discovery Science, 2003.
[14] Y. Chi, Y. Yang, and R. R. Muntz. Mining frequent rooted trees and free
trees using canonical forms. Technical Report CSD-TR No. 030043,
UCLA, 2003.
[15] H. Tan, T.S. Dillon, L. Feng, E. Chang, F. Hadzic, "X3-Miner: Mining
Patterns from XML Database," In Proc. Data Mining '05. Skiathos,
Greece, 2005.
[16] K. Wang and H. Liu, "Discovering Typical Structures of Documents: A
Road Map Approach," Proc. ACM SIGIR Conf. Information Retrieval,
1998.
[17] Y. Chi, Y. Yang, and R.R. Muntz, "Indexing and Mining Free Trees,"
Proc. Third IEEE Int-l Conf. Data Mining, 2003.
[18] U. Ruckert and S. Kramer, "Frequent Free Tree Discovery in Graph
Data," Special Track on Data Mining, Proc. ACM Symp. Applied
Computing, 2004.
[19] Y. Xiao, J.-F. Yao, Z. Li, and M.H. Dunham, "Efficient Data Mining for
Maximal Frequent Subtrees," Proc. Int-l Conf. Data Mining, 2003.
[20] S. Nijssen and J.N. Kok, "Efficient Discovery of Frequent Unordered
Trees," Proc. First Int-l Workshop Mining Graphs, Trees, and
Sequences, 2003.
[21] Y. Chi, Y. Yang, and R.R. Muntz, "HybridTreeMiner: An Efficient
Algorihtm for Mining Frequent Rooted Trees and Free Trees Using
Canonical Forms," Proc. 16th Int-l Conf. Scientific and Statistical
Database Management, 2004.
[22] A. Termier, M-C. Rousset, and M. Sebag, "Treefinder: A First Step
Towards XML Data Mining," Proc. IEEE Int-l Conf. Data Mining, 2002.
[23] D. Shasha, J. Wang, and S. Zhang, "Unordered Tree Mining with
Applications to Phylogeny," Proc. Int-l Conf. Data Eng., 2004.
[24] C. Wang, M. Hong, J. Pei, H. Zhou, W. Wang, and B. Shi, "Efficient
Pattern-Growth Methods for Frequent Tree Pattern Mining," Proc.
Pacific-Asia Conf. Knowledge Discovery and Data Mining, 2004.
[25] R. AliMohammadzadeh, S. Soltan, and M. Rahgozar, "Template guided
association rule mining from XML documents". In Proceedings of the
15th international Conference on World Wide Web (Edinburgh,
Scotland, May 23 - 26, 2006). WWW 2006, ACM Press, New York,
NY, 963-964. DOI= http://doi.acm.org/10.1145/1135777.1135966.
[26] Q Ding, K Ricords, J Lumpkin, "Deriving General Association Rules
from XML Data", In Proceedings of Fourth ACIS International
Conference on Software Engineering, Artificial Intelligence,
Networking, and Parallel/Distributed Computing (SNPD'03) October 16-
18, 2003 L├╝beck, Germany.
[27] YL Chen, CH Ye, SY Wu, "Mining Predecessor-Successor Rules from
DAG Data", International Journal of Intelligent Systems, 2006.
[28] C. Combi, B. Oliboni, R. Rossato. "Complex Association Rules for
XML Documents". In Proceedings of the 9th International Conference
on Knowledge-Based Intelligent Information & Engineering Systems
(KES05).
[1] Braga D., A. Campi, M. Klemettinen, and P. L. Lanzi. Mining
association rules from XML data. In Proceedings of the 4th International
Conference on Data Warehousing and Knowledge Discovery, September
4-6, Aixen-Provence, France 2002.
[2] Feng L. & T. Dillon. Mining XML-Enabled Association Rule with
Templates. In Proceedings of KDID04, 2004.
[3] Nayak, R. Discovering Knowledge from XML Documents, in Wong,
John, Eds. Encyclopedia of Data Warehousing and Mining. Idea Group
Publications, 2005.
[4] Tan, H., T.S. Dillon, L. Feng, E. Chang, F. Hadzic, "X3-Miner: Mining
Patterns from XML Database," In Proc. Data Mining '05. Skiathos,
Greece, 2005.
[5] M. .J. Zaki, "Efficiently Mining Frequent Trees in a Forest: Algorithms
and Applications," in IEEE Transaction on Knowledge and Data
Engineering, vol. 17, no. 8, pp. 1021-1035, 2005.
[6] M. J. Zaki,.. "Efficient Mining of Trees in the Forest". SIGKDD '02,
Edmonton, Alberta, Canada, ACM. 2002.
[7] Y. Chi, S. Nijssen, R.R. Muntz, J. N. Kok, "Frequent Subtree Mining An
Overview," Fundamental Informatics, Special Issue on Graph and Tree
Mining, 2005.
[8] R. Agrawal, H. Mannila, R. Srikant, H. Toivonen, and A. Inkeri
Verkamo, "Fast Discovery of Association Rules," Advances in
Knowledge Discovery, and Data Mining, U. Fayyad et al., eds.,pp. 307-
328, Menlo Park, Calif.: AAAI Press, 1996.
[9] R. AliMohammadzadeh, M. Haghir Chehreghani, A. Zarnani, M.
Rahgozar, "W3-Miner: Mining Weighted Frequent Subtree Patterns in a
Collection of Trees". In Proceedings of the Second International
Conference on Pattern Analysis (Budapest, Hungary, May 26-28, 2006).
ICPA-06. Transaction on Engineering, Computing and Technology,
ISSN 1305-5313, Pages 164-168, World Enformatika Society.
[10] M. Zaki. Efficiently mining frequent embedded unordered trees.
Fundamental Informatics, 65:1-20, 2005.
[11] M. J. Zaki and C. C. Aggarwal. XRules: An effective structural classifier
for XML data. In Proc. of the 2003 Int. Conf. Knowledge Discovery and
Data Mining, 2003.
[12] K. Abe, S. Kawasoe, T. Asai, H. Arimura, and S. Arikawa, "Optimized
Substructure Discovery for Semi-structured Data," In Proc. PKDD-02,
1-14, LNAI 2431, 2002.
[13] T. Asai, H. Arimura, T. Uno, and S. Nakano. Discovering frequent
substructures in large unordered trees. In Proc. of the 6th Intl. Conf. on
Discovery Science, 2003.
[14] Y. Chi, Y. Yang, and R. R. Muntz. Mining frequent rooted trees and free
trees using canonical forms. Technical Report CSD-TR No. 030043,
UCLA, 2003.
[15] H. Tan, T.S. Dillon, L. Feng, E. Chang, F. Hadzic, "X3-Miner: Mining
Patterns from XML Database," In Proc. Data Mining '05. Skiathos,
Greece, 2005.
[16] K. Wang and H. Liu, "Discovering Typical Structures of Documents: A
Road Map Approach," Proc. ACM SIGIR Conf. Information Retrieval,
1998.
[17] Y. Chi, Y. Yang, and R.R. Muntz, "Indexing and Mining Free Trees,"
Proc. Third IEEE Int-l Conf. Data Mining, 2003.
[18] U. Ruckert and S. Kramer, "Frequent Free Tree Discovery in Graph
Data," Special Track on Data Mining, Proc. ACM Symp. Applied
Computing, 2004.
[19] Y. Xiao, J.-F. Yao, Z. Li, and M.H. Dunham, "Efficient Data Mining for
Maximal Frequent Subtrees," Proc. Int-l Conf. Data Mining, 2003.
[20] S. Nijssen and J.N. Kok, "Efficient Discovery of Frequent Unordered
Trees," Proc. First Int-l Workshop Mining Graphs, Trees, and
Sequences, 2003.
[21] Y. Chi, Y. Yang, and R.R. Muntz, "HybridTreeMiner: An Efficient
Algorihtm for Mining Frequent Rooted Trees and Free Trees Using
Canonical Forms," Proc. 16th Int-l Conf. Scientific and Statistical
Database Management, 2004.
[22] A. Termier, M-C. Rousset, and M. Sebag, "Treefinder: A First Step
Towards XML Data Mining," Proc. IEEE Int-l Conf. Data Mining, 2002.
[23] D. Shasha, J. Wang, and S. Zhang, "Unordered Tree Mining with
Applications to Phylogeny," Proc. Int-l Conf. Data Eng., 2004.
[24] C. Wang, M. Hong, J. Pei, H. Zhou, W. Wang, and B. Shi, "Efficient
Pattern-Growth Methods for Frequent Tree Pattern Mining," Proc.
Pacific-Asia Conf. Knowledge Discovery and Data Mining, 2004.
[25] R. AliMohammadzadeh, S. Soltan, and M. Rahgozar, "Template guided
association rule mining from XML documents". In Proceedings of the
15th international Conference on World Wide Web (Edinburgh,
Scotland, May 23 - 26, 2006). WWW 2006, ACM Press, New York,
NY, 963-964. DOI= http://doi.acm.org/10.1145/1135777.1135966.
[26] Q Ding, K Ricords, J Lumpkin, "Deriving General Association Rules
from XML Data", In Proceedings of Fourth ACIS International
Conference on Software Engineering, Artificial Intelligence,
Networking, and Parallel/Distributed Computing (SNPD'03) October 16-
18, 2003 L├╝beck, Germany.
[27] YL Chen, CH Ye, SY Wu, "Mining Predecessor-Successor Rules from
DAG Data", International Journal of Intelligent Systems, 2006.
[28] C. Combi, B. Oliboni, R. Rossato. "Complex Association Rules for
XML Documents". In Proceedings of the 9th International Conference
on Knowledge-Based Intelligent Information & Engineering Systems
(KES05).
@article{"International Journal of Information, Control and Computer Sciences:59549", author = "R. AliMohammadzadeh and M. Rahgozar and A. Zarnani", title = "A New Model for Discovering XML Association Rules from XML Documents", abstract = "The inherent flexibilities of XML in both structure
and semantics makes mining from XML data a complex task with
more challenges compared to traditional association rule mining in
relational databases. In this paper, we propose a new model for the
effective extraction of generalized association rules form a XML
document collection. We directly use frequent subtree mining
techniques in the discovery process and do not ignore the tree
structure of data in the final rules. The frequent subtrees based on the
user provided support are split to complement subtrees to form the
rules. We explain our model within multi-steps from data preparation
to rule generation.", keywords = "XML, Data Mining, Association Rule Mining.", volume = "2", number = "9", pages = "3140-5", }