A Tree Based Association Rule Approach for XML Data with Semantic Integration

The use of eXtensible Markup Language (XML) in
web, business and scientific databases lead to the development of
methods, techniques and systems to manage and analyze XML data.
Semi-structured documents suffer due to its heterogeneity and
dimensionality. XML structure and content mining represent
convergence for research in semi-structured data and text mining. As
the information available on the internet grows drastically, extracting
knowledge from XML documents becomes a harder task. Certainly,
documents are often so large that the data set returned as answer to a
query may also be very big to convey the required information. To
improve the query answering, a Semantic Tree Based Association
Rule (STAR) mining method is proposed. This method provides
intentional information by considering the structure, content and the
semantics of the content. The method is applied on Reuter’s dataset
and the results show that the proposed method outperforms well.





References:
[1] Markus Tresch, Neal Palmer and Allen Luniewski (1995), “Type
Classification of Semi Structured documents”, In the proceedings of the
21st International Conference on Very Large Data Bases,pp.263-274.
[2] Jeonghee Yi and Neel Sundaresan, (2001), “A Classifier for semi
structured documents”, In the Proceedings of 6th Internatnal Conference
on Knowledge Discovery and Data Mining,pp.34-344
[3] Shashirekha H.L., Vanishree K.S., and Sumangala N,(2011),“Content
and Structure Based Classification of Xml Documents”, International
Journal Of Machine Intelligence , Vol. 3, No. 4, pp.376-380.
[4] Sekhar, G. S., and Krishna, S. M. , (2012), “Efficient Data Mining for
XML Queries–Answering Support”, In the IOSR Journal of Computer
Engineering, Vol.4, No.6, pp. 13-22.
[5] F. Llopis A. Ferrandez , J. L. Vicedo and A. Gelbukh,(2002), “Text
segmentation for efficient information retrieval”, In the Proceedings of
3rd International Conference on Text Processing and Computational
Linguistics,LNCS 2276: pp 373-380.
[6] Chen L, Bhowmick, SS, & Chia LT, (2004), “Mining association rules
from structural deltas of historical XML documents”, In the Proceedings
of Pacific-Asia conference on knowledge discovery and data mining, pp.
452-457.
[7] AliMohammadzadeh R, Soltan S & Rahgozar M, (2006), ‘Template
guided association rule mining from XML documents’, Proceedings of
15th International World Wide Web Conference, pp. 963-964
[8] Tekli J, Chbeir R, & Yetongnon K (2007), “Structural similarity
evaluation between XML documents and DTDs”, Proceedings of the 8th
International Conference on Web Information Systems Engineering
Nancy, pp. 196-211.
[9] Qiu W (2009), “Research and application of XML documents query
based on weight cost”, Asia-Pacific Conference on information
processing, vol.1, pp.525-528.
[10] Mazuran, M, Quintarelli, E, & Tanca, L 2012, ‘Data mining for XML
query-answering support’, IEEE Transaction on Knowledge and Data
Engineering, vol. 24, no. 8, pp. 1393-1407.
[11] Sekhar, G. S., and Krishna, S. M., (2012), “Efficient Data Mining for
XML Queries–Answering Support”,In the IOSR Journal of Computer
Engineering, Vol.4, No.6, pp. 13-22.
[12] Vikhe, P. B., & Gunjal, B. L. (2013), “Extracting Tree Based
Association Rules from XML Document”, International Journal of
Emerging Technology and Advanced Engineering,Vol. 3, No.6.
[13] Chiranjeevi, K., Vasantha, K., & Rao, C. M. (2013),“A Succinct
Answering Prototype for XML Data”, In the international journal of
Advanced and Innovative Research,Vol.2,No.11, pp. 642-651.
[14] Jacques Savoy,(1999) “A Stemming Procedure and Stop word List for
General French Corpora”, In the Journal of the American Society for
Information Science, Vol.50, No.10, pp 944-952.
[15] G. Salton, (1989), “Automatic text processing: the transformation,
analysis, and retrieval of information by computer”. Addison-Wesley
Longman Publishing Co. Boston, MA, USA 1989.
[16] Vurukonda, N., Reddy, G. R., C., Mounika, B., Yogyatha, G., Srujana,
N., & Priya, P. K. (2013). “A Survey on Tree based Association Rules
(TARs) from XML Documents”, In the International Journal of
Research and computational Technology,Vol.5, No.1.