Elimination of Redundant Links in Web Pages– Mathematical Approach

With the enormous growth on the web, users get easily lost in the rich hyper structure. Thus developing user friendly and automated tools for providing relevant information without any redundant links to the users to cater to their needs is the primary task for the website owners. Most of the existing web mining algorithms have concentrated on finding frequent patterns while neglecting the less frequent one that are likely to contain the outlying data such as noise, irrelevant and redundant data. This paper proposes new algorithm for mining the web content by detecting the redundant links from the web documents using set theoretical(classical mathematics) such as subset, union, intersection etc,. Then the redundant links is removed from the original web content to get the required information by the user..




References:
[1] S.Poonkuzhali, K.Thiagarajan, K.Sarukesi,Set theoretical Approach for
mining web content through outliers detection, International journal on
research and industrial applications, Volume 2, Jan 2009
[2] Changjun Wu, Guosun Zeng, Guorong Xu , A Web Page
Segmentation Algorithm for Extracting Product Information ,
Information Acquisition, 2006 IEEE International Conference on
Publication Date: Aug. 2006.
[3] Raymond Kosala, Hendrik Blockeel, Web Mining Research: A Survey,
ACM SIGKDD, July 2000
[4] Bing Liu, Kevin Chen- Chuan Chang , Editorial: Special issue on Web
Content Mining , SIGKDD Explorations, Volume 6, Issue 2.
[5] Jaroslav Pokorny, Jozef Smizansky, Page Content Rank: An approach to
the Web Content Mining.
[6] Malik Agyemang Ken Barker Rada S. Alhajj , Mining Web Content
Outliers using Structure Oriented Weighting Techniques and N-Grams ,
2005 ACM Symposium on Applied Computing
[7] Ricardo Campos , Gael Dias, Celia Nunes, WISE : Hierarchical Soft
Clustering of Web Page Search Results based on Web Content Mining
Techniques, International conference on Web Intelligence,
IEEE/WIC/ACM 2006.
[8] Jiang Yiyong, Zhang Jifu,Cai Jainghui, Zhang Sulan, Hu Lihua , The
Outliers Mining Algorithm Based On Constrained Concept Lattice,
Internal Symposium on Data Privacy and E.commerce , IEEE 2007.
[9] kshitija Pol, Nita Patil, Shreya Patankar, Chhaya Das, A Survey on Web
Content Mining and Extraction of Structured and Semistructured
data,First International Conference on Emerging trends in Engineering
and Technology, 2008
[10] J.P. Tremblay and R. Manohar, "Discrete Mathematical Structures with
Applications to Computer Science", TMH, 1997.
[11] Kenneth H. Rosen, "Discrete Mathematics and its Applications", Fifth
Edition, TMH, 2003.
[12] R.P. Grimaldi, "Discrete and Combinatorial Mathematics", Pearson
Edition, New Delhi 2002.
[13] ] M.K. Venkataraman, N. Sridharan and N.Chandrasekaran, "Discrete
Mathematics", The National Publishing Company, 2003.
[14] Hongqi li, Zhuang Wu, Xiaogang Ji, Research on the techniques for
Effectively Searching and Retrieving Information from Internet,
International Symposium on Electronic Commerce and Security, IEEE
2008