Abstract: Demand over web services is in growing with increases number of Web users. Web service is applied by Web application. Web application size is affected by its user-s requirements and interests. Differential in requirements and interests lead to growing of Web application size. The efficient way to save store spaces for more data and information is achieved by implementing algorithms to compress the contents of Web application documents. This paper introduces an algorithm to reduce Web application size based on reduction of the contents of HTML files. It removes unimportant contents regardless of the HTML file size. The removing is not ignored any character that is predicted in the HTML building process.
Abstract: Text Mining is an important step of Knowledge
Discovery process. It is used to extract hidden information from notstructured
o semi-structured data. This aspect is fundamental because
much of the Web information is semi-structured due to the nested
structure of HTML code, much of the Web information is linked,
much of the Web information is redundant. Web Text Mining helps
whole knowledge mining process to mining, extraction and
integration of useful data, information and knowledge from Web
page contents.
In this paper, we present a Web Text Mining process able to
discover knowledge in a distributed and heterogeneous multiorganization
environment. The Web Text Mining process is based on
flexible architecture and is implemented by four steps able to
examine web content and to extract useful hidden information
through mining techniques. Our Web Text Mining prototype starts
from the recovery of Web job offers in which, through a Text Mining
process, useful information for fast classification of the same are
drawn out, these information are, essentially, job offer place and
skills.