Abstract: This research aims to explore how well the extended model of internet commerce adoption (eMICA) model is often used to determine the extent of internet commerce adoption in the travel agencies sector in both Egypt and Kingdom of Saudi Arabia (KSA). The web content analysis method was used to analyze the level of adoption of Egyptian travel agencies and Saudi travel agencies according to data immensely available on their websites. Therefore, each site was categorized according to the phases and levels proposed. In order to achieve this, 120 websites were evaluated by the two authors over a three-month period, from August to October 2020, and then categorized according to the phases and levels of (eMICA). The results show that there are deficiencies in the application of the eMICA model by both KSA and Egyptian travel agencies, generally, updating their websites, the absence of quality certification, offering secure online payment, virtual tours, and videos using Flash animation. In general, the Egyptian companies slightly outperformed the KSA ones in applying eMICA model.
Abstract: Nowadays, the Web has become one of the most
pervasive platforms for information change and retrieval. It collects
the suitable and perfectly fitting information from websites that one
requires. Data mining is the form of extracting data’s available in the
internet. Web mining is one of the elements of data mining
Technique, which relates to various research communities such as
information recovery, folder managing system and simulated
intellects. In this Paper we have discussed the concepts of Web
mining. We contain generally focused on one of the categories of
Web mining, specifically the Web Content Mining and its various
farm duties. The mining tools are imperative to scanning the many
images, text, and HTML documents and then, the result is used by
the various search engines. We conclude by presenting a comparative
table of these tools based on some pertinent criteria.
Abstract: This article deals with the popularity of candidates for the president of the United States of America. The popularity is assessed according to public comments on the Web 2.0. Social networking, blogging and online forums (collectively Web 2.0) are for common Internet users the easiest way to share their personal opinions, thoughts, and ideas with the entire world. However, the web content diversity, variety of technologies and website structure differences, all of these make the Web 2.0 a network of heterogeneous data, where things are difficult to find for common users. The introductory part of the article describes methodology for gathering and processing data from Web 2.0. The next part of the article is focused on the evaluation and content analysis of obtained information, which write about presidential candidates.
Abstract: Website plays a significant role in success of an e-business. It is the main start point of any organization and corporation for its customers, so it's important to customize and design it according to the visitors' preferences. Also, websites are a place to introduce services of an organization and highlight new service to the visitors and audiences. In this paper, we will use web usage mining techniques, as a new field of research in data mining and knowledge discovery, in an Iranian government website. Using the results, a framework for web content layour is proposed. An agent is designed to dynamically update and improve web links locations and layout. Then, we will explain how it is used to directly enable top managers of the organization to influence on the arrangement of web contents and also to enhance customization of web site navigation due to online users' behaviors.
Abstract: The emergence of the Internet has brewed the
revolution of information storage and retrieval. As most of the
data in the web is unstructured, and contains a mix of text,
video, audio etc, there is a need to mine information to cater to
the specific needs of the users without loss of important
hidden information. Thus developing user friendly and
automated tools for providing relevant information quickly
becomes a major challenge in web mining research. Most of
the existing web mining algorithms have concentrated on
finding frequent patterns while neglecting the less frequent
ones that are likely to contain outlying data such as noise,
irrelevant and redundant data. This paper mainly focuses on
Signed approach and full word matching on the organized
domain dictionary for mining web content outliers. This
Signed approach gives the relevant web documents as well as
outlying web documents. As the dictionary is organized based
on the number of characters in a word, searching and retrieval
of documents takes less time and less space.
Abstract: There are many problems associated with the World Wide
Web: getting lost in the hyperspace; the web content is still accessible only
to humans and difficulties of web administration. The solution to these
problems is the Semantic Web which is considered to be the extension
for the current web presents information in both human readable and
machine processable form. The aim of this study is to reach new
generic foundation architecture for the Semantic Web because there
is no clear architecture for it, there are four versions, but still up to
now there is no agreement for one of these versions nor is there a
clear picture for the relation between different layers and
technologies inside this architecture. This can be done depending on
the idea of previous versions as well as Gerber-s evaluation method
as a step toward an agreement for one Semantic Web architecture.
Abstract: With the advent of emerging personal computing paradigms such as ubiquitous and mobile computing, Web contents are becoming accessible from a wide range of mobile devices. Since these devices do not have the same rendering capabilities, Web contents need to be adapted for transparent access from a variety of client agents. Such content adaptation results in better rendering and faster delivery to the client device. Nevertheless, Web content adaptation sets new challenges for semantic markup. This paper presents an advanced components platform, called MorfeoSMC, enabling the development of mobility applications and services according to a channel model based on Services Oriented Architecture (SOA) principles. It then goes on to describe the potential for integration with the Semantic Web through a novel framework of external semantic annotation of mobile Web contents. The role of semantic annotation in this framework is to describe the contents of individual documents themselves, assuring the preservation of the semantics during the process of adapting content rendering, as well as to exploit these semantic annotations in a novel user profile-aware content adaptation process. Semantic Web content adaptation is a way of adding value to and facilitates repurposing of Web contents (enhanced browsing, Web Services location and access, etc).
Abstract: The rapid expansion of the web is causing the
constant growth of information, leading to several problems such as
increased difficulty of extracting potentially useful knowledge. Web
content mining confronts this problem gathering explicit information
from different web sites for its access and knowledge discovery.
Query interfaces of web databases share common building blocks.
After extracting information with parsing approach, we use a new
data mining algorithm to match a large number of schemas in
databases at a time. Using this algorithm increases the speed of
information matching. In addition, instead of simple 1:1 matching,
they do complex (m:n) matching between query interfaces. In this
paper we present a novel correlation mining algorithm that matches
correlated attributes with smaller cost. This algorithm uses Jaccard
measure to distinguish positive and negative correlated attributes.
After that, system matches the user query with different query
interfaces in special domain and finally chooses the nearest query
interface with user query to answer to it.
Abstract: The evolution of information and communication
technology has made a very powerful support for the improvement of
online learning platforms in creation of courses. This paper presents a
study that attempts to explore new web architecture for creating an
adaptive online learning system to profiles of learners, using the Web
as a source for the automatic creation of courses for the online
training platform. This architecture will reduce the time and decrease
the effort performed by the drafters of the current e-learning
platform, and direct adaptation of the Web content will greatly enrich
the quality of online training courses.
Abstract: In the last decade digital watermarking procedures have
become increasingly applied to implement the copyright protection
of multimedia digital contents distributed on the Internet. To this
end, it is worth noting that a lot of watermarking procedures
for images and videos proposed in literature are based on spread
spectrum techniques. However, some scepticism about the robustness
and security of such watermarking procedures has arisen because
of some documented attacks which claim to render the inserted
watermarks undetectable. On the other hand, web content providers
wish to exploit watermarking procedures characterized by flexible and
efficient implementations and which can be easily integrated in their
existing web services frameworks or platforms. This paper presents
how a simple spread spectrum watermarking procedure for MPEG-2
videos can be modified to be exploited in web contexts. To this end,
the proposed procedure has been made secure and robust against some
well-known and dangerous attacks. Furthermore, its basic scheme
has been optimized by making the insertion procedure adaptive with
respect to the terminals used to open the videos and the network transactions
carried out to deliver them to buyers. Finally, two different
implementations of the procedure have been developed: the former
is a high performance parallel implementation, whereas the latter is
a portable Java and XML based implementation. Thus, the paper
demonstrates that a simple spread spectrum watermarking procedure,
with limited and appropriate modifications to the embedding scheme,
can still represent a valid alternative to many other well-known and
more recent watermarking procedures proposed in literature.
Abstract: Online news websites are one of the main and wide areas of Mass Media. Since the nineties several Jordanian newspapers were introduced to the World Wide Web to reach various and large numbers of audiances. Examples of these newspapers that have online version are Al-Rai, Ad-Dustor and AlGhad. Other pure online news websites include Ammon and Rum. The main aim of this study is to evaluate online newspaper websites using two assessment measures; usability and web content. This aim is achieved by using a questionnaire based evaluation which is based on the definition of usability and web content in the ISO document as the standard number 9241-part 11. The results are obtained based on 204 audiences- responses. The results of the research showed that the usability factor is relatively good for all Jordanian online newspapers whereas the web content factor is moderate.
Abstract: This paper presents a watermarking protocol able to
solve the well-known “customer-s right problem" and “unbinding
problem". In particular, the protocol has been purposely designed
to be adopted in a web context, where users wanting to buy digital
contents are usually neither provided with digital certificates issued
by certification authorities (CAs) nor able to autonomously perform
specific security actions. Furthermore, the protocol enables users to
keep their identities unexposed during web transactions as well as
allows guilty buyers, i.e. who are responsible distributors of illegal
replicas, to be unambiguously identified. Finally, the protocol has
been designed so that web content providers (CPs) can exploit
copyright protection services supplied by web service providers (SPs)
in a security context. Thus, CPs can take advantage of complex
services without having to directly implement them.
Abstract: The development of Internet technology in recent years has led to a more active role of users in creating Web content. This has significant effects both on individual learning and collaborative knowledge building. This paper will present an integrative framework model to describe and explain learning and knowledge building with shared digital artifacts on the basis of Luhmann-s systems theory and Piaget-s model of equilibration. In this model, knowledge progress is based on cognitive conflicts resulting from incongruities between an individual-s prior knowledge and the information which is contained in a digital artifact. Empirical support for the model will be provided by 1) applying it descriptively to texts from Wikipedia, 2) examining knowledge-building processes using a social network analysis, and 3) presenting a survey of a series of experimental laboratory studies.
Abstract: Text Mining is an important step of Knowledge
Discovery process. It is used to extract hidden information from notstructured
o semi-structured data. This aspect is fundamental because
much of the Web information is semi-structured due to the nested
structure of HTML code, much of the Web information is linked,
much of the Web information is redundant. Web Text Mining helps
whole knowledge mining process to mining, extraction and
integration of useful data, information and knowledge from Web
page contents.
In this paper, we present a Web Text Mining process able to
discover knowledge in a distributed and heterogeneous multiorganization
environment. The Web Text Mining process is based on
flexible architecture and is implemented by four steps able to
examine web content and to extract useful hidden information
through mining techniques. Our Web Text Mining prototype starts
from the recovery of Web job offers in which, through a Text Mining
process, useful information for fast classification of the same are
drawn out, these information are, essentially, job offer place and
skills.
Abstract: The internet has become an attractive avenue for
global e-business, e-learning, knowledge sharing, etc. Due to
continuous increase in the volume of web content, it is not practically
possible for a user to extract information by browsing and integrating
data from a huge amount of web sources retrieved by the existing
search engines. The semantic web technology enables advancement
in information extraction by providing a suite of tools to integrate
data from different sources. To take full advantage of semantic web,
it is necessary to annotate existing web pages into semantic web
pages. This research develops a tool, named OWIE (Ontology-based
Web Information Extraction), for semantic web annotation using
domain specific ontologies. The tool automatically extracts
information from html pages with the help of pre-defined ontologies
and gives them semantic representation. Two case studies have been
conducted to analyze the accuracy of OWIE.
Abstract: With the proliferation of World Wide Web,
development of web-based technologies and the growth in web
content, the structure of a website becomes more complex and web
navigation becomes a critical issue to both web designers and users.
In this paper we define the content and web pages as two important
and influential factors in website navigation and paraphrase the
enhancement in the website navigation as making some useful
changes in the link structure of the website based on the
aforementioned factors. Then we suggest a new method for
proposing the changes using fuzzy approach to optimize the website
architecture. Applying the proposed method to a real case of Iranian
Civil Aviation Organization (CAO) website, we discuss the results of
the novel approach at the final section.
Abstract: With the rapid growth in business size, today's businesses orient towards electronic technologies. Amazon.com and e-bay.com are some of the major stakeholders in this regard. Unfortunately the enormous size and hugely unstructured data on the web, even for a single commodity, has become a cause of ambiguity for consumers. Extracting valuable information from such an everincreasing data is an extremely tedious task and is fast becoming critical towards the success of businesses. Web content mining can play a major role in solving these issues. It involves using efficient algorithmic techniques to search and retrieve the desired information from a seemingly impossible to search unstructured data on the Internet. Application of web content mining can be very encouraging in the areas of Customer Relations Modeling, billing records, logistics investigations, product cataloguing and quality management. In this paper we present a review of some very interesting, efficient yet implementable techniques from the field of web content mining and study their impact in the area specific to business user needs focusing both on the customer as well as the producer. The techniques we would be reviewing include, mining by developing a knowledge-base repository of the domain, iterative refinement of user queries for personalized search, using a graphbased approach for the development of a web-crawler and filtering information for personalized search using website captions. These techniques have been analyzed and compared on the basis of their execution time and relevance of the result they produced against a particular search.
Abstract: The explosive growth of World Wide Web has posed
a challenging problem in extracting relevant data. Traditional web
crawlers focus only on the surface web while the deep web keeps
expanding behind the scene. Deep web pages are created
dynamically as a result of queries posed to specific web databases.
The structure of the deep web pages makes it impossible for
traditional web crawlers to access deep web contents. This paper,
Deep iCrawl, gives a novel and vision-based approach for extracting
data from the deep web. Deep iCrawl splits the process into two
phases. The first phase includes Query analysis and Query translation
and the second covers vision-based extraction of data from the
dynamically created deep web pages. There are several established
approaches for the extraction of deep web pages but the proposed
method aims at overcoming the inherent limitations of the former.
This paper also aims at comparing the data items and presenting them
in the required order.
Abstract: With the enormous growth on the web, users get easily
lost in the rich hyper structure. Thus developing user friendly and
automated tools for providing relevant information without any
redundant links to the users to cater to their needs is the primary task
for the website owners. Most of the existing web mining algorithms
have concentrated on finding frequent patterns while neglecting the
less frequent one that are likely to contain the outlying data such as
noise, irrelevant and redundant data. This paper proposes new
algorithm for mining the web content by detecting the redundant
links from the web documents using set theoretical(classical
mathematics) such as subset, union, intersection etc,. Then the
redundant links is removed from the original web content to get the
required information by the user..
Abstract: Currently, web usage make a huge data from a lot of
user attention. In general, proxy server is a system to support web
usage from user and can manage system by using hit rates. This
research tries to improve hit rates in proxy system by applying data
mining technique. The data set are collected from proxy servers in the
university and are investigated relationship based on several features.
The model is used to predict the future access websites. Association
rule technique is applied to get the relation among Date, Time, Main
Group web, Sub Group web, and Domain name for created model.
The results showed that this technique can predict web content for the
next day, moreover the future accesses of websites increased from
38.15% to 85.57 %.
This model can predict web page access which tends to increase
the efficient of proxy servers as a result. In additional, the
performance of internet access will be improved and help to reduce
traffic in networks.