Abstract: We try to give a solution of version control for
documents in web service, that-s why we propose a new approach
used specially for the XML documents. The new approach is applied
in a centralized repository, this repository coexist with other
repositories in a decentralized system. To achieve the activities of
this approach in a standard model we use the ECA active rules. We
also show how the Event-Condition-Action rules (ECA rules) have
been incorporated as a mechanism for the version control of
documents. The need to integrate ECA rules is that it provides a clear
declarative semantics and induces an immediate operational
realization in the system without the need for human intervention.
Abstract: Current tools for data migration between documentoriented
and relational databases have several disadvantages. We
propose a new approach for data migration between documentoriented
and relational databases. During data migration the relational
schema of the target (relational database) is automatically created
from collection of XML documents. Proposed approach is verified on
data migration between document-oriented database IBM Lotus/
Notes Domino and relational database implemented in relational
database management system (RDBMS) MySQL.
Abstract: Semantic Web Technologies enable machines to
interpret data published in a machine-interpretable form on the web.
At the present time, only human beings are able to understand the
product information published online. The emerging semantic Web
technologies have the potential to deeply influence the further
development of the Internet Economy. In this paper we propose a
scenario based research approach to predict the effects of these new
technologies on electronic markets and business models of traders
and intermediaries and customers. Over 300 million searches are
conducted everyday on the Internet by people trying to find what
they need. A majority of these searches are in the domain of
consumer ecommerce, where a web user is looking for something to
buy. This represents a huge cost in terms of people hours and an
enormous drain of resources. Agent enabled semantic search will
have a dramatic impact on the precision of these searches. It will
reduce and possibly eliminate information asymmetry where a better
informed buyer gets the best value. By impacting this key
determinant of market prices semantic web will foster the evolution
of different business and economic models. We submit that there is a
need for developing these futuristic models based on our current
understanding of e-commerce models and nascent semantic web
technologies. We believe these business models will encourage
mainstream web developers and businesses to join the “semantic web
revolution."
Abstract: Machine-understandable data when strongly
interlinked constitutes the basis for the SemanticWeb. Annotating
web documents is one of the major techniques for creating metadata
on the Web. Annotating websitexs defines the containing data in a
form which is suitable for interpretation by machines. In this paper,
we present a better and improved approach than previous [1] to
annotate the texts of the websites depends on the knowledge base.
Abstract: This paper presents a system for discovering
association rules from collections of unstructured documents called
EART (Extract Association Rules from Text). The EART system
treats texts only not images or figures. EART discovers association
rules amongst keywords labeling the collection of textual documents.
The main characteristic of EART is that the system integrates XML
technology (to transform unstructured documents into structured
documents) with Information Retrieval scheme (TF-IDF) and Data
Mining technique for association rules extraction. EART depends on
word feature to extract association rules. It consists of four phases:
structure phase, index phase, text mining phase and visualization
phase. Our work depends on the analysis of the keywords in the
extracted association rules through the co-occurrence of the keywords
in one sentence in the original text and the existing of the keywords
in one sentence without co-occurrence. Experiments applied on a
collection of scientific documents selected from MEDLINE that are
related to the outbreak of H5N1 avian influenza virus.
Abstract: Mobile learning (m-learning) is a new method in teaching and learning process which combines technology of mobile device with learning materials. It can enhance student's engagement in learning activities and facilitate them to access the learning materials at anytime and anywhere. In Kolej Poly-Tech Mara (KPTM), this method is seen as an important effort in teaching practice and to improve student learning performance. The aim of this paper is to discuss the development of m-learning application called Mobile EEF Learning System (MEEFLS) to be implemented for Electric and Electronic Fundamentals course using Flash, XML (Extensible Markup Language) and J2ME (Java 2 micro edition). System Development Life Cycle (SDLC) was used as an application development approach. It has three modules in this application such as notes or course material, exercises and video. MEELFS development is seen as a tool or a pilot test for m-learning in KPTM.
Abstract: Schema matching plays a key role in many different
applications, such as schema integration, data integration, data
warehousing, data transformation, E-commerce, peer-to-peer data
management, ontology matching and integration, semantic Web,
semantic query processing, etc. Manual matching is expensive and
error-prone, so it is therefore important to develop techniques to
automate the schema matching process. In this paper, we present a
solution for XML schema automated matching problem which
produces semantic mappings between corresponding schema
elements of given source and target schemas. This solution
contributed in solving more comprehensively and efficiently XML
schema automated matching problem. Our solution based on
combining linguistic similarity, data type compatibility and structural
similarity of XML schema elements. After describing our solution,
we present experimental results that demonstrate the effectiveness of
this approach.
Abstract: Over the past few years, XML (eXtensible Mark-up
Language) has emerged as the standard for information
representation and data exchange over the Internet. This paper
provides a kick-start for new researches venturing in XML databases
field. We survey the storage representation for XML document,
review the XML query processing and optimization techniques with
respect to the particular storage instance. Various optimization
technologies have been developed to solve the query retrieval and
updating problems. Towards the later year, most researchers
proposed hybrid optimization techniques. Hybrid system opens the
possibility of covering each technology-s weakness by its strengths.
This paper reviews the advantages and limitations of optimization
techniques.
Abstract: This paper describes text mining technique for automatically extracting association rules from collections of textual documents. The technique called, Extracting Association Rules from Text (EART). It depends on keyword features for discover association rules amongst keywords labeling the documents. In this work, the EART system ignores the order in which the words occur, but instead focusing on the words and their statistical distributions in documents. The main contributions of the technique are that it integrates XML technology with Information Retrieval scheme (TFIDF) (for keyword/feature selection that automatically selects the most discriminative keywords for use in association rules generation) and use Data Mining technique for association rules discovery. It consists of three phases: Text Preprocessing phase (transformation, filtration, stemming and indexing of the documents), Association Rule Mining (ARM) phase (applying our designed algorithm for Generating Association Rules based on Weighting scheme GARW) and Visualization phase (visualization of results). Experiments applied on WebPages news documents related to the outbreak of the bird flu disease. The extracted association rules contain important features and describe the informative news included in the documents collection. The performance of the EART system compared with another system that uses the Apriori algorithm throughout the execution time and evaluating extracted association rules.
Abstract: XML has become a popular standard for information exchange via web. Each XML document can be presented as a rooted, ordered, labeled tree. The Node label shows the exact position of a node in the original document. Region and Dewey encoding are two famous methods of labeling trees. In this paper, we propose a new insert friendly labeling method named IFDewey based on recently proposed scheme, called Extended Dewey. In Extended Dewey many labels must be modified when a new node is inserted into the XML tree. Our method eliminates this problem by reserving even numbers for future insertion. Numbers generated by Extended Dewey may be even or odd. IFDewey modifies Extended Dewey so that only odd numbers are generated and even numbers can then be used for a much easier insertion of nodes.
Abstract: Advent enhancements in the field of computing have
increased massive use of web based electronic documents. Current
Copyright protection laws are inadequate to prove the ownership for
electronic documents and do not provide strong features against
copying and manipulating information from the web. This has
opened many channels for securing information and significant
evolutions have been made in the area of information security.
Digital Watermarking has developed into a very dynamic area of
research and has addressed challenging issues for digital content.
Watermarking can be visible (logos or signatures) and invisible
(encoding and decoding). Many visible watermarking techniques
have been studied for text documents but there are very few for web
based text. XML files are used to trade information on the internet
and contain important information. In this paper, two invisible
watermarking techniques using Synonyms and Acronyms are
proposed for XML files to prove the intellectual ownership and to
achieve the security. Analysis is made for different attacks and
amount of capacity to be embedded in the XML file is also noticed.
A comparative analysis for capacity is also made for both methods.
The system has been implemented using C# language and all tests are
made practically to get the results.
Abstract: XML files contain data which is in well formatted manner. By studying the format or semantics of the grammar it will be helpful for fast retrieval of the data. There are many algorithms which describes about searching the data from XML files. There are no. of approaches which uses data structure or are related to the contents of the document. In these cases user must know about the structure of the document and information retrieval techniques using NLPs is related to content of the document. Hence the result may be irrelevant or not so successful and may take more time to search.. This paper presents fast XML retrieval techniques by using new indexing technique and the concept of RXML. When indexing an XML document, the system takes into account both the document content and the document structure and assigns the value to each tag from file. To query the system, a user is not constrained about fixed format of query.
Abstract: EGOTHOR is a search engine that indexes the Web
and allows us to search the Web documents. Its hit list contains URL
and title of the hits, and also some snippet which tries to shortly
show a match. The snippet can be almost always assembled by an
algorithm that has a full knowledge of the original document (mostly
HTML page). It implies that the search engine is required to store
the full text of the documents as a part of the index.
Such a requirement leads us to pick up an appropriate compression
algorithm which would reduce the space demand. One of the solutions
could be to use common compression methods, for instance gzip or
bzip2, but it might be preferable if we develop a new method which
would take advantage of the document structure, or rather, the textual
character of the documents.
There already exist a special compression text algorithms and
methods for a compression of XML documents. The aim of this
paper is an integration of the two approaches to achieve an optimal
level of the compression ratio
Abstract: The data exchanged on the Web are of different nature
from those treated by the classical database management systems;
these data are called semi-structured data since they do not have a
regular and static structure like data found in a relational database;
their schema is dynamic and may contain missing data or types.
Therefore, the needs for developing further techniques and
algorithms to exploit and integrate such data, and extract relevant
information for the user have been raised. In this paper we present
the system OSIX (Osiris based System for Integration of XML
Sources). This system has a Data Warehouse model designed for the
integration of semi-structured data and more precisely for the
integration of XML documents. The architecture of OSIX relies on
the Osiris system, a DL-based model designed for the representation
and management of databases and knowledge bases. Osiris is a viewbased
data model whose indexing system supports semantic query
optimization. We show that the problem of query processing on a
XML source is optimized by the indexing approach proposed by
Osiris.
Abstract: The size, complexity and number of databases used
for protein information have caused bioinformatics to lag behind in
adapting to the need to handle this distributed information.
Integrating all the information from different databases into one
database is a challenging problem. Our main research is to develop a
tool which can be used to access and manipulate protein information
from difference databases. In our approach, we have integrated
difference databases such as Swiss-prot, PDB, Interpro, and EMBL
and transformed these databases in flat file format into relational
form using XML and Bioperl. As a result, we showed this tool can
search different sizes of protein information stored in relational
database and the result can be retrieved faster compared to flat file
database. A web based user interface is provided to allow user to
access or search for protein information in the local database.
Abstract: PPX(Pretty Printer for XML) is a query language that offers a concise description method of formatting the XML data into HTML. In this paper, we propose a simple specification of formatting method that is a combination description of automatic layout operators and variables in the layout expression of the GENERATE clause of PPX. This method can automatically format irregular XML data included in a part of XML with layout decision rule that is referred to DTD. In the experiment, a quick comparison shows that PPX requires far less description compared to XSLT or XQuery programs doing same tasks.
Abstract: In this paper we propose a novel approach for
searching eCommerce products using a mobile phone, illustrated by a
prototype eCoMobile. This approach aims to globalize the mobile
search by integrating the concept of user multilinguism into it. To
show that, we particularly deal with English and Arabic languages.
Indeed the mobile user can formulate his query on a commercial
product in either language (English/Arabic). The description of his
information need on commercial products relies on the ontology that
represents the conceptualization of the product catalogue knowledge
domain defined in both English and Arabic languages. A query
expressed on a mobile device client defines the concept that
corresponds to the name of the product followed by a set of pairs
(property, value) specifying the characteristics of the product. Once a
query is submitted it is then communicated to the server side which
analyses it and in its turn performs an http request to an eCommerce
application server (like Amazon). This latter responds by returning
an XML file representing a set of elements where each element
defines an item of the searched product with its specific
characteristics. The XML file is analyzed on the server side and then
items are displayed on the mobile device client along with its
relevant characteristics in the chosen language.