Data Extraction of XML Files using Searching and Indexing Techniques

XML files contain data which is in well formatted manner. By studying the format or semantics of the grammar it will be helpful for fast retrieval of the data. There are many algorithms which describes about searching the data from XML files. There are no. of approaches which uses data structure or are related to the contents of the document. In these cases user must know about the structure of the document and information retrieval techniques using NLPs is related to content of the document. Hence the result may be irrelevant or not so successful and may take more time to search.. This paper presents fast XML retrieval techniques by using new indexing technique and the concept of RXML. When indexing an XML document, the system takes into account both the document content and the document structure and assigns the value to each tag from file. To query the system, a user is not constrained about fixed format of query.

Tablet Computer as a User Interface: Intelligent Solutions for Multifunctional Hardcopy Devices

Tablet computers and Multifunctional Hardcopy Devices (MHDs) are common devices in daily life. Though, many scientific studies have not been published. The tablet computers are straightforward to use whereas the MHDs are comparatively difficult to use. Thus, to assist different levels of users, we propose combining these two devices to achieve straightforward intelligent user interface (UI) and versatile What You See Is What You Get (WYSIWYG) document management and production. Our approach to this issue is to design an intelligent user dependent UI for a MHD applying a tablet computer. Furthermore, we propose hardware interconnection and versatile intelligent software between these two devices. In this study, we first provide a state-of-the-art survey on MHDs and tablet computers, and their interconnections. Secondly we provide a comparative UI survey on two state-of-the-art MHDs with a proposal of a novel UI for the MHDs using Jakob Nielsen-s Ten Usability Heuristics Evaluation.

Prototype of a Federative Factory Data Management for the Support of Factory Planning Processes

Due to short product life cycles, increasing variety of products and short cycles of leap innovations manufacturing companies have to increase the flexibility of factory structures. Flexibility of factory structures is based on defined factory planning processes in which product, process and resource data of various partial domains have to be considered. Thus factory planning processes can be characterized as iterative, interdisciplinary and participative processes [1]. To support interdisciplinary and participative character of planning processes, a federative factory data management (FFDM) as a holistic solution will be described. FFDM is already implemented in form of a prototype. The interim results of the development of FFDM will be shown in this paper. The principles are the extracting of product, process and resource data from documents of various partial domains providing as web services on a server. The described data can be requested by the factory planner by using a FFDM-browser.

Support Vector Machine for Persian Font Recognition

In this paper we examine the use of global texture analysis based approaches for the purpose of Persian font recognition in machine-printed document images. Most existing methods for font recognition make use of local typographical features and connected component analysis. However derivation of such features is not an easy task. Gabor filters are appropriate tools for texture analysis and are motivated by human visual system. Here we consider document images as textures and use Gabor filter responses for identifying the fonts. The method is content independent and involves no local feature analysis. Two different classifiers Weighted Euclidean Distance and SVM are used for the purpose of classification. Experiments on seven different type faces and four font styles show average accuracy of 85% with WED and 82% with SVM classifier over typefaces

An Semantic Algorithm for Text Categoritation

Text categorization techniques are widely used to many Information Retrieval (IR) applications. In this paper, we proposed a simple but efficient method that can automatically find the relationship between any pair of terms and documents, also an indexing matrix is established for text categorization. We call this method Indexing Matrix Categorization Machine (IMCM). Several experiments are conducted to show the efficiency and robust of our algorithm.

Methodology Issues and Design Approach of VLE on Mathematical Concepts Acquisition within Secondary Education in England

This study used positivist quantitative approach to examine the mathematical concepts acquisition of- KS4 (14-16) Special Education Needs (SENs) students within the school sector education in England. The research is based on a pilot study and the design is completely holistic in its approach with mixing methodologies. The study combines the qualitative and quantitative methods of approach in gathering formative data for the design process. Although, the approach could best be described as a mix method, fundamentally with a strong positivist paradigm, hence my earlier understanding of the differentiation of the students, student – teacher body and the various elements of indicators that is being measured which will require an attenuated description of individual research subjects. The design process involves four phases with five key stages which are; literature review and document analysis, the survey, interview, and observation; then finally the analysis of data set. The research identified the need for triangulation with Reid-s phases of data management providing scaffold for the study. The study clearly identified the ideological and philosophical aspects of educational research design for the study of mathematics by the special education needs (SENs) students in England using the virtual learning environment (VLE) platform.

Barriers to Knowledge Management: A Theoretical Framework and a Review of Industrial Cases

Firms have invested heavily in knowledge management (KM) with the aim to build a knowledge capability and use it to achieve a competitive advantage. Research has shown, however, that not all knowledge management projects succeed. Some studies report that about 84% of knowledge management projects fail. This paper has integrated studies on the impediments to knowledge management into a theoretical framework. Based on this framework, five cases documenting failed KM initiatives were analysed. The analysis gave us a clear picture about why certain KM projects fail. The high failure rate of KM can be explained by the gaps that exist between users and management in terms of KM perceptions and objectives

Dynamic Inverted Index Maintenance

The majority of today's IR systems base the IR task on two main processes: indexing and searching. There exists a special group of dynamic IR systems where both processes (indexing and searching) happen simultaneously; such a system discards obsolete information, simultaneously dealing with the insertion of new in¬formation, while still answering user queries. In these dynamic, time critical text document databases, it is often important to modify index structures quickly, as documents arrive. This paper presents a method for dynamization which may be used for this task. Experimental results show that the dynamization process is possible and that it guarantees the response time for the query operation and index actualization.

High Securing Cover-File of Hidden Data Using Statistical Technique and AES Encryption Algorithm

Nowadays, the rapid development of multimedia and internet allows for wide distribution of digital media data. It becomes much easier to edit, modify and duplicate digital information Besides that, digital documents are also easy to copy and distribute, therefore it will be faced by many threatens. It-s a big security and privacy issue with the large flood of information and the development of the digital format, it become necessary to find appropriate protection because of the significance, accuracy and sensitivity of the information. Nowadays protection system classified with more specific as hiding information, encryption information, and combination between hiding and encryption to increase information security, the strength of the information hiding science is due to the non-existence of standard algorithms to be used in hiding secret messages. Also there is randomness in hiding methods such as combining several media (covers) with different methods to pass a secret message. In addition, there are no formal methods to be followed to discover the hidden data. For this reason, the task of this research becomes difficult. In this paper, a new system of information hiding is presented. The proposed system aim to hidden information (data file) in any execution file (EXE) and to detect the hidden file and we will see implementation of steganography system which embeds information in an execution file. (EXE) files have been investigated. The system tries to find a solution to the size of the cover file and making it undetectable by anti-virus software. The system includes two main functions; first is the hiding of the information in a Portable Executable File (EXE), through the execution of four process (specify the cover file, specify the information file, encryption of the information, and hiding the information) and the second function is the extraction of the hiding information through three process (specify the steno file, extract the information, and decryption of the information). The system has achieved the main goals, such as make the relation of the size of the cover file and the size of information independent and the result file does not make any conflict with anti-virus software.

RUPSec: An Extension on RUP for Developing Secure Systems - Requirements Discipline

The world is moving rapidly toward the deployment of information and communication systems. Nowadays, computing systems with their fast growth are found everywhere and one of the main challenges for these systems is increasing attacks and security threats against them. Thus, capturing, analyzing and verifying security requirements becomes a very important activity in development process of computing systems, specially in developing systems such as banking, military and e-business systems. For developing every system, a process model which includes a process, methods and tools is chosen. The Rational Unified Process (RUP) is one of the most popular and complete process models which is used by developers in recent years. This process model should be extended to be used in developing secure software systems. In this paper, the Requirement Discipline of RUP is extended to improve RUP for developing secure software systems. These proposed extensions are adding and integrating a number of Activities, Roles, and Artifacts to RUP in order to capture, document and model threats and security requirements of system. These extensions introduce a group of clear and stepwise activities to developers. By following these activities, developers assure that security requirements are captured and modeled. These models are used in design, implementation and test activitie

Impacts of Climate Change under the Threat of Global Warming for an Agricultural Watershed of the Kangsabati River

The effects of global warming on India vary from the submergence of low-lying islands and coastal lands to the melting of glaciers in the Indian Himalayas, threatening the volumetric flow rate of many of the most important rivers of India and South Asia. In India, such effects are projected to impact millions of lives. As a result of ongoing climate change, the climate of India has become increasingly volatile over the past several decades; this trend is expected to continue. Climate change is one of the most important global environmental challenges, with implications for food production, water supply, health, energy, etc. Addressing climate change requires a good scientific understanding as well as coordinated action at national and global level. The climate change issue is part of the larger challenge of sustainable development. As a result, climate policies can be more effective when consistently embedded within broader strategies designed to make national and regional development paths more sustainable. The impact of climate variability and change, climate policy responses, and associated socio-economic development will affect the ability of countries to achieve sustainable development goals. A very well calibrated Soil and Water Assessment Tool (R2 = 0.9968, NSE = 0.91) was exercised over the Khatra sub basin of the Kangsabati River watershed in Bankura district of West Bengal, India, in order to evaluate projected parameters for agricultural activities. Evapotranspiration, Transmission Losses, Potential Evapotranspiration and Lateral Flow to reach are evaluated from the years 2041-2050 in order to generate a picture for sustainable development of the river basin and its inhabitants. India has a significant stake in scientific advancement as well as an international understanding to promote mitigation and adaptation. This requires improved scientific understanding, capacity building, networking and broad consultation processes. This paper is a commitment towards the planning, management and development of the water resources of the Kangsabati River by presenting detailed future scenarios of the Kangsabati river basin, Khatra sub basin, over the mentioned time period. India-s economy and societal infrastructures are finely tuned to the remarkable stability of the Indian monsoon, with the consequence that vulnerability to small changes in monsoon rainfall is very high. In 2002 the monsoon rains failed during July, causing profound loss of agricultural production with a drop of over 3% in India-s GDP. Neither the prolonged break in the monsoon nor the seasonal rainfall deficit was predicted. While the general features of monsoon variability and change are fairly well-documented, the causal mechanisms and the role of regional ecosystems in modulating the changes are still not clear. Current climate models are very poor at modelling the Asian monsoon: this is a challenging and critical region where the ocean, atmosphere, land surface and mountains all interact. The impact of climate change on regional ecosystems is likewise unknown. The potential for the monsoon to become more volatile has major implications for India itself and for economies worldwide. Knowledge of future variability of the monsoon system, particularly in the context of global climate change, is of great concern for regional water and food security. The major findings of this paper were that of all the chosen projected parameters, transmission losses, soil water content, potential evapotranspiration, evapotranspiration and lateral flow to reach, display an increasing trend over the time period of years 2041- 2050.

Suitability of Entry into the Euro Area: An Excursion in Selected Economies

The current situation in the eurozone raises a number of topics for discussion and to help in finding an answer to the question of whether a common currency is a more suitable means of coping with the impact of the financial crisis or whether national currencies are better suited to this. The economic situation in the EU is now considerably volatile and, due to problems with the fulfilment of the Maastricht convergence criteria, it is now being considered whether, in their further development, new member states will decide to distance themselves from the euro or will, in an attempt to overcome the crisis, speed up the adoption of the euro. The Czech Republic is one country with little interest in adopting the euro, justified by the fact that a better alternative to dealing with this crisis is an independent monetary policy and its ability to respond flexibly to the economic situation not only in Europe, but around the world. One attribute of the crisis in the Czech Republic and its mitigation is the freely floating exchange rate of the national currency. It is not only the Czech Republic that is attempting to alleviate the impact of the crisis, but also new EU member countries facing fresh questions to which theory have yet to provide wholly satisfactory answers. These questions undoubtedly include the problem of inflation targeting and the choice of appropriate instruments for achieving financial stability. The difficulty lies in the fact that these objectives may be contradictory and may require more than one means of achieving them. In this respect we may assume that membership of the euro zone might not in itself mitigate the development of the recession or protect the nation from future crises. We are of the opinion that the decisive factor in the development of any economy will continue to be the domestic economic policy and the operability of market economic mechanisms. We attempt to document this fact using selected countries as examples, these being the Czech Republic, Poland, Hungary, and Slovakia.

Semi-Automatic Analyzer to Detect Authorial Intentions in Scientific Documents

Information Retrieval has the objective of studying models and the realization of systems allowing a user to find the relevant documents adapted to his need of information. The information search is a problem which remains difficult because the difficulty in the representing and to treat the natural languages such as polysemia. Intentional Structures promise to be a new paradigm to extend the existing documents structures and to enhance the different phases of documents process such as creation, editing, search and retrieval. The intention recognition of the author-s of texts can reduce the largeness of this problem. In this article, we present intentions recognition system is based on a semi-automatic method of extraction the intentional information starting from a corpus of text. This system is also able to update the ontology of intentions for the enrichment of the knowledge base containing all possible intentions of a domain. This approach uses the construction of a semi-formal ontology which considered as the conceptualization of the intentional information contained in a text. An experiments on scientific publications in the field of computer science was considered to validate this approach.

Tool Tracker: A Toolkit Ensembling Useful Online Networking Tools for Efficient Management and Operation of a Network

Tool Tracker is a client-server based application. It is essentially a catalogue of various network monitoring and management tools that are available online. There is a database maintained on the server side that contains the information about various tools. Several clients can access this information simultaneously and utilize this information. The various categories of tools considered are packet sniffers, port mappers, port scanners, encryption tools, and vulnerability scanners etc for the development of this application. This application provides a front end through which the user can invoke any tool from a central repository for the purpose of packet sniffing, port scanning, network analysis etc. Apart from the tool, its description and the help files associated with it would also be stored in the central repository. This facility will enable the user to view the documentation pertaining to the tool without having to download and install the tool. The application would update the central repository with the latest versions of the tools. The application would inform the user about the availability of a newer version of the tool currently being used and give the choice of installing the newer version to the user. Thus ToolTracker provides any network administrator that much needed abstraction and ease-ofuse with respect to the tools that he can use to efficiently monitor a network.

Compression of Semistructured Documents

EGOTHOR is a search engine that indexes the Web and allows us to search the Web documents. Its hit list contains URL and title of the hits, and also some snippet which tries to shortly show a match. The snippet can be almost always assembled by an algorithm that has a full knowledge of the original document (mostly HTML page). It implies that the search engine is required to store the full text of the documents as a part of the index. Such a requirement leads us to pick up an appropriate compression algorithm which would reduce the space demand. One of the solutions could be to use common compression methods, for instance gzip or bzip2, but it might be preferable if we develop a new method which would take advantage of the document structure, or rather, the textual character of the documents. There already exist a special compression text algorithms and methods for a compression of XML documents. The aim of this paper is an integration of the two approaches to achieve an optimal level of the compression ratio

A Materialized Approach to the Integration of XML Documents: the OSIX System

The data exchanged on the Web are of different nature from those treated by the classical database management systems; these data are called semi-structured data since they do not have a regular and static structure like data found in a relational database; their schema is dynamic and may contain missing data or types. Therefore, the needs for developing further techniques and algorithms to exploit and integrate such data, and extract relevant information for the user have been raised. In this paper we present the system OSIX (Osiris based System for Integration of XML Sources). This system has a Data Warehouse model designed for the integration of semi-structured data and more precisely for the integration of XML documents. The architecture of OSIX relies on the Osiris system, a DL-based model designed for the representation and management of databases and knowledge bases. Osiris is a viewbased data model whose indexing system supports semantic query optimization. We show that the problem of query processing on a XML source is optimized by the indexing approach proposed by Osiris.

Semantic Mobility Channel (SMC): Ubiquitous and Mobile Computing Meets the Semantic Web

With the advent of emerging personal computing paradigms such as ubiquitous and mobile computing, Web contents are becoming accessible from a wide range of mobile devices. Since these devices do not have the same rendering capabilities, Web contents need to be adapted for transparent access from a variety of client agents. Such content adaptation is exploited for either an individual element or a set of consecutive elements in a Web document and results in better rendering and faster delivery to the client device. Nevertheless, Web content adaptation sets new challenges for semantic markup. This paper presents an advanced components platform, called SMC, enabling the development of mobility applications and services according to a channel model based on the principles of Services Oriented Architecture (SOA). It then goes on to describe the potential for integration with the Semantic Web through a novel framework of external semantic annotation that prescribes a scheme for representing semantic markup files and a way of associating Web documents with these external annotations. The role of semantic annotation in this framework is to describe the contents of individual documents themselves, assuring the preservation of the semantics during the process of adapting content rendering. Semantic Web content adaptation is a way of adding value to Web contents and facilitates repurposing of Web contents (enhanced browsing, Web Services location and access, etc).

Ontology-based Concept Weighting for Text Documents

Documents clustering become an essential technology with the popularity of the Internet. That also means that fast and high-quality document clustering technique play core topics. Text clustering or shortly clustering is about discovering semantically related groups in an unstructured collection of documents. Clustering has been very popular for a long time because it provides unique ways of digesting and generalizing large amounts of information. One of the issues of clustering is to extract proper feature (concept) of a problem domain. The existing clustering technology mainly focuses on term weight calculation. To achieve more accurate document clustering, more informative features including concept weight are important. Feature Selection is important for clustering process because some of the irrelevant or redundant feature may misguide the clustering results. To counteract this issue, the proposed system presents the concept weight for text clustering system developed based on a k-means algorithm in accordance with the principles of ontology so that the important of words of a cluster can be identified by the weight values. To a certain extent, it has resolved the semantic problem in specific areas.

Color and Layout-based Identification of Documents Captured from Handheld Devices

This paper proposes a method, combining color and layout features, for identifying documents captured from low-resolution handheld devices. On one hand, the document image color density surface is estimated and represented with an equivalent ellipse and on the other hand, the document shallow layout structure is computed and hierarchically represented. Our identification method first uses the color information in the documents in order to focus the search space on documents having a similar color distribution, and finally selects the document having the most similar layout structure in the remaining of the search space.

Role-play Gaming Simulation for Flood Management on Cultural Heritage: A Case Study of Ayutthaya Historic City

The main aim of this research is to develop a methodology to encourage people's awareness, knowledge and understanding on the participation of flood management for cultural heritage, as the cooperation and interaction among government section, private section, and public section through role-play gaming simulation theory. The format of this research is to develop Role-play gaming simulation from existing documents, game or role-playing from several sources and existing data of the research site. We found that role-play gaming simulation can be implemented to help improving the understanding of the existing problem and the impact of the flood on cultural heritage, and the role-play game can be developed into the tool to improve people's knowledge, understanding and awareness about people's participation for flood management on cultural heritage, moreover the cooperation among the government, private section and public section will be improved through the theory of role-play gaming simulation.