Ontology-based Domain Modelling for Consistent Content Change Management

Ontology-based modelling of multi-formatted software application content is a challenging area in content management. When the number of software content unit is huge and in continuous process of change, content change management is important. The management of content in this context requires targeted access and manipulation methods. We present a novel approach to deal with model-driven content-centric information systems and access to their content. At the core of our approach is an ontology-based semantic annotation technique for diversely formatted content that can improve the accuracy of access and systems evolution. Domain ontologies represent domain-specific concepts and conform to metamodels. Different ontologies - from application domain ontologies to software ontologies - capture and model the different properties and perspectives on a software content unit. Interdependencies between domain ontologies, the artifacts and the content are captured through a trace model. The annotation traces are formalised and a graph-based system is selected for the representation of the annotation traces.

Data Annotation Models and Annotation Query Language

This paper presents data annotation models at five levels of granularity (database, relation, column, tuple, and cell) of relational data to address the problem of unsuitability of most relational databases to express annotations. These models do not require any structural and schematic changes to the underlying database. These models are also flexible, extensible, customizable, database-neutral, and platform-independent. This paper also presents an SQL-like query language, named Annotation Query Language (AnQL), to query annotation documents. AnQL is simple to understand and exploits the already-existent wide knowledge and skill set of SQL.

Using the Semantic Web in Ubiquitous and Mobile Computing: the Morfeo Experience

With the advent of emerging personal computing paradigms such as ubiquitous and mobile computing, Web contents are becoming accessible from a wide range of mobile devices. Since these devices do not have the same rendering capabilities, Web contents need to be adapted for transparent access from a variety of client agents. Such content adaptation results in better rendering and faster delivery to the client device. Nevertheless, Web content adaptation sets new challenges for semantic markup. This paper presents an advanced components platform, called MorfeoSMC, enabling the development of mobility applications and services according to a channel model based on Services Oriented Architecture (SOA) principles. It then goes on to describe the potential for integration with the Semantic Web through a novel framework of external semantic annotation of mobile Web contents. The role of semantic annotation in this framework is to describe the contents of individual documents themselves, assuring the preservation of the semantics during the process of adapting content rendering, as well as to exploit these semantic annotations in a novel user profile-aware content adaptation process. Semantic Web content adaptation is a way of adding value to and facilitates repurposing of Web contents (enhanced browsing, Web Services location and access, etc).

ORank: An Ontology Based System for Ranking Documents

Increasing growth of information volume in the internet causes an increasing need to develop new (semi)automatic methods for retrieval of documents and ranking them according to their relevance to the user query. In this paper, after a brief review on ranking models, a new ontology based approach for ranking HTML documents is proposed and evaluated in various circumstances. Our approach is a combination of conceptual, statistical and linguistic methods. This combination reserves the precision of ranking without loosing the speed. Our approach exploits natural language processing techniques for extracting phrases and stemming words. Then an ontology based conceptual method will be used to annotate documents and expand the query. To expand a query the spread activation algorithm is improved so that the expansion can be done in various aspects. The annotated documents and the expanded query will be processed to compute the relevance degree exploiting statistical methods. The outstanding features of our approach are (1) combining conceptual, statistical and linguistic features of documents, (2) expanding the query with its related concepts before comparing to documents, (3) extracting and using both words and phrases to compute relevance degree, (4) improving the spread activation algorithm to do the expansion based on weighted combination of different conceptual relationships and (5) allowing variable document vector dimensions. A ranking system called ORank is developed to implement and test the proposed model. The test results will be included at the end of the paper.

Research on Strategy for Automated Scaleless-Map Compilation

As a tool for human spatial cognition and thinking, the map has been playing an important role. Maps are perhaps as fundamental to society as language and the written word. Economic and social development requires extensive and in-depth understanding of their own living environment, from the scope of the overall global to urban housing. This has brought unprecedented opportunities and challenges for traditional cartography . This paper first proposed the concept of scaleless-map and its basic characteristics, through the analysis of the existing multi-scale representation techniques. Then some strategies are presented for automated mapping compilation. Taking into account the demand of automated map compilation, detailed proposed the software - WJ workstation must have four technical features, which are generalization operators, symbol primitives, dynamically annotation and mapping process template. This paper provides a more systematic new idea and solution to improve the intelligence and automation of the scaleless cartography.

NEAR: Visualizing Information Relations in Multimedia Repository A•VI•RE

This paper describes the NEAR (Navigating Exhibitions, Annotations and Resources) panel, a novel interactive visualization technique designed to help people navigate and interpret groups of resources, exhibitions and annotations by revealing hidden relations such as similarities and references. NEAR is implemented on A•VI•RE, an extended online information repository. A•VI•RE supports a semi-structured collection of exhibitions containing various resources and annotations. Users are encouraged to contribute, share, annotate and interpret resources in the system by building their own exhibitions and annotations. However, it is hard to navigate smoothly and efficiently in A•VI•RE because of its high capacity and complexity. We present a visual panel that implements new navigation and communication approaches that support discovery of implied relations. By quickly scanning and interacting with NEAR, users can see not only implied relations but also potential connections among different data elements. NEAR was tested by several users in the A•VI•RE system and shown to be a supportive navigation tool. In the paper, we further analyze the design, report the evaluation and consider its usage in other applications.

A Recommender Agent to Support Virtual Learning Activities

This article describes the implementation of an intelligent agent that provides recommendations for educational resources in a virtual learning environment (VLE). It aims to support pending (undeveloped) student learning activities. It begins by analyzing the proposed VLE data model entities in the recommender process. The pending student activities are then identified, which constitutes the input information for the agent. By using the attribute-based recommender technique, the information can be processed and resource recommendations can be obtained. These serve as support for pending activity development in the course. To integrate this technique, we used an ontology. This served as support for the semantic annotation of attributes and recommended files recovery.

Correspondence between Function and Interaction in Protein Interaction Network of Saccaromyces cerevisiae

Understanding the cell's large-scale organization is an interesting task in computational biology. Thus, protein-protein interactions can reveal important organization and function of the cell. Here, we investigated the correspondence between protein interactions and function for the yeast. We obtained the correlations among the set of proteins. Then these correlations are clustered using both the hierarchical and biclustering methods. The detailed analyses of proteins in each cluster were carried out by making use of their functional annotations. As a result, we found that some functional classes appear together in almost all biclusters. On the other hand, in hierarchical clustering, the dominancy of one functional class is observed. In the light of the clustering data, we have verified some interactions which were not identified as core interactions in DIP and also, we have characterized some functionally unknown proteins according to the interaction data and functional correlation. In brief, from interaction data to function, some correlated results are noticed about the relationship between interaction and function which might give clues about the organization of the proteins, also to predict new interactions and to characterize functions of unknown proteins.

Computational Method for Annotation of Protein Sequence According to Gene Ontology Terms

Annotation of a protein sequence is pivotal for the understanding of its function. Accuracy of manual annotation provided by curators is still questionable by having lesser evidence strength and yet a hard task and time consuming. A number of computational methods including tools have been developed to tackle this challenging task. However, they require high-cost hardware, are difficult to be setup by the bioscientists, or depend on time intensive and blind sequence similarity search like Basic Local Alignment Search Tool. This paper introduces a new method of assigning highly correlated Gene Ontology terms of annotated protein sequences to partially annotated or newly discovered protein sequences. This method is fully based on Gene Ontology data and annotations. Two problems had been identified to achieve this method. The first problem relates to splitting the single monolithic Gene Ontology RDF/XML file into a set of smaller files that can be easy to assess and process. Thus, these files can be enriched with protein sequences and Inferred from Electronic Annotation evidence associations. The second problem involves searching for a set of semantically similar Gene Ontology terms to a given query. The details of macro and micro problems involved and their solutions including objective of this study are described. This paper also describes the protein sequence annotation and the Gene Ontology. The methodology of this study and Gene Ontology based protein sequence annotation tool namely extended UTMGO is presented. Furthermore, its basic version which is a Gene Ontology browser that is based on semantic similarity search is also introduced.

MIM: A Species Independent Approach for Classifying Coding and Non-Coding DNA Sequences in Bacterial and Archaeal Genomes

A number of competing methodologies have been developed to identify genes and classify DNA sequences into coding and non-coding sequences. This classification process is fundamental in gene finding and gene annotation tools and is one of the most challenging tasks in bioinformatics and computational biology. An information theory measure based on mutual information has shown good accuracy in classifying DNA sequences into coding and noncoding. In this paper we describe a species independent iterative approach that distinguishes coding from non-coding sequences using the mutual information measure (MIM). A set of sixty prokaryotes is used to extract universal training data. To facilitate comparisons with the published results of other researchers, a test set of 51 bacterial and archaeal genomes was used to evaluate MIM. These results demonstrate that MIM produces superior results while remaining species independent.

A New Approach to Annotate the Text's of the Websites and Documents with a Quite Comprehensive Knowledge Base

Machine-understandable data when strongly interlinked constitutes the basis for the SemanticWeb. Annotating web documents is one of the major techniques for creating metadata on the Web. Annotating websites defines the containing data in a form which is suitable for interpretation by machines. In this paper, we present a new approach to annotate websites and documents by promoting the abstraction level of the annotation process to a conceptual level. By this means, we hope to solve some of the problems of the current annotation solutions.

Collaborative Document Evaluation: An Alternative Approach to Classic Peer Review

Research papers are usually evaluated via peer review. However, peer review has limitations in evaluating research papers. In this paper, Scienstein and the new idea of 'collaborative document evaluation' are presented. Scienstein is a project to evaluate scientific papers collaboratively based on ratings, links, annotations and classifications by the scientific community using the internet. In this paper, critical success factors of collaborative document evaluation are analyzed. That is the scientists- motivation to participate as reviewers, the reviewers- competence and the reviewers- trustworthiness. It is shown that if these factors are ensured, collaborative document evaluation may prove to be a more objective, faster and less resource intensive approach to scientific document evaluation in comparison to the classical peer review process. It is shown that additional advantages exist as collaborative document evaluation supports interdisciplinary work, allows continuous post-publishing quality assessments and enables the implementation of academic recommendation engines. In the long term, it seems possible that collaborative document evaluation will successively substitute peer review and decrease the need for journals.

MNECLIB2 – A Classical Music Digital Library

Lately there has been a significant boost of interest in music digital libraries, which constitute an attractive area of research and development due to their inherent interesting issues and challenging technical problems, solutions to which will be highly appreciated by enthusiastic end-users. We present here a DL that we have developed to support users in their quest for classical music pieces within a particular collection of 18,000+ audio recordings. To cope with the early DL model limitations, we have used a refined socio-semantic and contextual model that allows rich bibliographic content description, along with semantic annotations, reviewing, rating, knowledge sharing etc. The multi-layered service model allows incorporation of local and distributed information, construction of rich hypermedia documents, expressing the complex relationships between various objects and multi-dimensional spaces, agents, actors, services, communities, scenarios etc., and facilitates collaborative activities to offer to individual users the needed collections and services.

Cross Signal Identification for PSG Applications

The standard investigational method for obstructive sleep apnea syndrome (OSAS) diagnosis is polysomnography (PSG), which consists of a simultaneous, usually overnight recording of multiple electro-physiological signals related to sleep and wakefulness. This is an expensive, encumbering and not a readily repeated protocol, and therefore there is need for simpler and easily implemented screening and detection techniques. Identification of apnea/hypopnea events in the screening recordings is the key factor for the diagnosis of OSAS. The analysis of a solely single-lead electrocardiographic (ECG) signal for OSAS diagnosis, which may be done with portable devices, at patient-s home, is the challenge of the last years. A novel artificial neural network (ANN) based approach for feature extraction and automatic identification of respiratory events in ECG signals is presented in this paper. A nonlinear principal component analysis (NLPCA) method was considered for feature extraction and support vector machine for classification/recognition. An alternative representation of the respiratory events by means of Kohonen type neural network is discussed. Our prospective study was based on OSAS patients of the Clinical Hospital of Pneumology from Iaşi, Romania, males and females, as well as on non-OSAS investigated human subjects. Our computed analysis includes a learning phase based on cross signal PSG annotation.

The Spiral_OWL Model – Towards Spiral Knowledge Engineering

The Spiral development model has been used successfully in many commercial systems and in a good number of defense systems. This is due to the fact that cost-effective incremental commitment of funds, via an analogy of the spiral model to stud poker and also can be used to develop hardware or integrate software, hardware, and systems. To support adaptive, semantic collaboration between domain experts and knowledge engineers, a new knowledge engineering process, called Spiral_OWL is proposed. This model is based on the idea of iterative refinement, annotation and structuring of knowledge base. The Spiral_OWL model is generated base on spiral model and knowledge engineering methodology. A central paradigm for Spiral_OWL model is the concentration on risk-driven determination of knowledge engineering process. The collaboration aspect comes into play during knowledge acquisition and knowledge validation phase. Design rationales for the Spiral_OWL model are to be easy-to-implement, well-organized, and iterative development cycle as an expanding spiral.

An Experiment on Personal Archiving and Retrieving Image System (PARIS)

PARIS (Personal Archiving and Retrieving Image System) is an experiment personal photograph library, which includes more than 80,000 of consumer photographs accumulated within a duration of approximately five years, metadata based on our proposed MPEG-7 annotation architecture, Dozen Dimensional Digital Content (DDDC), and a relational database structure. The DDDC architecture is specially designed for facilitating the managing, browsing and retrieving of personal digital photograph collections. In annotating process, we also utilize a proposed Spatial and Temporal Ontology (STO) designed based on the general characteristic of personal photograph collections. This paper explains PRAIS system.

Dynamic Capitalization and Visualization Strategy in Collaborative Knowledge Management System for EI Process

Knowledge is attributed to human whose problemsolving behavior is subjective and complex. In today-s knowledge economy, the need to manage knowledge produced by a community of actors cannot be overemphasized. This is due to the fact that actors possess some level of tacit knowledge which is generally difficult to articulate. Problem-solving requires searching and sharing of knowledge among a group of actors in a particular context. Knowledge expressed within the context of a problem resolution must be capitalized for future reuse. In this paper, an approach that permits dynamic capitalization of relevant and reliable actors- knowledge in solving decision problem following Economic Intelligence process is proposed. Knowledge annotation method and temporal attributes are used for handling the complexity in the communication among actors and in contextualizing expressed knowledge. A prototype is built to demonstrate the functionalities of a collaborative Knowledge Management system based on this approach. It is tested with sample cases and the result showed that dynamic capitalization leads to knowledge validation hence increasing reliability of captured knowledge for reuse. The system can be adapted to various domains.

AnQL: A Query Language for Annotation Documents

This paper presents data annotation models at five levels of granularity (database, relation, column, tuple, and cell) of relational data to address the problem of unsuitability of most relational databases to express annotations. These models do not require any structural and schematic changes to the underlying database. These models are also flexible, extensible, customizable, database-neutral, and platform-independent. This paper also presents an SQL-like query language, named Annotation Query Language (AnQL), to query annotation documents. AnQL is simple to understand and exploits the already-existent wide knowledge and skill set of SQL.

Soccer Video Edition Using a Multimodal Annotation

In this paper, we present an approach for soccer video edition using a multimodal annotation. We propose to associate with each video sequence of a soccer match a textual document to be used for further exploitation like search, browsing and abstract edition. The textual document contains video meta data, match meta data, and match data. This document, generated automatically while the video is analyzed, segmented and classified, can be enriched semi automatically according to the user type and/or a specialized recommendation system.

Automatic Real-Patient Medical Data De-Identification for Research Purposes

Our Medicine-oriented research is based on a medical data set of real patients. It is a security problem to share patient private data with peoples other than clinician or hospital staff. We have to remove person identification information from medical data. The medical data without private data are available after a de-identification process for any research purposes. In this paper, we introduce an universal automatic rule-based de-identification application to do all this stuff on an heterogeneous medical data. A patient private identification is replaced by an unique identification number, even in burnedin annotation in pixel data. The identical identification is used for all patient medical data, so it keeps relationships in a data. Hospital can take an advantage of a research feedback based on results.