Semi-Automatic Approach for Semantic Annotation

The third phase of web means semantic web requires many web pages which are annotated with metadata. Thus, a crucial question is where to acquire these metadata. In this paper we propose our approach, a semi-automatic method to annotate the texts of documents and web pages and employs with a quite comprehensive knowledge base to categorize instances with regard to ontology. The approach is evaluated against the manual annotations and one of the most popular annotation tools which works the same as our tool. The approach is implemented in .net framework and uses the WordNet for knowledge base, an annotation tool for the Semantic Web.

A Proposed Information Extraction Technique in Engineering Drawing for Reuse Design

The extensive number of engineering drawing will be referred for planning process and the changes will produce a good engineering design to meet the demand in producing a new model. The advantage in reuse of engineering designs is to allow continuous product development to further improve the quality of product development, thus reduce the development costs. However, to retrieve the existing engineering drawing, it is time consuming, a complex process and are expose to errors. Engineering drawing file searching system will be proposed to solve this problem. It is essential for engineer and designer to have some sort of medium to enable them to search for drawing in the most effective way. This paper lays out the proposed research project under the area of information extraction in engineering drawing.

Use of Bayesian Network in Information Extraction from Unstructured Data Sources

This paper applies Bayesian Networks to support information extraction from unstructured, ungrammatical, and incoherent data sources for semantic annotation. A tool has been developed that combines ontologies, machine learning, and information extraction and probabilistic reasoning techniques to support the extraction process. Data acquisition is performed with the aid of knowledge specified in the form of ontology. Due to the variable size of information available on different data sources, it is often the case that the extracted data contains missing values for certain variables of interest. It is desirable in such situations to predict the missing values. The methodology, presented in this paper, first learns a Bayesian network from the training data and then uses it to predict missing data and to resolve conflicts. Experiments have been conducted to analyze the performance of the presented methodology. The results look promising as the methodology achieves high degree of precision and recall for information extraction and reasonably good accuracy for predicting missing values.