Thematic Role Extraction Using Shallow Parsing

Extracting thematic (semantic) roles is one of the major steps in representing text meaning. It refers to finding the semantic relations between a predicate and syntactic constituents in a sentence. In this paper we present a rule-based approach to extract semantic roles from Persian sentences. The system exploits a twophase architecture to (1) identify the arguments and (2) label them for each predicate. For the first phase we developed a rule based shallow parser to chunk Persian sentences and for the second phase we developed a knowledge-based system to assign 16 selected thematic roles to the chunks. The experimental results of testing each phase are shown at the end of the paper.

Behavior Model Mapping and Transformation using Model-Driven Architecture

Model mapping and transformation are important processes in high level system abstractions, and form the cornerstone of model-driven architecture (MDA) techniques. Considerable research in this field has devoted attention to static system abstraction, despite the fact that most systems are dynamic with high frequency changes in behavior. In this paper we provide an overview of work that has been done with regard to behavior model mapping and transformation, based on: (1) the completeness of the platform independent model (PIM); (2) semantics of behavioral models; (3) languages supporting behavior model transformation processes; and (4) an evaluation of model composition to effect the best approach to describing large systems with high complexity.

Talent in Autism: Cognitive Style based on Weak Central Coherence and Special Sensory Characteristics in State of Kuwait: Case Study

The study aimed to identify the nature of autistic talent, the manifestations of their weak central coherence, and their sensory characteristics. The case study consisted of four talented autistic males. Two of them in drawing, one in clay formation and one in jigsaw puzzle. Tools of data collection were Group Embedded Figures Test, Block Design Test, Sensory Profile Checklist Revised, Interview forms and direct observation. Results indicated that talent among autistics emerges in limited domain and being extraordinary for each case. Also overlapping construction properties. Indeed, they show three perceptual aspects of weak central coherence: The weak in visual spatial-constructional coherence, the weak in perceptual coherence and the weak in verbal – semantic coherence. Moreover, the majority of the study cases used the three strategies of weak central coherence (segmentation, obliqueness and rotation). As for the sensory characteristics, all study cases have numbers of that characteristics that especially emerges in the visual system.

A New Model for Discovering XML Association Rules from XML Documents

The inherent flexibilities of XML in both structure and semantics makes mining from XML data a complex task with more challenges compared to traditional association rule mining in relational databases. In this paper, we propose a new model for the effective extraction of generalized association rules form a XML document collection. We directly use frequent subtree mining techniques in the discovery process and do not ignore the tree structure of data in the final rules. The frequent subtrees based on the user provided support are split to complement subtrees to form the rules. We explain our model within multi-steps from data preparation to rule generation.

A New Similarity Measure Based On Edge Counting

In the field of concepts, the measure of Wu and Palmer [1] has the advantage of being simple to implement and have good performances compared to the other similarity measures [2]. Nevertheless, the Wu and Palmer measure present the following disadvantage: in some situations, the similarity of two elements of an IS-A ontology contained in the neighborhood exceeds the similarity value of two elements contained in the same hierarchy. This situation is inadequate within the information retrieval framework. To overcome this problem, we propose a new similarity measure based on the Wu and Palmer measure. Our objective is to obtain realistic results for concepts not located in the same way. The obtained results show that compared to the Wu and Palmer approach, our measure presents a profit in terms of relevance and execution time.

A Semantic Web Based Ontology in the Financial Domain

The paper describes design of an ontology in the financial domain for mutual funds. The design of this ontology consists of four steps, namely, specification, knowledge acquisition, implementation and semantic query. Specification includes a description of the taxonomy and different types mutual funds and their scope. Knowledge acquisition involves the information extraction from heterogeneous resources. Implementation describes the conceptualization and encoding of this data. Finally, semantic query permits complex queries to integrated data, mapping of these database entities to ontological concepts.

Software Architectural Design Ontology

Software Architecture plays a key role in software development but absence of formal description of Software Architecture causes different impede in software development. To cope with these difficulties, ontology has been used as artifact. This paper proposes ontology for Software Architectural design based on IEEE model for architecture description and Kruchten 4+1 model for viewpoints classification. For categorization of style and views, ISO/IEC 42010 has been used. Corpus method has been used to evaluate ontology. The main aim of the proposed ontology is to classify and locate Software Architectural design information.

Word Stemming Algorithms and Retrieval Effectiveness in Malay and Arabic Documents Retrieval Systems

Documents retrieval in Information Retrieval Systems (IRS) is generally about understanding of information in the documents concern. The more the system able to understand the contents of documents the more effective will be the retrieval outcomes. But understanding of the contents is a very complex task. Conventional IRS apply algorithms that can only approximate the meaning of document contents through keywords approach using vector space model. Keywords may be unstemmed or stemmed. When keywords are stemmed and conflated in retrieving process, we are a step forwards in applying semantic technology in IRS. Word stemming is a process in morphological analysis under natural language processing, before syntactic and semantic analysis. We have developed algorithms for Malay and Arabic and incorporated stemming in our experimental systems in order to measure retrieval effectiveness. The results have shown that the retrieval effectiveness has increased when stemming is used in the systems.

A Review of Enterprise Risk Management Practices among Malaysian Public Listed Companies

The risk sphere in business is fast changing and expanding. Almost anything has become a risk factor that will have potent, direct, and far reaching impacts on business. This paper examines the intensity of enterprise risk management (ERM) practices among the Malaysian public listed companies. The paper espouses a ERM framework comprising fourteen important implementation elements and processes. Results of the analysis indicate that the intensity of ERM implementation among the respondents is in the ‘good’ category of the semantic scale, which is deemed encouraging vis-à-vis the country’s regulatory regime.

Computational Networks for Knowledge Representation

In the artificial intelligence field, knowledge representation and reasoning are important areas for intelligent systems, especially knowledge base systems and expert systems. Knowledge representation Methods has an important role in designing the systems. There have been many models for knowledge such as semantic networks, conceptual graphs, and neural networks. These models are useful tools to design intelligent systems. However, they are not suitable to represent knowledge in the domains of reality applications. In this paper, new models for knowledge representation called computational networks will be presented. They have been used in designing some knowledge base systems in education for solving problems such as the system that supports studying knowledge and solving analytic geometry problems, the program for studying and solving problems in Plane Geometry, the program for solving problems about alternating current in physics.

The Semantic Web: a New Approach for Future World Wide Web

The purpose of semantic web research is to transform the Web from a linked document repository into a distributed knowledge base and application platform, thus allowing the vast range of available information and services to be more efficiently exploited. As a first step in this transformation, languages such as OWL have been developed. Although fully realizing the Semantic Web still seems some way off, OWL has already been very successful and has rapidly become a defacto standard for ontology development in fields as diverse as geography, geology, astronomy, agriculture, defence and the life sciences. The aim of this paper is to classify key concepts of Semantic Web as well as introducing a new practical approach which uses these concepts to outperform Word Wide Web.

On Analysis of Boundness Property for ECATNets by Using Rewriting Logic

To analyze the behavior of Petri nets, the accessibility graph and Model Checking are widely used. However, if the analyzed Petri net is unbounded then the accessibility graph becomes infinite and Model Checking can not be used even for small Petri nets. ECATNets [2] are a category of algebraic Petri nets. The main feature of ECATNets is their sound and complete semantics based on rewriting logic [8] and its language Maude [9]. ECATNets analysis may be done by using techniques of accessibility analysis and Model Checking defined in Maude. But, these two techniques supported by Maude do not work also with infinite-states systems. As a category of Petri nets, ECATNets can be unbounded and so infinite systems. In order to know if we can apply accessibility analysis and Model Checking of Maude to an ECATNet, we propose in this paper an algorithm allowing the detection if the ECATNet is bounded or not. Moreover, we propose a rewriting logic based tool implementing this algorithm. We show that the development of this tool using the Maude system is facilitated thanks to the reflectivity of the rewriting logic. Indeed, the self-interpretation of this logic allows us both the modelling of an ECATNet and acting on it.

New Approach for Manipulation of Stratified Programs

Negation is useful in the majority of the real world applications. However, its introduction leads to semantic and canonical problems. We propose in this paper an approach based on stratification to deal with negation problems. This approach is based on an extension of predicates nets. It is characterized with two main contributions. The first concerns the management of the whole class of stratified programs. The second contribution is related to usual operations optimizations on stratified programs (maximal stratification, incremental updates ...).

Development of a Semantic Wiki-based Feature Library for the Extraction of Manufacturing Feature and Manufacturing Information

A manufacturing feature can be defined simply as a geometric shape and its manufacturing information to create the shape. In a feature-based process planning system, feature library that consists of pre-defined manufacturing features and the manufacturing information to create the shape of the features, plays an important role in the extraction of manufacturing features with their proper manufacturing information. However, to manage the manufacturing information flexibly, it is important to build a feature library that can be easily modified. In this paper, the implementation of Semantic Wiki for the development of the feature library is proposed.

Analyzing Multi-Labeled Data Based on the Roll of a Concept against a Semantic Range

Classifying data hierarchically is an efficient approach to analyze data. Data is usually classified into multiple categories, or annotated with a set of labels. To analyze multi-labeled data, such data must be specified by giving a set of labels as a semantic range. There are some certain purposes to analyze data. This paper shows which multi-labeled data should be the target to be analyzed for those purposes, and discusses the role of a label against a set of labels by investigating the change when a label is added to the set of labels. These discussions give the methods for the advanced analysis of multi-labeled data, which are based on the role of a label against a semantic range.

Mapping Semantic Networks to Undirected Networks

There exists an injective, information-preserving function that maps a semantic network (i.e a directed labeled network) to a directed network (i.e. a directed unlabeled network). The edge label in the semantic network is represented as a topological feature of the directed network. Also, there exists an injective function that maps a directed network to an undirected network (i.e. an undirected unlabeled network). The edge directionality in the directed network is represented as a topological feature of the undirected network. Through function composition, there exists an injective function that maps a semantic network to an undirected network. Thus, aside from space constraints, the semantic network construct does not have any modeling functionality that is not possible with either a directed or undirected network representation. Two proofs of this idea will be presented. The first is a proof of the aforementioned function composition concept. The second is a simpler proof involving an undirected binary encoding of a semantic network.

Maya Semantic Technique: A Mathematical Technique Used to Determine Partial Semantics for Declarative Sentences

This research uses computational linguistics, an area of study that employs a computer to process natural language, and aims at discerning the patterns that exist in declarative sentences used in technical texts. The approach is mathematical, and the focus is on instructional texts found on web pages. The technique developed by the author and named the MAYA Semantic Technique is used here and organized into four stages. In the first stage, the parts of speech in each sentence are identified. In the second stage, the subject of the sentence is determined. In the third stage, MAYA performs a frequency analysis on the remaining words to determine the verb and its object. In the fourth stage, MAYA does statistical analysis to determine the content of the web page. The advantage of the MAYA Semantic Technique lies in its use of mathematical principles to represent grammatical operations which assist processing and accuracy if performed on unambiguous text. The MAYA Semantic Technique is part of a proposed architecture for an entire web-based intelligent tutoring system. On a sample set of sentences, partial semantics derived using the MAYA Semantic Technique were approximately 80% accurate. The system currently processes technical text in one domain, namely Cµ programming. In this domain all the keywords and programming concepts are known and understood.

Hybrid Machine Learning Approach for Text Categorization

Text categorization - the assignment of natural language documents to one or more predefined categories based on their semantic content - is an important component in many information organization and management tasks. Performance of neural networks learning is known to be sensitive to the initial weights and architecture. This paper discusses the use multilayer neural network initialization with decision tree classifier for improving text categorization accuracy. An adaptation of the algorithm is proposed in which a decision tree from root node until a final leave is used for initialization of multilayer neural network. The experimental evaluation demonstrates this approach provides better classification accuracy with Reuters-21578 corpus, one of the standard benchmarks for text categorization tasks. We present results comparing the accuracy of this approach with multilayer neural network initialized with traditional random method and decision tree classifiers.

An Ontology Based Question Answering System on Software Test Document Domain

Processing the data by computers and performing reasoning tasks is an important aim in Computer Science. Semantic Web is one step towards it. The use of ontologies to enhance the information by semantically is the current trend. Huge amount of domain specific, unstructured on-line data needs to be expressed in machine understandable and semantically searchable format. Currently users are often forced to search manually in the results returned by the keyword-based search services. They also want to use their native languages to express what they search. In this paper, an ontology-based automated question answering system on software test documents domain is presented. The system allows users to enter a question about the domain by means of natural language and returns exact answer of the questions. Conversion of the natural language question into the ontology based query is the challenging part of the system. To be able to achieve this, a new algorithm regarding free text to ontology based search engine query conversion is proposed. The algorithm is based on investigation of suitable question type and parsing the words of the question sentence.

A Universal Model for Content-Based Image Retrieval

In this paper a novel approach for generalized image retrieval based on semantic contents is presented. A combination of three feature extraction methods namely color, texture, and edge histogram descriptor. There is a provision to add new features in future for better retrieval efficiency. Any combination of these methods, which is more appropriate for the application, can be used for retrieval. This is provided through User Interface (UI) in the form of relevance feedback. The image properties analyzed in this work are by using computer vision and image processing algorithms. For color the histogram of images are computed, for texture cooccurrence matrix based entropy, energy, etc, are calculated and for edge density it is Edge Histogram Descriptor (EHD) that is found. For retrieval of images, a novel idea is developed based on greedy strategy to reduce the computational complexity. The entire system was developed using AForge.Imaging (an open source product), MATLAB .NET Builder, C#, and Oracle 10g. The system was tested with Coral Image database containing 1000 natural images and achieved better results.