Abstract: Technological advances of computer science and data
analysis are helping to provide continuously huge volumes of
biological data, which are available on the web. Such advances
involve and require powerful techniques for data integration to
extract pertinent knowledge and information for a specific question.
Biomedical exploration of these big data often requires the use
of complex queries across multiple autonomous, heterogeneous
and distributed data sources. Semantic integration is an active
area of research in several disciplines, such as databases,
information-integration, and ontology. We provide a survey of some
approaches and techniques for integrating biological data, we focus
on those developed in the ontology community.
Abstract: Hyperspectral imagery (HSI) typically provides a
wealth of information captured in a wide range of the
electromagnetic spectrum for each pixel in the image. Hence, a
pixel in HSI is a high-dimensional vector of intensities with a
large spectral range and a high spectral resolution. Therefore, the
semantic interpretation is a challenging task of HSI analysis. We
focused in this paper on object classification as HSI semantic
interpretation. However, HSI classification still faces some issues,
among which are the following: The spatial variability of spectral
signatures, the high number of spectral bands, and the high cost
of true sample labeling. Therefore, the high number of spectral
bands and the low number of training samples pose the problem of
the curse of dimensionality. In order to resolve this problem, we
propose to introduce the process of dimensionality reduction trying
to improve the classification of HSI. The presented approach is a
semi-supervised band selection method based on spatial hypergraph
embedding model to represent higher order relationships with
different weights of the spatial neighbors corresponding to the
centroid of pixel. This semi-supervised band selection has been
developed to select useful bands for object classification. The
presented approach is evaluated on AVIRIS and ROSIS HSIs
and compared to other dimensionality reduction methods. The
experimental results demonstrate the efficacy of our approach
compared to many existing dimensionality reduction methods for
HSI classification.
Abstract: Document Analysis is an important research field that aims to gather the information by analyzing the data in documents. As one of the important targets for many fields is to understand what people actually want, sentimental analysis field has been one of the vital fields that are tightly related to the document analysis. This research focuses on analyzing text documents to classify each document according to its opinion. The aim of this research is to detect the emotions from text documents based on enriching the lexicon with adapting their content based on semantic patterns extraction. The proposed approach has been presented, and different experiments are applied by different perspectives to reveal the positive impact of the proposed approach on the classification results.
Abstract: Abstract—Attribute or feature selection is one of the basic
strategies to improve the performances of data classification tasks,
and, at the same time, to reduce the complexity of classifiers,
and it is a particularly fundamental one when the number
of attributes is relatively high. Its application to unsupervised
classification is restricted to a limited number of experiments in
the literature. Evolutionary computation has already proven itself
to be a very effective choice to consistently reduce the number
of attributes towards a better classification rate and a simpler
semantic interpretation of the inferred classifiers. We present a feature
selection wrapper model composed by a multi-objective evolutionary
algorithm, the clustering method Expectation-Maximization (EM),
and the classifier C4.5 for the unsupervised classification of data
extracted from a psychological test named BASC-II (Behavior
Assessment System for Children - II ed.) with two objectives:
Maximizing the likelihood of the clustering model and maximizing
the accuracy of the obtained classifier. We present a methodology
to integrate feature selection for unsupervised classification, model
evaluation, decision making (to choose the most satisfactory model
according to a a posteriori process in a multi-objective context), and
testing. We compare the performance of the classifier obtained by the
multi-objective evolutionary algorithms ENORA and NSGA-II, and
the best solution is then validated by the psychologists that collected
the data.
Abstract: The present study addressed the nature of bilingual semantic processing in Mandarin Chinese and Southern Min and examined category effects and age effects. Nineteen bilingual adults of Mandarin Chinese and Southern Min, nine monolingual seniors of Mandarin Chinese, and ten monolingual seniors of Southern Min in Taiwan individually completed two semantic tasks: Picture naming and category fluency tasks. The instruments for the naming task were sixty black-and-white pictures, including thirty-five object pictures and twenty-five action pictures. The category fluency task also consisted of two semantic categories – objects (or nouns) and actions (or verbs). The reaction time for each picture/question was additionally calculated and analyzed. Oral productions in Mandarin Chinese and in Southern Min were compared and discussed to examine the category effects and age effects. The results of the category fluency task indicated that the content of information of these seniors was comparatively deteriorated, and thus they produced a smaller number of semantic-lexical items. Significant group differences were also found in the reaction time results. Category effects were significant for both adults and seniors in the semantic fluency task. The findings of the present study will help characterize the nature of the bilingual semantic processing of adults and seniors, and contribute to the fields of contrastive and corpus linguistics.
Abstract: Web service adaptation involves the creation of adapters that solve Web services incompatibilities known as mismatches. Since the importance of Web services adaptation is increasing because of the frequent implementation and use of online Web services, this paper presents a literature review of web services to investigate the main methods of adaptation, their theoretical underpinnings and the metrics used to measure adapters performance. Eighteen publications were reviewed independently by two researchers. We found that adaptation techniques are needed to solve different types of problems that may arise due to incompatibilities in Web service interfaces, including protocols, messages, data and semantics that affect the interoperability of the services. Although adapters are non-invasive methods that can improve Web services interoperability and there are current approaches for service adaptation; there is, however, not yet one solution that fits all types of mismatches. Our results also show that only a few research projects incorporate theoretical frameworks and that metrics to measure adapters’ performance are very limited. We conclude that further research on software adaptation should improve current adaptation methods in different layers of the service interoperability and that an adaptation theoretical framework that incorporates a theoretical underpinning and measures of qualitative and quantitative performance needs to be created.
Abstract: Nowadays, ontologies are used for achieving a
common understanding within a user community and for sharing
domain knowledge. However, the de-centralized nature of the web
makes indeed inevitable that small communities will use their own
ontologies to describe their data and to index their own resources.
Certainly, accessing to resources from various ontologies created
independently is an important challenge for answering end user
queries. Ontology mapping is thus required for combining ontologies.
However, mapping complete ontologies at run time is a
computationally expensive task. This paper proposes a system in
which mappings between concepts may be generated dynamically as
the concepts are encountered during user queries. In this way, the
interaction itself defines the context in which small and relevant
portions of ontologies are mapped. We illustrate application of the
proposed system in the context of Technology Enhanced Learning
(TEL) where learners need to access to learning resources covering
specific concepts.
Abstract: Social networks have recently gained a growing
interest on the web. Traditional formalisms for representing social
networks are static and suffer from the lack of semantics. In this
paper, we will show how semantic web technologies can be used to
model social data. The SemTemp ontology aligns and extends
existing ontologies such as FOAF, SIOC, SKOS and OWL-Time to
provide a temporal and semantically rich description of social data.
We also present a modeling scenario to illustrate how our ontology
can be used to model social networks.
Abstract: This paper proposes a method of learning topics for
broadcasting contents. There are two kinds of texts related to
broadcasting contents. One is a broadcasting script, which is a series of
texts including directions and dialogues. The other is blogposts, which
possesses relatively abstracted contents, stories, and diverse
information of broadcasting contents. Although two texts range over
similar broadcasting contents, words in blogposts and broadcasting
script are different. When unseen words appear, it needs a method to
reflect to existing topic. In this paper, we introduce a semantic
vocabulary expansion method to reflect unseen words. We expand
topics of the broadcasting script by incorporating the words in
blogposts. Each word in blogposts is added to the most semantically
correlated topics. We use word2vec to get the semantic correlation
between words in blogposts and topics of scripts. The vocabularies of
topics are updated and then posterior inference is performed to
rearrange the topics. In experiments, we verified that the proposed
method can discover more salient topics for broadcasting contents.
Abstract: In order to retrieve images efficiently from a large
database, a unique method integrating color and texture features
using genetic programming has been proposed. Opponent color
histogram which gives shadow, shade, and light intensity invariant
property is employed in the proposed framework for extracting color
features. For texture feature extraction, fast discrete curvelet
transform which captures more orientation information at different
scales is incorporated to represent curved like edges. The recent
scenario in the issues of image retrieval is to reduce the semantic gap
between user’s preference and low level features. To address this
concern, genetic algorithm combined with relevance feedback is
embedded to reduce semantic gap and retrieve user’s preference
images. Extensive and comparative experiments have been conducted
to evaluate proposed framework for content based image retrieval on
two databases, i.e., COIL-100 and Corel-1000. Experimental results
clearly show that the proposed system surpassed other existing
systems in terms of precision and recall. The proposed work achieves
highest performance with average precision of 88.2% on COIL-100
and 76.3% on Corel, the average recall of 69.9% on COIL and 76.3%
on Corel. Thus, the experimental results confirm that the proposed
content based image retrieval system architecture attains better
solution for image retrieval.
Abstract: Scripts are one of the basic text resources to understand
broadcasting contents. Topic modeling is the method to get the
summary of the broadcasting contents from its scripts. Generally,
scripts represent contents descriptively with directions and speeches,
and provide scene segments that can be seen as semantic units.
Therefore, a script can be topic modeled by treating a scene segment
as a document. Because scene segments consist of speeches mainly,
however, relatively small co-occurrences among words in the scene
segments are observed. This causes inevitably the bad quality of
topics by statistical learning method. To tackle this problem, we
propose a method to improve topic quality with additional word
co-occurrence information obtained using scene similarities. The
main idea of improving topic quality is that the information that
two or more texts are topically related can be useful to learn high
quality of topics. In addition, more accurate topical representations
lead to get information more accurate whether two texts are related
or not. In this paper, we regard two scene segments are related
if their topical similarity is high enough. We also consider that
words are co-occurred if they are in topically related scene segments
together. By iteratively inferring topics and determining semantically
neighborhood scene segments, we draw a topic space represents
broadcasting contents well. In the experiments, we showed the
proposed method generates a higher quality of topics from Korean
drama scripts than the baselines.
Abstract: Advances in spatial and spectral resolution of satellite
images have led to tremendous growth in large image databases. The
data we acquire through satellites, radars, and sensors consists of
important geographical information that can be used for remote
sensing applications such as region planning, disaster management.
Spatial data classification and object recognition are important tasks
for many applications. However, classifying objects and identifying
them manually from images is a difficult task. Object recognition is
often considered as a classification problem, this task can be
performed using machine-learning techniques. Despite of many
machine-learning algorithms, the classification is done using
supervised classifiers such as Support Vector Machines (SVM) as the
area of interest is known. We proposed a classification method,
which considers neighboring pixels in a region for feature extraction
and it evaluates classifications precisely according to neighboring
classes for semantic interpretation of region of interest (ROI). A
dataset has been created for training and testing purpose; we
generated the attributes by considering pixel intensity values and
mean values of reflectance. We demonstrated the benefits of using
knowledge discovery and data-mining techniques, which can be on
image data for accurate information extraction and classification from
high spatial resolution remote sensing imagery.
Abstract: This article discusses the passage of RDB to XML
documents (schema and data) based on metadata and semantic
enrichment, which makes the RDB under flattened shape and is
enriched by the object concept. The integration and exploitation of
the object concept in the XML uses a syntax allowing for the
verification of the conformity of the document XML during the
creation. The information extracted from the RDB is therefore
analyzed and filtered in order to adjust according to the structure of
the XML files and the associated object model. Those implemented
in the XML document through a SQL query are built dynamically. A
prototype was implemented to realize automatic migration, and so
proves the effectiveness of this particular approach.
Abstract: Radio Frequency Identification (RFID) has become a
key technology in the emerging concept of Internet of Things (IoT).
Naturally, business applications would require the deployment of
various RFID systems developed by different vendors that use
different data formats and structures. This heterogeneity poses a
challenge in developing real-life IoT systems with RFID, as
integration is becoming very complex and challenging. Semantic
integration is a key approach to deal with this challenge. To do so,
ontology for RFID systems need to be developed in order to
annotated semantically RFID systems, and hence, facilitate their
integration. Accordingly, in this paper, we propose ontology for
RFID systems. The proposed ontology can be used to semantically
enrich RFID systems, and hence, improve their usage and reasoning.
Abstract: There are real needs to integrate types of Open
Educational Resources (OER) with an intelligent system to extract
information and knowledge in the semantic searching level. The
needs came because most of current learning standard adopted web
based learning and the e-learning systems do not always serve all
educational goals. Semantic Web systems provide educators,
students, and researchers with intelligent queries based on a semantic
knowledge management learning system. An ontology-based learning
system is an advanced system, where ontology plays the core of the
semantic web in a smart learning environment. The objective of this
paper is to discuss the potentials of ontologies and mapping different
kinds of ontologies; heterogeneous or homogenous to manage and
control different types of Open Educational Resources. The important
contribution of this research is that it uses logical rules and
conceptual relations to map between ontologies of different
educational resources. We expect from this methodology to establish
an intelligent educational system supporting student tutoring, self and
lifelong learning system.
Abstract: In this paper, the secure BioSemantic Scheme is
presented to bridge biological/biomedical research problems and
computational solutions via semantic computing. Due to the diversity
of problems in various research fields, the semantic capability
description language (SCDL) plays and important role as a common
language and generic form for problem formalization. SCDL is
expected the essential for future semantic and logical computing in
Biosemantic field. We show several example to Biomedical problems
in this paper. Moreover, in the coming age of cloud computing, the
security problem is considered to be crucial issue and we presented a
practical scheme to cope with this problem.
Abstract: The enormous amount of information stored on the
web increases from one day to the next, exposing the web currently
faced with the inevitable difficulties of research pertinent information
that users really want. The problem today is not limited to expanding
the size of the information highways, but to design a system for
intelligent search. The vast majority of this information is stored in
relational databases, which in turn represent a backend for managing
RDF data of the semantic web. This problem has motivated us to
write this paper in order to establish an effective approach to support
semantic transformation algorithm for SPARQL queries to SQL
queries, more precisely SPARQL SELECT queries; by adopting this
method, the relational database can be questioned easily with
SPARQL queries maintaining the same performance.
Abstract: Model transformation, as a pivotal aspect of Modeldriven
engineering, attracts more and more attentions both from
researchers and practitioners. Many domains (enterprise engineering,
software engineering, knowledge engineering, etc.) use model
transformation principles and practices to serve to their domain
specific problems; furthermore, model transformation could also be
used to fulfill the gap between different domains: by sharing and
exchanging knowledge. Since model transformation has been widely
used, there comes new requirement on it: effectively and efficiently
define the transformation process and reduce manual effort that
involved in. This paper presents an automatic model transformation
methodology based on semantic and syntactic comparisons, and
focuses particularly on granularity issue that existed in transformation
process. Comparing to the traditional model transformation
methodologies, this methodology serves to a general purpose: crossdomain
methodology. Semantic and syntactic checking
measurements are combined into a refined transformation process,
which solves the granularity issue. Moreover, semantic and syntactic
comparisons are supported by software tool; manual effort is replaced
in this way.
Abstract: Ontology validation is an important part of web
applications’ development, where knowledge integration and
ontological reasoning play a fundamental role. It aims to ensure the
consistency and correctness of ontological knowledge and to
guarantee that ontological reasoning is carried out in a meaningful
way. Existing approaches to ontology validation address more or less
specific validation issues, but the overall process of validating web
ontologies has not been formally established yet. As the size and the
number of web ontologies continue to grow, more web applications’
developers will rely on the existing repository of ontologies rather
than develop ontologies from scratch. If an application utilizes
multiple independently created ontologies, their consistency must be
validated and eventually adjusted to ensure proper interoperability
between them. This paper presents a validation technique intended to
test the consistency of independent ontologies utilized by a common
application.
Abstract: The growth in the volume of text data such as books
and articles in libraries for centuries has imposed to establish
effective mechanisms to locate them. Early techniques such as
abstraction, indexing and the use of classification categories have
marked the birth of a new field of research called "Information
Retrieval". Information Retrieval (IR) can be defined as the task of
defining models and systems whose purpose is to facilitate access to
a set of documents in electronic form (corpus) to allow a user to find
the relevant ones for him, that is to say, the contents which matches
with the information needs of the user. This paper presents a new
semantic indexing approach of a documentary corpus. The indexing
process starts first by a term weighting phase to determine the
importance of these terms in the documents. Then the use of a
thesaurus like Wordnet allows moving to the conceptual level.
Each candidate concept is evaluated by determining its level of
representation of the document, that is to say, the importance of the
concept in relation to other concepts of the document. Finally, the
semantic index is constructed by attaching to each concept of the
ontology, the documents of the corpus in which these concepts are
found.