Abstract: Image registration is an important topic for many imaging systems and computer vision applications. The standard image registration techniques such as Mutual information/ Normalized mutual information -based methods have a limited performance because they do not consider the spatial information or the relationships between the neighbouring pixels or voxels. In addition, the amount of image noise may significantly affect the registration accuracy. Therefore, this paper proposes an efficient method that explicitly considers the relationships between the adjacent pixels, where the gradient information of the reference and scene images is extracted first, and then the cosine similarity of the extracted gradient information is computed and used to improve the accuracy of the standard normalized mutual information measure. Our experimental results on different data types (i.e. CT, MRI and thermal images) show that the proposed method outperforms a number of image registration techniques in terms of the accuracy.
Abstract: Currently, database management systems have various tools such as backup and maintenance, and also provide statistical information such as resource usage and security. In terms of query performance, this paper covers query optimization, views, indexed tables, pre-computation materialized view, query performance analysis in which query plan alternatives can be created and the least costly one selected to optimize a query. Indexes and views can be created for related table columns. The literature review of this study showed that, in the course of time, despite the growing capabilities of the database management system, only database administrators are aware of the need for dealing with archival and transactional data types differently. These data may be constantly changing data used in everyday life, and also may be from the completed questionnaire whose data input was completed. For both types of data, the database uses its capabilities; but as shown in the findings section, instead of repeating similar heavy calculations which are carrying out same results with the same query over a survey results, using materialized view results can be in a more simple way. In this study, this performance difference was observed quantitatively considering the cost of the query.
Abstract: Big Data (BD) is associated with a new generation of technologies and architectures which can harness the value of extremely large volumes of very varied data through real time processing and analysis. It involves changes in (1) data types, (2) accumulation speed, and (3) data volume. This paper presents the main concepts related to the BD paradigm, and introduces architectures and technologies for BD and BD sets. The integration of BD with the Hadoop Framework is also underlined. BD has attracted a lot of attention in the public sector due to the newly emerging technologies that allow the availability of network access. The volume of different types of data has exponentially increased. Some applications of BD in the public sector in Romania are briefly presented.
Abstract: A knowledge base stores facts and rules about the
world that applications can use for the purpose of reasoning. By
applying the concept of granular computing to a knowledge base,
several advantages emerge. These can be harnessed by applications
to improve their capabilities and performance. In this paper, the
concept behind such a construct, called a granular knowledge cube,
is defined, and its intended use as an instrument that manages to
cope with different data types and detect knowledge domains is
elaborated. Furthermore, the underlying architecture, consisting of the
three layers of the storing, representing, and structuring of knowledge,
is described. Finally, benefits as well as challenges of deploying it
are listed alongside application types that could profit from having
such an enhanced knowledge base.
Abstract: Over the past epoch a rampant amount of work has been done in the data clustering research under the unsupervised learning technique in Data mining. Furthermore several algorithms and methods have been proposed focusing on clustering different data types, representation of cluster models, and accuracy rates of the clusters. However no single clustering algorithm proves to be the most efficient in providing best results. Accordingly in order to find the solution to this issue a new technique, called Cluster ensemble method was bloomed. This cluster ensemble is a good alternative approach for facing the cluster analysis problem. The main hope of the cluster ensemble is to merge different clustering solutions in such a way to achieve accuracy and to improve the quality of individual data clustering. Due to the substantial and unremitting development of new methods in the sphere of data mining and also the incessant interest in inventing new algorithms, makes obligatory to scrutinize a critical analysis of the existing techniques and the future novelty. This paper exposes the comparative study of different cluster ensemble methods along with their features, systematic working process and the average accuracy and error rates of each ensemble methods. Consequently this speculative and comprehensive analysis will be very useful for the community of clustering practitioners and also helps in deciding the most suitable one to rectify the problem in hand.
Abstract: To achieve competitive advantage nowadays, most of
the industrial companies are considering that success is sustained to
great product development. That is to manage the product throughout
its entire lifetime ranging from design, manufacture, operation and
destruction. Achieving this goal requires a tight collaboration
between partners from a wide variety of domains, resulting in various
product data types and formats, as well as different software tools. So
far, the lack of a meaningful unified representation for product data
semantics has slowed down efficient product development. This
paper proposes an ontology based approach to enable such semantic
interoperability. Generic and extendible product ontology is
described, gathering main concepts pertaining to the mechanical field
and the relations that hold among them. The ontology is not
exhaustive; nevertheless, it shows that such a unified representation
is possible and easily exploitable. This is illustrated thru a case study
with an example product and some semantic requests to which the
ontology responds quite easily. The study proves the efficiency of
ontologies as a support to product data exchange and information
sharing, especially in product development environments where
collaboration is not just a choice but a mandatory prerequisite.
Abstract: The novelty proposed in this study is twofold and consists in the developing of a new color similarity metric based on the human visual system and a new color indexing based on a textual approach. The new color similarity metric proposed is based on the color perception of the human visual system. Consequently the results returned by the indexing system can fulfill as much as possibile the user expectations. We developed a web application to collect the users judgments about the similarities between colors, whose results are used to estimate the metric proposed in this study. In order to index the image's colors, we used a text indexing engine to facilitate the integration of visual features in a database of text documents. The textual signature is build by weighting the image's colors in according to their occurrence in the image. The use of a textual indexing engine, provide us a simple, fast and robust solution to index images. A typical usage of the system proposed in this study, is the development of applications whose data type is both visual and textual. In order to evaluate the proposed method we chose a price comparison engine as a case of study, collecting a series of commercial offers containing the textual description and the image representing a specific commercial offer.
Abstract: In many applications there is a broad variety of
information relevant to a focal “object" of interest, and the fusion of such heterogeneous data types is desirable for classification and
categorization. While these various data types can sometimes be treated as orthogonal (such as the hull number, superstructure color,
and speed of an oil tanker), there are instances where the inference and the correlation between quantities can provide improved fusion
capabilities (such as the height, weight, and gender of a person). A
service-oriented architecture has been designed and prototyped to
support the fusion of information for such “object-centric" situations.
It is modular, scalable, and flexible, and designed to support new data sources, fusion algorithms, and computational resources without affecting existing services. The architecture is designed to simplify
the incorporation of legacy systems, support exact and probabilistic entity disambiguation, recognize and utilize multiple types of
uncertainties, and minimize network bandwidth requirements.
Abstract: Software reuse can be considered as the most realistic
and promising way to improve software engineering productivity and
quality. Automated assistance for software reuse involves the
representation, classification, retrieval and adaptation of components.
The representation and retrieval of components are important to
software reuse in Component-Based on Software Development
(CBSD). However, current industrial component models mainly focus
on the implement techniques and ignore the semantic information
about component, so it is difficult to retrieve the components that
satisfy user-s requirements. This paper presents a method of business
component retrieval based on specification matching to solve the
software reuse of enterprise information system. First, a business
component model oriented reuse is proposed. In our model, the
business data type is represented as sign data type based on XML,
which can express the variable business data type that can describe the
variety of business operations. Based on this model, we propose
specification match relationships in two levels: business operation
level and business component level. In business operation level, we
use input business data types, output business data types and the
taxonomy of business operations evaluate the similarity between
business operations. In the business component level, we propose five
specification matches between business components. To retrieval
reusable business components, we propose the measure of similarity
degrees to calculate the similarities between business components.
Finally, a business component retrieval command like SQL is
proposed to help user to retrieve approximate business components
from component repository.
Abstract: This paper focuses on testing database of existing
information system. At the beginning we describe the basic problems
of implemented databases, such as data redundancy, poor design of
database logical structure or inappropriate data types in columns of
database tables. These problems are often the result of incorrect
understanding of the primary requirements for a database of an
information system. Then we propose an algorithm to compare the
conceptual model created from vague requirements for a database
with a conceptual model reconstructed from implemented database.
An algorithm also suggests steps leading to optimization of
implemented database. The proposed algorithm is verified by an
implemented prototype. The paper also describes a fuzzy system
which works with the vague requirements for a database of an
information system, procedure for creating conceptual from vague
requirements and an algorithm for reconstructing a conceptual model
from implemented database.
Abstract: Multimedia distributed systems deal with heterogeneous
data, such as texts, images, graphics, video and audio. The specification
of temporal relations among different data types and distributed
sources is an open research area. This paper proposes a fully
distributed synchronization model to be used in multimedia systems.
One original aspect of the model is that it avoids the use of a common
reference (e.g. wall clock and shared memory). To achieve this, all
possible multimedia temporal relations are specified according to
their causal dependencies.
Abstract: This paper presents an algebraic approach to optimize
queries in domain-specific database management system
for protein structure data. The approach involves the introduction of
several protein structure specific algebraic operators to query the
complex data stored in an object-oriented database system. The
Protein Algebra provides an extensible set of high-level Genomic
Data Types and Protein Data Types along with a comprehensive
collection of appropriate genomic and protein functions. The paper
also presents a query translator that converts high-level query
specifications in algebra into low-level query specifications in
Protein-QL, a query language designed to query protein structure
data. The query transformation process uses a Protein Ontology that
serves the purpose of a dictionary.
Abstract: The scientific achievements coming from molecular
biology depend greatly on the capability of computational
applications to analyze the laboratorial results. A comprehensive
analysis of an experiment requires typically the simultaneous study
of the obtained dataset with data that is available in several distinct
public databases. Nevertheless, developing a centralized access to
these distributed databases rises up a set of challenges such as: what
is the best integration strategy, how to solve nomenclature clashes,
how to solve database overlapping data and how to deal with huge
datasets. In this paper we present GeNS, a system that uses a simple and yet innovative approach to address several biological data integration issues. Compared with existing systems, the main
advantages of GeNS are related to its maintenance simplicity and to its coverage and scalability, in terms of number of supported
databases and data types. To support our claims we present the current use of GeNS in two concrete applications. GeNS currently contains more than 140 million of biological relations and it can be
publicly downloaded or remotely access through SOAP web services.
Abstract: Bio-chips are used for experiments on genes and
contain various information such as genes, samples and so on. The
two-dimensional bio-chips, in which one axis represent genes and the
other represent samples, are widely being used these days. Instead of
experimenting with real genes which cost lots of money and much
time to get the results, bio-chips are being used for biological
experiments. And extracting data from the bio-chips with high
accuracy and finding out the patterns or useful information from such
data is very important. Bio-chip analysis systems extract data from
various kinds of bio-chips and mine the data in order to get useful
information. One of the commonly used methods to mine the data is
classification. The algorithm that is used to classify the data can be
various depending on the data types or number characteristics and so
on. Considering that bio-chip data is extremely large, an algorithm that
imitates the ecosystem such as the ant algorithm is suitable to use as an
algorithm for classification. This paper focuses on finding the
classification rules from the bio-chip data using the Ant Colony
algorithm which imitates the ecosystem. The developed system takes
in consideration the accuracy of the discovered rules when it applies it
to the bio-chip data in order to predict the classes.
Abstract: In present days the area of data migration is very topical. Current tools for data migration in the area of relational database have several disadvantages that are presented in this paper. We propose a methodology for data migration of the database tables and their data between various types of relational database systems (RDBMS). The proposed methodology contains an expert system. The expert system contains a knowledge base that is composed of IFTHEN rules and based on the input data suggests appropriate data types of columns of database tables. The proposed tool, which contains an expert system, also includes the possibility of optimizing the data types in the target RDBMS database tables based on processed data of the source RDBMS database tables. The proposed expert system is shown on data migration of selected database of the source RDBMS to the target RDBMS.
Abstract: Schema matching plays a key role in many different
applications, such as schema integration, data integration, data
warehousing, data transformation, E-commerce, peer-to-peer data
management, ontology matching and integration, semantic Web,
semantic query processing, etc. Manual matching is expensive and
error-prone, so it is therefore important to develop techniques to
automate the schema matching process. In this paper, we present a
solution for XML schema automated matching problem which
produces semantic mappings between corresponding schema
elements of given source and target schemas. This solution
contributed in solving more comprehensively and efficiently XML
schema automated matching problem. Our solution based on
combining linguistic similarity, data type compatibility and structural
similarity of XML schema elements. After describing our solution,
we present experimental results that demonstrate the effectiveness of
this approach.
Abstract: In general, class complexity is measured based on any
one of these factors such as Line of Codes (LOC), Functional points
(FP), Number of Methods (NOM), Number of Attributes (NOA) and so on. There are several new techniques, methods and metrics with
the different factors that are to be developed by the researchers for calculating the complexity of the class in Object Oriented (OO)
software. Earlier, Arockiam et.al has proposed a new complexity measure namely Extended Weighted Class Complexity (EWCC)
which is an extension of Weighted Class Complexity which is proposed by Mishra et.al. EWCC is the sum of cognitive weights of
attributes and methods of the class and that of the classes derived. In EWCC, a cognitive weight of each attribute is considered to be 1.
The main problem in EWCC metric is that, every attribute holds the
same value but in general, cognitive load in understanding the
different types of attributes cannot be the same. So here, we are proposing a new metric namely Attribute Weighted Class Complexity
(AWCC). In AWCC, the cognitive weights have to be assigned for the attributes which are derived from the effort needed to understand
their data types. The proposed metric has been proved to be a better
measure of complexity of class with attributes through the case studies and experiments
Abstract: In recent years, scanning probe atomic force
microscopy SPM AFM has gained acceptance over a wide spectrum
of research and science applications. Most fields focuses on physical,
chemical, biological while less attention is devoted to manufacturing
and machining aspects. The purpose of the current study is to assess
the possible implementation of the SPM AFM features and its
NanoScope software in general machining applications with special
attention to the tribological aspects of cutting tool. The surface
morphology of coated and uncoated as-received carbide inserts is
examined, analyzed, and characterized through the determination of
the appropriate scanning setting, the suitable data type imaging
techniques and the most representative data analysis parameters
using the MultiMode SPM AFM in contact mode. The NanoScope
operating software is used to capture realtime three data types
images: “Height", “Deflection" and “Friction". Three scan sizes are
independently performed: 2, 6, and 12 μm with a 2.5 μm vertical
range (Z). Offline mode analysis includes the determination of three
functional topographical parameters: surface “Roughness", power
spectral density “PSD" and “Section". The 12 μm scan size in
association with “Height" imaging is found efficient to capture every
tiny features and tribological aspects of the examined surface. Also,
“Friction" analysis is found to produce a comprehensive explanation
about the lateral characteristics of the scanned surface. Configuration
of many surface defects and drawbacks has been precisely detected
and analyzed.