Abstract: In this paper, we propose an efficient hierarchical DNA
sequence search method to improve the search speed while the
accuracy is being kept constant. For a given query DNA sequence,
firstly, a fast local search method using histogram features is used as a
filtering mechanism before scanning the sequences in the database.
An overlapping processing is newly added to improve the robustness
of the algorithm. A large number of DNA sequences with low
similarity will be excluded for latter searching. The Smith-Waterman
algorithm is then applied to each remainder sequences. Experimental
results using GenBank sequence data show the proposed method
combining histogram information and Smith-Waterman algorithm is
more efficient for DNA sequence search.
Abstract: The paper proposes a unified model for multimedia data retrieval which includes data representatives, content representatives, index structure, and search algorithms. The multimedia data are defined as k-dimensional signals indexed in a multidimensional k-tree structure. The benefits of using the k-tree unified model were demonstrated by running the data retrieval application on a six networked nodes test bed cluster. The tests were performed with two retrieval algorithms, one that allows parallel searching using a single feature, the second that performs a weighted cascade search for multiple features querying. The experiments show a significant reduction of retrieval time while maintaining the quality of results.
Abstract: This paper aims at improving web server performance
by establishing a middleware layer between web and database
servers, which minimizes the overload on the database server. A
middleware system has been developed as a service mainly to
improve the performance. This system manages connection accesses
in a way that would result in reducing the overload on the database
server. In addition to the connection management, this system acts as
an object-oriented model for best utilization of operating system
resources. A web developer can use this Service Broker to improve
web server performance.
Abstract: Gabor-based face representation has achieved enormous success in face recognition. This paper addresses a novel algorithm for face recognition using neural networks trained by Gabor features. The system is commenced on convolving a face image with a series of Gabor filter coefficients at different scales and orientations. Two novel contributions of this paper are: scaling of rms contrast and introduction of fuzzily skewed filter. The neural network employed for face recognition is based on the multilayer perceptron (MLP) architecture with backpropagation algorithm and incorporates the convolution filter response of Gabor jet. The effectiveness of the algorithm has been justified over a face database with images captured at different illumination conditions.
Abstract: The database reverse engineering problems and
solving processes are getting mature, even though, the academic
community is facing the complex problem of knowledge transfer,
both in university and industrial contexts. This paper presents a new
CASE tool developed at the University of Jordan which addresses an
efficient support of this transfer, namely UJ-CASE-TOOL. It is a
small and self-contained application exhibiting representative
problems and appropriate solutions that can be understood in a
limited time. It presents an algorithm that describes the developed
academic CASE tool which has been used for several years both as
an illustration of the principles of database reverse engineering and
as an exercise aimed at academic and industrial students.
Abstract: Number of documents being created increases at an
increasing pace while most of them being in already known topics
and little of them introducing new concepts. This fact has started a
new era in information retrieval discipline where the requirements
have their own specialties. That is digging into topics and concepts
and finding out subtopics or relations between topics. Up to now IR
researches were interested in retrieving documents about a general
topic or clustering documents under generic subjects. However these
conventional approaches can-t go deep into content of documents
which makes it difficult for people to reach to right documents they
were searching. So we need new ways of mining document sets
where the critic point is to know much about the contents of the
documents. As a solution we are proposing to enhance LSI, one of
the proven IR techniques by supporting its vector space with n-gram
forms of words. Positive results we have obtained are shown in two
different application area of IR domain; querying a document
database, clustering documents in the document database.
Abstract: This paper presents data annotation models at
five levels of granularity (database, relation, column, tuple, and cell) of relational data to address the problem of unsuitability of most relational databases to express annotations. These models
do not require any structural and schematic changes to the
underlying database. These models are also flexible, extensible,
customizable, database-neutral, and platform-independent. This paper also presents an SQL-like query language, named Annotation Query Language (AnQL), to query annotation documents. AnQL is simple to understand and exploits the already-existent wide knowledge and skill set of SQL.
Abstract: The development of distributed systems has been affected by the need to accommodate an increasing degree of flexibility, adaptability, and autonomy. The Mobile Agent technology is emerging as an alternative to build a smart generation of highly distributed systems. In this work, we investigate the performance aspect of agent-based technologies for information retrieval. We present a comparative performance evaluation model of Mobile Agents versus Remote Method Invocation by means of an analytical approach. We demonstrate the effectiveness of mobile agents for dynamic code deployment and remote data processing by reducing total latency and at the same time producing minimum network traffic. We argue that exploiting agent-based technologies significantly enhances the performance of distributed systems in the domain of information retrieval.
Abstract: Address Matching is an important application of
Geographic Information System (GIS). Prior to Address Matching
working, obtaining X,Y coordinates is necessary, which process is
calling Address Geocoding. This study will illustrate the effective
address geocoding process of using household registry database, and
the check system for geocoded address.
Abstract: We present here the results for a comparative study of
some techniques, available in the literature, related to the relevance
feedback mechanism in the case of a short-term learning. Only one
method among those considered here is belonging to the data mining
field which is the K-nearest neighbors algorithm (KNN) while the
rest of the methods is related purely to the information retrieval field
and they fall under the purview of the following three major axes:
Shifting query, Feature Weighting and the optimization of the
parameters of similarity metric. As a contribution, and in addition to
the comparative purpose, we propose a new version of the KNN
algorithm referred to as an incremental KNN which is distinct from
the original version in the sense that besides the influence of the
seeds, the rate of the actual target image is influenced also by the
images already rated. The results presented here have been obtained
after experiments conducted on the Wang database for one iteration
and utilizing color moments on the RGB space. This compact
descriptor, Color Moments, is adequate for the efficiency purposes
needed in the case of interactive systems. The results obtained allow
us to claim that the proposed algorithm proves good results; it even
outperforms a wide range of techniques available in the literature.
Abstract: The rapid expansion of the web is causing the
constant growth of information, leading to several problems such as
increased difficulty of extracting potentially useful knowledge. Web
content mining confronts this problem gathering explicit information
from different web sites for its access and knowledge discovery.
Query interfaces of web databases share common building blocks.
After extracting information with parsing approach, we use a new
data mining algorithm to match a large number of schemas in
databases at a time. Using this algorithm increases the speed of
information matching. In addition, instead of simple 1:1 matching,
they do complex (m:n) matching between query interfaces. In this
paper we present a novel correlation mining algorithm that matches
correlated attributes with smaller cost. This algorithm uses Jaccard
measure to distinguish positive and negative correlated attributes.
After that, system matches the user query with different query
interfaces in special domain and finally chooses the nearest query
interface with user query to answer to it.
Abstract: A new automatic system for the recognition and re¬construction of resealed and/or rotated partially occluded objects is presented. The objects to be recognized are described by 2D views and each view is occluded by several half-planes. The whole object views and their visible parts (linear cuts) are then stored in a database. To establish if a region R of an input image represents an object possibly occluded, the system generates a set of linear cuts of R and compare them with the elements in the database. Each linear cut of R is associated to the most similar database linear cut. R is recognized as an instance of the object 0 if the majority of the linear cuts of R are associated to a linear cut of views of 0. In the case of recognition, the system reconstructs the occluded part of R and determines the scale factor and the orientation in the image plane of the recognized object view. The system has been tested on two different datasets of objects, showing good performance both in terms of recognition and reconstruction accuracy.
Abstract: The iris recognition technology is the most accurate,
fast and less invasive one compared to other biometric techniques
using for example fingerprints, face, retina, hand geometry, voice or
signature patterns. The system developed in this study has the
potential to play a key role in areas of high-risk security and can
enable organizations with means allowing only to the authorized
personnel a fast and secure way to gain access to such areas. The
paper aim is to perform the iris region detection and iris inner and
outer boundaries localization. The system was implemented on
windows platform using Visual C# programming language. It is easy
and efficient tool for image processing to get great performance
accuracy. In particular, the system includes two main parts. The first
is to preprocess the iris images by using Canny edge detection
methods, segments the iris region from the rest of the image and
determine the location of the iris boundaries by applying Hough
transform. The proposed system tested on 756 iris images from 60
eyes of CASIA iris database images.
Abstract: The main objective of this article is to present the semi-active vibration control using an electro-rheological fluid embedded sandwich structure for a cantilever beam. ER fluid is a smart material, which cause the suspended particles polarize and connect each other to form chain. The stiffness and damping coefficients of the ER fluid can be changed in 10 micro seconds; therefore, ERF is suitable to become the material embedded in the tunable vibration absorber to become a smart absorber. For the ERF smart material embedded structure, the fuzzy control law depends on the experimental expert database and the proposed self-tuning strategy. The electric field is controlled by a CRIO embedded system to implement the real application. This study investigates the different performances using the Type-1 fuzzy and interval Type-2 fuzzy controllers. The Interval type-2 fuzzy control is used to improve the modeling uncertainties for this ERF embedded shock absorber. The self-tuning vibration controllers using Type-1 and Interval Type-2 fuzzy law are implemented to the shock absorber system. Based on the resulting performance, Internal Type-2 fuzzy is better than the traditional Type-1 fuzzy control for this vibration control system.
Abstract: The inherent flexibilities of XML in both structure
and semantics makes mining from XML data a complex task with
more challenges compared to traditional association rule mining in
relational databases. In this paper, we propose a new model for the
effective extraction of generalized association rules form a XML
document collection. We directly use frequent subtree mining
techniques in the discovery process and do not ignore the tree
structure of data in the final rules. The frequent subtrees based on the
user provided support are split to complement subtrees to form the
rules. We explain our model within multi-steps from data preparation
to rule generation.
Abstract: This paper presents an automatic feature recognition
method based on center-surround difference detecting and fuzzy logic
that can be applied in ground-penetrating radar (GPR) image
processing. Adopted center-surround difference method, the salient
local image regions are extracted from the GPR images as features of
detected objects. And fuzzy logic strategy is used to match the
detected features and features in template database. This way, the
problem of objects detecting, which is the key problem in GPR image
processing, can be converted into two steps, feature extracting and
matching. The contributions of these skills make the system have the
ability to deal with changes in scale, antenna and noises. The results of
experiments also prove that the system has higher ratio of features
sensing in using GPR to image the subsurface structures.
Abstract: This paper proposes an auto-classification algorithm
of Web pages using Data mining techniques. We consider the
problem of discovering association rules between terms in a set of
Web pages belonging to a category in a search engine database, and
present an auto-classification algorithm for solving this problem that
are fundamentally based on Apriori algorithm. The proposed
technique has two phases. The first phase is a training phase where
human experts determines the categories of different Web pages, and
the supervised Data mining algorithm will combine these categories
with appropriate weighted index terms according to the highest
supported rules among the most frequent words. The second phase is
the categorization phase where a web crawler will crawl through the
World Wide Web to build a database categorized according to the
result of the data mining approach. This database contains URLs and
their categories.
Abstract: Apparel product development is an important stage in the life cycle of a product. Shortening this stage will help to reduce the costs of a garment. The aim of this study is to examine the production parameters in knitwear apparel companies by defining the unit costs, and developing a software to calculate the unit costs of garments and make the cost estimates. In this study, with the help of a questionnaire, different companies- systems of unit cost estimating and cost calculating were tried to be analyzed. Within the scope of the questionnaire, the importance of cost estimating process for apparel companies and the expectations from a new cost estimating program were investigated. According to the results of the questionnaire, it was seen that the majority of companies which participated to the questionnaire use manual cost calculating methods or simple Microsoft Excel spreadsheets to make cost estimates. Furthermore, it was discovered that many companies meet with difficulties in archiving the cost data for future use and as a solution to that problem, it is thought that prior to making a cost estimate, sub units of garment costs which are fabric, accessory and the labor costs should be analyzed and added to the database of the programme beforehand. Another specification of the cost estimating unit prepared in this study is that the programme was designed to consist of two main units, one of which makes the product specification and the other makes the cost calculation. The programme is prepared as a web-based application in order that the supplier, the manufacturer and the customer can have the opportunity to communicate through the same platform.
Abstract: This paper sets forth the possibility and importance about applying Data Mining in Web logs mining and shows some problems in the conventional searching engines. Then it offers an improved algorithm based on the original AprioriAll algorithm which has been used in Web logs mining widely. The new algorithm adds the property of the User ID during the every step of producing the candidate set and every step of scanning the database by which to decide whether an item in the candidate set should be put into the large set which will be used to produce next candidate set. At the meantime, in order to reduce the number of the database scanning, the new algorithm, by using the property of the Apriori algorithm, limits the size of the candidate set in time whenever it is produced. Test results show the improved algorithm has a more lower complexity of time and space, better restrain noise and fit the capacity of memory.
Abstract: The latest Geographic Information System (GIS)
technology makes it possible to administer the spatial components of
daily “business object," in the corporate database, and apply suitable
geographic analysis efficiently in a desktop-focused application. We
can use wireless internet technology for transfer process in spatial
data from server to client or vice versa. However, the problem in
wireless Internet is system bottlenecks that can make the process of
transferring data not efficient. The reason is large amount of spatial
data. Optimization in the process of transferring and retrieving data,
however, is an essential issue that must be considered. Appropriate
decision to choose between R-tree and Quadtree spatial data indexing
method can optimize the process. With the rapid proliferation of
these databases in the past decade, extensive research has been
conducted on the design of efficient data structures to enable fast
spatial searching. Commercial database vendors like Oracle have also
started implementing these spatial indexing to cater to the large and
diverse GIS. This paper focuses on the decisions to choose R-tree
and quadtree spatial indexing using Oracle spatial database in mobile
GIS application. From our research condition, the result of using
Quadtree and R-tree spatial data indexing method in one single
spatial database can save the time until 42.5%.