Abstract: This paper presents an algebraic approach to optimize
queries in domain-specific database management system
for protein structure data. The approach involves the introduction of
several protein structure specific algebraic operators to query the
complex data stored in an object-oriented database system. The
Protein Algebra provides an extensible set of high-level Genomic
Data Types and Protein Data Types along with a comprehensive
collection of appropriate genomic and protein functions. The paper
also presents a query translator that converts high-level query
specifications in algebra into low-level query specifications in
Protein-QL, a query language designed to query protein structure
data. The query transformation process uses a Protein Ontology that
serves the purpose of a dictionary.
Abstract: In the area of Human Resource Management, the trend is towards online exchange of information about human resources. For example, online applications for employment become standard and job offerings are posted in many job portals. However, there are too many job portals to monitor all of them if someone is interested in a new job. We developed a prototype for integrating information of different job portals into one meta-search engine. First, existing job portals were investigated and XML schema documents were derived automated from these portals. Second, translation rules for transforming each schema to a central HR-XML-conform schema were determined. The HR-XML-schema is used to build a form for searching jobs. The data supplied by a user in this form is now translated into queries for the different job portals. Each result obtained by a job portal is sent to the meta-search engine that ranks the result of all received job offers according to user's preferences.
Abstract: Distributed Computing Systems are usually considered the most suitable model for practical solutions of many parallel algorithms. In this paper an enhanced distributed system is presented to improve the time complexity of Binary Indexed Trees (BIT). The proposed system uses multi-uniform processors with identical architectures and a specially designed distributed memory system. The analysis of this system has shown that it has reduced the time complexity of the read query to O(Log(Log(N))), and the update query to constant complexity, while the naive solution has a time complexity of O(Log(N)) for both queries. The system was implemented and simulated using VHDL and Verilog Hardware Description Languages, with xilinx ISE 10.1, as the development environment and ModelSim 6.1c, similarly as the simulation tool. The simulation has shown that the overhead resulting by the wiring and communication between the system fragments could be fairly neglected, which makes it applicable to practically reach the maximum speed up offered by the proposed model.
Abstract: This paper introduces and studies new indexing techniques for content-based queries in images databases. Indexing is the key to providing sophisticated, accurate and fast searches for queries in image data. This research describes a new indexing approach, which depends on linear modeling of signals, using bases for modeling. A basis is a set of chosen images, and modeling an image is a least-squares approximation of the image as a linear combination of the basis images. The coefficients of the basis images are taken together to serve as index for that image. The paper describes the implementation of the indexing scheme, and presents the findings of our extensive evaluation that was conducted to optimize (1) the choice of the basis matrix (B), and (2) the size of the index A (N). Furthermore, we compare the performance of our indexing scheme with other schemes. Our results show that our scheme has significantly higher performance.
Abstract: Generally, administrative systems in an academic
environment are disjoint and support independent queries. The
objective in this work is to semantically connect these independent
systems to provide support to queries run on the integrated platform.
The proposed framework, by enriching educational material in the
legacy systems, provides a value-added semantics layer where
activities such as annotation, query and reasoning can be carried out
to support management requirements. We discuss the development of
this ontology framework with a case study of UAE University
program administration to show how semantic web technologies can
be used by administration to develop student profiles for better
academic program management.
Abstract: Today's business environment requires that companies have access to highly relevant information in a matter of seconds.
Modern Business Intelligence tools rely on data structured mostly in traditional dimensional database schemas, typically represented by
star schemas. Dimensional modeling is already recognized as a
leading industry standard in the field of data warehousing although
several drawbacks and pitfalls were reported. This paper focuses on
the analysis of another data warehouse modeling technique - the
anchor modeling, and its characteristics in context with the standardized dimensional modeling technique from a query performance perspective. The results of the analysis show
information about performance of queries executed on database
schemas structured according to principles of each database modeling
technique.
Abstract: Access control is a critical security service in Wire- less
Sensor Networks (WSNs). To prevent malicious nodes from joining
the sensor network, access control is required. On one hand, WSN
must be able to authorize and grant users the right to access to the
network. On the other hand, WSN must organize data collected by
sensors in such a way that an unauthorized entity (the adversary)
cannot make arbitrary queries. This restricts the network access only
to eligible users and sensor nodes, while queries from outsiders will
not be answered or forwarded by nodes. In this paper we presentee
different access control schemes so as to ?nd out their objectives,
provision, communication complexity, limits, etc. Using the node
density parameter, we also provide a comparison of these proposed
access control algorithms based on the network topology which can
be flat or hierarchical.
Abstract: This study proposes a conceptual model and
empirically tests the relationships between customers and librarians
(i.e. tangibles, responsiveness, assurance, reliability and empathy)
with a dependent variable (customer satisfaction) regarding library
services. The SERVQUAL instrument was administered to 100
respondents which comprises of staff and students at a public higher
learning institution in the Federal Territory of Labuan, Malaysia.
They were public university library users. Results revealed that all
service quality dimensions tested were significant and influenced
customer satisfaction of visitors to a public university library.
Assurance is the most important factor that influences customer
satisfaction with the services rendered by the librarian. It is
imperative for the library management to take note that the top five
service attributes that gained greatest attention from library visitors-
perspective includes employee willingness to help customers,
availability of customer representatives online for response to
queries, library staff actively and promptly provide services, signs in
the building are clear and library staff are friendly and courteous.
This study provides valuable results concerning the determinants of
the service quality and customer satisfaction of public university
library services from the users' perspective.
Abstract: In this paper, we present a system for content-based
retrieval of large database of classified satellite images, based on
user's relevance feedback (RF).Through our proposed system, we
divide each satellite image scene into small subimages, which stored
in the database. The modified radial basis functions neural network
has important role in clustering the subimages of database according
to the Euclidean distance between the query feature vector and the
other subimages feature vectors. The advantage of using RF
technique in such queries is demonstrated by analyzing the database
retrieval results.
Abstract: In this research, we propose to use the discrete cosine
transform to approximate the cumulative distributions of data cube
cells- values. The cosine transform is known to have a good energy
compaction property and thus can approximate data distribution
functions easily with small number of coefficients. The derived
estimator is accurate and easy to update. We perform experiments to
compare its performance with a well-known technique - the (Haar)
wavelet. The experimental results show that the cosine transform
performs much better than the wavelet in estimation accuracy, speed,
space efficiency, and update easiness.
Abstract: We demonstrate through a sample application, Ebanking,
that the Web Service Modelling Language Ontology component
can be used as a very powerful object-oriented database design
language with logic capabilities. Its conceptual syntax allows the
definition of class hierarchies, and logic syntax allows the definition
of constraints in the database. Relations, which are available for
modelling relations of three or more concepts, can be connected to
logical expressions, allowing the implicit specification of database
content. Using a reasoning tool, logic queries can also be made
against the database in simulation mode.
Abstract: It is a challenge to provide a wide range of queries to
database query systems for small mobile devices, such as the PDAs
and cell phones. Currently, due to the physical and resource
limitations of these devices, most reported database querying systems
developed for them are only offering a small set of pre-determined
queries for users to possibly pose. The above can be resolved by
allowing free-form queries to be entered on the devices. Hence, a
query language that does not restrict the combination of query terms
entered by users is proposed. This paper presents the free-form query
language and the method used in translating free-form queries to
their equivalent SQL statements.
Abstract: A Data Warehouses is a repository of information
integrated from source data. Information stored in data warehouse is
the form of materialized in order to provide the better performance
for answering the queries. Deciding which appropriated views to be
materialized is one of important problem. In order to achieve this
requirement, the constructing search space close to optimal is a
necessary task. It will provide effective result for selecting view to be
materialized. In this paper we have proposed an approach to reoptimize
Multiple View Processing Plan (MVPP) by using global
common subexpressions. The merged queries which have query
processing cost not close to optimal would be rewritten. The
experiment shows that our approach can help to improve the total
query processing cost of MVPP and sum of query processing cost
and materialized view maintenance cost is reduced as well after views
are selected to be materialized.
Abstract: With the rapid growth in business size, today's businesses orient towards electronic technologies. Amazon.com and e-bay.com are some of the major stakeholders in this regard. Unfortunately the enormous size and hugely unstructured data on the web, even for a single commodity, has become a cause of ambiguity for consumers. Extracting valuable information from such an everincreasing data is an extremely tedious task and is fast becoming critical towards the success of businesses. Web content mining can play a major role in solving these issues. It involves using efficient algorithmic techniques to search and retrieve the desired information from a seemingly impossible to search unstructured data on the Internet. Application of web content mining can be very encouraging in the areas of Customer Relations Modeling, billing records, logistics investigations, product cataloguing and quality management. In this paper we present a review of some very interesting, efficient yet implementable techniques from the field of web content mining and study their impact in the area specific to business user needs focusing both on the customer as well as the producer. The techniques we would be reviewing include, mining by developing a knowledge-base repository of the domain, iterative refinement of user queries for personalized search, using a graphbased approach for the development of a web-crawler and filtering information for personalized search using website captions. These techniques have been analyzed and compared on the basis of their execution time and relevance of the result they produced against a particular search.
Abstract: The explosive growth of World Wide Web has posed
a challenging problem in extracting relevant data. Traditional web
crawlers focus only on the surface web while the deep web keeps
expanding behind the scene. Deep web pages are created
dynamically as a result of queries posed to specific web databases.
The structure of the deep web pages makes it impossible for
traditional web crawlers to access deep web contents. This paper,
Deep iCrawl, gives a novel and vision-based approach for extracting
data from the deep web. Deep iCrawl splits the process into two
phases. The first phase includes Query analysis and Query translation
and the second covers vision-based extraction of data from the
dynamically created deep web pages. There are several established
approaches for the extraction of deep web pages but the proposed
method aims at overcoming the inherent limitations of the former.
This paper also aims at comparing the data items and presenting them
in the required order.
Abstract: A data warehouse (DW) is a system which has value and role for decision-making by querying. Queries to DW are critical regarding to their complexity and length. They often access millions of tuples, and involve joins between relations and aggregations. Materialized views are able to provide the better performance for DW queries. However, these views have maintenance cost, so materialization of all views is not possible. An important challenge of DW environment is materialized view selection because we have to realize the trade-off between performance and view maintenance cost. Therefore, in this paper, we introduce a new approach aimed at solve this challenge based on Two-Phase Optimization (2PO), which is a combination of Simulated Annealing (SA) and Iterative Improvement (II), with the use of Multiple View Processing Plan (MVPP). Our experiments show that our method provides a further improvement in term of query processing cost and view maintenance cost.
Abstract: Computer languages are usually lumped together
into broad -paradigms-, leaving us in want of a finer classification
of kinds of language. Theories distinguishing between -genuine
differences- in language has been called for, and we propose that
such differences can be observed through a notion of expressive mode.
We outline this concept, propose how it could be operationalized and
indicate a possible context for the development of a corresponding
theory. Finally we consider a possible application in connection
with evaluation of language revision. We illustrate this with a case,
investigating possible revisions of the relational algebra in order to
overcome weaknesses of the division operator in connection with
universal queries.
Abstract: The majority of today's IR systems base the IR task on two main processes: indexing and searching. There exists a special group of dynamic IR systems where both processes (indexing and searching) happen simultaneously; such a system discards obsolete information, simultaneously dealing with the insertion of new in¬formation, while still answering user queries. In these dynamic, time critical text document databases, it is often important to modify index structures quickly, as documents arrive. This paper presents a method for dynamization which may be used for this task. Experimental results show that the dynamization process is possible and that it guarantees the response time for the query operation and index actualization.
Abstract: Databases have become ubiquitous. Almost all IT applications are storing into and retrieving information from databases. Retrieving information from the database requires knowledge of technical languages such as Structured Query Language (SQL). However majority of the users who interact with the databases do not have a technical background and are intimidated by the idea of using languages such as SQL. This has led to the development of a few Natural Language Database Interfaces (NLDBIs). A NLDBI allows the user to query the database in a natural language. This paper highlights on architecture of new NLDBI system, its implementation and discusses on results obtained. In most of the typical NLDBI systems the natural language statement is converted into an internal representation based on the syntactic and semantic knowledge of the natural language. This representation is then converted into queries using a representation converter. A natural language query is translated to an equivalent SQL query after processing through various stages. The work has been experimented on primitive database queries with certain constraints.
Abstract: In this paper we present semantic assistant agent
(SAA), an open source digital library agent which takes user query
for finding information in the digital library and takes resources-
metadata and stores it semantically. SAA uses Semantic Web to
improve browsing and searching for resources in digital library. All
metadata stored in the library are available in RDF format for
querying and processing by SemanSreach which is a part of SAA
architecture. The architecture includes a generic RDF-based model
that represents relationships among objects and their components.
Queries against these relationships are supported by an RDF triple
store.