Abstract: Distributed database is a collection of logically related databases that cooperate in a transparent manner. Query processing uses a communication network for transmitting data between sites. It refers to one of the challenges in the database world. The development of sophisticated query optimization technology is the reason for the commercial success of database systems, which complexity and cost increase with increasing number of relations in the query. Mariposa, query trading and query trading with processing task-trading strategies developed for autonomous distributed database systems, but they cause high optimization cost because of involvement of all nodes in generating an optimal plan. In this paper, we proposed a modification on the autonomous strategy K-QTPT that make the seller’s nodes with the lowest cost have gradually high priorities to reduce the optimization time. We implement our proposed strategy and present the results and analysis based on those results.
Abstract: In-memory database systems are becoming popular
due to the availability and affordability of sufficiently large RAM and
processors in modern high-end servers with the capacity to manage
large in-memory database transactions. While fast and reliable inmemory
systems are still being developed to overcome cache misses,
CPU/IO bottlenecks and distributed transaction costs, disk-based data
stores still serve as the primary persistence. In addition, with the
recent growth in multi-tenancy cloud applications and associated
security concerns, many organisations consider the trade-offs and
continue to require fast and reliable transaction processing of diskbased
database systems as an available choice. For these
organizations, the only way of increasing throughput is by improving
the performance of disk-based concurrency control. This warrants a
hybrid database system with the ability to selectively apply an
enhanced disk-based data management within the context of inmemory
systems that would help improve overall throughput.
The general view is that in-memory systems substantially
outperform disk-based systems. We question this assumption and
examine how a modified variation of access invariance that we call
enhanced memory access, (EMA) can be used to allow very high
levels of concurrency in the pre-fetching of data in disk-based
systems. We demonstrate how this prefetching in disk-based systems
can yield close to in-memory performance, which paves the way for
improved hybrid database systems. This paper proposes a novel EMA
technique and presents a comparative study between disk-based EMA
systems and in-memory systems running on hardware configurations
of equivalent power in terms of the number of processors and their
speeds. The results of the experiments conducted clearly substantiate
that when used in conjunction with all concurrency control
mechanisms, EMA can increase the throughput of disk-based systems
to levels quite close to those achieved by in-memory system. The
promising results of this work show that enhanced disk-based
systems facilitate in improving hybrid data management within the
broader context of in-memory systems.
Abstract: An algorithm is a well-defined procedure that takes
some input in the form of some values, processes them and gives the
desired output. It forms the basis of many other algorithms such as
searching, pattern matching, digital filters etc., and other applications
have been found in database systems, data statistics and processing,
data communications and pattern matching. This paper introduces
algorithmic “Enhanced Bidirectional Selection” sort which is
bidirectional, stable. It is said to be bidirectional as it selects two
values smallest from the front and largest from the rear and assigns
them to their appropriate locations thus reducing the number of
passes by half the total number of elements as compared to selection
sort.
Abstract: By running transactions under the SNAPSHOT isolation
we can achieve a good level of concurrency, specially in databases
with high-intensive read workloads. However, SNAPSHOT is not
immune to all the problems that arise from competing transactions
and therefore no serialization warranty exists. We propose in this
paper a technique to obtain data consistency with SNAPSHOT by using
some special triggers that we named DAEMON TRIGGERS. Besides
keeping the benefits of the SNAPSHOT isolation, the technique is
specially useful for those database systems that do not have an
isolation level that ensures serializability, like Firebird and Oracle. We
describe all the anomalies that might arise when using the SNAPSHOT
isolation and show how to preclude them with DAEMON TRIGGERS.
Based on the methodology presented here, it is also proposed the
creation of a new isolation level: DAEMON SNAPSHOT.
Abstract: Pioneer networked systems assume that connections are reliable, and a faulty operation will be considered in case of losing a connection. Transient connections are typical of mobile devices. Areas of application of data sharing system such as these, lead to the conclusion that network connections may not always be reliable, and that the conventional approaches can be improved. Nigerian commercial banking industry is a critical system whose operation is increasingly becoming dependent on information technology (IT) driven information system. The proposed solution to this problem makes use of a hierarchically clustered network structure which we selected to reflect (as much as possible) the typical organizational structure of the Nigerian commercial banks. Representative transactions such as data updates and replication of the results of such updates were used to simulate the proposed model to show its applicability.
Abstract: Due to new distributed database applications such as
huge deductive database systems, the search complexity is constantly
increasing and we need better algorithms to speedup traditional
relational database queries. An optimal dynamic programming
method for such high dimensional queries has the big disadvantage of
its exponential order and thus we are interested in semi-optimal but
faster approaches. In this work we present a multi-agent based
mechanism to meet this demand and also compare the result with
some commonly used query optimization algorithms.
Abstract: In-place sorting algorithms play an important role in many fields such as very large database systems, data warehouses, data mining, etc. Such algorithms maximize the size of data that can be processed in main memory without input/output operations. In this paper, a novel in-place sorting algorithm is presented. The algorithm comprises two phases; rearranging the input unsorted array in place, resulting segments that are ordered relative to each other but whose elements are yet to be sorted. The first phase requires linear time, while, in the second phase, elements of each segment are sorted inplace in the order of z log (z), where z is the size of the segment, and O(1) auxiliary storage. The algorithm performs, in the worst case, for an array of size n, an O(n log z) element comparisons and O(n log z) element moves. Further, no auxiliary arithmetic operations with indices are required. Besides these theoretical achievements of this algorithm, it is of practical interest, because of its simplicity. Experimental results also show that it outperforms other in-place sorting algorithms. Finally, the analysis of time and space complexity, and required number of moves are presented, along with the auxiliary storage requirements of the proposed algorithm.
Abstract: The volume of XML data exchange is explosively increasing, and the need for efficient mechanisms of XML data management is vital. Many XML storage models have been proposed for storing XML DTD-independent documents in relational database systems. Benchmarking is the best way to highlight pros and cons of different approaches. In this study, we use a common benchmarking scheme, known as XMark to compare the most cited and newly proposed DTD-independent methods in terms of logical reads, physical I/O, CPU time and duration. We show the effect of Label Path, extracting values and storing in another table and type of join needed for each method's query answering.
Abstract: Main Memory Database systems (MMDB) store their
data in main physical memory and provide very high-speed access.
Conventional database systems are optimized for the particular
characteristics of disk storage mechanisms. Memory resident
systems, on the other hand, use different optimizations to structure
and organize data, as well as to make it reliable.
This paper provides a brief overview on MMDBs and one of the
memory resident systems named FastDB and compares the
processing time of this system with a typical disc resident database
based on the results of the implementation of TPC benchmarks
environment on both.
Abstract: On a such wide-area environment as a Grid, data
placement is an important aspect of distributed database systems. In
this paper, we address the problem of initial placement of database
no-replicated fragments in Grid architecture. We propose a graph
based approach that considers resource restrictions. The goal is to
optimize the use of computing, storage and communication
resources. The proposed approach is developed in two phases: in the
first phase, we perform fragment grouping using knowledge about
fragments dependency and, in the second phase, we determine an
efficient placement of the fragment groups on the Grid. We also
show, via experimental analysis that our approach gives solutions
that are close to being optimal for different databases and Grid
configurations.
Abstract: The speculative locking (SL) protocol extends the twophase locking (2PL) protocol to allow for parallelism among conflicting transactions. The adaptive speculative locking (ASL) protocol provided further enhancements and outperformed SL protocols under most conditions. Neither of these protocols consider the impact of network latency on the performance of the distributed database systems. We have studied the performance of ASL protocol taking into account the communication overhead. The results indicate that though system load can counter network latency, it can still become a bottleneck in many situations. The impact of latency on performance depends on many factors including the system resources. A flexible discrete event simulator was used as the testbed for this study.
Abstract: In present days the area of data migration is very topical. Current tools for data migration in the area of relational database have several disadvantages that are presented in this paper. We propose a methodology for data migration of the database tables and their data between various types of relational database systems (RDBMS). The proposed methodology contains an expert system. The expert system contains a knowledge base that is composed of IFTHEN rules and based on the input data suggests appropriate data types of columns of database tables. The proposed tool, which contains an expert system, also includes the possibility of optimizing the data types in the target RDBMS database tables based on processed data of the source RDBMS database tables. The proposed expert system is shown on data migration of selected database of the source RDBMS to the target RDBMS.
Abstract: Mining sequential patterns from large customer transaction databases has been recognized as a key research topic in database systems. However, the previous works more focused on mining sequential patterns at a single concept level. In this study, we introduced concept hierarchies into this problem and present several algorithms for discovering multiple-level sequential patterns based on the hierarchies. An experiment was conducted to assess the performance of the proposed algorithms. The performances of the algorithms were measured by the relative time spent on completing the mining tasks on two different datasets. The experimental results showed that the performance depends on the characteristics of the datasets and the pre-defined threshold of minimal support for each level of the concept hierarchy. Based on the experimental results, some suggestions were also given for how to select appropriate algorithm for a certain datasets.