Enhanced Disk-Based Databases Towards Improved Hybrid In-Memory Systems

In-memory database systems are becoming popular due to the availability and affordability of sufficiently large RAM and processors in modern high-end servers with the capacity to manage large in-memory database transactions. While fast and reliable inmemory systems are still being developed to overcome cache misses, CPU/IO bottlenecks and distributed transaction costs, disk-based data stores still serve as the primary persistence. In addition, with the recent growth in multi-tenancy cloud applications and associated security concerns, many organisations consider the trade-offs and continue to require fast and reliable transaction processing of diskbased database systems as an available choice. For these organizations, the only way of increasing throughput is by improving the performance of disk-based concurrency control. This warrants a hybrid database system with the ability to selectively apply an enhanced disk-based data management within the context of inmemory systems that would help improve overall throughput. The general view is that in-memory systems substantially outperform disk-based systems. We question this assumption and examine how a modified variation of access invariance that we call enhanced memory access, (EMA) can be used to allow very high levels of concurrency in the pre-fetching of data in disk-based systems. We demonstrate how this prefetching in disk-based systems can yield close to in-memory performance, which paves the way for improved hybrid database systems. This paper proposes a novel EMA technique and presents a comparative study between disk-based EMA systems and in-memory systems running on hardware configurations of equivalent power in terms of the number of processors and their speeds. The results of the experiments conducted clearly substantiate that when used in conjunction with all concurrency control mechanisms, EMA can increase the throughput of disk-based systems to levels quite close to those achieved by in-memory system. The promising results of this work show that enhanced disk-based systems facilitate in improving hybrid data management within the broader context of in-memory systems.

Effect of Network Communication Overhead on the Performance of Adaptive Speculative Locking Protocol

The speculative locking (SL) protocol extends the twophase locking (2PL) protocol to allow for parallelism among conflicting transactions. The adaptive speculative locking (ASL) protocol provided further enhancements and outperformed SL protocols under most conditions. Neither of these protocols consider the impact of network latency on the performance of the distributed database systems. We have studied the performance of ASL protocol taking into account the communication overhead. The results indicate that though system load can counter network latency, it can still become a bottleneck in many situations. The impact of latency on performance depends on many factors including the system resources. A flexible discrete event simulator was used as the testbed for this study.

A New Approach for Recoverable Timestamp Ordering Schedule

A new approach for timestamp ordering problem in serializable schedules is presented. Since the number of users using databases is increasing rapidly, the accuracy and needing high throughput are main topics in database area. Strict 2PL does not allow all possible serializable schedules and so does not result high throughput. The main advantages of the approach are the ability to enforce the execution of transaction to be recoverable and the high achievable performance of concurrent execution in central databases. Comparing to Strict 2PL, the general structure of the algorithm is simple, free deadlock, and allows executing all possible serializable schedules which results high throughput. Various examples which include different orders of database operations are discussed.

An Exploratory Environment for Concurrency Control Algorithms

Designing, implementing, and debugging concurrency control algorithms in a real system is a complex, tedious, and errorprone process. Further, understanding concurrency control algorithms and distributed computations is itself a difficult task. Visualization can help with both of these problems. Thus, we have developed an exploratory environment in which people can prototype and test various versions of concurrency control algorithms, study and debug distributed computations, and view performance statistics of distributed systems. In this paper, we describe the exploratory environment and show how it can be used to explore concurrency control algorithms for the interactive steering of distributed computations.