Database Placement on Large-Scale Systems

Large-scale systems such as Grids offer infrastructures for both data distribution and parallel processing. The use of Grid infrastructures is a more recent issue that is already impacting the Distributed Database Management System industry. In DBMS, distributed query processing has emerged as a fundamental technique for ensuring high performance in distributed databases. Database placement is particularly important in large-scale systems because it reduces communication costs and improves resource usage. In this paper, we propose a dynamic database placement policy that depends on query patterns and Grid sites capabilities. We evaluate the performance of the proposed database placement policy using simulations. The obtained results show that dynamic database placement can significantly improve the performance of distributed query processing.




References:
[1] M. T. Ozsu and P. Valduriez, Principles of Distributed Database
Systems, 2nd ed. Prentice-Hall, 1999.
[2] J. Smith, A. Gounaris, P. Watson, N. W. Paton, A. Fernandes and R.
Sakellariou, "Distributed Query Processing on the Grid," in Proceedings
of the Third Workshop on Grid Computing, GRID2002, Baltimore, MA,
2002, pp. 380-387.
[3] S. Narayanan, U. Catalyurek, T. Kurc, X. Zhang and J. Saltz, "Applying
Database Support for Large Scale Data Driven Science in Distributed
Environments," in 4th International Workshop on Grid Computing
(Grid2003), Phoenix, Arizona, November 2003, pp. 141-149.
[4] A. Gounaris, N. W. Paton, R. Sakellariou and A. A. Fernandes,
"Adaptive Query Processing and the Grid: Opportunities and
Challenges," in DEXA Workshops, Zaragoza, Spain, August-September
2004, pp. 506-510.
[5] H. Ye, B. Kerhervé and G. von Bochmann, "Revisiting Join Site
Selection in Distributed Database Systems," in Euro-Par 2003, Parallel
Processing, 9th International Euro-Par Conference, ser. Lecture Notes
in Computer Science, vol. 2790. Klagenfurt, Austria: Springer-Verlag,
August 2003, pp. 342-347
[6] C.H. Lee and M.S. Chen, "Distributed Query Processing in the Internet:
Exploring Relation Replication and Network Characteristics," in ICDCS
'01: Proceedings of the 21st International Conference on Distributed
Computing Systems, Washington, DC, USA, 2001, pp. 439-446.
[7] H. Stockinger, O. F. Rana, R. Moore and A. Merzky, "Data
Management for Grid Environments," in High-Performance Computing
and Networking, 9th International Conference, HPCN Europe 2001, ser.
Lecture Notes in Computer Science, vol. 2110, Amsterdam, The
Netherlands: Springer-Verlag, June 2001, pp. 151-160.
[8] T. Kosar and M. Livny, "Stork: Making Data Placement a First Class
Citizen in the Grid," in Proceedings of 24th IEEE Int. Conference on
Distributed Computing Systems, ICDCS 2004, Tokyo, March 2004.
[9] F. Bonnassieux, R. Harakaly and P. Primet, "Automatic Services
Discovery, Monitoring and Visualization of Grid Environments: The
MapCenter Approach," in First European Across Grids Conference,
Santiago de Compostela, Spain, February 2003, pp. 222-229.
[10] D. G. Cameron, R. Carvajal-Schiaffino, A. P. Millar, C. Nicholson, K.
Stockinger, and F. Zini, "Optorsim: A simulation tool for scheduling and
replica optimisation in data grids," in Proceedings of Computing in High
Energy Physics, CHEP 2004, Interlaken, Switzerland, 2004.