A Graph-Based Approach for Placement of No-Replicated Databases in Grid

On a such wide-area environment as a Grid, data placement is an important aspect of distributed database systems. In this paper, we address the problem of initial placement of database no-replicated fragments in Grid architecture. We propose a graph based approach that considers resource restrictions. The goal is to optimize the use of computing, storage and communication resources. The proposed approach is developed in two phases: in the first phase, we perform fragment grouping using knowledge about fragments dependency and, in the second phase, we determine an efficient placement of the fragment groups on the Grid. We also show, via experimental analysis that our approach gives solutions that are close to being optimal for different databases and Grid configurations.




References:
[1] I. D. Baev and R. Rajaraman. Approximation algorithms for data
placement in arbitrary networks. In SODA-01: Proceedings of the twelfth
annual ACM-SIAM Symposium on Discrete Algorithms, pages 661-670,
Philadelphia, PA, USA, 2001.
[2] A. Brinkmann, K. Salzwedel, and C. Scheideler. Compact, adaptive
placement schemes for non-uniform capacities. In Proceedings of the
14th ACM Symp. on Parallel Algorithms and Architectures (SPAA),
pages 53-62, Winnipeg, Manitoba, Canada, August 2002.
[3] D. G. Cameron, R. Carvajal-Schiaffino, A. P. Millar, C. Nicholson, K.
Stockinger, and F. Zini. Optorsim: A simulation tool for scheduling and
replica optimisation in data grids. In Proceedings of Computing in High
Energy Physics, CHEP 2004, Interlaken, Switzerland, 2004.
[4] Y. Huang and J. Chen. Fragment allocation in distributed database
design. Journal of Information Science and Engineering, 17(3):491-506,
2001.
[5] Y. Huang and N. Venkatasubramanian. Data placement in intermittently
available environments. In High Performance Computing - HiPC 2002,
9th International Conference, volume 2552 of Lecture Notes in
Computer Science, pages 367-376, Bangalore, India, December 2002.
Springer-Verlag.
[6] T. Kosar and M. Livny. Stork: Making data placement a first class
citizen in the grid. In Proceedings of 24th IEEE Int. Conference on
Distributed Computing Systems,(ICDCS2004), Tokyo, March 2004.
[7] K. Ranganathan and I. Foster. Decoupling computation and data
scheduling in distributed data-intensive applications. In International
Symposium of High Performance Distributed Computing, HPDC-11,
Edinburgh, Scotland, July 2002.
[8] H. Stockinger. Distributed database management systems and the data
grid. In 18th IEEE Symposium on Mass Storage Systems and 9th NASA
Goddard Conference on Mass Storage Systems and Technologies, San
Diego, CA, April 17-20 2001.
[9] H. Stockinger, Omer F. Rana, R. Moore, and A. Merzky. Data
management for grid environments. In High-Performance Computing
and Networking, 9th International Conference, HPCN Europe 2001,
volume 2110 of Lecture Notes in Computer Science, pages 151-160,
Amsterdam, The Netherlands, June 2001. Springer-Verlag.