Parallel and Distributed Mining of Association Rule on Knowledge Grid

In Virtual organization, Knowledge Discovery (KD) service contains distributed data resources and computing grid nodes. Computational grid is integrated with data grid to form Knowledge Grid, which implements Apriori algorithm for mining association rule on grid network. This paper describes development of parallel and distributed version of Apriori algorithm on Globus Toolkit using Message Passing Interface extended with Grid Services (MPICHG2). The creation of Knowledge Grid on top of data and computational grid is to support decision making in real time applications. In this paper, the case study describes design and implementation of local and global mining of frequent item sets. The experiments were conducted on different configurations of grid network and computation time was recorded for each operation. We analyzed our result with various grid configurations and it shows speedup of computation time is almost superlinear.




References:
[1] M.Perez, A. Sanchez, V. Robels, J.P.na Design and implementation of a
data mining grid-aware architecture, Future Genaration computer
systems 23, pp 42-47, 2007.
[2] Agrawal R, Imielinski T, Swami A. Mining association rules between
sets of items in large databases. In: Proc. ACM SIGMOD Intl. Conf
Management Data, 1993.
[3] I. Foster, C.Kesselman and S. Tuecke, The anatomy of the grid: enabling
scalable virtual organizations, Int-l J. High-perform Comput Appl 15
(2001).
[4] Mario C., Domenico T., Paolo T., "Distributed Data Mining on the
Grid", Future Generation Computer Systems, 2002.
[5] Rakesh Agrawal, John C. Shafer, "Parallel Mining of Association
Rules", IEEE Transactions on knowledge and Data Engineering,
December 1996.
[6] R. Nararajan, R.Sion, T.Phan, A Grid based approach for enterprise
scale mining, Future generation computer systems 23, (2007) 48-54.
[7] Sotomayor B, Chilgders L. Globus Toolkit 4: Programming Java
Services. Morgan Kaufmann, 2006.
[8] H. Karguta and C. Kamath and P.Chan, Distributed and Parallel Data
Mining: Emergence, Growth, and Future Directions, In: Advances in
Distributed and Parallel Knowledge Discovery, AAAI/MIT Press,
pp.409-416, (2000).
[9] W. Cheung,X.-F.Xhang,Z.-Luo,F.Tong Service-oriented distributed
mining, IEEE internet computing 10, (2006 ) 44-54.