Post Mining- Discovering Valid Rules from Different Sized Data Sources

A big organization may have multiple branches spread across different locations. Processing of data from these branches becomes a huge task when innumerable transactions take place. Also, branches may be reluctant to forward their data for centralized processing but are ready to pass their association rules. Local mining may also generate a large amount of rules. Further, it is not practically possible for all local data sources to be of the same size. A model is proposed for discovering valid rules from different sized data sources where the valid rules are high weighted rules. These rules can be obtained from the high frequency rules generated from each of the data sources. A data source selection procedure is considered in order to efficiently synthesize rules. Support Equalization is another method proposed which focuses on eliminating low frequency rules at the local sites itself thus reducing the rules by a significant amount.





References:
[1] Agarwal, R. and Srikant, R,ÔÇÿFast Algorithms for Mining Association
Rules, Proc. Very Large Database Conf. 1994.
[2] R.Agarwal. T.Imielinski and A. Swami, Mining Association Rules
between Sets of Items in Large Databases, Proc. ACM International
Conferences on Management of Data, 1993, pp.207-216.
[3] Cheung, D. Lee, S. and Kao, B., Maintenance of Discovered
Association Rules in Large Databases: An Incremental Updating
Technique, Proc. 12th Int-l Conf. Data Eng., 1996, pp. 106-114.
[4] U.M. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy,
Advances in Knowledge Discovery and Data Mining. AAAI Press/The
MIT Press, 1996.
[5] Han, J. Pei, J. and Yin, Y. , Mining Frequent Patterns Without Candidate
Generation, Proc. ACM SIGMOD Int-l Conf. Management of Data,
2000, pp. 1-12.
[6] Jia-Wei Han and Micheline Kamber (2001), Data Mining: Concepts and
Techniques, Morgan Kaufmann Publishers.
[7] R.Nedunchezhian and K.Anbumani, Single Scan Frequent set
Generation in Association Rule Mining, Proc. 1st International
Computer Engineering Conference New Technologies for the
Information Society, Cairo University, Egypt, 2004, 300-305.
[8] Park, J.S. Chen, M.S. and Yu, P.S., An Effective Hash Based Algorithm
for Mining Association Rules, Proc. ACM SIGMOD Conf. Management
of Data, 1995.
[9] Rastogi, R. and Shim, K., Mining Optimized Support Rules for Numeric
Attributes, Proc. ACM SIGMOD Conf. Management of Data, 1999.
[10] Simovici, Dan A. Cristofor, Laurentiu and Cristofor, Dana, Galois
Connections and Data mining, J.UCS: Journal of Universal Computer
Science, 2000.
[11] Webb, G.I., Efficient Search for Association Rules, Proc. ACM
SIGKDD Int-l Conf. Knowledge Discovery and Data Mining, 2000,
pp. 99-107.
[12] Wu, Xindong and Zhang, Shichao, Synthesizing High- Frequency Rules
from Different Data Sources, IEEE Trans. Knowledge and Data Eng.,
vol. 15, no.2., Mar/Apr 2003.