SUPAR: System for User-Centric Profiling of Association Rules in Streaming Data

With a surge of stream processing applications novel techniques are required for generation and analysis of association rules in streams. The traditional rule mining solutions cannot handle streams because they generally require multiple passes over the data and do not guarantee the results in a predictable, small time. Though researchers have been proposing algorithms for generation of rules from streams, there has not been much focus on their analysis. We propose Association rule profiling, a user centric process for analyzing association rules and attaching suitable profiles to them depending on their changing frequency behavior over a previous snapshot of time in a data stream. Association rule profiles provide insights into the changing nature of associations and can be used to characterize the associations. We discuss importance of characteristics such as predictability of linkages present in the data and propose metric to quantify it. We also show how association rule profiles can aid in generation of user specific, more understandable and actionable rules. The framework is implemented as SUPAR: System for Usercentric Profiling of Association Rules in streaming data. The proposed system offers following capabilities: i) Continuous monitoring of frequency of streaming item-sets and detection of significant changes therein for association rule profiling. ii) Computation of metrics for quantifying predictability of associations present in the data. iii) User-centric control of the characterization process: user can control the framework through a) constraint specification and b) non-interesting rule elimination.




References:
[1] V. Bhatnagar and S. K. Kochhar. User subjectivity in change modeling
of streaming itemsets. In ADMA, pages 812├▒823, 2005.
[2] G. Dong and J. Li. Interestingness of discovered association rules in
terms of neighborhood based unexpectedness. In PAKDD, pages 72├▒86,
1998.
[3] B. G. Helsinki. Interactive constrained association rule mining.
[4] M. Klemettinen, H. Mannila, P. Ronkainen, H. Toivonen, and A. I.
Verkamo. Finding interesting rules from large sets of discovered
association rules. In N. R. Adam, B. K. Bhargava, and Y. Yesha, editors,
Third International Conference on Information and Knowledge
Management (CIKM'94), pages 401├▒407. ACM Press, 1994.
[5] B. Liu, W. Hsu, and Y. Ma. Identifying non-actionable association
rules. In KDD '01: Proceedings of the seventh ACM SIGKDD
international conference on Knowledge discovery and data mining,
pages 329├▒334, New York, NY, USA, 2001. ACM Press.
[6] B. Liu, W. Hsu, L.-F. Mun, and H.-Y. Lee. Finding interesting patterns
using user expectations, Knowledge and Data Engineering,
11(6):817├▒832, 1999.
[7] R. Srikant, Q. Vu, and R. Agrawal. Mining association rules with item
constraints. In D. Heckerman, H. Mannila, D. Pregibon, and R.
Uthurusamy, editors, Proc. 3rd Int. Conf. Knowledge Discovery and
Data Mining, KDD, pages 67├▒73. AAAI Press, 14├▒17 1997.
[8] H. Toivonen, M. Klemettinen, P. Ronkainen, K. Hatonen, and H.
Mannila. Pruning and grouping of discovered association rules, 1995.
[9] P. S. M. Tsai and C.-M. Chen. Mining interesting association rules from
customer databases and transaction databases. Inf. Syst., 29(8):685├▒696,
2004.
[10] D. Kifer, S.B.David and J.Gehrke. Detecting Change in Data Streams, in
proc. Of VLDB 2004
[11] A. Tsymbal. The Problem of Concept Drift: Definitions and Related
Work, 2004
[12] J. Han, J. Pei, and Y. Yin. Mining frequent patterns without candidate
generation. Technical Report TR-99-12, Computing Science Technical
Report, Simon Fraser University, October 1999.