Development of Subjective Measures of Interestingness: From Unexpectedness to Shocking

Knowledge Discovery of Databases (KDD) is the process of extracting previously unknown but useful and significant information from large massive volume of databases. Data Mining is a stage in the entire process of KDD which applies an algorithm to extract interesting patterns. Usually, such algorithms generate huge volume of patterns. These patterns have to be evaluated by using interestingness measures to reflect the user requirements. Interestingness is defined in different ways, (i) Objective measures (ii) Subjective measures. Objective measures such as support and confidence extract meaningful patterns based on the structure of the patterns, while subjective measures such as unexpectedness and novelty reflect the user perspective. In this report, we try to brief the more widely spread and successful subjective measures and propose a new subjective measure of interestingness, i.e. shocking.




References:
[1] Alex Berson & Stephen Smith, Data Warehousing & Data Mining, Hill
Edition 2004.Ding, W., and Marchionini, G. A Study on Video Browsing
Strategies. Technical Report UMIACS-TR-97-40, University of
Maryland, College Park, MD, 1997.
[2] Harry Singh, Data Warehousing, concepts, technologies,
implementations, 1998.
[3] G. Piatesky-Shapiro and C. J Matheus, The interestingness of deviations.
In Proceedings of AAAI.
[4] A. Silberschatz and A. Tuzhilin, On subjective measures of
interestingness in knowledge discovery., Proceedings of the First
International Conference on Knowledge Discovery and Data Mining,
1995, 275-281.
[5] B. Liu & W. Hsu, L. Mun, and H. lee, Identifying Interesting Missing
patterns.
[6] B. Liu & W. Hsu, L. Mun, and H. Lee, Finding interesting patterns using
user expectations, Technical Report TRA7/96, Department of
Information Systems and Computer Science, National University of
Singapore, 1996.
[7] B. Liu, W. Hsu, and S. Chen, Using general impressions to analyze
discovered classification rules.
[8] Bing Liu & Wynne Hsu, Post Analysis of Learned Rules, AAAI-96, Aug
4-8, 1996, Portland, Oregon, USA.
[9] B. Padmanabhan and A. Tuzhilin, Unexpectedness as a measure of
interestingness in knowledge discovery.
[10] B. Padmanabhan and A. Tuzhilin, A belief-driven method for
discovering unexpected patterns.
[11] Zengyou He, Xiaofei Xu and Shengchun Deng, Data mining for
actionable knowledge: a survey.
[12] A. Silberschatz and A. Tuzhilin, what makes patterns interesting in
knowledge discovery systems.
[13] Ke Wang, Senqiang Zhou and Jiawei Han, Profit mining: from patterns
to actions.
[14] Basu, S., Mooney, R.,J .,Pasupuleti,K. V.,Ghosh,J.: using Lexical
knowledge to evaluate the novelty of rules mined from text. In
proceedings of the NAACL workshop and other Lexical resources:
Applications, Extentions and customization
[15] A. S. Al-Hegami, V. Bhatnagar, N. KumarNovelty Framework for
Knowledge Discovery in Databases, DaWaK 2004: 48-57.
[16] F. Hussain, H. Liu, E. Suzuki and H. Lu, Exception rule mining with a
relative interestingness measure.
[17] E. Suzuki, Discovering unexpected exceptions: A stochastic approach.
In Proc. RFID, pages 225-232, 1996
[18] E. Suzuki, Discovery of surprising exception rules based on intensity on
implication. In proc. Second Pacific-Asia conference on Knowledge
Discovery and data mining (PAKDD), 1998.
[19] R. Agarwal, T. Imielinski, and A. Swami, Mining association rules
between sets of items in large databases. In Proc. Of the ACM SIGMOD
Conference on Management of data, Washington, D.C., May 1993.