Novelty as a Measure of Interestingness in Knowledge Discovery

Rule Discovery is an important technique for mining knowledge from large databases. Use of objective measures for discovering interesting rules leads to another data mining problem, although of reduced complexity. Data mining researchers have studied subjective measures of interestingness to reduce the volume of discovered rules to ultimately improve the overall efficiency of KDD process. In this paper we study novelty of the discovered rules as a subjective measure of interestingness. We propose a hybrid approach based on both objective and subjective measures to quantify novelty of the discovered rules in terms of their deviations from the known rules (knowledge). We analyze the types of deviation that can arise between two rules and categorize the discovered rules according to the user specified threshold. We implement the proposed framework and experiment with some public datasets. The experimental results are promising.




References:
[1] A. S. Al-Hegami, " Subjective Measures and their Role in Data Mining
Process ", In Proceedings of the 6th International Conference on
Cognitive Systems, New Delhi, India, 2004.
[2] A. S. Al-Hegami, V. Bhatnagar, and N. Kumar, " Novelty Framework
for Knowledge Discovery in Databases ", In Proceedings of the 6th
International Conference on Data Warehousing and Knowledge
Discovery (DaWaK 2004), Zaragoza, Spain, 2004, pp 48-55.
[3] A. S. Al-Hegami, " Interestingness Measures of KDD : A Comparative
Analysis ",In Proceedings of the 11th International Conference on
Concurrent Engineering: Research and Applications, Beijing, China,
2004, pp 321-326.
[4] B. Padmanabhan and A. Tuzhilin, " Unexpectedness as a Measure of
Interestingness in Knowledge Discovery ", Working paper # IS-97-6,
Dept. of Information Systems, Stern School of Business, NYU, 1997.
[5] J. Han, and M. Kamber, "Data Mining: Concepts and Techniques", 1st
Edition, Harcourt India Private Limited. 2001.
[6] M. H. Dunham, " Data Mining: Introductory and Advanced Topics ",1st
Edition, Pearson Education (Singaphore) Pte. Ltd., 2003.
[7] G. Piateskey-Shapiro, and C. J. Matheus, "The Interestingness of
Deviations", In Proceedings of AAAI Workshop on Knowledge
Discovery in Databases, 1994.
[8] S. Basu, R. J. Mooney, K. V. Pasupuleti, and J. Ghosh, "Using Lexical
Knowledge to Evaluate the Novelty of Rules Mined from Text ", In
Proceedings of the NAACL workshop and other Lexical Resources:
Applications, Extensions and Customizations, 2001.
[9] A. Silberschatz and A.Tuzhilin, "On Subjective Measures of
Interestingness in Knowledge Discovery", In Proceedings of the 1st
International Conference on Knowledge Discovery and Data Mining.
1995.
[10] B. Liu, W. Hsu, and S. Chen, " Using General Impressions to Analyse
Discovered Classification Rules ", In Proceedings of the 3rd
International Conference on Knowledge Discovery and Data Mining
(KDD 97), 1997.
[11] T. Kohonen, " Self-Organization and Associative Memory ", 3rd
Edition, Springer, Berlin. 1993.
[12] A. Silberschatz and A. Tuzhilin, "What Makes Patterns Interesting in
Knowledge Discovery Systems ", IEEE Transactions on Knowledge
and Data Engineering. V.5, No.6. 1996.
[13] B. Liu and W. Hsu, " Post Analysis of Learned Rules ", In Proceedings
of the 13th National Conférence on AI(AAAI'96), 1996.
[14] S. Marsland, " On-Line Novelty Detection Through Self-Organization,
with Application to Robotics ", Ph.D. Thesis, Department of Computer
Science, University of Manchester, 2001.
[15] N. Japkowicz , C. Myers, and M. Gluck, " A Novelty Detection
Approach to Classification", In Proceedings of the 14th International
Joint Conference on Artificial Intelligence, 1995.
[16] S. Roberts, and L. Tarassenko, "A Probabilistic Resource Allocation
Network for Novelty Detection", In Neural Computation, 6(2), 1994
[17] A. Ypma, and R. Duin, "Novelty Detection Using Self-Organizing
Maps", In Progress in Connectionist-Based Information Systems.
Volume 2, 1997.
[18] Uthurusamy, R., "From Data Mining to Knowledge Discovery", In
Advances in Knowledge Discovery and Data mining. Edited by U. M.
Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy, Menlo
Park, CA:AAAI/MIT Press, 1996.
[19] http://kdd.ics.uci.edu/
[20] http://www.comp.nus.edu.sg/~dm2/p_download.html
[21] T. Yairi, Y. Kato and K. Hori, " Fault Detection by Mining Association
Rules from House-keeping Data ", In Proceedings of International
Symposium on Artificial Intelligence, Robotics and Automation in
Space (SAIRAS 2001), 2000.
[22] G. Psaila, "Discovery of Association Rule Meta-Patterns", In
Proceedings of 2nd International Conference on Data Warehousing and
Knowledge Discovery (DaWaK99), 1999.
[23] J. Pei and J. Han, "Can We Push More Constraints into Frequent Pattern
Mining", In Proceeding of the 6th ACM SIGKDD, 2000.
[24] F. Bronchi, F. Giannotti, A. Mazzanti and D. Pedreschi, "Adaptive
Constraint Pushing in Frequent Pattern Mining", In Proceedings of the
7th PKDD-03, 2003, pp 47-58.
[25] F. Bronchi, F. Giannotti, A. Mazzanti and D. Pedreschi, "ExAMiner:
Optimized Level-wise Frequent Pattern Mining with Monotone
Constraints", In Proceedings of the 3rd International Conference on Data
Mining (ICDM03), 2003, pp 11-18.
[26] F. Bronchi, F. Giannotti, A. Mazzanti and D. Pedreschi, "Exante:
Anticipated Data Reduction in Constrained Pattern Mining", In
Proceedings of the 7th PKDD-03, 2003, 59-70.