Discovery of Quantified Hierarchical Production Rules from Large Set of Discovered Rules

Automated discovery of Rule is, due to its applicability, one of the most fundamental and important method in KDD. It has been an active research area in the recent past. Hierarchical representation allows us to easily manage the complexity of knowledge, to view the knowledge at different levels of details, and to focus our attention on the interesting aspects only. One of such efficient and easy to understand systems is Hierarchical Production rule (HPRs) system. A HPR, a standard production rule augmented with generality and specificity information, is of the following form: Decision If < condition> Generality Specificity . HPRs systems are capable of handling taxonomical structures inherent in the knowledge about the real world. This paper focuses on the issue of mining Quantified rules with crisp hierarchical structure using Genetic Programming (GP) approach to knowledge discovery. The post-processing scheme presented in this work uses Quantified production rules as initial individuals of GP and discovers hierarchical structure. In proposed approach rules are quantified by using Dempster Shafer theory. Suitable genetic operators are proposed for the suggested encoding. Based on the Subsumption Matrix(SM), an appropriate fitness function is suggested. Finally, Quantified Hierarchical Production Rules (HPRs) are generated from the discovered hierarchy, using Dempster Shafer theory. Experimental results are presented to demonstrate the performance of the proposed algorithm.





References:
[1] Bharadwaj, K.K., Neerja and Goel, G.C. 1994, ÔÇÿHierarchical Censored
Production Rules system employing Dampster-Shafer Uncertainty
Calculus-, Information and Software Technology, Vol 36, pp 155-174.
[2] Tamanna Siddiqui, K. K. Bharadwaj, "Discovery of Quantified
Censored Production Rules from the Large set of Discovered rules",
Proceedings of International Conference: Conference on Information
Science, Technology and Management (CISTM 2006), Chandigarh,
India. July 16-18, 2006.
[3] S. Levachkine and A. Guzman-Arenas, "Hierarchies measuring
qualitative variables," Springer-Verlag Berlin Heidelberg 2004, A.
Gelbukh (Ed.):CICLing 2004,2004,pp.262-274.
[4] B. Liu., M. Hu. And W. Hsu., " Inductive Representation of decision
Trees using general rules and exceptions", AAAI-2000.
[5] Cios, K. J. Sztandera, L. M., "Continuous ID3 Algorithm with Fuzzy
entropy measures. Proc. IEEE Int. Conf. Fuzzy Systems, San Diego,
469-476, 1996.
[6] U. Fayyad, G. P. Shapiro and P Smyth, "The KDD process for extracting
useful knowledge from volumes of data", Communications of the ACM,
vol.39, pp. 27-34, 1996.
[7] K. Sentz and S. Ferson (2002), ÔÇÿCombination of Evidence in Dempster-
Shafer Theory-, Sandia National Laboratories report SAND2002-0835.
[8] Farhad Hussain, Huan Liu, Einoshin Suzuki, Hongjun Lu: Exception
Rule Mining with a Relative Interestingness Measure. PAKDD 2000:
86-97.
[9] M. Suan, "Semi-Automatic taxonomy for efficient information
searching," Proceeding of the 2nd International Conference on
Information Technology for Application (ICITA-2004), 2004.
[10] J. R. Koza, "Genetic programming: on the programming of computers by
means of natural selection," MIT Press, 1994.
[11] C. C. Bojarczuk, H. S. Lopes, and A. A. Freitas, " Genetic programming
for knowledge discovery in chest pain diagnosis," IEEE Engineering in
Medical and Biology magazine-special issue on data mining and
knowledge discovery, 19(4), July/Aug 2000,pp.38-44.
[12] Tamanna Siddiqui, "A KDD Tool for automated Discovery of
knowledge", Proceedings of the 2nd national Conference INDIA Com -
2008, Computing for nation development, February 08 - 09, 2008.
[13] 223012 (M01) Statistical quantification of the sources of variance in
uncertainty analysis: Robinson R.B., Hurst B.T., Risk Analysis, Volume
17, Nr. 4, pp 447-454
[14] K. K. Bharadwaj and R. Varshneya, "Parallelization of hierarchical
censored production rules," Information and Software Technology, 37,
1995, pp.453-460.
[15] K. K. Bharadwaj and N. K. Jain, "Hierarchical censored production rules
(HCPRs) systems," Data and Knowledge Engineering, North Holland,
vol. 8, 1992, pp.19-34.
[16] J. Han, and Y. FU, "Dynamic generation and refinement of concept
hierarchies for knowledge discovery in databases," AAAI-94 Workshop
Knowledge in Databases (KDD-94), Seattle, WA, July 1994, pp. 157-
168.
[17] H. Surynato and P. Compton, "Learning classification taxonomies from
a classification knowledge based system," Proceedings the First
Workshop on Ontology Learning in Conjunction with ECAI-2000,
Berlin, pp.1-6.
[18] B. Liu, M. Hu, and W. Hsu, "Multi-level organization and
summarization of the discovered rules," Boston, USA, SIGKDD-2000,
Aug 20-23, 2000.
[19] D. Richards and U. Malik, "Multi-level rule discovery from
propositional knowledge bases," International Workshop on Knowledge
Discovery in Multimedia and Complex Data (KDMCD-02), Taipei,
Taiwan, May 2002, pp.11-19.
[20] A. A. Freitas, "A survey of evolutionary algorithms for data mining and
knowledge discovery," In: A. Ghosh, and S. Tsutsui (Eds.) Advances in
Evolutionary Computation, Springer-Verlag, 2002.
[21] I. De Falco, A. Della Cioppa, and E. Tarantiono, "Discovering
interesting classification rules with genetic programming," Applied Soft
Computing, 1, 2002, pp.257-269.
[22] M. V. Fidelis, H. S. Lopes, and A. A. Freitas, "Discovering
comprehensible classification rules with a genetic algorithm," Proc.
Congress on Evolutionary Computation-2000 (CEC-2000), La Jolla, CA,
USA,IEEE, July 2000, pp.805-810.