Evolutionary Approach for Automated Discovery of Censored Production Rules

In the recent past, there has been an increasing interest in applying evolutionary methods to Knowledge Discovery in Databases (KDD) and a number of successful applications of Genetic Algorithms (GA) and Genetic Programming (GP) to KDD have been demonstrated. The most predominant representation of the discovered knowledge is the standard Production Rules (PRs) in the form If P Then D. The PRs, however, are unable to handle exceptions and do not exhibit variable precision. The Censored Production Rules (CPRs), an extension of PRs, were proposed by Michalski & Winston that exhibit variable precision and supports an efficient mechanism for handling exceptions. A CPR is an augmented production rule of the form: If P Then D Unless C, where C (Censor) is an exception to the rule. Such rules are employed in situations, in which the conditional statement 'If P Then D' holds frequently and the assertion C holds rarely. By using a rule of this type we are free to ignore the exception conditions, when the resources needed to establish its presence are tight or there is simply no information available as to whether it holds or not. Thus, the 'If P Then D' part of the CPR expresses important information, while the Unless C part acts only as a switch and changes the polarity of D to ~D. This paper presents a classification algorithm based on evolutionary approach that discovers comprehensible rules with exceptions in the form of CPRs. The proposed approach has flexible chromosome encoding, where each chromosome corresponds to a CPR. Appropriate genetic operators are suggested and a fitness function is proposed that incorporates the basic constraints on CPRs. Experimental results are presented to demonstrate the performance of the proposed algorithm.




References:
[1] K.K. Bharadwaj and N.K Jain, "Hierarchical censored production rules
(HCPRs) system," Data & Knowledge Engineering, North Holland, vol. 8,
pp. 19-34, 1992.
[2] K.K. Bharadwaj, N.M. Hewahi and M.A. Brando, "Adaptive
hierarchical censored production rule-based system: A genetic algorithm
approach," Advances in Artificial Intelligence, SBIA -96, Lecture Notes
in Artificial Intelligence, No. 1159, Berlin, Germany, Springer-Verlag,
pp. 81-90, 1996.
[3] Basheer M. Al-Maqaleh and Kamal K. Bharadwaj "Genetic
programming approach to hierarchical production discovery," in Proc.
5th International Conference on Databases and Data Mining
(DBDM2005), vol. 6, Istanbul, Turkey, June 2005, pp. 271-274.
[4] M. Brameier and W. Banzhaf, "A comparison of linear genetic
programming and neural networks in medical data mining," IEEE
Transaction on Evolutionary Computations 5(1), pp. 17-26, 2001.
[5] I. De Falco, A. Della Cioppa and E. Tarantino, "Discovering interesting
classification rules with genetic programming," Applied Soft Computing,
1, pp. 257-269, 2002.
[6] K. A. De Jong, W.M. Spears and D. F. Gordon, "Using genetic
algorithms for concept learning," Machine Learning, vol. 13, pp. 161-
188, 1993.
[7] W. Romao, A.A. Freitas and I.M. de S. Gimenes, "Discovering
interesting knowledge from a science and technology database with a
genetic algorithm," Applied Soft Computing, 4, pp. 121-137, 2004.
[8] U. M. Fayyad, G. P. Shapiro and P. Smyth, "The KDD process for
extracting useful knowledge from volumes of data," Communication of
ACM. Nov. vol. 39 (11), pp. 27-34, 1996.
[9] M. V. Fidelis, H. S. Lopes and A. A. Freitas, "Discovering
comprehensible classification rules with a genetic algorithm," in Proc.
Congress on Evolutionary Computation-2000 (CEC-2000), La Jolla,
CA, USA, IEEE, July 2000, pp. 805-810.
[10] A.A. Freitas, "A survey of evolutionary algorithms for data mining and
knowledge discovery," In: A. Ghosh and S. Tsutsui (Eds.) Advances in
Evolutionary Computation, Springer-Verlag, 2002.
[11] B.R. Gaines, "Transforming rules and trees into comprehensible
knowledge structures," AAAI, MIT Press, 1995.
[12] B.R. Gaines and P. Compton, "Induction of ripple down rules applied to
modeling large database," Journal of Intelligent Information System.
5(3), pp. 211-228, 1995.
[13] D. E. Goldberg, "Genetic algorithms in search, optimization and
machine learning," Addison-Wesley, 1989.
[14] Han and Kamber, "Data mining: concepts and techniques," Academic
Press, 2001.
[15] N. M. Hewahi and K. K. Bharadwaj, "Bucket brigade algorithm for
HCPRs based system," International Journal of Intelligent Systems, 11,
pp. 197-226, 1996.
[16] F. Hussain, H. Liu and E. Suzuki, "Exception rule mining with a relative
interestingness measure," in Proc. 4th Pacific-Asia Conference on
Knowledge Discovery and Data Mining, 2000, pp. 86-97.
[17] N. K. Jain, K.K. Bharadwaj and N. Marranghello, "Extended of
hierarchical censored production rules (EHCPRs) system: An approach
toward generalized knowledge representation," Journal of Intelligent
System, 9(3, 4), pp. 259-295, 1999.
[18] J. Kivinen, H. Mannila and E. Ukkonen, "Learning rules with local
exceptions," EuroCOLT, 1994, pp. 35-46.
[19] W. Kwedlo and M. Kretowski, "Discovery of decision rules from
databases: an evolutionary approach," in Proc. 2nd European Symp. On
Principles of Data Mining and Knowledge Discovery (PKDD-98),
Lecture Notes in Computer Science 1510, Springer, 1998, pp. 370-378.
[20] B. Liu, M. Hu and W. Hsu, "Intuitive representation of decision tress
using general rules and exceptions," AAAI-2000, 2000.
[21] R. S. Michalski and P. H. Winston, "Variable precision logic," Artificial
Intelligence, vol. 29, pp. 121-146,1986.
[22] N. J. Radcliffe and P. D. Surry, "Co-operation through hierarchical
competition in genetic data mining," EPCC-TR94-09, 1994.
[23] K.K. Bharadwaj, Neerja and G.C. Goel, "Hierarchical censored
production rules system employing Dampster-Shafer uncertainty
calculus", Information and Software Technology, 36, pp.155-164, 1994.
[24] P. Haddaway, "A variable precision logic inference system employing
the Dempster-Shafer uncertainty calculus", MS Thesis (UILU-ENG-86-
1777), University of Illinois, Urbana-Champaign, IL, USA, 1987.
[25] E. Suzuki, and J.M. Zytkow, "Unified algorithm for undirected
discovery of exception rules," International Journal of Intelligent
Systems, vol.20, pp. 673-691, 2005.
[26] B. Alatas and A. Arslan, "Mining of interesting prediction rules with
uniform two-level genetic algorithm," International Journal of
Computational Intelligence, vol. 1, no. 4, pp. 276-281,2004.
[27] M. C. J. Bot and W. B. Langdon, " Application of genetic programming
to induction of linear classification trees," Genetic Programming: Proc.
of the 3rd European Conference (EuroCP-2000), Lecture Notes in
Computer Science, 1802,Springer, 2000, pp. 247-258.
[28] K.K. Gundogan, B. Alatas, A. Karci and Y. Tatar, "Comprehensible
classification rule mining with two-level genetic algorithm," in Proc. 2nd
FAE International Symposium, Cyprus, 2002, pp. 373-377.
[29] S. Bhattacharyya, "Evolutionary algorithms in data mining: multiobjective
performance modeling for direct marketing," in Proc. 6th ACM
SIGKDD International Conference on Knowledge Discovery and Data
Mining (KDD-2000), ACM, 2000, pp. 465-473.
[30] UCI Repository of Machine Learning Databases. Department of
Information and Computer Science University of California, 1994.
Available: http://www.ics.uci.edu/~mlearn/MLRepositry.html.