Evolutionary Approach for Automated Discovery of Censored Production Rules
In the recent past, there has been an increasing interest
in applying evolutionary methods to Knowledge Discovery in
Databases (KDD) and a number of successful applications of Genetic
Algorithms (GA) and Genetic Programming (GP) to KDD have been
demonstrated. The most predominant representation of the
discovered knowledge is the standard Production Rules (PRs) in the
form If P Then D. The PRs, however, are unable to handle
exceptions and do not exhibit variable precision. The Censored
Production Rules (CPRs), an extension of PRs, were proposed by
Michalski & Winston that exhibit variable precision and supports an
efficient mechanism for handling exceptions. A CPR is an
augmented production rule of the form:
If P Then D Unless C, where C (Censor) is an exception to the rule.
Such rules are employed in situations, in which the conditional
statement 'If P Then D' holds frequently and the assertion C holds
rarely. By using a rule of this type we are free to ignore the exception
conditions, when the resources needed to establish its presence are
tight or there is simply no information available as to whether it
holds or not. Thus, the 'If P Then D' part of the CPR expresses
important information, while the Unless C part acts only as a switch
and changes the polarity of D to ~D.
This paper presents a classification algorithm based on evolutionary
approach that discovers comprehensible rules with exceptions in the
form of CPRs.
The proposed approach has flexible chromosome encoding, where
each chromosome corresponds to a CPR. Appropriate genetic
operators are suggested and a fitness function is proposed that
incorporates the basic constraints on CPRs. Experimental results are
presented to demonstrate the performance of the proposed algorithm.
[1] K.K. Bharadwaj and N.K Jain, "Hierarchical censored production rules
(HCPRs) system," Data & Knowledge Engineering, North Holland, vol. 8,
pp. 19-34, 1992.
[2] K.K. Bharadwaj, N.M. Hewahi and M.A. Brando, "Adaptive
hierarchical censored production rule-based system: A genetic algorithm
approach," Advances in Artificial Intelligence, SBIA -96, Lecture Notes
in Artificial Intelligence, No. 1159, Berlin, Germany, Springer-Verlag,
pp. 81-90, 1996.
[3] Basheer M. Al-Maqaleh and Kamal K. Bharadwaj "Genetic
programming approach to hierarchical production discovery," in Proc.
5th International Conference on Databases and Data Mining
(DBDM2005), vol. 6, Istanbul, Turkey, June 2005, pp. 271-274.
[4] M. Brameier and W. Banzhaf, "A comparison of linear genetic
programming and neural networks in medical data mining," IEEE
Transaction on Evolutionary Computations 5(1), pp. 17-26, 2001.
[5] I. De Falco, A. Della Cioppa and E. Tarantino, "Discovering interesting
classification rules with genetic programming," Applied Soft Computing,
1, pp. 257-269, 2002.
[6] K. A. De Jong, W.M. Spears and D. F. Gordon, "Using genetic
algorithms for concept learning," Machine Learning, vol. 13, pp. 161-
188, 1993.
[7] W. Romao, A.A. Freitas and I.M. de S. Gimenes, "Discovering
interesting knowledge from a science and technology database with a
genetic algorithm," Applied Soft Computing, 4, pp. 121-137, 2004.
[8] U. M. Fayyad, G. P. Shapiro and P. Smyth, "The KDD process for
extracting useful knowledge from volumes of data," Communication of
ACM. Nov. vol. 39 (11), pp. 27-34, 1996.
[9] M. V. Fidelis, H. S. Lopes and A. A. Freitas, "Discovering
comprehensible classification rules with a genetic algorithm," in Proc.
Congress on Evolutionary Computation-2000 (CEC-2000), La Jolla,
CA, USA, IEEE, July 2000, pp. 805-810.
[10] A.A. Freitas, "A survey of evolutionary algorithms for data mining and
knowledge discovery," In: A. Ghosh and S. Tsutsui (Eds.) Advances in
Evolutionary Computation, Springer-Verlag, 2002.
[11] B.R. Gaines, "Transforming rules and trees into comprehensible
knowledge structures," AAAI, MIT Press, 1995.
[12] B.R. Gaines and P. Compton, "Induction of ripple down rules applied to
modeling large database," Journal of Intelligent Information System.
5(3), pp. 211-228, 1995.
[13] D. E. Goldberg, "Genetic algorithms in search, optimization and
machine learning," Addison-Wesley, 1989.
[14] Han and Kamber, "Data mining: concepts and techniques," Academic
Press, 2001.
[15] N. M. Hewahi and K. K. Bharadwaj, "Bucket brigade algorithm for
HCPRs based system," International Journal of Intelligent Systems, 11,
pp. 197-226, 1996.
[16] F. Hussain, H. Liu and E. Suzuki, "Exception rule mining with a relative
interestingness measure," in Proc. 4th Pacific-Asia Conference on
Knowledge Discovery and Data Mining, 2000, pp. 86-97.
[17] N. K. Jain, K.K. Bharadwaj and N. Marranghello, "Extended of
hierarchical censored production rules (EHCPRs) system: An approach
toward generalized knowledge representation," Journal of Intelligent
System, 9(3, 4), pp. 259-295, 1999.
[18] J. Kivinen, H. Mannila and E. Ukkonen, "Learning rules with local
exceptions," EuroCOLT, 1994, pp. 35-46.
[19] W. Kwedlo and M. Kretowski, "Discovery of decision rules from
databases: an evolutionary approach," in Proc. 2nd European Symp. On
Principles of Data Mining and Knowledge Discovery (PKDD-98),
Lecture Notes in Computer Science 1510, Springer, 1998, pp. 370-378.
[20] B. Liu, M. Hu and W. Hsu, "Intuitive representation of decision tress
using general rules and exceptions," AAAI-2000, 2000.
[21] R. S. Michalski and P. H. Winston, "Variable precision logic," Artificial
Intelligence, vol. 29, pp. 121-146,1986.
[22] N. J. Radcliffe and P. D. Surry, "Co-operation through hierarchical
competition in genetic data mining," EPCC-TR94-09, 1994.
[23] K.K. Bharadwaj, Neerja and G.C. Goel, "Hierarchical censored
production rules system employing Dampster-Shafer uncertainty
calculus", Information and Software Technology, 36, pp.155-164, 1994.
[24] P. Haddaway, "A variable precision logic inference system employing
the Dempster-Shafer uncertainty calculus", MS Thesis (UILU-ENG-86-
1777), University of Illinois, Urbana-Champaign, IL, USA, 1987.
[25] E. Suzuki, and J.M. Zytkow, "Unified algorithm for undirected
discovery of exception rules," International Journal of Intelligent
Systems, vol.20, pp. 673-691, 2005.
[26] B. Alatas and A. Arslan, "Mining of interesting prediction rules with
uniform two-level genetic algorithm," International Journal of
Computational Intelligence, vol. 1, no. 4, pp. 276-281,2004.
[27] M. C. J. Bot and W. B. Langdon, " Application of genetic programming
to induction of linear classification trees," Genetic Programming: Proc.
of the 3rd European Conference (EuroCP-2000), Lecture Notes in
Computer Science, 1802,Springer, 2000, pp. 247-258.
[28] K.K. Gundogan, B. Alatas, A. Karci and Y. Tatar, "Comprehensible
classification rule mining with two-level genetic algorithm," in Proc. 2nd
FAE International Symposium, Cyprus, 2002, pp. 373-377.
[29] S. Bhattacharyya, "Evolutionary algorithms in data mining: multiobjective
performance modeling for direct marketing," in Proc. 6th ACM
SIGKDD International Conference on Knowledge Discovery and Data
Mining (KDD-2000), ACM, 2000, pp. 465-473.
[30] UCI Repository of Machine Learning Databases. Department of
Information and Computer Science University of California, 1994.
Available: http://www.ics.uci.edu/~mlearn/MLRepositry.html.
[1] K.K. Bharadwaj and N.K Jain, "Hierarchical censored production rules
(HCPRs) system," Data & Knowledge Engineering, North Holland, vol. 8,
pp. 19-34, 1992.
[2] K.K. Bharadwaj, N.M. Hewahi and M.A. Brando, "Adaptive
hierarchical censored production rule-based system: A genetic algorithm
approach," Advances in Artificial Intelligence, SBIA -96, Lecture Notes
in Artificial Intelligence, No. 1159, Berlin, Germany, Springer-Verlag,
pp. 81-90, 1996.
[3] Basheer M. Al-Maqaleh and Kamal K. Bharadwaj "Genetic
programming approach to hierarchical production discovery," in Proc.
5th International Conference on Databases and Data Mining
(DBDM2005), vol. 6, Istanbul, Turkey, June 2005, pp. 271-274.
[4] M. Brameier and W. Banzhaf, "A comparison of linear genetic
programming and neural networks in medical data mining," IEEE
Transaction on Evolutionary Computations 5(1), pp. 17-26, 2001.
[5] I. De Falco, A. Della Cioppa and E. Tarantino, "Discovering interesting
classification rules with genetic programming," Applied Soft Computing,
1, pp. 257-269, 2002.
[6] K. A. De Jong, W.M. Spears and D. F. Gordon, "Using genetic
algorithms for concept learning," Machine Learning, vol. 13, pp. 161-
188, 1993.
[7] W. Romao, A.A. Freitas and I.M. de S. Gimenes, "Discovering
interesting knowledge from a science and technology database with a
genetic algorithm," Applied Soft Computing, 4, pp. 121-137, 2004.
[8] U. M. Fayyad, G. P. Shapiro and P. Smyth, "The KDD process for
extracting useful knowledge from volumes of data," Communication of
ACM. Nov. vol. 39 (11), pp. 27-34, 1996.
[9] M. V. Fidelis, H. S. Lopes and A. A. Freitas, "Discovering
comprehensible classification rules with a genetic algorithm," in Proc.
Congress on Evolutionary Computation-2000 (CEC-2000), La Jolla,
CA, USA, IEEE, July 2000, pp. 805-810.
[10] A.A. Freitas, "A survey of evolutionary algorithms for data mining and
knowledge discovery," In: A. Ghosh and S. Tsutsui (Eds.) Advances in
Evolutionary Computation, Springer-Verlag, 2002.
[11] B.R. Gaines, "Transforming rules and trees into comprehensible
knowledge structures," AAAI, MIT Press, 1995.
[12] B.R. Gaines and P. Compton, "Induction of ripple down rules applied to
modeling large database," Journal of Intelligent Information System.
5(3), pp. 211-228, 1995.
[13] D. E. Goldberg, "Genetic algorithms in search, optimization and
machine learning," Addison-Wesley, 1989.
[14] Han and Kamber, "Data mining: concepts and techniques," Academic
Press, 2001.
[15] N. M. Hewahi and K. K. Bharadwaj, "Bucket brigade algorithm for
HCPRs based system," International Journal of Intelligent Systems, 11,
pp. 197-226, 1996.
[16] F. Hussain, H. Liu and E. Suzuki, "Exception rule mining with a relative
interestingness measure," in Proc. 4th Pacific-Asia Conference on
Knowledge Discovery and Data Mining, 2000, pp. 86-97.
[17] N. K. Jain, K.K. Bharadwaj and N. Marranghello, "Extended of
hierarchical censored production rules (EHCPRs) system: An approach
toward generalized knowledge representation," Journal of Intelligent
System, 9(3, 4), pp. 259-295, 1999.
[18] J. Kivinen, H. Mannila and E. Ukkonen, "Learning rules with local
exceptions," EuroCOLT, 1994, pp. 35-46.
[19] W. Kwedlo and M. Kretowski, "Discovery of decision rules from
databases: an evolutionary approach," in Proc. 2nd European Symp. On
Principles of Data Mining and Knowledge Discovery (PKDD-98),
Lecture Notes in Computer Science 1510, Springer, 1998, pp. 370-378.
[20] B. Liu, M. Hu and W. Hsu, "Intuitive representation of decision tress
using general rules and exceptions," AAAI-2000, 2000.
[21] R. S. Michalski and P. H. Winston, "Variable precision logic," Artificial
Intelligence, vol. 29, pp. 121-146,1986.
[22] N. J. Radcliffe and P. D. Surry, "Co-operation through hierarchical
competition in genetic data mining," EPCC-TR94-09, 1994.
[23] K.K. Bharadwaj, Neerja and G.C. Goel, "Hierarchical censored
production rules system employing Dampster-Shafer uncertainty
calculus", Information and Software Technology, 36, pp.155-164, 1994.
[24] P. Haddaway, "A variable precision logic inference system employing
the Dempster-Shafer uncertainty calculus", MS Thesis (UILU-ENG-86-
1777), University of Illinois, Urbana-Champaign, IL, USA, 1987.
[25] E. Suzuki, and J.M. Zytkow, "Unified algorithm for undirected
discovery of exception rules," International Journal of Intelligent
Systems, vol.20, pp. 673-691, 2005.
[26] B. Alatas and A. Arslan, "Mining of interesting prediction rules with
uniform two-level genetic algorithm," International Journal of
Computational Intelligence, vol. 1, no. 4, pp. 276-281,2004.
[27] M. C. J. Bot and W. B. Langdon, " Application of genetic programming
to induction of linear classification trees," Genetic Programming: Proc.
of the 3rd European Conference (EuroCP-2000), Lecture Notes in
Computer Science, 1802,Springer, 2000, pp. 247-258.
[28] K.K. Gundogan, B. Alatas, A. Karci and Y. Tatar, "Comprehensible
classification rule mining with two-level genetic algorithm," in Proc. 2nd
FAE International Symposium, Cyprus, 2002, pp. 373-377.
[29] S. Bhattacharyya, "Evolutionary algorithms in data mining: multiobjective
performance modeling for direct marketing," in Proc. 6th ACM
SIGKDD International Conference on Knowledge Discovery and Data
Mining (KDD-2000), ACM, 2000, pp. 465-473.
[30] UCI Repository of Machine Learning Databases. Department of
Information and Computer Science University of California, 1994.
Available: http://www.ics.uci.edu/~mlearn/MLRepositry.html.
@article{"International Journal of Information, Control and Computer Sciences:63171", author = "Kamal K. Bharadwaj and Basheer M. Al-Maqaleh", title = "Evolutionary Approach for Automated Discovery of Censored Production Rules", abstract = "In the recent past, there has been an increasing interest
in applying evolutionary methods to Knowledge Discovery in
Databases (KDD) and a number of successful applications of Genetic
Algorithms (GA) and Genetic Programming (GP) to KDD have been
demonstrated. The most predominant representation of the
discovered knowledge is the standard Production Rules (PRs) in the
form If P Then D. The PRs, however, are unable to handle
exceptions and do not exhibit variable precision. The Censored
Production Rules (CPRs), an extension of PRs, were proposed by
Michalski & Winston that exhibit variable precision and supports an
efficient mechanism for handling exceptions. A CPR is an
augmented production rule of the form:
If P Then D Unless C, where C (Censor) is an exception to the rule.
Such rules are employed in situations, in which the conditional
statement 'If P Then D' holds frequently and the assertion C holds
rarely. By using a rule of this type we are free to ignore the exception
conditions, when the resources needed to establish its presence are
tight or there is simply no information available as to whether it
holds or not. Thus, the 'If P Then D' part of the CPR expresses
important information, while the Unless C part acts only as a switch
and changes the polarity of D to ~D.
This paper presents a classification algorithm based on evolutionary
approach that discovers comprehensible rules with exceptions in the
form of CPRs.
The proposed approach has flexible chromosome encoding, where
each chromosome corresponds to a CPR. Appropriate genetic
operators are suggested and a fitness function is proposed that
incorporates the basic constraints on CPRs. Experimental results are
presented to demonstrate the performance of the proposed algorithm.", keywords = "Censored Production Rule, Data Mining, MachineLearning, Evolutionary Algorithms.", volume = "1", number = "10", pages = "3299-6", }