Learning Classifier Systems Approach for Automated Discovery of Censored Production Rules

In the recent past Learning Classifier Systems have been successfully used for data mining. Learning Classifier System (LCS) is basically a machine learning technique which combines evolutionary computing, reinforcement learning, supervised or unsupervised learning and heuristics to produce adaptive systems. A LCS learns by interacting with an environment from which it receives feedback in the form of numerical reward. Learning is achieved by trying to maximize the amount of reward received. All LCSs models more or less, comprise four main components; a finite population of condition–action rules, called classifiers; the performance component, which governs the interaction with the environment; the credit assignment component, which distributes the reward received from the environment to the classifiers accountable for the rewards obtained; the discovery component, which is responsible for discovering better rules and improving existing ones through a genetic algorithm. The concatenate of the production rules in the LCS form the genotype, and therefore the GA should operate on a population of classifier systems. This approach is known as the 'Pittsburgh' Classifier Systems. Other LCS that perform their GA at the rule level within a population are known as 'Mitchigan' Classifier Systems. The most predominant representation of the discovered knowledge is the standard production rules (PRs) in the form of IF P THEN D. The PRs, however, are unable to handle exceptions and do not exhibit variable precision. The Censored Production Rules (CPRs), an extension of PRs, were proposed by Michalski and Winston that exhibit variable precision and supports an efficient mechanism for handling exceptions. A CPR is an augmented production rule of the form: IF P THEN D UNLESS C, where Censor C is an exception to the rule. Such rules are employed in situations, in which conditional statement IF P THEN D holds frequently and the assertion C holds rarely. By using a rule of this type we are free to ignore the exception conditions, when the resources needed to establish its presence are tight or there is simply no information available as to whether it holds or not. Thus, the IF P THEN D part of CPR expresses important information, while the UNLESS C part acts only as a switch and changes the polarity of D to ~D. In this paper Pittsburgh style LCSs approach is used for automated discovery of CPRs. An appropriate encoding scheme is suggested to represent a chromosome consisting of fixed size set of CPRs. Suitable genetic operators are designed for the set of CPRs and individual CPRs and also appropriate fitness function is proposed that incorporates basic constraints on CPR. Experimental results are presented to demonstrate the performance of the proposed learning classifier system.




References:
[1] Ian H. Witten and Eibe Frank, ÔÇÿDATA MINING: Practical Machine
Learning Tools and Techniques-, Second Edition, Morgan Kaufmann
Publishers, An imprint of Elsevier.
[2] U.M. Fayyad, G. P. Shapiro and P. Smyth, ÔÇÿThe KDD process for
extracting useful knowledge from volumes from data-, Communication
of ACM, Nov. vol. 39(11), page 27 - 34, 1996.
[3] Pieter Adriaans and Dolf Zantinge, in the book on DATA MINING by
Pearson Ed. Publication.
[4] Holland, J.H. (1975) Adaptation in Natural and Artificial Systems.
University of Michigan Press.
[5] Holland, J.H. (1976) Adaptation. In Rosen & Snell (eds) Progress in
Theoretical Biology, 4.Plenum.
[6] Bernado, E., Llora, X., & Garrell, J. M. (2001). XCS and GALE: a
comparative study of two learning classifier systems with six other
learning algorithms on classification tasks. In Fourth International
Workshop on Learning Classifier Systems - IWLCS-2001 pages 337-
341.
[7] Jaume Bacardit and Martin V. Butz, Enginyeria i Arquitectura La Salle,
Universitat Ramon Llull, Passeig Bonanova 8, 08022, Barcelona,
[email protected], ÔÇÿData Mining in Learning Classifer
Systems:Comparing XCS with GAssist-
[8] Holland, J.H., ÔÇÿ Escaping Brittleness: The possibilities of General
Purpose Learning Algorithms applied to Parallel rule - based systems-,
ÔÇÿMachine Learning: An Artificial Intelligence approach-, Vol. II, pages
593 - 623, 1978.
[9] John H. Homes, Pier Luca lanzi, Wolfgang Stolzmann, Stewart W.
Wilson, ÔÇÿLearning Classifier Systems: New Models, Successful
Applications.
[10] Kenneth A. De Jong, ÔÇÿLearning with Genetic Algorithms: An
Overview-, Machine Learning, 1988.
[11] Michalski, R.S. and Winston, P.H. 1986, ÔÇÿVariable precision logic-,
Artificial Intelligence, Vol 29, pp 121-146.
[12] Winston, P.H., ÔÇÿ Learning by augmenting rules and accumulating
censors-, MIT AI Laboratory, AIM - 678, Cambridge, MA, 1982; also in
R.S. Michalski, j.G. Carbonell and T. M Mitchell (Eds.), ÔÇÿMachine
learning: An Artificial Intelligence Approach 2 ( Morgan Kaufmann,
Los Altos, CA, 1986).
[13] Alwyn Barry, John Holmes, and Xavier Llora, ÔÇÿData Mining using
Learning Classifier Systems- , pages 21 - 23
[14] Pierre Bonelli, Alexandre Parodi, Sandip Sen, and Stewart W. Wilson.
ÔÇÿNEWBOOLE: A Fast GBML System-, In International Conference on
Machine Learning, pages 153-159, San Mateo, California, 1990.
Morgan Kaufmann.
[15] John H. Holmes, ÔÇÿEvolution-Assisted Discovery of Sentinel Features in
Epidemiologic Surveillance-. PhD thesis, Drexel University, 1996.
[16] John H. Holmes, ÔÇÿA genetics-based machine learning approach to
knowledge discovery in clinical data-. Journal of the American Medical
Informatics Association Supplement, 1996.
[17] Stewart W. Wilson,- Knowledge Growth in an Artificial Animal-. In
Grefenstette [45], pages 16-23. Also appeared in Proceedings of the 4th
Yale.
[18] Stewart W. Wilson,- Classifier Systems and the Animat Problem-.
Machine Learning, 2:199-228, 1987. Also Research Memo RIS-36r, the
Rowland Institute for Science, Cambridge, MA, 1986.
[19] John H. Holmes, ÔÇÿDiscovering Risk of Disease with a Learning
Classifier System-, In Thomas Back, editor, Proceedings of the 7th
International Conference on Genetic Algorithms (ICGA97). Morgan
Kaufmann, 1997.
[20] Website address of an online data repository; from where Mushroom
training dataset and other training data sets are taken:
http://www.sgi.com/tech/mlc/db/
[21] K.K. Bharadwaj and N.K. Jain, ÔÇÿHierarchical Censored production Rules
System-, Data and Knowledge Engineering, North Holland, vol. 8, page
19 - 34, 1992.
[22] E. Suzuki, and J.M. Zytkow, ÔÇÿUnified algorithm for undirected discovery
of exception rules-, International Journal of Intelligent Systems, vol. 20,
page 673 - 691, 2005.
[23] N.J. Radcliffe and P.D. Surry, ÔÇÿCo - operation through hierarchical
competition in genetic data mining-, EPCC - TR94 - 09, 1994.