Abstract: In the recent past, there has been an increasing interest
in applying evolutionary methods to Knowledge Discovery in
Databases (KDD) and a number of successful applications of Genetic
Algorithms (GA) and Genetic Programming (GP) to KDD have been
demonstrated. The most predominant representation of the
discovered knowledge is the standard Production Rules (PRs) in the
form If P Then D. The PRs, however, are unable to handle
exceptions and do not exhibit variable precision. The Censored
Production Rules (CPRs), an extension of PRs, were proposed by
Michalski & Winston that exhibit variable precision and supports an
efficient mechanism for handling exceptions. A CPR is an
augmented production rule of the form:
If P Then D Unless C, where C (Censor) is an exception to the rule.
Such rules are employed in situations, in which the conditional
statement 'If P Then D' holds frequently and the assertion C holds
rarely. By using a rule of this type we are free to ignore the exception
conditions, when the resources needed to establish its presence are
tight or there is simply no information available as to whether it
holds or not. Thus, the 'If P Then D' part of the CPR expresses
important information, while the Unless C part acts only as a switch
and changes the polarity of D to ~D.
This paper presents a classification algorithm based on evolutionary
approach that discovers comprehensible rules with exceptions in the
form of CPRs.
The proposed approach has flexible chromosome encoding, where
each chromosome corresponds to a CPR. Appropriate genetic
operators are suggested and a fitness function is proposed that
incorporates the basic constraints on CPRs. Experimental results are
presented to demonstrate the performance of the proposed algorithm.
Abstract: Censored Production Rule is an extension of standard
production rule, which is concerned with problems of reasoning with
incomplete information, subject to resource constraints and problem
of reasoning efficiently with exceptions. A CPR has a form: IF A
(Condition) THEN B (Action) UNLESS C (Censor), Where C is the
exception condition. Fuzzy CPR are obtained by augmenting
ordinary fuzzy production rule “If X is A then Y is B with an
exception condition and are written in the form “If X is A then Y is B
Unless Z is C. Such rules are employed in situation in which the
fuzzy conditional statement “If X is A then Y is B" holds frequently
and the exception condition “Z is C" holds rarely. Thus “If X is A
then Y is B" part of the fuzzy CPR express important information
while the unless part acts only as a switch that changes the polarity of
“Y is B" to “Y is not B" when the assertion “Z is C" holds. The
proposed approach is an attempt to discover fuzzy censored
production rules from set of discovered fuzzy if then rules in the
form:
A(X)  B(Y) || C(Z).
Abstract: In the recent past Learning Classifier Systems have
been successfully used for data mining. Learning Classifier System
(LCS) is basically a machine learning technique which combines
evolutionary computing, reinforcement learning, supervised or
unsupervised learning and heuristics to produce adaptive systems. A
LCS learns by interacting with an environment from which it
receives feedback in the form of numerical reward. Learning is
achieved by trying to maximize the amount of reward received. All
LCSs models more or less, comprise four main components; a finite
population of condition–action rules, called classifiers; the
performance component, which governs the interaction with the
environment; the credit assignment component, which distributes the
reward received from the environment to the classifiers accountable
for the rewards obtained; the discovery component, which is
responsible for discovering better rules and improving existing ones
through a genetic algorithm. The concatenate of the production rules
in the LCS form the genotype, and therefore the GA should operate
on a population of classifier systems. This approach is known as the
'Pittsburgh' Classifier Systems. Other LCS that perform their GA at
the rule level within a population are known as 'Mitchigan' Classifier
Systems. The most predominant representation of the discovered
knowledge is the standard production rules (PRs) in the form of IF P
THEN D. The PRs, however, are unable to handle exceptions and do
not exhibit variable precision. The Censored Production Rules
(CPRs), an extension of PRs, were proposed by Michalski and
Winston that exhibit variable precision and supports an efficient
mechanism for handling exceptions. A CPR is an augmented
production rule of the form: IF P THEN D UNLESS C, where
Censor C is an exception to the rule. Such rules are employed in
situations, in which conditional statement IF P THEN D holds
frequently and the assertion C holds rarely. By using a rule of this
type we are free to ignore the exception conditions, when the
resources needed to establish its presence are tight or there is simply
no information available as to whether it holds or not. Thus, the IF P
THEN D part of CPR expresses important information, while the
UNLESS C part acts only as a switch and changes the polarity of D
to ~D. In this paper Pittsburgh style LCSs approach is used for
automated discovery of CPRs. An appropriate encoding scheme is
suggested to represent a chromosome consisting of fixed size set of
CPRs. Suitable genetic operators are designed for the set of CPRs
and individual CPRs and also appropriate fitness function is proposed
that incorporates basic constraints on CPR. Experimental results are
presented to demonstrate the performance of the proposed learning
classifier system.
Abstract: Knowledge is indispensable but voluminous knowledge becomes a bottleneck for efficient processing. A great challenge for data mining activity is the generation of large number of potential rules as a result of mining process. In fact sometimes result size is comparable to the original data. Traditional data mining pruning activities such as support do not sufficiently reduce the huge rule space. Moreover, many practical applications are characterized by continual change of data and knowledge, thereby making knowledge voluminous with each change. The most predominant representation of the discovered knowledge is the standard Production Rules (PRs) in the form If P Then D. Michalski & Winston proposed Censored Production Rules (CPRs), as an extension of production rules, that exhibit variable precision and supports an efficient mechanism for handling exceptions. A CPR is an augmented production rule of the form: If P Then D Unless C, where C (Censor) is an exception to the rule. Such rules are employed in situations in which the conditional statement 'If P Then D' holds frequently and the assertion C holds rarely. By using a rule of this type we are free to ignore the exception conditions, when the resources needed to establish its presence, are tight or there is simply no information available as to whether it holds or not. Thus the 'If P Then D' part of the CPR expresses important information while the Unless C part acts only as a switch changes the polarity of D to ~D. In this paper a scheme based on Dempster-Shafer Theory (DST) interpretation of a CPR is suggested for discovering CPRs from the discovered flat PRs. The discovery of CPRs from flat rules would result in considerable reduction of the already discovered rules. The proposed scheme incrementally incorporates new knowledge and also reduces the size of knowledge base considerably with each episode. Examples are given to demonstrate the behaviour of the proposed scheme. The suggested cumulative learning scheme would be useful in mining data streams.