Association Rule and Decision Tree based Methodsfor Fuzzy Rule Base Generation

This paper focuses on the data-driven generation of fuzzy IF...THEN rules. The resulted fuzzy rule base can be applied to build a classifier, a model used for prediction, or it can be applied to form a decision support system. Among the wide range of possible approaches, the decision tree and the association rule based algorithms are overviewed, and two new approaches are presented based on the a priori fuzzy clustering based partitioning of the continuous input variables. An application study is also presented, where the developed methods are tested on the well known Wisconsin Breast Cancer classification problem.


Keywords:


References:
[1] J. Abonyi, B. Feil, S. Nemeth, and P. Arva. Modified Gath-Geva
clustering for fuzzy segmentation of multivariate time-series. Fuzzy Sets
and Systems, Data Mining Special Issue, pages in print, avaiable on-line
from Science Direct, 2005.
[2] J. M. Adamo. Fuzzy decision trees. Fuzzy Sets and Systems, 4(3):207-
219, 1980.
[3] R. Agrawal, T. Imielinski, and A. Swami. Mining association rules
between sets of items in large databases. In Proceedings of the 1993
ACM SIGMOD International Conference on Management of Data, pages
207-216, 1993.
[4] R. Agrawal, H. Mannila, R. Srikant, H. Toivonen, and A.I. Verkamo.
Fast discovery of association rules. In Advances in Knowledge Discovery
and Data Mining, pages 307-328. AAAI/MIT Press, 1996.
[5] R. Agrawal and R. Srikant. Fast algorithm for mining association rules
in large databases. In Proceedings of the 20th International Conference
on Very Large Data Bases, pages 487-499, 1994.
[6] Jean-Roger Le Gall Anke Neumann, Josiane Holstein and Eric Lepage.
Measuring performance in health care: case-mix adjustment by boosted
decision trees. Artificial Intelligence in Medicine, 32(2):97-113, 2004.
[7] Jan Jantzen Hubertus Axer Beth Bjerregaard Athanasios Tsakonas,
Georgios Dounias and Diedrich Graf von Keyserlingk. Evolving rulebased
systems in two medical domains using genetic programming.
Artificial Intelligence in Medicine, 32(3):195-216, 2004.
[8] W-H. Au and K.C.C. Chan. An effective algorithm for discovering
fuzzy rules in relational databases. In Proceedings of the 7th IEEE
International Conference on Fuzzy Systems, pages 1314-1319, 1998.
[9] W-H. Au and K.C.C. Chan. Farm: A data mining system for discovering
fuzzy association rules. In Proceedings of the 8th IEEE International
Conference on Fuzzy Systems, pages 1217-1222, 1999.
[10] Elena Baralis and Silvia Chiusano. Essential classification rule sets.
ACM Transactions on Database Systems, 29(4):635674, 2004.
[11] Y. Bastide, R. Taouil, N. Pasquier, G. Stumme, and L. Lakhal. Mining
frequent patterns with counting inference. SIGKDD Explorations,
2(2):66-75, 2000.
[12] R.J. Bayardo. Efficiently mining long patterns from databases. In
Proceedings of the 1998 ACM SIGMOD International Conference on
Management of Data, pages 85-93, 1998.
[13] R.J. Bayardo and R. Agrawal. Mining the most interesting rules. In
Proceedings of the 1999 ACM SIGKDD International Conference on
Knowledge Discovery and Data Mining, pages 145-154, 1999.
[14] R.J. Bayardo, R. Agrawal, and D. Gunopulos. Constraint-based rule
mining in large, dense databases. In Proceedings of the 1999 IEEE
International Conference on Data Engineering, pages 188-197, 1999.
[15] C.J. Moran B.L. Henderson, E.N. Bui and D.A.P. Simon. Australiawide
predictions of soil properties using decision trees. Geoderma,
124(3-4):383-398, 2005.
[16] S. Brin, R. Motwani, J. Ullman, and S. Tsur. Dynamic itemset counting
and implication rules for market basket data. In Proceedings of the 1997
ACM SIGMOD International Conference on Management of Data, pages
255-264, 1997.
[17] D. Burdick, M. Calimlim, and J. Gehrke. Mafia: A maximal frequent
itemset algorithm for transactional databases. In Proceedings of the 2001
IEEE International Conference on Data Engineering, pages 443-552,
2001.
[18] K.C.C. Chan and W-H. Au. Mining fuzzy association rules. In
Proceedings of the 1997 International Conference on Information and
Knowledge Management, pages 209-215, 1997.
[19] Liu C-H. Wang Y-W. Chang, P-C. A hybrid model by clustering and
evolving fuzzy rules for sales decision supports in printed circuit board
industry. Decisions Support Systems, Available online 13 December
2005.
[20] G. Chen and Q. Wei. Fuzzy association rules and the extended mining
algorithms. Information Sciences, 147:201-228, 2002.
[21] G. Chen, Q. Wei, and E. Kerre. Fuzzy data mining: Discovery of fuzzy
generalized association rules. In Recent Issues on Fuzzy Databases,
pages 45-66. Springer, 2000.
[22] Henri Prade Didier Dubois. What are fuzzy rules and how to use them.
Fuzzy Sets and Systems, 84:169-185, 1996.
[23] Zhang X. Wong-L. Li J. Dong, G. Caep: classification by aggregating
emerging patterns. In Second International Conference on Discovery
Science, 1999.
[24] J. Abonyi F. D. Tamas, F. P. Pach and A. M. Esteves. Analysisof trace
elements in clinker based on supervised clustering and fuzzy decision
tree induction. In 6th International Congress, Global Construction:
Ultimate Concrete Opportunities, Dundee, Scotland, 2005.
[25] P. Arva F. P. Pach, A. Gyenesei and J. Abonyi. Fuzzy association
rule mining for model structure identification. In 10th Online World
Conference on Soft Computing in Industrial Application, WSC10, 2005.
[26] P. Arva F. P. Pach, A. Gyenesei and J. Abonyi. Fuzzy association
rule mining for model structure identification. In Applications of Soft
Computing: Recent Trends, Springer, 2006, In Press.
[27] S. Nemeth P. Arva J. Abonyi F. P. Pach, A. Gyenesei. Fuzzy association
rule mining for the analysis of historical process data. Acta Agraria
Kaposvariensis, 2006, In Press.
[28] S. Nemeth P. Arva J. Abonyi F. P. Pach, F. Szeifert. Fuzzy association
rule mining for data-driven analysis of dynamical systems. Hungarian
Journal of Industrial Chemistry, Special Issue on Recent advances in
Computer Aided Process Engineering, 2006, In Press.
[29] S. Nemeth P. Arva F.P. Pach, J. Abonyi. Supervised clustering and fuzzy
decision tree induction for the identification of compact classifiers. In 5th
International Symposium of Hungarian Researchers on Computational
Intelligence, Budapest, Hungary, 2004.
[30] Paul Leng Frans Coenen. The effect of threshold values on association
rule based classification accuracy. Data and Knowledge Engineering,
Available online, 2006.
[31] I. Gath and A.B. Geva. Unsupervised optimal fuzzy clustering. IEEE
Transactions on Pattern Analysis and Machine Intelligence, 7:773-781,
1989.
[32] Pal N.R. Das J. Ghosh, A. A fuzzy rule based approach to cloud cover
estimation. Remote Sensing of Environment, 100:531-549, 2006.
[33] D.E. Gustafson and W.C. Kessel. Fuzzy clustering with fuzzy covariance
matrix. In In Proceedings of the IEEE CDC, San Diego, pages 761-766,
1979.
[34] S. Kper J. Zhang and A. Knoll. Extracting compact fuzzy rules based
on adaptive data approximation using b-splines. Information Sciences,
142(1-4):227-248, 2002.
[35] C.Z. Janikow. Fuzzy decision trees: issues and methods. IEEE Trans.
Systems Man Cybernet. Part B (Cybernetics), 28(1):1-14, 1998.
[36] C.Z. Janikow. Fuzzy partitioning with fid 3.1. In Proc. 18th Internat.
Conf. of the North American Fuzzy Information Processing Society,
NAFIPS99, pages 467-471, 1999.
[37] Patrick Soriano Jean-Yves Potvin and Maxime Valle. Generating trading
rules on the stock markets with genetic programming. Computers and
Operations Research, 31(7):1033-1047, 2004.
[38] E.S. Karapidakis. Machine learning for frequency estimation of power
systems. Applied Soft Computing, In Press, Corrected Proof, Available
online 28 December 2005.
[39] Kun Chang Lee and Sung Joo Park. A knowledge-based fuzzy decision
tree classifier for time series modeling. Fuzzy Sets and Systems, 33(1):1-
18, 1989.
[40] Ricardo Linden and Amit Bhaya. Evolving fuzzy rules to model gene
expression. Biosystems, In Press, Accepted Manuscript, Available online
30 April 2006.
[41] Ma Y. Liu, B. and Wong C. K. Improving an association rule based
classifier. In Principles of Data Mining and Knowledge Discovery, pages
504-509, 2000.
[42] T. Bar-Noy M. Friedman and M. Blau A. Kandel. Certain computational
aspects of fuzzy decision trees. Fuzzy Sets and Systems, 28(2):163-170,
1988.
[43] Bing Liu Wynne Hsu Yiming Ma. Integrating classification and
association rule mining. In Appeared in KDD-98, New York, 1998.
[44] D. Meretakis and B. Wuthrich. Extending naive bayes classifiers using
long itemsets. In Knowledge Discovery and Data Mining, pages 165-
174, 1999.
[45] A. Keith Dunker Predrag Radivojac, Nitesh V. Chawla and Zoran
Obradovic. Classification and knowledge discovery in protein databases.
Journal of Biomedical Informatics, 37(4):224-239, 2004.
[46] J. R. Quinlan. Induction on decision trees. Machine Learning, 1(1):81-
106, 1986.
[47] J.R. Quinlan. C4.5: Programs for Machine Learning. Morgan Kaufmann,
San Mateo, CA, 1993.
[48] R. Agrawal R. Srikant. Mining generalized association rules. In The
Internat. Conf. on Very Large Databases, 1995.
[49] A. Abuelgasim R.H. Fraser and R. Latifovic. A method for detecting
large-scale forest cover change using coarse spatial resolution imagery.
Remote Sensing of Environment, 95(4):414-427, 2005.
[50] Sbastien Thomassey and Antonio Fiordaliso. A hybrid sales forecasting
system based on clustering and decision trees. Decision Support Systems,
In Press, Corrected Proof,, Available online 30 March 2005.
[51] Kuei-Ying Lin Tzung-Pei Hong and Shyue-Liang Wang. Fuzzy data
mining for interesting generalized association rules. Fuzzy Sets and
Systems, 138(2):255-269, 2003.
[52] J. M. Zurada W. Duch, R. Setiono. Computational intelligence methods
for rule-based data understanding. Proc. of the IEEE, 92(5), 2004.
[53] Zhou S. Wang, K. and Y. He. Growing decision tree on support-less
association rules. In In proceedings of KDD-00, Boston, MA, 2000.
[54] R. Weber. Fuzzy id3: a class of methods for automatic knowledge
acquisition. In Proc. 2nd Internat. Conf. on Fuzzy Logic and Neural
Networks, Iizuka, Japan, page 265268, 1992.
[55] Zenon A. Sosnowskic Witold Pedrycz. The design of decision trees in
the framework of granular data and their application to software quality
models. Fuzzy Sets and Systems, 123:271290, 2001.
[56] Gwo-Hshiung Tzeng Yi-Chung Hu. Elicitation of classification rules by
fuzzy data mining. Engineering Applications of Artificial Intelligence,
16:709716, 2003.
[57] Gwo-Hshiung Tzeng Yi-Chung Hu, Ruey-Shun Chen. Mining fuzzy
association rules for classification problems. Computers and Industrial
Engineering, 43:735750, 2002.
[58] X. Yin and J. Han. Cpar: Classification based on predictive association
rules. In in Proceedings of 2003 SIAM International Conference on
Data Mining (SDM-03), 2003.
[59] A. Zimmermann and Raedt L. D. Corclass: Correlated association
rule mining for classification. In Discovery Science, 7th International
Conference, Padova, Italy, pages 60-72, 2004.