An Attribute-Centre Based Decision Tree Classification Algorithm

Decision tree algorithms have very important place at classification model of data mining. In literature, algorithms use entropy concept or gini index to form the tree. The shape of the classes and their closeness to each other some of the factors that affect the performance of the algorithm. In this paper we introduce a new decision tree algorithm which employs data (attribute) folding method and variation of the class variables over the branches to be created. A comparative performance analysis has been held between the proposed algorithm and C4.5.




References:
[1] Coley, A D., (1999). An Introduction to Genetic Algorithms for
Scientists and Engineers. World Scientific, Singapore, 188p.
[2] Pham, D.T., and Karaboga, D., 2000. Intelligent Optimization
Techniques. Springer, London, Great Britain, 261p.
[3] Han J., & Kamber, Micheline. (2001). Data Mining Concepts and
Techniques, Morgan Kaufman Publishers Academic Press.
[4] Authori Fausett L.(1994). Fundamentals of Neural Networks, Prentice-
Hall, New Jersey.
[5] Maulik U. & Sanghamitra B.(2000). Genetic Algorithm-based clustering
technique, Journal of the Pattern Recognition, Pergamon, issue: 33.
[6] Bill F. (Ed.) (1992). Information retrieval : data structures & algorithms.
Prentice Hall.
[7] Breiman, L., J. H. Fried man, R. A. Ol shen, and C. J. Stone. (1984).
Classification and regression trees. Mon terey, Calif., U.S.A.Wadsworth,
Inc.
[8] Shafer J.C., Agrawal R., Mehta M.: "SPRINT: A Scalable Parallel
Classifier for Data Mining", Proc. of the 22th International Conference
on Very Large Databases, Mumbai (Bombay), India, Sept. 1996.
[9] Mitchell T.(1997). Machine Learning, McGraw-Hill International.
[10] Quinlan,J.Ross. (1987). Simplifying decision trees, International Journal
of Man-Machine Studies,issue: 27(3), (pp. 221 - 234).
[11] Breiman L., & Friedman J. H., & Olshen R. A., & Stone C. J. (1984).
Classification and Regression Trees, Wadsworth, Belmont.
[12] Mehta M., & Agrawal R., & Rissanen J. (1996). SLIQ: A Fast Scalable
Classifier for Data Mining, Proceedings of 5th International Extending
Database Technology Conference.France. (pp. 18-32). Springer-Verlag,
London.
[13] Agrawal R. & Shafer J.C. (1996). Parallel Mining of Association
Rules, Proceedings. of IEEE Transactions on Knowledge and Data
Engineering, Vol. 8, No. 6. (962- 969). IEEE Educational Activities
Department. USA.
[14] Hettich, S. , & Bay, S. D. (1999). The UCI KDD Archive, Department of
Information and Computer Science, University of California, Irvine, CA.
Retrieved September 1, 2008, from http://kdd.ics.uci.edu.
[15] Pham D.T., & Chan A.B.(1998). Control Chart Pattern Recognition
using a New Type of Self Organizing Neural Network. Proceedings of
the Institution of Mechanical Engineers, Part I: Journal of Systems and
Control Engineering. Vol 212, No 1, (pp. 115-127). Professional
Engineering Publishing.
[16] Keogh, E. & Pazzani, M. (2001). Derivative Dynamic Time Warping. In
First SIAM International Conference on Data Mining (SDM'2001),
Chicago, USA.
[17] Alcock R.J., & Manolopoulos Y.(1999). Time-Series Similarity Queries
Employing a Feature-Based Approach. 7th Hellenic Conference on
Informatics. Ioannina,Greece.