Performance Optimization of Data Mining Application Using Radial Basis Function Classifier
Text data mining is a process of exploratory data
analysis. Classification maps data into predefined groups or classes.
It is often referred to as supervised learning because the classes are
determined before examining the data. This paper describes proposed
radial basis function Classifier that performs comparative crossvalidation
for existing radial basis function Classifier. The feasibility
and the benefits of the proposed approach are demonstrated by means
of data mining problem: direct Marketing. Direct marketing has
become an important application field of data mining. Comparative
Cross-validation involves estimation of accuracy by either stratified
k-fold cross-validation or equivalent repeated random subsampling.
While the proposed method may have high bias; its performance
(accuracy estimation in our case) may be poor due to high variance.
Thus the accuracy with proposed radial basis function Classifier was
less than with the existing radial basis function Classifier. However
there is smaller the improvement in runtime and larger improvement
in precision and recall. In the proposed method Classification
accuracy and prediction accuracy are determined where the
prediction accuracy is comparatively high.
[1] Oliver Buchtala, Manual Klimek and Bernhard Sick, Member, IEEE
" Evolutionary Optimization of Radial Basis Function Classifier for Data
Mining Applications", IEEE Transactions on
systems,man,andcybernets,vol.35,No.5, October,2005
[2] Blake, C., & Merz, C. (1998). UCI repository of machine learning
databases. http://www.ics.uci.edu/˜mlearn/MLRepository.html.
[3] C. L. Bauer. A direct mail customer purchase model. Journal of Direct
Marketing, 2:16-24, 1988.
[4] Dietterich, T. (1998). Approximate statistical tests for comparing
supervised classification learning algorithms.Neural Computation, 10,
1895-1923.
[5] Friedman, J., Bentley, J., &Finkel, R. (1977). An algorithm for finding
best matches in logarithmic expected time. ACM Transactions on
Mathematical Software, 3, 209-226.
[6] Jiawei Han, Micheline Kamber " Data Mining - Concepts and
Techniques" Elsevier, 2003, pages 359 to 365.
[7] N. Jovanovic, V. Milutinovic, and Z. Obradovic, Member, IEEE,
"Foundations of Predictive Data Mining" (2002)
[8] J. M. Sousa, U. Kaymak, and S. Madeira. A comparative study of fuzzy
target selection methods in direct marketing. In Proceedings of the 11th
IEEE International Conference on Fuzzy Systems, Hawaii, USA, May
2002.
[9] Kohavi, R. (1995). A study of cross-validation and bootstrap for
accuracy estimation and model selection. Proceedings of International
Joint Conference on Artificial Intelligence (pp. 1137-1143).
[10] Margaret H.Dunham, "Data Mining- Introductory and Advanced
Topics" Pearson Education, 2003, page 112.
[11] Mitchell, T. (1997). Machine learning. New York: McGraw-Hill.
[12] Naohiro lshiil, Eisuke suchiya, Yongguangao and Nobuhiko yamaguchi,
"Combining Classification Improvements by Ensemble Processing"
Proceedings of the 2005 Third ACIS Int'l Conference on Software
Engineering Research, Management and Applications (SERA-05) 0-
7695-2297-1/05 $20.00 ┬® 2005 IEEE
[13] Ross, S. (1988). A first course in probability. New York: Macmillan.
[14] Sara Madeira Joao M.Sousa, "Comparison of target selection methods
in direct Marketing" Technical University of Lisbon, Institution Superior
T-echicio, Dept.Mechanical Eng./IDMEC, 1049-001 Lisbon, Portugal
(2002).
[15] Vapnik, V. (1998). Statistical learning theory. New York: Wiley.
[1] Oliver Buchtala, Manual Klimek and Bernhard Sick, Member, IEEE
" Evolutionary Optimization of Radial Basis Function Classifier for Data
Mining Applications", IEEE Transactions on
systems,man,andcybernets,vol.35,No.5, October,2005
[2] Blake, C., & Merz, C. (1998). UCI repository of machine learning
databases. http://www.ics.uci.edu/˜mlearn/MLRepository.html.
[3] C. L. Bauer. A direct mail customer purchase model. Journal of Direct
Marketing, 2:16-24, 1988.
[4] Dietterich, T. (1998). Approximate statistical tests for comparing
supervised classification learning algorithms.Neural Computation, 10,
1895-1923.
[5] Friedman, J., Bentley, J., &Finkel, R. (1977). An algorithm for finding
best matches in logarithmic expected time. ACM Transactions on
Mathematical Software, 3, 209-226.
[6] Jiawei Han, Micheline Kamber " Data Mining - Concepts and
Techniques" Elsevier, 2003, pages 359 to 365.
[7] N. Jovanovic, V. Milutinovic, and Z. Obradovic, Member, IEEE,
"Foundations of Predictive Data Mining" (2002)
[8] J. M. Sousa, U. Kaymak, and S. Madeira. A comparative study of fuzzy
target selection methods in direct marketing. In Proceedings of the 11th
IEEE International Conference on Fuzzy Systems, Hawaii, USA, May
2002.
[9] Kohavi, R. (1995). A study of cross-validation and bootstrap for
accuracy estimation and model selection. Proceedings of International
Joint Conference on Artificial Intelligence (pp. 1137-1143).
[10] Margaret H.Dunham, "Data Mining- Introductory and Advanced
Topics" Pearson Education, 2003, page 112.
[11] Mitchell, T. (1997). Machine learning. New York: McGraw-Hill.
[12] Naohiro lshiil, Eisuke suchiya, Yongguangao and Nobuhiko yamaguchi,
"Combining Classification Improvements by Ensemble Processing"
Proceedings of the 2005 Third ACIS Int'l Conference on Software
Engineering Research, Management and Applications (SERA-05) 0-
7695-2297-1/05 $20.00 ┬® 2005 IEEE
[13] Ross, S. (1988). A first course in probability. New York: Macmillan.
[14] Sara Madeira Joao M.Sousa, "Comparison of target selection methods
in direct Marketing" Technical University of Lisbon, Institution Superior
T-echicio, Dept.Mechanical Eng./IDMEC, 1049-001 Lisbon, Portugal
(2002).
[15] Vapnik, V. (1998). Statistical learning theory. New York: Wiley.
@article{"International Journal of Information, Control and Computer Sciences:58168", author = "M. Govindarajan and R. M.Chandrasekaran", title = "Performance Optimization of Data Mining Application Using Radial Basis Function Classifier", abstract = "Text data mining is a process of exploratory data
analysis. Classification maps data into predefined groups or classes.
It is often referred to as supervised learning because the classes are
determined before examining the data. This paper describes proposed
radial basis function Classifier that performs comparative crossvalidation
for existing radial basis function Classifier. The feasibility
and the benefits of the proposed approach are demonstrated by means
of data mining problem: direct Marketing. Direct marketing has
become an important application field of data mining. Comparative
Cross-validation involves estimation of accuracy by either stratified
k-fold cross-validation or equivalent repeated random subsampling.
While the proposed method may have high bias; its performance
(accuracy estimation in our case) may be poor due to high variance.
Thus the accuracy with proposed radial basis function Classifier was
less than with the existing radial basis function Classifier. However
there is smaller the improvement in runtime and larger improvement
in precision and recall. In the proposed method Classification
accuracy and prediction accuracy are determined where the
prediction accuracy is comparatively high.", keywords = "Text Data Mining, Comparative Cross-validation,Radial Basis Function, runtime, accuracy.", volume = "3", number = "2", pages = "386-6", }