Unsupervised Feature Selection Using Feature Density Functions

Since dealing with high dimensional data is computationally complex and sometimes even intractable, recently several feature reductions methods have been developed to reduce the dimensionality of the data in order to simplify the calculation analysis in various applications such as text categorization, signal processing, image retrieval, gene expressions and etc. Among feature reduction techniques, feature selection is one the most popular methods due to the preservation of the original features. In this paper, we propose a new unsupervised feature selection method which will remove redundant features from the original feature space by the use of probability density functions of various features. To show the effectiveness of the proposed method, popular feature selection methods have been implemented and compared. Experimental results on the several datasets derived from UCI repository database, illustrate the effectiveness of our proposed methods in comparison with the other compared methods in terms of both classification accuracy and the number of selected features.




References:
[1] U. Fayyad, G. Piatetsky-Shapiro, P. Smyth, "From data mining to
knowledge discovery in databases", AI Magazine, vol. 17, 1996, pp. 37-
54.
[2] M. Lindenbaum, S. Markovitch, D. Rusakov, "Selective sampling for
nearest neighbor classifiers", Machine learning, vol. 54, 2004, pp. 125-
152.
[3] A.I. Schein, L.H. Ungar, "Active learning for logistic regression: an
evaluation", Machine Learning, vol. 68, 2007, pp. 235-265.
[4] M.A. Hall, "Correlation-based feature subset selection for machine
learning", Ph.D. Dissertation, Department of Computer Science,
University of Waikato, Hamilton, New Zealand, 1999.
[5] I.K. Fodor, "A survey of dimension reduction techniques", Technical
Report UCRL- ID-148494, Lawrence Livermore National Laboratory,
US Department of Energy, 2002.
[6] M.A. Hall, "Correlation-based feature selection for discrete and numeric
class machine learning", Department of Computer Science, University of
Waikato, Hamilton, New Zealand, 2000.
[7] R. Bellman, "Adaptive Control Processes: A Guided Tour", Princeton
University Press, Princeton, 1961.
[8] H. Liu, J. Sun, L. Liu H. Zhang, "Feature selection with dynamic mutual
information", Pattern Recognition, vol. 42, 2009, pp. 1330 - 1339.
[9] N. Pradhananga, "Effective Linear-Time Feature Selection", Department
of Computer Science, University of Waikato, Hamilton, New Zealand,
2007.
[10] George H. John, Pat Langley: Estimating Continuous Distributions in
Bayesian Classifiers. In: Eleventh Conference on Uncertainty in
Artificial Intelligence, San Mateo, 1995, pp. 338-345.
[11] M.P. Narendra, K. Fukunaga,"A branch and bound algorithm for feature
subset selection", IEEE Trans. Comput. Vol. 26, 1997, pp. 917-922.
[12] P.A. Devijver, J. Kittler, "Pattern Recognition: A Statistical Approach",
Englewood Cliffs: Prentice Hall, 1982.
[13] M. Dash, H. Liu, "Unsupervised Feature Selection", Proc. Pacific Asia
conf. Knowledge Discovery and Data Mining, 2000, pp. 110-121.
[14] J. Dy, C. Btodley, "Feature Subset Selection and Order Identification for
Unsupervised Learning", Proc. 17th Int-l. Conf. Machine Learning,
2000.
[15] S.Basu, C.A. Micchelli, P. Olsen, "Maximum Entropy and Maximum
Likelihood Criteria for Feature Selection from Multi-variate Data", Proc.
IEEE Int-l. Symp. Circuits and Systems, 2000, pp. 267-270.
[16] S.K .Pal, R.K. De, J. Basak, "Unsupervised Feature Evaluation: A
Neuro-Fuzzy Approach", IEEE Trans. Neural Network, vol. 11, 2000,
pp. 366-376.
[17] S.K .Das, "Feature Selection with a Linear Dependence Measure", IEEE
Trans. Computers, 1971, pp. 1106-1109.
[18] G.T. Toussaint, T.R. Vilmansen, "Comments on Feature Selection with a
Linear Dependence Measure", IEEE Trans. Computers, 1972, 408.
[19] H. Liu, R. Setiono: "A probabilistic approach to feature selection - A
filter solution". In: 13th International Conference on Machine Learning,
1996, pp. 319-327.
[20] K. Fukunaga, "Introduction to Statistical Pattern Recognition",
Academic Press, 2nd Ed. 1990.
[21] E. Frank, M.A. Hall, G. Holmes, R. Kirkby, B. Pfahringer, "Weka - a
machine learning workbench for data mining", In The Data Mining and
Knowledge Discovery Handbook, Springer 2005, pp. 1305-1314.
[22] M. Dash, H. Liu, "Unsupervised Feature Selection", Proc. Pacific Asia
Conf. Knowledge Discovery and Data Mining, 2000, pp. 110-121.
[23] P. Pudil, J. Novovicova,J. Kittler, "Floating Search Methods in Feature
Selection", Pattern Recognition Letters, vol. 15, 1994, pp. 1119-1125.
[24] R.O. Duda, P.E. Hart, D.G. Stork, "Pattern Classification", Second
Edition, Wiley, 1997.