A New Approach for Classifying Large Number of Mixed Variables

The issue of classifying objects into one of predefined groups when the measured variables are mixed with different types of variables has been part of interest among statisticians in many years. Some methods for dealing with such situation have been introduced that include parametric, semi-parametric and nonparametric approaches. This paper attempts to discuss on a problem in classifying a data when the number of measured mixed variables is larger than the size of the sample. A propose idea that integrates a dimensionality reduction technique via principal component analysis and a discriminant function based on the location model is discussed. The study aims in offering practitioners another potential tool in a classification problem that is possible to be considered when the observed variables are mixed and too large.

Authors:



References:
[1] W. R. Klecka, Discriminant Analysis. Series: Quantitative
Applications in the Social Sciences. A Sage University Paper. Beverly
Hills, CA: Saga, 1980.
[2] M. N├║├▒ez, A. Villarroya and J. M. Oller, "Minimum Distance
Probability Discriminant Analysis for Mixed Variables," Biometrics,
vol. 59, pp. 248-253, 2003.
[3] Y. H. Chan, "Biostatistics 303: Discriminant Analysis," Singapore
Medical Journal, vol. 46, no. 2, pp. 54-61, 2005.
[4] M. Doumpos and C. Zopounidis, Multicriteria Decision Aid
Classification Methods. Kluwer Academic Publishers, 2002.
[5] R. A. Fisher, "The Use of Multiple Measurements in Taxonomic
Problems," Annals of Eugenics, vol. 7, no. 2, pp. 179-188, 1936.
[6] S. B. Bull and A. Donner, "The Efficiency of Multinominal Logistic
Regression compared with Multiple Group Discriminant Analysis,"
Journal of the American Statistical Association, vol. 82, pp. 1118-
1122, 1987.
[7] J. A. Anderson, Logistic Discrimination. In Handbook of Statistics
(Vol. 2) P. R. Krishnaiah and L. N. Kanal (Eds.). Amsterdam: North-
Holland, pp. 169-191, 1992.
[8] J. J. Daudin, "Selection of Variables in Mixed-variable Discriminant
Analysis," Biometrics, vol. 42, no. 3, pp. 473-481, 1986.
[9] O. Asparoukhov and W. J. Krzanowski, "Non-parametric Smoothing
of the Location Model in Mixed Variable Discrimination," Statistics
and Computing, vol. 10, pp. 289-297, 2000.
[10] A. Merbouha and A. Mkhadri, "Regularization of the Location Model
in Discrimination with Mixed Discrete and Continuous Variables,"
Computational Statistics and Data Analysis, vol. 45, pp. 563-576,
2004.
[11] W. J. Krzanowski, "Discrimination and Classification using Both
Binary and Continuous Variables," Journal of the American
Statistical Association, vol. 70, no. 352, pp. 782-790, 1975.
[12] W. J. Krzanowski, "The Location Model for Mixtures of Categorical
and Continuous Variables," Journal of Classification, vol. 10, pp. 25-
49, 1993.
[13] D. J. Hand, Construction and Assessment of Classification Rules.
Chichester: John Wiley & Son, 1997.
[14] K. D. Wernecke, "A Coupling Procedure for the Discrimination of
Mixed Data," Biometrics, vol. 48, no. 2, pp. 497-506, 1992.
[15] L. Xu, A. Krzyżak and C. Y. Suen, "Methods of Combining Multiple
Classifiers and Their Applications to Handwriting Recognition,"
IEEE Transactions on Systems, Man, and Cybernetics, vol. 22, no. 3,
pp. 418-435, 1992.
[16] P. C. Chang and A. A. Afifi, "Classification based on Dichotomous
and Continuous Variables," Journal of the American Statistical
Association, vol. 69, no. 346, pp. 336-339, 1974.
[17] W. J. Krzanowski, "Mixtures of Continuous and Categorical
Variables in Discriminant Analysis," Biometrics, vol. 36, pp. 493-
499, 1980.
[18] N. I. Mahat, W. J. Krzanowski and A. Hernandez, "Strategies for
Non-Parametric Smoothing of the Location Model in Mixed-Variable
Discriminant Analysis," Modern Applied Science, vol. 3, no. 1, pp.
151-163, 2009.
[19] J. Aitchison and C. G. G. Aitken, "Multivariate Binary Discrimination
by the Kernel Method," Biometrika, vol. 63, pp. 413-420, 1976.
[20] J. A. Anderson, "Separate Sample Logistic Discrimination,"
Biometrika, vol. 59, no. 1, pp. 19-35, 1972.
[21] D. J. Hand, J. J. Oliver and A. D. Lunn, "Discriminant Analysis when
the Classes Arise from a Continuum," Pattern Recognition, vol. 31,
no. 5, pp. 641-650, 1998.
[22] P. A. Lachenbruch, C. Sneeringer and L. T. Revo, "Robustness of the
Linear and Quadratic Discriminant Function to Certain Types of Nonnormality,"
Communications in Statistics, vol. 1, pp. 39-56, 1973.
[23] R. A Johnson and D. W. Wichern, Applied Multivariate Statistical
Analysis (3rd edition). New Jersey, Englewood Cliffs: Prentice Hall,
1992.
[24] A. Das, Discriminant Analysis and Its Applications. BDM&DM Term
Paper, 2009.
[25] T. W. Anderson, An Introduction to Multivariate Statistical Analysis
(2nd edition). New York: John Wiley & Sons, 1984.
[26] Guo, T. Hastie and R. Tibshirani, "Regularized Linear Discriminant
Analysis and Its Application in Microarrays," Biostatistics, vol. 8, no.
1, pp. 86-100, 2007.
[27] I. G. Vlachonikolis and F. H. C. Marriott, "Discrimination with
Mixed Binary and Continuous Data," Applied Statistics, vol. 31, no.
1, pp. 23-31, 1982.
[28] N. I. Mahat, W. J. Krzanowski and A. Hernandez, "Variable Selection
in Discriminant Analysis Based on the Location Model for Mixed
Variables," Advances in Data Analysis and Classification, vol. 1, no.
2, pp. 105-122, 2007
[29] H. Ping, "Classification Methods and Applications to Mass Spectral
Data," unpublished PhD Thesis. Hong Kong: Baptist University,
Department of Mathematics, 2005
[30] J. J. Dai, L. Lieu and D. Rocke, "Dimension Reduction for
Classification with Gene Expression Microarray Data," Statistical
Applications in Genetics and Molecular Biology, vol. 5, no. 1, article
6, 20 pages, 2006.
[31] Q. Li, "An Integrated Framework of Feature Selection and Extraction
for Appearance-based Recognition," unpublished PhD Thesis.
University Of Delaware: Faculty of Computer Science, 2006.
[32] J. S. Marron, M. J. Todd and J. Ahn, "Distance Weighted
Discrimination," Journal of the American Statistical Association, vol.
480, pp. 1267-1271, 2007.
[33] C. Ambroise and G. J. McLachlan, "Selection Bias in Gene
Extraction on the Basis of Microarray Gene-Expression Data,"
Proceedings of the National Academy of Sciences, 2002, vol. 99, no.
10, pp. 6562-6566.
[34] R. Simon, M. D. Radmacher, K. Dobbin and L. M. McShane, "Pitfalls
in the Use of DNA Microarray Data for Diagnostic and Prognostic
Classification," Journal of the National Cancer Institute, vol. 95, no.
1, pp. 14-18, 2003.
[35] Z. Qiao, L. Zhou and J. Z. Huang, "Effective Linear Discriminant
Analysis for High Dimensional, Low Sample Size Data," Proceedings
of the World Congress on Engineering (WCE), 2008, vol. 2, pp. 1070-
1075.
[36] G. J. McLachlan, "A Criterion for Selecting Variables for the Linear
Discriminant Function," Biometrics, vol. 32, no. 3, pp. 529-534,
1976.
[37] P. Xu, G. N. Brock and R. S. Parrish, "Modified Linear Discriminant
Analysis Approaches for Classification of High-Dimensional
Microarray Data," Computational Statistics and Data Analysis, vol.
53, no. 5, pp. 1674-1687, 2009.
[38] Y. Lu, Q. Tian, M. Sanchez, J. Neary, F. Liu and Y. Wang, "Learning
Microarray Gene Expression Data by Hybrid Discriminant Analysis,"
IEEE Multimedia Magazine, Special Issue on Multimedia Signal
Processing and Systems in Health Care and Life Science, vol. 14, no.
4, pp. 22-31, 2007.
[39] I. G. Chong and C. H. Jun, "Performance of Some Variable Selection
Methods when Multicollinearity is Present," Chemometrics and
Intelligent Laboratory Systems, vol. 78, pp. 103-112, 2005.
[40] J. M. Weiner and O. J. Dunn, "Elimination of Variate in Linear
Discrimination Problems," Biometrics, vol. 22, no. 2, pp. 268-275,
1966.
[41] L. Jenkins and M. Anderson, "A Multivariate Statistical Approach to
Reducing the Number of Variables in Data Envelopment Analysis,"
European Journal of Operational Research, vol. 147, no. 1, pp. 51-
61, 2003.
[42] F. Nie, S. Xiang, Y. Song and C. Zhang, "Extracting the Optimal
Dimensionality for Discriminant Analysis," International Conference
on Acoustics, Speech and Signal Processing (ICASSP), vol. 2, pp.
617-620, 2007.
[43] P. N. Belhumeur, J. P. Hespanha and D. J. Kriegman, "Eigenfaces vs.
Fisherfaces: Recognition using Class Specific Linear Projection,"
IEEE Transactions on Pattern Analysis and Machine Intelligence,
vol. 19, no. 7, pp. 711-720, 1997.
[44] W. Zhao, R. Chellappa and N. Nandhakumar, "Empirical
Performance Analysis of Linear Discriminant Classifiers,"
Proceedings of the IEEE Computer Society Conference on Computer
Vision and Pattern Recognition, 1998, pp. 164-169.
[45] L. F. Chen, H. Y. M. Liao, M. T. Ko, J. C. Lin and G. J. Yu, "A New
LDA-based Face Recognition System which can Solve the Small
Sample Size Problem," Pattern Recognition, vol. 33, no. 10, pp.
1713-1726, 2000.
[46] J. Yang and J. Y. Yang, "Optimal FLD Algorithm for Facial Feature
Extraction," SPIE Proceedings of the Intelligent Robots and
Computer Vision XX: Algorithms, Techniques, and Active Vision,
2001, vol. 4572, pp. 438-444.
[47] H. Yu and J. Yang, "A Direct LDA Algorithm for High-Dimensional
Data with Application to Face Recognition," Pattern Recognition,
vol. 34, no. 10, pp. 2067-2070, 2001.
[48] J. Yang and J. Y. Yang, "Why Can LDA be Performed in PCA
Transformed Space? Rapid and Brief Communication," Pattern
Recognition, vol. 36, no. 2, pp. 563-566, 2003.
[49] J. Ye and T. Xiong, "Computational and Theoretical Analysis of Null
Space and Orthogonal Linear Discriminant Analysis," Journal of
Machine Learning Research, vol. 7, pp. 1183-1204, 2006.
[50] S. Wold, K. Esbensen and P. Geladi, "Principal Component
Analysis," Chemometrics Intelligent Laboratory Systems, vol. 2, pp.
37-52,1987.
[51] D. Ghosh, "Singular Value Decomposition Regression Modeling for
Classification of Tumors from Microarray Experiments," Proceedings
of the Pacific Symposium on Biocomputing, 2002, pp. 11462-11467.
[52] S. Deerwester, S. T. Dumais, G. W. Furnas, T. K. Landauer and R.
Harshman, "Indexing by Latent Semantic Analysis," Journal of the
American Society for Information Science, vol. 41, no. 6, pp. 391-
407, 1990.
[53] A. Kabán and M. A. Girolami, "Fast Extraction of Semantic Features
from a Latent Semantic Indexed Corpus," Neural Processing Letters,
vol. 15, no. 1, pp. 31-43, 2002.
[54] P. M. Garthwaite, "An Interpretation of Partial Least Squares,"
Journal of the American Statistical Association, vol. 89, no. 425, pp.
122-127, 1994.
[55] D. V. Nguyen and D. M. Rocke, "Tumor Classification by Partial
Least Squares using Microarray Gene Expression Data,"
Bioinformatics, vol. 18, no. 1, pp. 39-50, 2002a.
[56] D. V. Nguyen and D. M. Rocke, "Multi-class Cancer Classification
via Partial Least Squares with Gene Expression Profiles,"
Bioinformatics, vol. 18, pp. 1216-1226, 2002b.
[57] X. Huang and W. Pan, "Linear Regression and Two-class
Classification with Gene Expression Data," Bioinformatics, vol. 19,
pp. 2072-2978, 2003.
[58] A. Boulesteix, "PLS Dimension Reduction for Classification with
Microarray Data," Statistical Applications in Genetics and Molecular
Biology, vol. 3, pp. 1-33, 2004.
[59] R. D. Cook, Regression Graphics. New York: John Wiley & Sons,
1998.
[60] F. Chiaromonte and J. Martinelli, "Dimension Reduction Strategies
for Analyzing Global Gene Expression Data with a Response,"
Mathematical Biosciences, vol. 176, pp. 123-144, 2002.
[61] A. Antoniadis, S. Lambert-Lacroix and F. Leblanc, "Effective
Dimension Reduction Methods for Tumor Classification using Gene
Expression Data," Bioinformatics, vol. 19, pp. 563-570, 2003.
[62] E. Bura and R. M. Pfeiffer, "Graphical Methods for Class Prediction
using Dimension Reduction Techniques on DNA Microarray Data,"
Bioinformatics, vol. 19, pp. 1252-1258, 2003.
[63] W. Zhao, R. Chellappa and P. J. Philips, Subspace Linear
Discriminant Analysis for Face Recognition. Technical Report CARTR-
914. University of Maryland, College Park, 1999.
[64] J. Baeka and M. Kimb, "Face Recognition using Partial Least Squares
Components," Pattern Recognition," vol. 37, no. 6, pp. 1303-1306,
2004.
[65] W. Zuo, D. Zhang, J. Yang and K. Wang, "BDPCA plus LDA: A
Novel Fast Feature Extraction Technique for Face Recognition,"
IEEE Transactions on Systems, Man, and Cybernetics: Part BCybernetics,
vol. 36, no. 4, pp. 946-953, 2006.
[66] A. Caprihan, G. D. Pearlson and V. D. Calhoun, "Application of
Principal Component Analysis to Distinguish Patients with
Schizophrenia from Healthy Controls based on Fractional Anisotropy
Measurements," Neuroimage, vol. 42, no. 2, pp. 675-682, 2008.
[67] G. P. McCabe, Principal Variables. Technical Report. West Lafayette:
Purdue University, 1982.
[68] G. A. F. Seber, Multivariate Observations. New York: John Wiley &
Sons, 1984.
[69] I. T. Jolliffe, Principal Component Analysis. New York: Springer-
Verlag, 1986.
[70] H. B. Deng, L. W. Jin, L. X. Zhen and J. C. Huang, "A New Facial
Expression Recognition Method on Local Gabor Filter Bank and PCA
plus LDA," International Journal of Information Technology, vol. 11,
no. 11, pp. 86-96, 2005.
[71] W. Hwang, T. K. Kim and S. C. Kee, "LDA with Subgroup PCA
Method for Facial Image Retrieval," The 5th International Workshop
on Image Analysis for Multimedia Interactive Services (WIAMIS),
Portugal: Lisbon, April 2004, pp. 21-23.
[72] I. T. Jolliffe, Principal Component Analysis (2nd edition). New York:
Springer-Verlag, 2002.
[73] B. G. Amidan and D. N. Hagedorn, Logistic Regression Applied to
Seismic Discrimination. Technical Report. Pacific Northwest
National Laboratory (PNNL): Washington (US), Richland, 1998.
[74] A. P. Worth and M. T. D. Cronin, "The Use of Discriminant Analysis,
Logistic Regression and Classification Tree Analysis in the
Development of Classification Models for Human Health Effects,"
Theochem, vol. 622, pp. 97-111, 2003.
[75] H. C. Kim, D. Kim and S. Y. Bang, "Extensions of LDA by PCA
Mixture Model and Class-wise Features," Journal of the Pattern
Recognition Society, vol. 36, pp. 1095-1105, 2003.
[76] Y. Liang, C. Li, W. Gong and Y. Pan, "Uncorrelated Linear
Discriminant Analysis based on Weighted Pairwise Fisher Criterion,"
The Journal of the Pattern Recognition Society, vol. 40, pp. 3606-
3615, 2007.
[77] M. M. Sithole, "Variable Selection in Principal Component Analysis:
Using Measures of Multivariate Association," unpublished Master
Thesis. Curtin University of Technology, School of Mathematics and
Statistics, 1992.
[78] C. R. Rao, "The Use and Interpretation of Principal Component
Analysis in Applied Research," Sankhy─ü: The Indian Journal of
Statistics, Series A, vol. 26, no. 4, pp. 329-358, 1964.
[79] W. J. Krzanowski, "Selection of Variables to Preserve Multivariate
Data Structure using Principal Components," Applied Statistics, vol.
36, no. 1, pp. 22-33, 1987.
[80] D. C. Hoyle, "Automatic PCA Dimension Selection for High
Dimensional Data and Small Sample Sizes," Journal of Machine
Learning Research, vol. 9, pp. 2733-2759, 2008.
[81] W. K. Härdle and Z. Hlávka, Multivariate Statistics: Exercises and
Solutions. New York: Springer, 2007.