The Performance of Predictive Classification Using Empirical Bayes

This research is aimed to compare the percentages of correct classification of Empirical Bayes method (EB) to Classical method when data are constructed as near normal, short-tailed and long-tailed symmetric, short-tailed and long-tailed asymmetric. The study is performed using conjugate prior, normal distribution with known mean and unknown variance. The estimated hyper-parameters obtained from EB method are replaced in the posterior predictive probability and used to predict new observations. Data are generated, consisting of training set and test set with the sample sizes 100, 200 and 500 for the binary classification. The results showed that EB method exhibited an improved performance over Classical method in all situations under study.





References:
[1] L. Cucala, J.-M. Marin, C.P. Robert, and D.M. Titterington, "A Bayesian
reassessment of nearest-neighbour classification", University Paris-Sud,
Project Select, unpublished.
[2] T. Damoulas, and M.A. Girolami, "Combining feature spaces for
classification", Pattern Recognition, Letters 42, pp. 2671-2683, 2009.
[3] R. Johnson, and D. Wichern, "Applied multivariate statistical analysis",
Prentice - Hall, 2002.
[4] M. Aci, C. Inan, and M. Avci, "A hybrid classification method of k
nearest neighbor, Bayesian methods and genetic algorithm", Expert
Systems with Applications, 37, pp. 5061-5067, 2010.
[5] C.K.I. Williams, and D. Barber, "Bayesian Classification with Gaussian
Processes", IEEE Transactions On Paitern Analysis And Machine
Intelligence, VOL. 20, NO. 12, 2008.
[6] R.O. Duda, P.E. Hart, and D.G. Stork, "Pattern Classification", John
Wiley & Sons, 2001.
[7] N.A. Samsudin, and A.P. Bradley, "Nearest Neighbour group-based
classification", Pattern Recognition, Letters 43, pp. 3458-3467, 2010.
[8] M.A. Duarte-Mermoud, and N.H. Beltran, "Classification of Chilean
wines, Bayesian Network", A Practical Guide to Applications, pp. 281-
299, 2008.
[9] A. Porwal, and E.J.M. Carranza, "Classifiers for modeling of mineral
potential". Bayesian Network: A Practical Guide to Applications, pp.
149-171, 2008.
[10] B.P. Carlin, and T.A. Louis, "Bayesian Methods for Data Analysis",
Chamman & Hall, 2009.
[11] T.F. Li, "Bayes empirical Bayes approach to unsupervised learning of
parameters in pattern recognition", Pattern Recognition, Letters 33, pp.
333-340, 2000.
[12] S. Chang, and T.F. Li, "Empirical Bayes decision rule for classification
on defective items in Weibull distribution", Applied Mathematics and
Computation, 182, pp. 425- 433, 2006.
[13] L. Wei, and J. Chen, "Empirical Bayes estimation and its superiority for
two-way", Statistics & Probability, Letters 63, pp. 165-175, 2003.
[14] Y. Ji, K.-W. Tsui, and K. Kim, "A novel means of using gene clusters in
a two-step empirical Bayes method for predicting classes of samples",
Bioinformatics, Vol. 21, No. 7, pp. 1055-1061, 2005.
[15] T. Koski, "Bayesian Predictive Classification", School of Swedish
Statistical Association, Alternative Perspectives On Statistical Inference,
unpublished.
[16] R. Guo, and S. Chakraborty, "Bayesian Adaptive Nearest Neighbor",
Statistical Analysis and Data Mining, DOI:10.1002/sam, pp. 92-105,
2009.
[17] S.S. Shapiro, M.B. Wilk, and H.J. Chen, "A Comparative Study of
Various Tests for Normality, Journal of the American Statistical
Association", Vol. 63, No. 324, pp. 1343-1372, 1968.
[18] J.S. Ramberg, P.R. Tadikamalla, E.J. Dudewicz, and E.F. Mykytka, "A
Probability Distribution and Its Uses in Fitting Data", Journal of the
American Statistical Association, Vol. 21, No. 2, pp. 201-214, 1979.