On Speeding Up Support Vector Machines: Proximity Graphs Versus Random Sampling for Pre-Selection Condensation

Support vector machines (SVMs) are considered to be the best machine learning algorithms for minimizing the predictive probability of misclassification. However, their drawback is that for large data sets the computation of the optimal decision boundary is a time consuming function of the size of the training set. Hence several methods have been proposed to speed up the SVM algorithm. Here three methods used to speed up the computation of the SVM classifiers are compared experimentally using a musical genre classification problem. The simplest method pre-selects a random sample of the data before the application of the SVM algorithm. Two additional methods use proximity graphs to pre-select data that are near the decision boundary. One uses k-Nearest Neighbor graphs and the other Relative Neighborhood Graphs to accomplish the task.




References:
[1] M.B. Almeida, A.P. Braga, and J.P. Braga, "SVM-KM: speeding SVMs
learning with a priori cluster selection and k-means," In: Proceedings of
the 6th Brazilian Symposium on Neural Networks, pp. 162-167, 2000.
[2] F. Angiulli, "Fast nearest neighbor condensation for large data sets
classification," IEEE Transactions on Knowledge and Data Engineering,
vol. 19, no. 11, pp. 1450-1464, Nov. 2007.
[3] F. Angiulli and A. Astorino, "Scaling up support vector machines using
nearest neighbor condensation," IEEE Transactions on Neural
Networks, vol. 21, no. 2, pp. 351-357, February 2010.
[4] B. Bhattacharya, K. Mukherjee and G.T. Toussaint, "Geometric decision
rules for instance-based learning algorithms," Proc. Pattern Recognition
and Machine Intelligence: First International Conference, S. K. Pal et
al., (Eds.): LNCS 3776, Kolkata, India, pp. 60-69, December 20-22,
2005.
[5] L. Breiman, "Bagging predictors," Machine Learning, vol. 24, pp. 123-
140, 1996.
[6] D. Bryant and V. Moulton, "NeighborNet: An agglomerative algorithm
for the construction of phylogenetic networks," Molecular Biology and
Evolution, vol. 21, no. 2, pp. 255-265, 2004.
[7] C.J.C. Burges and B. Scholkopf, "Improving the accuracy and speed of
support vector learning machines," In Proceedings of the 9th NIPS
Conference, pages 375-381, 1997.
[8] O. Chapelle, V. Vapnik, O. Bousquet, and S. Mukherjee, "Choosing
multiple parameters for support vector machines," Machine Learning,
vol. 46, pp. 131-159, 2002.
[9] J. Chen and C.-L. Liu, "Fast multi-class sample reduction for speeding
up support vector machines," Proceedings of the IEEE International
Workshop on Machine Learning for Signal Processing, Beijing, China,
September 18-21, 2011.
[10] T.M. Cover and P.E. Hart, "Nearest neighbor pattern classification,"
IEEE Transactions on Information Theory, vol. 13, pp. 21-27, 1967.
[11] B.V. Dasarathi, "Tandem fusion of nearest neighbor editing and
condensing algorithms - data dimensionality effects," Proceedings of the
15th International Conference on Pattern Recognition, vol. 2, pp. 692-
695, 2000.
[12] T. Downs, K.E. Gates, and A. Masters, "Exact simplification of support
vector solutions," Journal of Machine Learning Research, vol. 2, pp.
293-297, 2001.
[13] S. Fine and K. Scheinberg, "Ecient SVM training using low-rank
kernel representation, Journal of Machine Learning Research, pp. 243-
264, 2009.
[14] O. Gascuel, "BIONJ: an improved version of the NJ algorithm based on
a simple model of sequence data," Molecular Biology and Evolution,
vol. 14, no. 7, pp. 685-695, 1997.
[15] E. Gomez and P. Herrera, "Comparative analysis of music recordings
from Western and Non-Western traditions by automatic tonal feature
extraction," Empirical Musicology Review, vol. 3, pp. 140-156, 2008.
[16] D. Han, C. Han, Y. Yang, Y. Liu, and W. Mao, "Pre-extracting method
for SVM classification based on the non-parametric K-NN rule,"
Proceedings of the 19th International Conference on Pattern
Recognition, pp. 1-4, Dec. 8-11, 2008.
[17] P.E. Hart, "The condensed nearest neighbor rule," IEEE Transactions on
Information Theory, vol. 14, pp. 515-516, 1968.
[18] D.H. Huson, "SplitsTree: A program for analyzing and visualizing
evolutionary data," Bioinformatics, vol. 14, no. 10, pp. 68-73, 1998.
[19] J.W. Jaromczyk and G.T. Toussaint, "Relative neighborhood graphs and
their relatives," Proc. of the IEEE, vol. 80, no. 9, September, pp. 1502-
1517, 1992.
[20] T. Joachims, "Making large-scale SVM learning practical," Advances,
In: Kernel Methods - Support Vector Learning, MIT-Press, Cambridge,
MA, 1999.
[21] S.S. Keerthi, O. Chapelle, and D. DeCoste, "Building support vector
machines with reduced classifier complexity," Journal of Machine
Learning Research, vol. 7, pp. 1493-1515, 2006.
[22] S. S. Keerthi, S. K. Shevade, C. Bhattacharyya, and K. R. K. Murthy,
"Improvements to Platt-s SMO algorithm for SVM classifier design,"
Neural Computation, vol. 13, pp. 637-649, 2001.
[23] R. Koggalage and S. Halgamuge, "Reducing the number of training
samples for fast support vector machine classification," Neural
Information Processing - Letters and Reviews, vol. 2, no. 3, pp. 57-65,
March 2004.
[24] Y.J. Lee and O.L. Mangasarian, "RSVM: Reduced support vector
machines," In Proceedings of the First SIAM International Conference
on Data Mining, SIAM, Chicago, April 5-7, 2001.
[25] D. Li and S. Simske, "Training set compression by incremental
clustering," Journal of Pattern Recognition Research, vol. 1, pp. 56-64,
2011.
[26] X. Liang, R.-C. Chen, and X. Guo, "Pruning support vector machines
without altering performances," IEEE Transactions on Neural Networks,
vol. 19, no. 10, pp. 1792-1803, October 2008.
[27] M.E. Mavroforakis and S. Theodoridis, "A geometric approach to
support vector machine (SVM) classification," IEEE Transactions on
Neural Networks, vol. 17, no. 3, pp. 671-682, May 2006.
[28] E. Mayr, "Cladistic analysis or cladistic classification?" Zeitschrift fuer
Zoologische Systematik und Evolutionsforschung, vol. 12, pp. 94-128,
1974.
[29] D.H. Nghi and L.C. Mai, "Training data selection for support vector
machines model," Proceedings of the International Conference on
Information and Electronics Engineering, vol. 6, Press, Singapore, 2011.
[30] N. Panda, E. Y. Chang, and G. Wu, "Concept boundary detection for
speeding up SVMs," Proceedings of the 23 International Conference on
Machine Learning, Pittsburgh, PA, 2006.
[31] J.C. Platt, "Fast training of support vector machines using sequential
minimal optimization," In B. Scholkopf, C. J. C. Burges, and A. J.
Smola, editors, Advances in Kernel Methods - Support Vector Learning,
Cambridge, MA, MIT Press, 1998.
[32] T. Thies and F. Weber, "Optimal reduced-set vectors for support vector
machines with a quadratic kernel," Neural Computation, vol. 16, pp.
1769-1777, 2004.
[33] G.T. Toussaint, B. K. Bhattacharya, and R. S. Poulsen, "The application
of Voronoi diagrams to nonparametric decision rules," Proc. Computer
Science and Statistics: 16th Symposium on the Interface, Atlanta,
Georgia, March 14-16,1984, Published by North-Holland in 1985,
Amsterdam, L. Billard, Ed., pp. 97-108.
[34] G.T. Toussaint and C. Berzan, "Proximity-graph instance-based
learning, support vector machines, and high dimensionality: An
empirical comparison." Proceedings of the Eighth International
Conference on Machine Learning and Data Mining, July 16-19, 2012,
Berlin, Germany. P. Perner (Ed.): MLDM 2012, LNAI 7376, pp. 222-
236, 2012. Springer-Verlag Berlin Heidelberg.
[35] G.T. Toussaint, "Geometric proximity graphs for improving nearest
neighbor methods in instance-based learning and data mining,"
International Journal of Computational Geometry and Applications, vol.
15, April 2005, pp. 101-150.
[36] G.T. Toussaint, "The relative neighborhood graph of a finite planar set,"
Pattern Recognition, vol. 12, pp. 261-268, 1980.
[37] I. W. Tsang, J. T. Kwok, and P.-M. Cheung, "Core vector machines: Fast
SVM training on very large data sets," Journal of Machine Learning
Research, vol. 6, pp. 363-392, 2005.
[38] V. Vapnik, The Nature of Statistical Learning Theory, Springer-Verlag,
New York, NY, 1995.
[39] J. Wang, P. Neskovic, L. N. Cooper, "Selecting data for fast support
vector machine training," Studies in Computational Intelligence (SCI),
vol. 35, pp. 61-84, 2007.
[40] Y. Wang, C.G. Zhou, Y.X. Huang, Y.C. Liang, and X.W. Yang, "A
boundary method to speed up training support vector machines," In: G.
R. Liu et al. (eds), Computational Methods, Springer, Printed in the
Netherlands, pp. 1209-1213, 2006.
[41] D.L. Wilson, "Asymptotic properties of nearest neighbor rules using
edited-data," IEEE Transactions on Systems, Man, and Cybernetics, vol.
2, pp. 408-421, 1973.
[42] M.H. Yang and N. Ahuja, "A geometric approach to train support vector
machines," In: Proceedings of the IEEE Conference on Computer Vision
and Pattern Recognition 2000 (CVPR 2000), June, Hilton Head Island,
pp. 430-437, 2000.
[43] W. Zhang and I. King, "Locating support vectors via β-skeleton
technique," In: Proceedings of the International Conference on Neural
Information Processing (ICONIP), pp. 1423-1427, 2002.
[44] W. Zhang and I. King, "A study of the relationship between support
vector machine and Gabriel graph." In: Proceedings of the IEEE
International Joint Conference on Neural Networks, IJCNN, Honolulu,
vol. 1, pp. 239-244, 2002.