Autonomously Determining the Parameters for SVDD with RBF Kernel from a One-Class Training Set

The one-class support vector machine “support vector
data description” (SVDD) is an ideal approach for anomaly or outlier
detection. However, for the applicability of SVDD in real-world
applications, the ease of use is crucial. The results of SVDD are
massively determined by the choice of the regularisation parameter C
and the kernel parameter  of the widely used RBF kernel. While for
two-class SVMs the parameters can be tuned using cross-validation
based on the confusion matrix, for a one-class SVM this is not
possible, because only true positives and false negatives can occur
during training. This paper proposes an approach to find the optimal
set of parameters for SVDD solely based on a training set from
one class and without any user parameterisation. Results on artificial
and real data sets are presented, underpinning the usefulness of the
approach.





References:
<p>[1] V. Chandola, A. Banerjee, and V. Kumar, “Anomaly detection: A survey,”
ACM Computing Surveys, September 2009.
[2] A. Theissler and I. Dear, “Detecting anomalies in recordings from test
drives based on a training set of normal instances,” in Proceedings of the
IADIS International Conference Intelligent Systems and Agents 2012 and
European Conference Data Mining 2012. IADIS Press, Lisbon., 2012,
pp. 124–132.
[3] A. Theissler and I. Dear, “An anomaly detection approach to detect
unexpected faults in recordings from test drives,” in Proceedings of the
WASET International Conference on Vehicular Electronics and Safety
2013, Stockholm (to be published)., 2013.
[4] V. Chandola, “Anomaly detection for symbolic sequences and time series
data,” Ph.D. dissertation, Computer Science Department, University of
Minnesota, 2009.
[5] V. J. Hodge and J. Austin, “A survey of outlier detection methodologies,”
Artificial Intelligence Review, vol. 22, p. 2004, 2004.
[6] S. Theodoridis and K. Koutroumbas, Pattern Recognition, Fourth Edition,
4th ed. Academic Press, 2009.
[7] S. Abe, Support Vector Machines for Pattern Classification (Advances
in Pattern Recognition), 2nd ed. Springer-Verlag London Ltd., 2010.
[8] D. M. Tax and R. P. Duin, “Data domain description using support
vectors,” in Proceedings of the European Symposium on Artificial Neural
Networks, 1999, pp. 251–256.
[9] T. Fawcett, “ROC graphs: Notes and practical considerations for researchers,”
HP Laboratories, Tech. Rep., 2004.
[10] D. Tax and R. Duin, “Support vector data description,” Machine Learning,
vol. 54, no. 1, pp. 45–66, Jan. 2004.
[11] L. Zhuang and H. Dai, “Parameter optimization of kernel-based oneclass
classifier on imbalance learning,” Journal of Computers, vol. 1,
no. 7, pp. 32–40, 2006.
[12] D. M. Tax and R. P. Duin, “Uniform object generation for optimizing
one-class classifiers,” Journal of Machine Learning Research, vol. 2, pp.
155–173, 2001.
[13] D. Tax and R. Duin, “Outliers and data descriptions,” in In Proceedings
of the Seventh Annual Conference of the Advanced School for Computing
and Imaging (ASCI), 2001.
[14] D. M. Tax, “One-class classification. concept-learning in the absence of
counter-examples,” Ph.D. dissertation, Delft University of Technology,
2001.
[15] C. A. Jones, “Lecture notes: Math2640 introduction to optimisation 4,”
University of Leeds, School of Mathematics, Tech. Rep., 2005.
[16] O. Pavlichenko, “Adaptation of measured data analysis algorithms for
an existing machine learning framework,” 2011.
[17] PRTools, “Website: PRTools: The Matlab Toolbox for Pattern
Recognition,” Nov. 2012. (Online). Available: http://www.prtools.org
[18] KEEL, “Website: KEEL (Knowledge Extraction based on
Evolutionary Learning),” Nov. 2012. (Online). Available:
http://sci2s.ugr.es/keel/datasets.php</p>