Semi-Supervised Outlier Detection Using a Generative and Adversary Framework

In many outlier detection tasks, only training data
belonging to one class, i.e., the positive class, is available. The
task is then to predict a new data point as belonging either to
the positive class or to the negative class, in which case the
data point is considered an outlier. For this task, we propose a
novel corrupted Generative Adversarial Network (CorGAN). In the
adversarial process of training CorGAN, the Generator generates
outlier samples for the negative class, and the Discriminator is trained
to distinguish the positive training data from the generated negative
data. The proposed framework is evaluated using an image dataset
and a real-world network intrusion dataset. Our outlier-detection
method achieves state-of-the-art performance on both tasks.



References:
[1] V. Hodge and J. Austin, “A survey of outlier detection methodologies,”
Artificial intelligence review, vol. 22, no. 2, pp. 85–126, 2004.
[2] M. M. Moya, M. W. Koch, and L. D. Hostetler, “One-class classifier
networks for target recognition applications,” Sandia National Labs.,
Albuquerque, NM (United States), Tech. Rep., 1993.
[3] G. Ritter and M. T. Gallegos, “Outliers in statistical pattern recognition
and an application to automatic chromosome classification,” Pattern
Recognition Letters, vol. 18, no. 6, pp. 525–539, 1997.
[4] C. M. Bishop, “Novelty detection and neural network validation,” IEE
Proceedings-Vision, Image and Signal processing, vol. 141, no. 4, pp.
217–222, 1994.
[5] N. Japkowicz, “Concept-learning in the absence of counter-examples:
an autoassociation-based approach to classification,” Ph.D. dissertation,
Rutgers, The State University of New Jersey, 1999.
[6] S. Basu, M. Bilenko, and R. J. Mooney, “A probabilistic framework for
semi-supervised clustering,” in Proceedings of the tenth ACM SIGKDD
international conference on Knowledge discovery and data mining.
ACM, 2004, pp. 59–68.
[7] B. Agarwal et al., “One-class support vector machine for sentiment
analysis of movie review documents,” World Academy of Science,
Engineering and Technology, International Journal of Computer,
Electrical, Automation, Control and Information Engineering, vol. 9,
no. 12, pp. 2458–2461, 2015.
[8] A. Lakhina, M. Crovella, and C. Diot, “Mining anomalies using traffic
feature distributions,” in ACM SIGCOMM Computer Communication
Review, vol. 35. ACM, 2005, pp. 217–228.
[9] Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” Nature, vol. 521,
no. 7553, pp. 436–444, 2015.
[10] S. Zhai, Y. Cheng, W. Lu, and Z. Zhang, “Deep structured energy based
models for anomaly detection,” arXiv preprint arXiv:1605.07717, 2016.
[11] J. Xie, R. Girshick, and A. Farhadi, “Unsupervised deep embedding for
clustering analysis,” in International conference on machine learning,
2016, pp. 478–487.
[12] X. Yang, K. Huang, J. Y. Goulermas, and R. Zhang, “Joint learning
of unsupervised dimensionality reduction and gaussian mixture model,”
Neural Processing Letters, vol. 45, no. 3, pp. 791–806, 2017.
[13] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley,
S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial nets,”
in Advances in neural information processing systems, 2014, pp.
2672–2680.
[14] T. Schlegl, P. Seeb¨ock, S. M. Waldstein, U. Schmidt-Erfurth, and
G. Langs, “Unsupervised anomaly detection with generative adversarial
networks to guide marker discovery,” in International Conference on
Information Processing in Medical Imaging. Springer, 2017, pp.
146–157.
[15] Y. Yu, W.-Y. Qu, N. Li, and Z. Guo, “Open-category classification by
adversarial sample generation,” arXiv preprint arXiv:1705.08722, 2017.
[16] H. Zenati, C. S. Foo, B. Lecouat, G. Manek, and V. R.
Chandrasekhar, “Efficient gan-based anomaly detection,” arXiv preprint
arXiv:1802.06222, 2018.
[17] M. A. Pimentel, D. A. Clifton, L. Clifton, and L. Tarassenko, “A review
of novelty detection,” Signal Processing, vol. 99, pp. 215–249, 2014.
[18] B. Lindsay, G. L. Mclachlan, K. E. Basford, and M. Dekker, “Mixture
models: Inference and applications to clustering,” Journal of the
American Statistical Association, vol. 84, no. 405, p. 337, 1989.
[19] C. M. Bishop, Pattern recognition and machine learning. springer,
2006.
[20] E. Parzen, “On estimation of a probability density function and mode,”
The annals of mathematical statistics, vol. 33, no. 3, pp. 1065–1076,
1962.
[21] P. Vincent and Y. Bengio, “Manifold parzen windows,” in Advances in
neural information processing systems, 2003, pp. 849–856.
[22] Y. Bengio, H. Larochelle, and P. Vincent, “Non-local manifold parzen
windows,” in Advances in neural information processing systems, 2006,
pp. 115–122.
[23] M. Markou and S. Singh, “Novelty detection: a review—part 2:: neural
network based approaches,” Signal processing, vol. 83, no. 12, pp.
2499–2521, 2003.
[24] J. An and S. Cho, “Variational autoencoder based anomaly detection
using reconstruction probability,” Special Lecture on IE, vol. 2, pp. 1–18,
2015. [25] D. M. Tax and K.-R. M¨uller, “Feature extraction for one-class
classification,” Lecture notes in computer science, pp. 342–349, 2003.
[26] S. D. Bay and M. Schwabacher, “Mining distance-based outliers in
near linear time with randomization and a simple pruning rule,” in
Proceedings of the ninth ACM SIGKDD international conference on
Knowledge discovery and data mining. ACM, 2003, pp. 29–38.
[27] M. M. Breunig, H.-P. Kriegel, R. T. Ng, and J. Sander, “Lof: identifying
density-based local outliers,” in ACM sigmod record, vol. 29. ACM,
2000, pp. 93–104.
[28] D. Barbar´a, Y. Li, and J. Couto, “Coolcat: an entropy-based algorithm
for categorical clustering,” in Proceedings of the eleventh international
conference on Information and knowledge management. ACM, 2002,
pp. 582–589.
[29] Z. He, X. Xu, and S. Deng, “Discovering cluster-based local outliers,”
Pattern Recognition Letters, vol. 24, no. 9, pp. 1641–1650, 2003.
[30] B. Sch¨olkopf, R. C. Williamson, A. J. Smola, J. Shawe-Taylor, and
J. C. Platt, “Support vector method for novelty detection,” in Advances
in neural information processing systems, 2000, pp. 582–588.
[31] D. M. Tax and R. P. Duin, “Support vector domain description,” Pattern
recognition letters, vol. 20, no. 11, pp. 1191–1199, 1999.
[32] K. Hempstalk, E. Frank, and I. H. Witten, “One-class classification
by combining density and class probability estimation,” in Joint
European Conference on Machine Learning and Knowledge Discovery
in Databases. Springer, 2008, pp. 505–519.
[33] W. Fan, M. Miller, S. Stolfo, W. Lee, and P. Chan, “Using
artificial anomalies to detect unknown and known network intrusions,”
Knowledge and Information Systems, vol. 6, no. 5, pp. 507–527, 2004.
[34] N. Abe, B. Zadrozny, and J. Langford, “Outlier detection by active
learning,” in Proceedings of the 12th ACM SIGKDD international
conference on Knowledge discovery and data mining. ACM, 2006,
pp. 504–509.
[35] D. M. Tax and R. P. Duin, “Uniform object generation for optimizing
one-class classifiers,” Journal of machine learning research, vol. 2, no.
Dec, pp. 155–173, 2001.
[36] A. B´anhalmi, A. Kocsor, and R. Busa-Fekete, “Counter-example
generation-based one-class classification,” in ECML. Springer, 2007,
pp. 543–550.
[37] J. Zhao, M. Mathieu, and Y. LeCun, “Energy-based generative
adversarial network,” arXiv preprint arXiv:1609.03126, 2016.
[38] S. Mohamed and B. Lakshminarayanan, “Learning in implicit generative
models,” arXiv preprint arXiv:1610.03483, 2016.
[39] S. Nowozin, B. Cseke, and R. Tomioka, “f-gan: Training generative
neural samplers using variational divergence minimization,” in Advances
in Neural Information Processing Systems, 2016, pp. 271–279.
[40] M. Uehara, I. Sato, M. Suzuki, K. Nakayama, and Y. Matsuo,
“Generative adversarial nets from a density ratio estimation perspective,”
arXiv preprint arXiv:1610.02920, 2016.
[41] M. Lichman et al., “Uci machine learning repository,” 2013.
[42] B. Zong, Q. Song, M. R. Min, W. Cheng, C. Lumezanu, D. Cho, and
H. Chen, “Deep autoencoding gaussian mixture model for unsupervised
anomaly detection,” 2018.
[43] L. v. d. Maaten and G. Hinton, “Visualizing data using t-sne,” Journal
of machine learning research, vol. 9, no. Nov, pp. 2579–2605, 2008.