A New History Based Method to Handle the Recurring Concept Shifts in Data Streams

Recent developments in storage technology and networking architectures have made it possible for broad areas of applications to rely on data streams for quick response and accurate decision making. Data streams are generated from events of real world so existence of associations, which are among the occurrence of these events in real world, among concepts of data streams is logical. Extraction of these hidden associations can be useful for prediction of subsequent concepts in concept shifting data streams. In this paper we present a new method for learning association among concepts of data stream and prediction of what the next concept will be. Knowing the next concept, an informed update of data model will be possible. The results of conducted experiments show that the proposed method is proper for classification of concept shifting data streams.




References:
[1] P. Wang, H. Wang, X. Wu, W. Wang, and B. Shi, "A Low-Granularity Classifier for Data Streams with Concept Drifts and Biased Class
Distribution", IEEE Transactions on Knowledge and Data Engineering,
vol. 19, no. 9, Sept., 2007, pp. 1202-1213.
[2] P. Domingos, and G. Hulten, "Mining high-speed data streams", ACM
Press, Boston, MA, 2000, pp. 71-80.
[3] H. Wang, W. Fan, P.S. Yu, and J. Han, "Mining concept-drifting data streams using ensemble classifiers". In Proceedings of the 9th ACM
SIGKDD International Conference on Knowledge Discovery and Data
Mining, 2003, pp. 226-235.
[4] Y. Yang, X. Wu, and X. Zhu, "Mining in Anticipation for Concept
Change: Proactive-Reactive Prediction in Data Streams", Journal of Data
Mining and Knowledge Discovery, Springer, ISSN: 1384-5810, Volume
13, Number 3, November, 2006, pp. 261-289.
[5] G. Hulten, L. Spencer, and P. Domingos. "Mining time changing data
streams". In SIGKDD, ACM Press, CA, 2001, pp. 97-106.
[6] W.N. Street and Y.S. Kim. "A streaming ensemble algorithm (SEA) for
large-scale classification". In SIGKDD, 2001.
[7] E. Keogh, and S. Kasetty, "On the need for time series data mining
benchmarks: A survey and empirical demonstration", In Proceedings of
the 8thACMSIGKDD International Conference on Knowledge Discovery
and Data Mining, 2002, pp. 102-111.
[8] K.O. Stanley, "Learning concept drift with a committee of decision
trees". Technical Report AI-03-302, Department of Computer Sciences,
University of Texas at Austin, 2003.
[9] A. Tsymbal, "The problem of concept drift: Definitions and related
work". Technical Report, Computer Science Department, Trinity
College Dublin, 2004.
[10] X. Zhu, P. Zhang, X. Lin, Y. Shi, "Active Learning from Data Streams", IEEE International Conference on Data Mining, Omaha, Nebraska, USA, 2007.