A Sequential Pattern Mining Method Based On Sequential Interestingness

Sequential mining methods efficiently discover all frequent sequential patterns included in sequential data. These methods use the support, which is the previous criterion that satisfies the Apriori property, to evaluate the frequency. However, the discovered patterns do not always correspond to the interests of analysts, because the patterns are common and the analysts cannot get new knowledge from the patterns. The paper proposes a new criterion, namely, the sequential interestingness, to discover sequential patterns that are more attractive for the analysts. The paper shows that the criterion satisfies the Apriori property and how the criterion is related to the support. Also, the paper proposes an efficient sequential mining method based on the proposed criterion. Lastly, the paper shows the effectiveness of the proposed method by applying the method to two kinds of sequential data.





References:
[1] R. Agrawal and R. Srikant, "Fast Algorithms for Mining Association
Rules," in Proc. of the 20th Int. Conf. Very Large Data Bases, 1994,
Santiago de Chile, Chile, pp. 487-499.
[2] R. Agrawal and R. Srikant, "Mining Sequential Patterns," in Proc. of the
11th Int. Conf. Data Engineering, 1995, Taipei, Taiwan, pp. 3-14.
[3] J. Ayres, J. E. Gehrke, T. Yiu, and J. Flannick, "Sequential PAttern Mining
Using Bitmaps," In Proc. of the 8th Int. Conf. on Knowledge Discovery
and Data Mining, 2002, Edmonton, Alberta, Canada, pp. 429-435.
[4] J. Blanchard, F. Guillet, H. Briand, and R. Gras, "Assessing Rule Interestingness
with a Probabilistic Measure of Deviation from Equilibrium,"
in Proc. of the 11th Int. Sympo. on Applied Stochastic Models and Data
Analysis, 2005, Brest, France, pp. 191-200.
[5] S. Brin, R. Motwani, and C. Silverstein, "Beyond Market Baskets: Generalizing
Association Rules to Correlations," in Proc. of the 1997 ACM
SIGMOD Int. Conf. on Management of Data, 1997, Tucson, Arizona,
USA, pp. 265-276.
[6] M. N. Garofalakis, R. Rastogi, and K. Shim, "SPIRIT: Sequential Pattern
Mining with Regular Expression Constraints," in Proc. of the Very Large
Data Bases Conf., 1999, Edinburgh, Scotland, UK, pp. 223-234.
[7] L. Geng and H. J. Hamilton, "Interestingness measures for data mining:
A survey," ACM Computing Surveys, vol. 38, no. 3, article 9, 2006.
[8] Y. Ichimura, Y. Nakayama, M. Miyoshi, T. Akahane, T. Sekiguchi,
Y. Fujiwara, "Text Mining System for Analysis of a Salesperson-s Daily
Reports," in Proc. of Pacific Association for Computational Linguistics
2001, 2001, Kitakyushu, Japan, pp. 127-135.
[9] V. Lavrenko, M. Schmill, D. Lawrie, P. Ogilvie, D. Jensen, J. Allan,
"Mining of Concurrent Text and Time-Series," in Proc. of the KDD-2000
Workshop on Text Mining, 2000, Boston, Massachusetts, USA, pp. 37-44.
[10] B. Lent, R. Agrawal, R. Srikant, "Discovering Trends in Text Databases,"
in Proc. of the 3rd Int. Conf. on Knowledge Discovery and Data Mining,
1997, Newport Beach, California, USA, pp. 227-230.
[11] K. McGarry, "A Survey of Interestingness Measures for Knowledge
Discovery," the Knowledge Engineering Review, vol. 20, no. 1, pp.39-
61, 2005.
[12] J. Pei, J. Han, B. Mortazavi-Asl, H. Pinto, Q. Chen, U. Dayal, M. Hsu,
"PrefixSpan: Mining Sequential Patterns Efficiently by Prefix-Projected
Pattern Growth," in Proc. of the 2001 Int. Conf. Data Engineering, 2001,
Heidelberg, Germany, pp. 215-224.
[13] J. Pei, J. Han, W. Wang, "Mining Sequential Patterns with Constraints
in Large Databases," in Proc. of the 11th ACM Int. Conf. on Information
and Knowledge Management, 2002, McLean, Virginia, USA, pp. 18-25.
[14] S. Sakurai, K. Ueno, R. Orihara, "Discovery of Time Series Event
Patterns based on Time Constraints from Textual Data," Int. J. of
Computational Intelligence, vol. 4, no. 2, pp. 144-151, 2008.
[15] K. Shimazu, A. Momma, and K. Furukawa, "Discovering Exceptional
Information from Customer Inquiry by Association Rule Miner," in Proc.
of the 6th Int. Conf. on Discovery Science 2003, 2003, Sapporo, Japan,
pp. 269-282.
[16] A. Silberschatz and A. Tuzhilin, "What Makes Patterns Interesting in
Knowledge Discovery Systems," IEEE Trans. on Knowledge and Data
Engineering, vol. 8, no. 6, pp. 970-974, Dec., 1996.
[17] R. Srikant and R. Agrawal, "Mining Sequential Patterns: Generalizations
and Performance Improvements," in Proc. of the 5th Int. Conf. Extending
Database Technology, 1996, Avignon, France, pp. 3-17.
[18] E. Suzuki and J. M. Zytkow, "Unified Algorithm for Undirected Discovery
of Exception Rules," Int. J. of Intelligent Systems, vol. 20, no. 7,
pp. 673-691, July, 2005.
[19] R. Swan and D. Jensen, "TimeMines: Constructing Timelines with
Statistical Models of Word Usage," in Proc. of the KDD-2000 Workshop
on Text Mining, 2000, Boston, Massachusetts, USA, pp. 73-80.
[20] S. -J. Yen, "Mining Interesting Sequential Patterns for Intelligent Systems,"
Int. J. of Intelligent Systems, vol. 20, no. 1 , pp 73-87, Jan., 2005.
[21] M. J. Zaki, "Sequence Mining in Categorical Domains: Algorithms
and Applications," in Sequence Learning: Paradigms, Algorithms, and
Applications, Lecture Notes in Computer Science, vol. 1828, pp. 162-
187, 2001.