Web usage mining is an interesting application of data
mining which provides insight into customer behaviour on the Internet. An important technique to discover user access and navigation trails is based on sequential patterns mining. One of the
key challenges for web access patterns mining is tackling the problem
of mining richly structured patterns. This paper proposes a novel
model called Web Access Patterns Graph (WAP-Graph) to represent all of the access patterns from web mining graphically. WAP-Graph
also motivates the search for new structural relation patterns, i.e. Concurrent Access Patterns (CAP), to identify and predict more
complex web page requests. Corresponding CAP mining and modelling methods are proposed and shown to be effective in the
search for and representation of concurrency between access patterns
on the web. From experiments conducted on large-scale synthetic
sequence data as well as real web access data, it is demonstrated that
CAP mining provides a powerful method for structural knowledge discovery, which can be visualised through the CAP-Graph model.
[1] B. Liu, Web Data Mining - Exploring hyperlinks, contents and usage data. Book series: Data-Centric Systems and Applications. Springer
Berlin/Heidelberg, 2007, ch. 1, 12.
[2] R. Kosala and H. Blockeel, "Web Mining Research: a survey," ACM SIGKDD Explorations Newsletter, vol. 2 Issue 1, June 2000.
[3] J. Srivastava, R. Cooley, M. Deshpande and P-T. Tan, "Web Usage ining: Discovery and applications of usage patterns from web data,"
SIGKDD Explorations, 2000, 1(2):12-23.
[4] J. Wang, Y. Huang, G. Wu and F. Zhang, "Web Mining: Knowledge
discovery on the Web," Systems, Man and Cybernetics, IEEE SMC '99
Conference Proceedings, (Tokyo, Japan, 1999), IEEE, vol. 2, 137-141.
[5] R. Agrawal and R. Srikant, "Mining sequential patterns," Proceedings
of the 11th International Conference on Data Engineering, (Taipei,
Taiwan, 1995), IEEE Computer Society Press, 3-14.
[6] R. Srikant and R. Agrawal, "Mining Sequential Patterns:
Generalizations and performance improvements," Proceedings of the
Fifth International Conference on Extending Database Technology,
(Avignon, France, 1996), Springer-Verlag, vol. 1057, 3-17.
[7] J. Pei, J. Han, B. Mortazavi-asl and H. Zhu, "Mining access patterns
efficiently from web logs," In Proceedings of the 4th Pacific-Asia
Conference on Knowledge Discovery and Data Mining, (Kyoto, Japan,
2000), Springer, 396-407.
[8] C. I. Ezeife and Y. Lu, "Mining web log sequential patterns with
position coded pre-order linked WAP-tree," International Journal of
Data Mining and Knowledge Discovery, 2005, 10, 5-38.
[9] W. Wang and P. T. Cao-Thai, "Novel position-coded methods for
mining web access patterns," IEEE International Conference on
Intelligence and Security Informatics, 2008, 194-196.
[10] X. Tan, M. Yao and J. Zhang, "Mining maximal frequent access
sequences based on improved WAP-tree," Proceedings of the Sixth
International Conference on Intelligent Systems Design and
Applications, IEEE Computer Society Press, 2006, vol. 1, 616-620.
[11] J. D. Parmar and S. Garg, "Modified web access pattern (mWAP)
approach for sequential pattern mining," INFOCOMP - Journal of
Computer Science, June, 2007, 6(2): 46-54.
[12] J. Lu, X. F. Wang, O. Adjei and F. Hussain, "Sequential patterns graph
and its construction algorithm," Chinese Journal of Computers, 2004,
27(6): 782-788.
[13] R. Agrawal, T. Imielinski and A. Swami, "Mining association rules
between sets of items in large databases," Proceedings of the 1993 ACM
SIGMOD, 207-216.
[14] J. Pei, J. W. Han, B. Mortazavi-Asl and H. Pinto, "PrefixSpan: Mining
sequential patterns efficiently by prefix-projected pattern growth,"
Proceedings of the 17th International Conference on Data Engineering,
(Heidelberg, Germany, 2001), IEEE Computer Society Press, 215-224.
[15] J. Lu, O. Adjei, W. R. Chen and J. Liu, "Post Sequential Patterns
Mining: A new method for discovering structural patterns," Proceedings
of the Second International Conference on Intelligent Information
Processing, (Beijing, China, 2004), Springer-Verlag, 239-250.
[16] J. Lu, W. R. Chen, O. Adjei and M. Keech, "Sequential patterns postprocessing
for structural relation patterns mining," International
Journal of Data Warehousing & Mining, 2008, 4(3): 71-89.
[17] J. Lu, W. R. Chen and M. Keech, "Graph-based modelling of concurrent
sequential patterns," International Journal of Data Warehousing &
Mining, to appear.
[18] P. Tang, and M. P. Turkia, "Mining frequent web access patterns with
partial enumeration," Proceedings of the 45th Annual Southeast
Regional Conference, (Winston-Salem, North Carolina, USA, 2007),
ACM, 226-231.
[19] R. Kohavi, C. Brodley, B. Frasca, L. Mason and Z. J. Zheng, "KDDCup
2000 Organizers' Report: Peeling the onion," SIGKDD
Explorations, vol. 2, Issue 2, 86-98, 2000.
[20] L. Getoor, "Link Mining: a new data mining challenge," SIGKDD
Explorations, vol. 4, Issue 2, 2003.
[1] B. Liu, Web Data Mining - Exploring hyperlinks, contents and usage data. Book series: Data-Centric Systems and Applications. Springer
Berlin/Heidelberg, 2007, ch. 1, 12.
[2] R. Kosala and H. Blockeel, "Web Mining Research: a survey," ACM SIGKDD Explorations Newsletter, vol. 2 Issue 1, June 2000.
[3] J. Srivastava, R. Cooley, M. Deshpande and P-T. Tan, "Web Usage ining: Discovery and applications of usage patterns from web data,"
SIGKDD Explorations, 2000, 1(2):12-23.
[4] J. Wang, Y. Huang, G. Wu and F. Zhang, "Web Mining: Knowledge
discovery on the Web," Systems, Man and Cybernetics, IEEE SMC '99
Conference Proceedings, (Tokyo, Japan, 1999), IEEE, vol. 2, 137-141.
[5] R. Agrawal and R. Srikant, "Mining sequential patterns," Proceedings
of the 11th International Conference on Data Engineering, (Taipei,
Taiwan, 1995), IEEE Computer Society Press, 3-14.
[6] R. Srikant and R. Agrawal, "Mining Sequential Patterns:
Generalizations and performance improvements," Proceedings of the
Fifth International Conference on Extending Database Technology,
(Avignon, France, 1996), Springer-Verlag, vol. 1057, 3-17.
[7] J. Pei, J. Han, B. Mortazavi-asl and H. Zhu, "Mining access patterns
efficiently from web logs," In Proceedings of the 4th Pacific-Asia
Conference on Knowledge Discovery and Data Mining, (Kyoto, Japan,
2000), Springer, 396-407.
[8] C. I. Ezeife and Y. Lu, "Mining web log sequential patterns with
position coded pre-order linked WAP-tree," International Journal of
Data Mining and Knowledge Discovery, 2005, 10, 5-38.
[9] W. Wang and P. T. Cao-Thai, "Novel position-coded methods for
mining web access patterns," IEEE International Conference on
Intelligence and Security Informatics, 2008, 194-196.
[10] X. Tan, M. Yao and J. Zhang, "Mining maximal frequent access
sequences based on improved WAP-tree," Proceedings of the Sixth
International Conference on Intelligent Systems Design and
Applications, IEEE Computer Society Press, 2006, vol. 1, 616-620.
[11] J. D. Parmar and S. Garg, "Modified web access pattern (mWAP)
approach for sequential pattern mining," INFOCOMP - Journal of
Computer Science, June, 2007, 6(2): 46-54.
[12] J. Lu, X. F. Wang, O. Adjei and F. Hussain, "Sequential patterns graph
and its construction algorithm," Chinese Journal of Computers, 2004,
27(6): 782-788.
[13] R. Agrawal, T. Imielinski and A. Swami, "Mining association rules
between sets of items in large databases," Proceedings of the 1993 ACM
SIGMOD, 207-216.
[14] J. Pei, J. W. Han, B. Mortazavi-Asl and H. Pinto, "PrefixSpan: Mining
sequential patterns efficiently by prefix-projected pattern growth,"
Proceedings of the 17th International Conference on Data Engineering,
(Heidelberg, Germany, 2001), IEEE Computer Society Press, 215-224.
[15] J. Lu, O. Adjei, W. R. Chen and J. Liu, "Post Sequential Patterns
Mining: A new method for discovering structural patterns," Proceedings
of the Second International Conference on Intelligent Information
Processing, (Beijing, China, 2004), Springer-Verlag, 239-250.
[16] J. Lu, W. R. Chen, O. Adjei and M. Keech, "Sequential patterns postprocessing
for structural relation patterns mining," International
Journal of Data Warehousing & Mining, 2008, 4(3): 71-89.
[17] J. Lu, W. R. Chen and M. Keech, "Graph-based modelling of concurrent
sequential patterns," International Journal of Data Warehousing &
Mining, to appear.
[18] P. Tang, and M. P. Turkia, "Mining frequent web access patterns with
partial enumeration," Proceedings of the 45th Annual Southeast
Regional Conference, (Winston-Salem, North Carolina, USA, 2007),
ACM, 226-231.
[19] R. Kohavi, C. Brodley, B. Frasca, L. Mason and Z. J. Zheng, "KDDCup
2000 Organizers' Report: Peeling the onion," SIGKDD
Explorations, vol. 2, Issue 2, 86-98, 2000.
[20] L. Getoor, "Link Mining: a new data mining challenge," SIGKDD
Explorations, vol. 4, Issue 2, 2003.
@article{"International Journal of Information, Control and Computer Sciences:59241", author = "Jing Lu and Malcolm Keech and Weiru Chen", title = "Concurrency in Web Access Patterns Mining", abstract = "Web usage mining is an interesting application of data
mining which provides insight into customer behaviour on the Internet. An important technique to discover user access and navigation trails is based on sequential patterns mining. One of the
key challenges for web access patterns mining is tackling the problem
of mining richly structured patterns. This paper proposes a novel
model called Web Access Patterns Graph (WAP-Graph) to represent all of the access patterns from web mining graphically. WAP-Graph
also motivates the search for new structural relation patterns, i.e. Concurrent Access Patterns (CAP), to identify and predict more
complex web page requests. Corresponding CAP mining and modelling methods are proposed and shown to be effective in the
search for and representation of concurrency between access patterns
on the web. From experiments conducted on large-scale synthetic
sequence data as well as real web access data, it is demonstrated that
CAP mining provides a powerful method for structural knowledge discovery, which can be visualised through the CAP-Graph model.", keywords = "concurrent access patterns (CAP), CAP mining and modelling, CAP-Graph, web access patterns (WAP), WAP-Graph, Web usage mining.", volume = "3", number = "10", pages = "2433-10", }