Analysis Model for the Relationship of Users, Products, and Stores on Online Marketplace Based on Distributed Representation

Recently, online marketplaces in the e-commerce industry, such as Rakuten and Alibaba, have become some of the most popular online marketplaces in Asia. In these shopping websites, consumers can select purchase products from a large number of stores. Additionally, consumers of the e-commerce site have to register their name, age, gender, and other information in advance, to access their registered account. Therefore, establishing a method for analyzing consumer preferences from both the store and the product side is required. This study uses the Doc2Vec method, which has been studied in the field of natural language processing. Doc2Vec has been used in many cases to analyze the extraction of semantic relationships between documents (represented as consumers) and words (represented as products) in the field of document classification. This concept is applicable to represent the relationship between users and items; however, the problem is that one more factor (i.e., shops) needs to be considered in Doc2Vec. More precisely, a method for analyzing the relationship between consumers, stores, and products is required. The purpose of our study is to combine the analysis of the Doc2vec model for users and shops, and for users and items in the same feature space. This method enables the calculation of similar shops and items for each user. In this study, we derive the real data analysis accumulated in the online marketplace and demonstrate the efficiency of the proposal.

[1] Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781.
[2] Yoav Goldberg, Omer Levy.word2vec Explained: Deriving Mikolov et al.’s Negative-Sampling Word-Embedding Method. February 14, 2014
[3] Quoc Le, Tomas Mikolov, Distributed representations of sentences and documents. In: International conference on machine learning2014, pp. 1188-1196
[4] Lap Q. Trieu, Huy Q. Tran, Minh-Triet Tran. News Classification from Social Media Using Twitter-Based Doc2Vec Model and Automatic Query Expansion. DOI: 10.1145/3155133
[5] H. Lee, Y. Yoon, Engineering doc2vec for automatic classification of product descriptions on O2O applications, Electron Commer Res 18, 433-456 2018.
[6] J.H. Lau, T. Baldwin, An empirical evaluation of doc2vec with practical insights into document embedding generation. In:Proc. RepL4NLP 2016, pp. 78-86
[7] Diederik P. Kingma, Jimmy Lei Ba, ADAM: A method for stochastic optimization, ICLR 2015
[8] L. Chen, G. Feng, C. Leong, B. Lehman, M. Martin-Raugh, H. Kell, et al., “Automated scoring of interview videos using Doc2Vec multimodal feature extraction paradigm.”Proc. of ACM ICMINov. 2016
[9] R. Nath Nandi, M. M. Arefin Zaman, T. Al. Muntasir, S. Hosain Sumit, T. Sourov, M. Jamil-Ur Rahman, “Bangla News Recommendation Using doc2vec,”International Conference on Bangla Speech and Language Processing (ICBSLP), Sylhet, Bangladesh, 2018, pp. 1-5, doi: 10.1109/ICBSLP.2018.8554679
[10] Rakuten, Inc. (2014): Rakuten Dataset. Informatics Research Data Repository, National Institute of informatics. (dataset).