Sentiment Analysis: Comparative Analysis of Multilingual Sentiment and Opinion Classification Techniques

Sentiment analysis and opinion mining have become
emerging topics of research in recent years but most of the work
is focused on data in the English language. A comprehensive
research and analysis are essential which considers multiple
languages, machine translation techniques, and different classifiers.
This paper presents, a comparative analysis of different approaches
for multilingual sentiment analysis. These approaches are divided
into two parts: one using classification of text without language
translation and second using the translation of testing data to a
target language, such as English, before classification. The presented
research and results are useful for understanding whether machine
translation should be used for multilingual sentiment analysis or
building language specific sentiment classification systems is a better
approach. The effects of language translation techniques, features,
and accuracy of various classifiers for multilingual sentiment analysis
is also discussed in this study.





References:
[1] Bo Pang, Lillian Lee, and Shivakumar Vaithyanathan. Thumbs
up? Sentiment Classification using Machine Learning Techniques.
Proceedings of the ACL-02 conference on Empirical methods in natural
language processing - EMNLP, pages 79–86, 2002.
[2] Peter D Turney. Thumbs up or thumbs down? Semantic Orientation
applied to Unsupervised Classification of Reviews. Proceedings of the
40th Annual Meeting of the Association for Computational Linguistics
(ACL), (July):417–424, 2002. [3] Andrew B Xiaojin. Introduction to Semi-Supervised Learning. Synthesis
Lectures on Artificial Intelligence and Machine Learning, pages 1–130,
2009.
[4] Xiaowen Ding, Xiaowen Ding, Bing Liu, Bing Liu, Philip S. Yu, and
Philip S. Yu. A holistic lexicon-based approach to opinion mining.
Proceedings of the international conference on Web search and web
data mining - WSDM, page 231, 2008.
[5] Kevin Hsin Yih Lin, Changhua Yang, and Hsin Hsi Chen. Emotion
classification of online news articles from the reader’s perspective.
Proceedings - 2008 IEEE/WIC/ACM International Conference on Web
Intelligence, WI 2008, pages 220–226, 2008.
[6] Jalel Akaichi. Social networks’ Facebook’ statutes
updates mining for sentiment classification. Proceedings -
SocialCom/PASSAT/BigData/EconCom/BioMedCom, pages 886–891,
2013.
[7] Hong Yu and Vasileios Hatzivassiloglou. Towards answering opinion
questions: separating facts from opinions and identifying the polarity of
opinion sentences. Proceedings of the 2003 conference on Empirical
methods in natural language processing, pages 129–136, 2003.
[8] Long Jiang, Mo Yu, Ming Zhou, Xiaohua Liu, and Tiejun Zhao.
Target-dependent Twitter Sentiment Classification. Computational
Linguistics, pages 151–160, 2011.
[9] Xiaolong Wang, Furu Wei, Xiaohua Liu, Ming Zhou, and Ming
Zhang. Topic sentiment analysis in twitter: a graph-based hashtag
sentiment classification approach. Proceedings of the 20th ACM
international conference on Information and knowledge management,
pages 1031–1040, 2011.
[10] Mondher Bouazizi, Tomoaki Otsuki Ohtsuki, and Senior Member. A
Pattern-Based Approach for Sarcasm Detection on Twitter. 2016.
[11] Mondher Bouazizi and Tomoaki Ohtsuki. Sarcasm detection in twitter:
all your products are incredibly amazing - are they really? 2015 IEEE
Global Communications Conference, GLOBECOM, pages 1–6, 2016.
[12] Ramanathan Narayanan, Bing Liu, and Alok Choudhary. Sentiment
analysis of conditional sentences. Proceedings of the 2009 Conference
on Empirical Methods in Natural Language Processing Volume 1
EMNLP 09, (August):180, 2009.
[13] Nitin Jindal and Bing Liu. Identifying comparative sentences in text
documents. Proceedings of the 29th annual international ACM SIGIR
conference on Research and development in information retrieval -
SIGIR ’06, page 244, 2006.
[14] Maite Taboada, Julian Brooke, Milan Tofiloski, Kimberly Voll, and
Manfred Stede. Lexicon-Based Methods for Sentiment Analysis.
Computational Linguistics, pages 267–307, 2011.
[15] Zeynep Zengin Alp and Sule Gunduz Oduducu. Extracting Topical
Information of Tweets Using Hashtags. IEEE 14th International
Conference on Machine Learning and Applications (ICMLA), pages
644–648, 2015.
[16] Brian Heredia, Taghi M. Khoshgoftaar, Joseph Prusa, and Michael
Crawford. Cross-Domain Sentiment Analysis: An Empirical Investigation. IEEE 17th International Conference on Information Reuse
and Integration (IRI), pages 160–165, 2016.
[17] Qingxi Peng and Ming Zhong. Detecting Spam Review through
Sentiment Analysis. Journal of Software, pages 2065–2072, 2014.
[18] Hang Cui, Vibhu Mittal, and Mayur Datar. Comparative Experiments
on Sentiment Classi cation for Online Product Reviews. Entropy, pages
1265–1270, 2003.
[19] Mochamad Ibrahim, Omar Abdillah, Alfan F. Wicaksono, and Mirna
Adriani. Buzzer Detection and Sentiment Analysis for Predicting
Presidential Election Results in a Twitter Nation. Proceedings - 15th
IEEE International Conference on Data Mining Workshop, ICDMW,
pages 1348–1353, 2016.
[20] S. A. A. A. Alrababah, K. H. Gan, and T. P. Tan. Product aspect ranking
using sentiment analysis and topsis. Third International Conference
on Information Retrieval and Knowledge Management (CAMP), pages
13–19, Aug 2016.
[21] M´ario Cordeiro. Twitter event detection: combining wavelet analysis and
topic inference summarization. Proceedings of Doctoral Symposium on
Informatics Engineering, 2012.
[22] Shravan Vishwanathan. Sentiment Analysis of French Movie Reviews.
Proceedings of 3rd IRF International Conference, pages 80–82, 2014.
[23] Alexandra Balahur and Marco Turchi. Multilingual sentiment analysis
using machine translation. Proceedings of the 3rd Workshop on
Computational Approaches to Subjectivity and Sentiment Analysis, pages
5260, Jeju, Republic of Korea., (July):52–60, 2012.
[24] Xiaojun Wan. Using Bilingual Knowledge and Ensemble Techniques
for Unsupervised Chinese Sentiment Analysis. Proceedings of the
Conference on Empirical Methods, pages 553–561, 2008.
[25] Julian Brooke, Milan Tofiloski, and Maite Taboada. Cross-linguistic
sentiment analysis: From english to spanish. International Conference
RANLP, pages 50–54, 2009.
[26] Mahmoud Al-ayyoub, S. B. Essa, and Izzat Alsmadi. Lexicon-based
sentiment analysis of Arabic tweets. International Journal of Social
Network Mining, 2(July 2016):101–114, 2015.
[27] Yonghui Wu, Mike Schuster, Zhifeng Chen, Quoc V. Le, Mohammad
Norouzi, Wolfgang Macherey, Maxim Krikun, Yuan Cao, Qin Gao,
Klaus Macherey, Jeff Klingner, Apurva Shah, Melvin Johnson, Xiaobing
Liu, Łukasz Kaiser, Stephan Gouws, Yoshikiyo Kato, Taku Kudo, Hideto
Kazawa, Keith Stevens, George Kurian, Nishant Patil, Wei Wang, Cliff
Young, Jason Smith, Jason Riesa, Alex Rudnick, Oriol Vinyals, Greg
Corrado, Macduff Hughes, and Jeffrey Dean. Google’s Neural Machine
Translation System: Bridging the Gap between Human and Machine
Translation. ArXiv, pages 1–23, 2016.
[28] Ondej Bojar, Vojtˇech Diatka, Pavel Rychl´y, Pavel Stra´ak, V´ıt Suchomel,
Aleˇs Tamchyna, and Daniel Zeman. Corpus for Machine Translation.
pages 3550–3555, 2002.
[29] Bo Pang and Lillian Lee. Seeing stars: Exploiting class relationships for
sentiment categorization with respect to rating scales. In Proceedings
of the ACL, 2005.
[30] Ilya Sutskever, Oriol Vinyals, and Quoc V Le. Sequence to Sequence
Learning with Neural Networks. Nips, pages 3104–3112, 2014.
[31] Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. Bleu: A
method for automatic evaluation of machine translation. pages 311–318,
2002.
[32] Kyunghyun Cho, Bart van Merrienboer, Caglar Gulcehre, Dzmitry
Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio.
Learning Phrase Representations using RNN Encoder-Decoder for
Statistical Machine Translation. Proceedings of the 2014 Conference on
Empirical Methods in Natural Language Processing (EMNLP), pages
1724–1734, 2014.
[33] Jacob Devlin, Rabih Zbib, Zhongqiang Huang, Thomas Lamar, Richard
Schwartz, and John Makhoul. Fast and Robust Neural Network Joint
Models for Statistical Machine Translation. Acl, pages 1370–1380, 2014.
[34] Delta TFIDF: An Improved Feature Space for Sentiment Analysis.
Proceedings of the Second International Conference on Weblogs and
Social Media (ICWSM, (May):490–497, 2008.