A Methodology for Investigating Public Opinion Using Multilevel Text Analysis

Recently, many users have begun to frequently share their opinions on diverse issues using various social media. Therefore, numerous governments have attempted to establish or improve national policies according to the public opinions captured from various social media. In this paper, we indicate several limitations of the traditional approaches to analyze public opinion on science and technology and provide an alternative methodology to overcome these limitations. First, we distinguish between the science and technology analysis phase and the social issue analysis phase to reflect the fact that public opinion can be formed only when a certain science and technology is applied to a specific social issue. Next, we successively apply a start list and a stop list to acquire clarified and interesting results. Finally, to identify the most appropriate documents that fit with a given subject, we develop a new logical filter concept that consists of not only mere keywords but also a logical relationship among the keywords. This study then analyzes the possibilities for the practical use of the proposed methodology thorough its application to discover core issues and public opinions from 1,700,886 documents comprising SNS, blogs, news, and discussions.




References:
[1] I. H. Witten, Text Mining, Practical Handbook of Internet Computing,
CRC Press, 2004.
[2] J. Hong, H. Choi, H. Han, J. Kim, E. Yu, S. Lim, and N. Kim, “A Data
Analysis-based Hybrid Methodology for Selecting Pending National
Issue Keywords,” Entrue Journal of Information Technology, vol. 13, pp.
97-111, Jun. 2014.
[3] R. J. Mooney, and R. Bunescu, “Mining Knowledge from Text Using
Information Extraction,” ACM SIGKDD Explorations, vol. 7, pp. 3-10,
Jun. 2006.
[4] S. Song, J. Yu, and E. Kim, “Offering System for Major Article Using
Text Mining and Data Mining,” Proceedings of the 32th annual
conference on Korea Information Processing Society, pp. 733-734, 2009.
[5] E. Yu, J. Kim, C. Lee, and N. Kim, “Using Ontologies for Semantic Text
Mining,” The Journal of Information Systems, vol. 21, pp. 137-161, Sep.
2012.
[6] D. Metzler, Y. Bernstein, W. B. Croft, A. Moffat, and J. Zobel,
“Similarity Measures for Tracking Information Flow,” Proceedings of
CIKM, Bremen, Germany, 2005.
[7] C. J. V. Rijsbergen, Information Retrieval, 2nd edition, Butterworth,
1979.
[8] F. Sebastiani, Classification of Text, Automatic, The Encyclopedia of
Language and Linguistics 14, 2nd edition, Elsevier Science Pub, 2006.
[9] W. Fan, L. Wallace, S. Rich, and Z. Zhang, “Tapping the Power of Text
Mining,” Communications of the ACM, vol. 49, pp. 76-82, Sep. 2006.
[10] S. M. Weiss, N. Indurkhya, and T. Zhang, Fundamentals of Predictive
Text Mining, Springer, 2010.
[11] G. Salton, A. Wong, and C. S. Yang, “A Vector Space Model for
Automatic Indexing,” Communications of the ACM, vol. 18, pp. 613-620,
Nov. 1975.
[12] R. Albright, Taming Text with the SVD, SAS Institute Inc., 2006.
[13] G. Salton, and M. J. McGill, Introduction to Modern Information
Retrieval, McGraw Hill, 1983.