A Methodology for Investigating Public Opinion Using Multilevel Text Analysis
Recently, many users have begun to frequently share
their opinions on diverse issues using various social media. Therefore,
numerous governments have attempted to establish or improve
national policies according to the public opinions captured from
various social media. In this paper, we indicate several limitations of
the traditional approaches to analyze public opinion on science and
technology and provide an alternative methodology to overcome these
limitations. First, we distinguish between the science and technology
analysis phase and the social issue analysis phase to reflect the fact that
public opinion can be formed only when a certain science and
technology is applied to a specific social issue. Next, we successively
apply a start list and a stop list to acquire clarified and interesting
results. Finally, to identify the most appropriate documents that fit
with a given subject, we develop a new logical filter concept that
consists of not only mere keywords but also a logical relationship
among the keywords. This study then analyzes the possibilities for the
practical use of the proposed methodology thorough its application to
discover core issues and public opinions from 1,700,886 documents
comprising SNS, blogs, news, and discussions.
[1] I. H. Witten, Text Mining, Practical Handbook of Internet Computing,
CRC Press, 2004.
[2] J. Hong, H. Choi, H. Han, J. Kim, E. Yu, S. Lim, and N. Kim, “A Data
Analysis-based Hybrid Methodology for Selecting Pending National
Issue Keywords,” Entrue Journal of Information Technology, vol. 13, pp.
97-111, Jun. 2014.
[3] R. J. Mooney, and R. Bunescu, “Mining Knowledge from Text Using
Information Extraction,” ACM SIGKDD Explorations, vol. 7, pp. 3-10,
Jun. 2006.
[4] S. Song, J. Yu, and E. Kim, “Offering System for Major Article Using
Text Mining and Data Mining,” Proceedings of the 32th annual
conference on Korea Information Processing Society, pp. 733-734, 2009.
[5] E. Yu, J. Kim, C. Lee, and N. Kim, “Using Ontologies for Semantic Text
Mining,” The Journal of Information Systems, vol. 21, pp. 137-161, Sep.
2012.
[6] D. Metzler, Y. Bernstein, W. B. Croft, A. Moffat, and J. Zobel,
“Similarity Measures for Tracking Information Flow,” Proceedings of
CIKM, Bremen, Germany, 2005.
[7] C. J. V. Rijsbergen, Information Retrieval, 2nd edition, Butterworth,
1979.
[8] F. Sebastiani, Classification of Text, Automatic, The Encyclopedia of
Language and Linguistics 14, 2nd edition, Elsevier Science Pub, 2006.
[9] W. Fan, L. Wallace, S. Rich, and Z. Zhang, “Tapping the Power of Text
Mining,” Communications of the ACM, vol. 49, pp. 76-82, Sep. 2006.
[10] S. M. Weiss, N. Indurkhya, and T. Zhang, Fundamentals of Predictive
Text Mining, Springer, 2010.
[11] G. Salton, A. Wong, and C. S. Yang, “A Vector Space Model for
Automatic Indexing,” Communications of the ACM, vol. 18, pp. 613-620,
Nov. 1975.
[12] R. Albright, Taming Text with the SVD, SAS Institute Inc., 2006.
[13] G. Salton, and M. J. McGill, Introduction to Modern Information
Retrieval, McGraw Hill, 1983.
[1] I. H. Witten, Text Mining, Practical Handbook of Internet Computing,
CRC Press, 2004.
[2] J. Hong, H. Choi, H. Han, J. Kim, E. Yu, S. Lim, and N. Kim, “A Data
Analysis-based Hybrid Methodology for Selecting Pending National
Issue Keywords,” Entrue Journal of Information Technology, vol. 13, pp.
97-111, Jun. 2014.
[3] R. J. Mooney, and R. Bunescu, “Mining Knowledge from Text Using
Information Extraction,” ACM SIGKDD Explorations, vol. 7, pp. 3-10,
Jun. 2006.
[4] S. Song, J. Yu, and E. Kim, “Offering System for Major Article Using
Text Mining and Data Mining,” Proceedings of the 32th annual
conference on Korea Information Processing Society, pp. 733-734, 2009.
[5] E. Yu, J. Kim, C. Lee, and N. Kim, “Using Ontologies for Semantic Text
Mining,” The Journal of Information Systems, vol. 21, pp. 137-161, Sep.
2012.
[6] D. Metzler, Y. Bernstein, W. B. Croft, A. Moffat, and J. Zobel,
“Similarity Measures for Tracking Information Flow,” Proceedings of
CIKM, Bremen, Germany, 2005.
[7] C. J. V. Rijsbergen, Information Retrieval, 2nd edition, Butterworth,
1979.
[8] F. Sebastiani, Classification of Text, Automatic, The Encyclopedia of
Language and Linguistics 14, 2nd edition, Elsevier Science Pub, 2006.
[9] W. Fan, L. Wallace, S. Rich, and Z. Zhang, “Tapping the Power of Text
Mining,” Communications of the ACM, vol. 49, pp. 76-82, Sep. 2006.
[10] S. M. Weiss, N. Indurkhya, and T. Zhang, Fundamentals of Predictive
Text Mining, Springer, 2010.
[11] G. Salton, A. Wong, and C. S. Yang, “A Vector Space Model for
Automatic Indexing,” Communications of the ACM, vol. 18, pp. 613-620,
Nov. 1975.
[12] R. Albright, Taming Text with the SVD, SAS Institute Inc., 2006.
[13] G. Salton, and M. J. McGill, Introduction to Modern Information
Retrieval, McGraw Hill, 1983.
@article{"International Journal of Information, Control and Computer Sciences:71411", author = "William Xiu Shun Wong and Myungsu Lim and Yoonjin Hyun and Chen Liu and Seongi Choi and Dasom Kim and Kee-Young Kwahk and Namgyu Kim", title = "A Methodology for Investigating Public Opinion Using Multilevel Text Analysis", abstract = "Recently, many users have begun to frequently share
their opinions on diverse issues using various social media. Therefore,
numerous governments have attempted to establish or improve
national policies according to the public opinions captured from
various social media. In this paper, we indicate several limitations of
the traditional approaches to analyze public opinion on science and
technology and provide an alternative methodology to overcome these
limitations. First, we distinguish between the science and technology
analysis phase and the social issue analysis phase to reflect the fact that
public opinion can be formed only when a certain science and
technology is applied to a specific social issue. Next, we successively
apply a start list and a stop list to acquire clarified and interesting
results. Finally, to identify the most appropriate documents that fit
with a given subject, we develop a new logical filter concept that
consists of not only mere keywords but also a logical relationship
among the keywords. This study then analyzes the possibilities for the
practical use of the proposed methodology thorough its application to
discover core issues and public opinions from 1,700,886 documents
comprising SNS, blogs, news, and discussions.", keywords = "Big data, social network analysis, text mining, topic
modeling.", volume = "9", number = "12", pages = "2416-5", }