Topic Modeling Using Latent Dirichlet Allocation and Latent Semantic Indexing on South African Telco Twitter Data

Twitter is one of the most popular social media platforms where users share their opinions on different subjects. Twitter can be considered a great source for mining text due to the high volumes of data generated through the platform daily. Many industries such as telecommunication companies can leverage the availability of Twitter data to better understand their markets and make an appropriate business decision. This study performs topic modeling on Twitter data using Latent Dirichlet Allocation (LDA). The obtained results are benchmarked with another topic modeling technique, Latent Semantic Indexing (LSI). The study aims to retrieve topics on a Twitter dataset containing user tweets on South African Telcos. Results from this study show that LSI is much faster than LDA. However, LDA yields better results with higher topic coherence by 8% for the best-performing model in this experiment. A higher topic coherence score indicates better performance of the model.

A Structured Mechanism for Identifying Political Influencers on Social Media Platforms: Top 10 Saudi Political Twitter Users

Social media networks, such as Twitter, offer the perfect opportunity to either positively or negatively affect political attitudes on large audiences. The existence of influential users who have developed a reputation for their knowledge and experience of specific topics is a major factor contributing to this impact. Therefore, knowledge of the mechanisms to identify influential users on social media is vital for understanding their effect on their audience. The concept of the influential user is related to the concept of opinion leaders' to indicate that ideas first flow from mass media to opinion leaders and then to the rest of the population. Hence, the objective of this research was to provide reliable and accurate structural mechanisms to identify influential users, which could be applied to different platforms, places, and subjects. Twitter was selected as the platform of interest, and Saudi Arabia as the context for the investigation. These were selected because Saudi Arabia has a large number of Twitter users, some of whom are considerably active in setting agendas and disseminating ideas. The study considered the scientific methods that have been used to identify public opinion leaders before, utilizing metrics software on Twitter. The key findings propose multiple novel metrics to compare Twitter influencers, including the number of followers, social authority and the use of political hashtags, and four secondary filtering measures. Thus, using ratio and percentage calculations to classify the most influential users, Twitter accounts were filtered, analyzed and included. The structured approach is used as a mechanism to explore the top ten influencers on Twitter from the political domain in Saudi Arabia.

Opinion Mining and Sentiment Analysis on DEFT

Current research practices sentiment analysis with a focus on social networks, DEfi Fouille de Texte (DEFT) (Text Mining Challenge) evaluation campaign focuses on opinion mining and sentiment analysis on social networks, especially social network Twitter. It aims to confront the systems produced by several teams from public and private research laboratories. DEFT offers participants the opportunity to work on regularly renewed themes and proposes to work on opinion mining in several editions. The purpose of this article is to scrutinize and analyze the works relating to opinions mining and sentiment analysis in the Twitter social network realized by DEFT. It examines the tasks proposed by the organizers of the challenge and the methods used by the participants.

Real Time Classification of Political Tendency of Twitter Spanish Users based on Sentiment Analysis

What people say on social media has turned into a rich source of information to understand social behavior. Specifically, the growing use of Twitter social media for political communication has arisen high opportunities to know the opinion of large numbers of politically active individuals in real time and predict the global political tendencies of a specific country. It has led to an increasing body of research on this topic. The majority of these studies have been focused on polarized political contexts characterized by only two alternatives. Unlike them, this paper tackles the challenge of forecasting Spanish political trends, characterized by multiple political parties, by means of analyzing the Twitters Users political tendency. According to this, a new strategy, named Tweets Analysis Strategy (TAS), is proposed. This is based on analyzing the users tweets by means of discovering its sentiment (positive, negative or neutral) and classifying them according to the political party they support. From this individual political tendency, the global political prediction for each political party is calculated. In order to do this, two different strategies for analyzing the sentiment analysis are proposed: one is based on Positive and Negative words Matching (PNM) and the second one is based on a Neural Networks Strategy (NNS). The complete TAS strategy has been performed in a Big-Data environment. The experimental results presented in this paper reveal that NNS strategy performs much better than PNM strategy to analyze the tweet sentiment. In addition, this research analyzes the viability of the TAS strategy to obtain the global trend in a political context make up by multiple parties with an error lower than 23%.

Quantifying Mobility of Urban Inhabitant Based on Social Media Data

Check-in locations on social media provide information about an individual’s location. The millions of units of data generated from these sites provide knowledge for human activity. In this research, we used a geolocation service and users’ texts posted on Twitter social media to analyze human mobility. Our research will answer the questions; what are the movement patterns of a citizen? And, how far do people travel in the city? We explore the people trajectory of 201,118 check-ins and 22,318 users over a period of one month in Makassar city, Indonesia. To accommodate individual mobility, the authors only analyze the users with check-in activity greater than 30 times. We used sampling method with a systematic sampling approach to assign the research sample. The study found that the individual movement shows a high degree of regularity and intensity in certain places. The other finding found that the average distance an urban inhabitant can travel per day is as far as 9.6 km.

The Study of Implications on Modern Businesses Performances by Digital Communities: Case of Data Leak

This study aims to investigate the impact of data leak of M&S customers on digital communities. Modern businesses are using digital communities as an important public relations tool for marketing purposes. This form of communication helps companies to build better relationship with their customers which also act as another source of information. The communication between the customers and the organizations is not regulated so users may post positive and negative comments. There are new platforms being developed on a daily basis and it is very crucial for the businesses to not only get themselves familiar with those but also know how to reach their existing and perspective consumers. The driving force of marketing and communication in modern businesses is the digital communities and these are continuously increasing and developing. This phenomenon is changing the way marketing is conducted. The current research has discussed the implications on M&S business performance since the data was exploited on digital communities; users contacted M&S and raised the security concerns. M&S closed down its website for few hours to try to resolve the issue. The next day M&S made a public apology about this incidence. This information was proliferated on various digital communities and it has impacted negatively on M&S brand name, sales and customers. The content analysis approach is being used to collect qualitative data from 100 digital bloggers including social media communities such as Facebook and Twitter. The results and finding provide useful new insights into the nature and form of security concerns of digital users. Findings have theoretical and practical implications. This research will showcase a large corporation utilizing various digital community platforms and can serve as a model for future organizations.

Fake Account Detection in Twitter Based on Minimum Weighted Feature set

Social networking sites such as Twitter and Facebook attracts over 500 million users across the world, for those users, their social life, even their practical life, has become interrelated. Their interaction with social networking has affected their life forever. Accordingly, social networking sites have become among the main channels that are responsible for vast dissemination of different kinds of information during real time events. This popularity in Social networking has led to different problems including the possibility of exposing incorrect information to their users through fake accounts which results to the spread of malicious content during life events. This situation can result to a huge damage in the real world to the society in general including citizens, business entities, and others. In this paper, we present a classification method for detecting the fake accounts on Twitter. The study determines the minimized set of the main factors that influence the detection of the fake accounts on Twitter, and then the determined factors are applied using different classification techniques. A comparison of the results of these techniques has been performed and the most accurate algorithm is selected according to the accuracy of the results. The study has been compared with different recent researches in the same area; this comparison has proved the accuracy of the proposed study. We claim that this study can be continuously applied on Twitter social network to automatically detect the fake accounts; moreover, the study can be applied on different social network sites such as Facebook with minor changes according to the nature of the social network which are discussed in this paper.

The Use of Emoticons in Polite Phrases of Greetings and Thanks

This paper shows the connection between emoticons and politeness in written computer-mediated communication. It studies if there are some differences in the use of emoticon between Czech and English written tweets. The assumptions about the use of emoticons were based on the use of greetings and thanks in real, faceto-face situations. The first assumption, that welcome greeting phrase would be accompanied by positive emoticon, was correct. But for the farewell greeting are both positive and negative emoticons possible. The results show lower frequency of negative emoticons in this context. There were also quite often found both positive and negative emoticon in the same tweet. The expression of gratitude is associated with positive emotions. The results show that emoticons accompany polite phrases of greeting and thanks very often both in Czech and English. The use of emoticons with studied polite phrases shows that emoticons have become an integral part of these phrases. 

Survey on Arabic Sentiment Analysis in Twitter

Large-scale data stream analysis has become one of the important business and research priorities lately. Social networks like Twitter and other micro-blogging platforms hold an enormous amount of data that is large in volume, velocity and variety. Extracting valuable information and trends out of these data would aid in a better understanding and decision-making. Multiple analysis techniques are deployed for English content. Moreover, one of the languages that produce a large amount of data over social networks and is least analyzed is the Arabic language. The proposed paper is a survey on the research efforts to analyze the Arabic content in Twitter focusing on the tools and methods used to extract the sentiments for the Arabic content on Twitter.

Stock Market Prediction by Regression Model with Social Moods

This paper presents a regression model with autocorrelated errors in which the inputs are social moods obtained by analyzing the adjectives in Twitter posts using a document topic model, where document topics are extracted using LDA. The regression model predicts Dow Jones Industrial Average (DJIA) more precisely than autoregressive moving-average models.

Why Are Entrepreneurs Resistant to E-tools?

Latvia is the fourth in the world by means of broadband internet speed. The total number of internet users in Latvia exceeds 70% of its population. The number of active mailboxes of the local internet e-mail service Inbox.lv accounts for 68% of the population and 97.6% of the total number of internet users. The Latvian portal Draugiem.lv is a phenomenon of social media, because 58.4 % of the population and 83.5% of internet users use it. A majority of Latvian company profiles are available on social networks, the most popular being Twitter.com. These and other parameters prove the fact consumers and companies are actively using the Internet.  However, after the authors in a number of studies analyzed how enterprises are employing the e-environment, namely, e-environment tools, they arrived to the conclusions that are not as flattering as the aforementioned statistics. There is an obvious contradiction between the statistical data and the actual studies. As a result, the authors have posed a question: Why are entrepreneurs resistant to e-tools? In order to answer this question, the authors have addressed the Technology Acceptance Model (TAM). The authors analyzed each phase and determined several factors affecting the use of e-environment, reaching the main conclusion that entrepreneurs do not have a sufficient level of e-literacy (digital literacy).  The authors employ well-established quantitative and qualitative methods of research: grouping, analysis, statistic method, factor analysis in SPSS 20  environment etc.  The theoretical and methodological background of the research is formed by, scientific researches and publications, that from the mass media and professional literature, statistical information from legal institutions as well as information collected by the author during the survey.