Abstract: Today, there is a large number of political transcripts
available on the Web to be mined and used for statistical analysis,
and product recommendations. As the online political resources are
used for various purposes, automatically determining the political
orientation on these transcripts becomes crucial. The methodologies
used by machine learning algorithms to do an automatic classification
are based on different features that are classified under categories
such as Linguistic, Personality etc. Considering the ideological
differences between Liberals and Conservatives, in this paper, the
effect of Personality traits on political orientation classification is
studied. The experiments in this study were based on the correlation
between LIWC features and the BIG Five Personality traits. Several
experiments were conducted using Convote U.S. Congressional-
Speech dataset with seven benchmark classification algorithms. The
different methodologies were applied on several LIWC feature sets
that constituted by 8 to 64 varying number of features that are
correlated to five personality traits. As results of experiments,
Neuroticism trait was obtained to be the most differentiating
personality trait for classification of political orientation. At the same
time, it was observed that the personality trait based classification
methodology gives better and comparable results with the related
work.
Abstract: By the evolvement in technology, the way of
expressing opinions switched direction to the digital world. The
domain of politics, as one of the hottest topics of opinion mining
research, merged together with the behavior analysis for affiliation
determination in texts, which constitutes the subject of this paper.
This study aims to classify the text in news/blogs either as
Republican or Democrat with the minimum number of features. As
an initial set, 68 features which 64 were constituted by Linguistic
Inquiry and Word Count (LIWC) features were tested against 14
benchmark classification algorithms. In the later experiments, the
dimensions of the feature vector reduced based on the 7 feature
selection algorithms. The results show that the “Decision Tree”,
“Rule Induction” and “M5 Rule” classifiers when used with “SVM”
and “IGR” feature selection algorithms performed the best up to
82.5% accuracy on a given dataset. Further tests on a single feature
and the linguistic based feature sets showed the similar results. The
feature “Function”, as an aggregate feature of the linguistic category,
was found as the most differentiating feature among the 68 features
with the accuracy of 81% in classifying articles either as Republican
or Democrat.