Leveraging Quality Metrics in Voting Model Based Thread Retrieval

Seeking and sharing knowledge on online forums
have made them popular in recent years. Although online forums are
valuable sources of information, due to variety of sources of
messages, retrieving reliable threads with high quality content is an
issue. Majority of the existing information retrieval systems ignore
the quality of retrieved documents, particularly, in the field of thread
retrieval. In this research, we present an approach that employs
various quality features in order to investigate the quality of retrieved
threads. Different aspects of content quality, including completeness,
comprehensiveness, and politeness, are assessed using these features,
which lead to finding not only textual, but also conceptual relevant
threads for a user query within a forum. To analyse the influence of
the features, we used an adopted version of voting model thread
search as a retrieval system. We equipped it with each feature solely
and also various combinations of features in turn during multiple
runs. The results show that incorporating the quality features
enhances the effectiveness of the utilised retrieval system
significantly.




References:
[1] Heydari, Atefeh, Mohammad ali Tavakoli, Naomie Salim, and Zahra
Heydari. "Detection of review spam: A survey." Expert Systems with
Applications 42, no. 7 (2015): 3634-3642.
[2] Elsas, J. L., Arguello, J., Callan, J. and Carbonell, J. G. (2008). Retrieval
and feedback models for blog feed search. In: Proceedings of the 31st
annual international ACM SIGIR conference on Research and
development in information retrieval, ACM, New York, NY, USA,
SIGIR '08, pp. 347-354, DOI 10.1145/1390334.1390394, URL
http://doi.acm.org/10.1145/1390334.1390394
[3] Elsas, J.L. and Carbonell, J. G. (2009) It pays to be picky: an evaluation
of thread retrieval in online forums. In: Proceedings of the 32nd
international ACM SIGIR conference on Research and development in
information retrieval, ACM, New York, NY, USA, SIGIR '09, pp. 714-
715, DOI 10.1145/1571941.1572092, URL
http://doi.acm.org/10.1145/1571941.1572092.
[4] Macdonald, C. and Ounis, I. (2008a). Voting techniques for expert
search. Knowl Inf Syst., 16(3), pp. 259-280. DOI 10.1007/s10115-007-
0105-3, URL http://dx.doi.org/10.1007/s10115-007-0105-3
[5] Macdonald, C. and Ounis, I. (2008b). Key blog distillation: ranking
aggregates. In: Proceedings of the 17th ACM conference on Information
and knowledge management, ACM, New York, NY, USA, CIKM '08, pp. 1043-1052, DOI 10.1145/1458082.1458221, URL
http://doi.acm.org/10.1145/1458082.1458221
[6] McCreadie,R. M. C.,Macdonald, C. and Ounis, I. (2010). News article
ranking: leveraging the wisdom of bloggers. In Adaptivity,
Personalization and Fusion of Heterogeneous Information, RIAO ‘10,
pp. 40-48, Paris, France. Le Centre de Hautes Etudes Internationals
D’Informatique
Documentaire.http://dl.acm.org/citation.cfm?id=1937055.1937064.
[7] Albaham, A. T. and Salim, N. (2012a). Adapting voting techniques for
online forum thread retrieval. Advanced Machine Learning
Technologies and Applications, volume 322 of Communications in
Computer and Information Science, pages 439-448. Springer Berlin
Heidelberg. ISBN 978-3-642-35325-3.
[8] Wang, G. A., Jiao, J. and Fan, W.(2009). Searching for Authoritative
Documents in Knowledge-Base Communities. ICIS 2009 Proceedings.
Paper 109.http://aisel.aisnet.org/icis2009/109
[9] Fan, W. (2009). Effective search in online knowledge communities: A
genetic algorithm approach (Doctoral dissertation, Virginia Polytechnic
Institute and State University).
[10] Albaham, A. T. and Salim, N. (2012b). Quality-biased retrieval in online
forums. Journal of Theoretical and Applied Information Technology,
38(1), pp. 55-62.
[11] Albaham, A. T. and Salim, N. (2013, December). Quality biased thread
retrieval using the voting model. In Proceedings of the 18th Australasian
Document Computing Symposium (pp. 97-100). ACM.
[12] Albaham, A. T., Salim, N. and Adekunle, O. I. (2014, January).
Leveraging Post Level Quality Indicators in Online Forum Thread
Retrieval. In Proceedings of the First International Conference on
Advanced Data and Information Engineering (DaEng-2013) (pp. 417-
425). Springer Singapore.
[13] Bhatia, S. and Mitra, P. (2010). Adopting inference networks for online
thread retrieval. In Proceedings of the Twenty-Fourth AAAI Conference
on Artificial Intelligence, pp. 1300-1305, Atlanta, Georgia, USA.
[14] Zuriati Ismail, Atefeh Heydari, Mohammadali Tavakoli, Naomie Salim.
“Incorporating Author’s Activeness in Online Discussion in Thread
Retrieval Model” ARPN Journal of Engineering and Applied Sciences
10 (2), 473-479
[15] Weimer, M. and Gurevych, I. (2007). Predicting the perceived quality of
web forum posts. In Proceedings of the Conference on Recent Advances
in Natural Language Processing (RANLP), pp. 643-648.
[16] Lui, M. and Baldwin, T. (2010). Classifying user forum participants:
Separating the gurus from the hacks, and other tales of the internet. In
Proceedings of the Australasian Language Technology Association
Workshop 2010, pp. 49-57, Melbourne, Australia, December 2010.
[17] Eng, K. and Chai, K. (2011). A Machine Learning-based Approach for
Automated Quality Assessment of User Generated Content in Web
Forums. PhD thesis, Digital Ecosystems and Business Intelligence
Institute, Curtin University.
[18] Burel, G., He, Y. and Alani, H. (2012). Automatic identification of best
answers in online enquiry communities. In 9th Extended Semantic Web
Conference, May 2012.
[19] Fan, W., Wang, G. and Liu, X. (2011). A knowledge adaption model
based framework for finding helpful user generated content in online
communities: In Thirty Second International Conference on Information
Systems. AIS Electronic Library (AISeL).
[20] Ponte, J. M. and Croft, W.B. (1998). A language modeling approach to
information retrieval. In: Proceedings of the 21st annual international
ACM SIGIR conference on Research and development in information
retrieval, ACM, NewYork, NY, USA, SIGIR ‘98, pp. 275-281, DOI
10.1145/290941.291008.
[21] Zhai, C. and Lafferty, J. (2004). A study of smoothing methods for
language models applied to information retrieval. ACM Trans Inf Syst,
22(2), pp. 179-214, DOI 10.1145/984321.984322.
[22] Craswell, N., Robertson, S., Zaragoza, H., and Taylor, M. (2005,
August). Relevance weighting for query independent evidence. In
Proceedings of the 28th annual international ACM SIGIR conference on
research and development in information retrieval (pp. 416-423). ACM.
[23] Aslam, J. A. and Montague, M. (2001). Models for metasearch. In: Oft,
W. B., Harper, D., Kraft, D. et al. (eds.) Proceedings of ACM SIGIR
2001. ACM Press, New Orleans, pp. 276–284. doi:
10.1145/383952.384007
[24] Fox, E.A. and Shaw, J. A. (1994). Combination of multiple searches. In:
Proceedings of TREC-2. NIST, Gaithersburg.
[25] Metzler, D. and Croft, W. B. (2007). Linear feature-based models for
information retrieval. Inf. Retr., 10(3), pp. 257-274, June. ISSN 1386-
4564.