Text Summarization for Oil and Gas Drilling Topic

Information sharing and gathering are important in the rapid advancement era of technology. The existence of WWW has caused rapid growth of information explosion. Readers are overloaded with too many lengthy text documents in which they are more interested in shorter versions. Oil and gas industry could not escape from this predicament. In this paper, we develop an Automated Text Summarization System known as AutoTextSumm to extract the salient points of oil and gas drilling articles by incorporating statistical approach, keywords identification, synonym words and sentence-s position. In this study, we have conducted interviews with Petroleum Engineering experts and English Language experts to identify the list of most commonly used keywords in the oil and gas drilling domain. The system performance of AutoTextSumm is evaluated using the formulae of precision, recall and F-score. Based on the experimental results, AutoTextSumm has produced satisfactory performance with F-score of 0.81.





References:
[1] E. Qwiener, J.O. Pederson, and A.S.Weigned, "A neural network
approach to topic spotting", in Proceedings of the Fourth Annual
Symposium on Document Analysis and Information Retrievel
(SDAIR-95), 1995.
[2] Joachims, T., "Text Categorization with SupportVector Machins:
Learning with Many Relevant Features", in European Conference on
Machine Learning (ECML), 1998.
[3] Tsuruoka, Y., Kawaguchi-shi, Tsujii, J., "Journal of Biomedical
Informatics archive", Vol.37(6), pp. 461-470, 2004.
[4] Y.Yang and C.G.Chute, "An example-based mapping method for text
categorization and retrievel", ACM Transaction on Information Systems
(TOIS), 12(3):252-277, 1994.
[5] Victoria, M., "Statistical Approaches to Automatic Text
Summarization", Bulletin of the American Society for Information
Science and Technology, Vol3(4), April/May 2004.
[6] S.P. Yong, Ahmad I.Z. Abidin and Y.Y. Chen, "A Neural Based Text
Summarization System", in Proceedings of the 6th International
Conference of DATA MINING, 2005.
[7] Pardo, T.A.S., Rino, L.H.M. and Nunes, M.G.V., "GistSumm: A
Summarization Tool Based on a New Extractive Method" in
Computational Processing of the Portuguese Language. Vol. 2721/2003
[8] Neto, J.L., Freitas, A.A. and Kaestner, C.A.A., "Automatic Text
Summarization Using a Machine Learning Approach" in Proceedings of
the 16th Brazilian Symposium on Artificial Intelligence: Advances in
Artificial Intelligence, London, 2002.
[9] Kim, S.B., Han, K.S., Rim, H.C. and Myaeng, S.H., "Some Effective
Techniques for Naïve Bayes Text Classification" in IEEE Transactions
on Knowledge and Data Engineering, 2006.
[10] Kraaij, W., Spitters, M. and Heijden, M., "Combining a Mixture
Language Model and Naïve Bayes for Multi-document Summarisation"
http://www-connex.lip6.fr/~amini/RelatedWorks/Kraaij01.pdf
[Accessed on 23th June 2008].
[11] Albanese, M., "Extacting and Summarizing Information from Large
Data Repositories"
http://www.fedoa.unina.it/577/01/Tesi_MASSIMILIANO_ALBANESE.
pdf (Accessed on 23th June 2008).