This work proposes an approach to address automatic
text summarization. This approach is a trainable summarizer, which
takes into account several features, including sentence position,
positive keyword, negative keyword, sentence centrality, sentence
resemblance to the title, sentence inclusion of name entity, sentence
inclusion of numerical data, sentence relative length, Bushy path of
the sentence and aggregated similarity for each sentence to generate
summaries. First we investigate the effect of each sentence feature on
the summarization task. Then we use all features score function to
train genetic algorithm (GA) and mathematical regression (MR)
models to obtain a suitable combination of feature weights. The
proposed approach performance is measured at several compression
rates on a data corpus composed of 100 English religious articles.
The results of the proposed approach are promising.
[1] Hobson, S., Dorr, B., Monz, C., & Schwartz, R. (2007). Task-based
evaluation of text summarization using Relevance Prediction
Information Processing & Management, 43(6), 1482-1499.
[2] Sjöbergh, J. (2007). Older versions of the ROUGEeval summarization
evaluation system were easier to fool. Information Processing &
Management, 43(6), 1500-1505.
[3] Over, P., Dang, H., & Harman, D. (2007). DUC in context. Information
Processing & Management, 43(6), 1506-1520.
[4] Hirao, T., Okumura, M., Yasuda, N., & Isozaki, H. (2007). Supervised
automatic evaluation for summarization with voted regression model.
Information Processing & Management, 43(6), 1521-1535.
[5] Zajic, D., Dorr, B., Lin, J., & Schwartz, R. (2007). Multi-candidate
reduction: Sentence compression as a tool for document summarization
tasks. Information Processing & Management, 43(6), 1549-1570.
[6] Nomoto, T. (2007). Discriminative sentence compression with
conditional random fields. Information Processing & Management,
43(6), 1571-1587.
[7] Vanderwende, L., Suzuki, H., Brockett, C., & Nenkova, A. (2007).
Beyond SumBasic: Task-focused summarization with sentence
simplification and lexical expansion. Information Processing &
Management, 43(6), 1606-1618.
[8] Harabagiu, S., Hickl, A., & Lacatusu, F. (2007). Satisfying information
needs with multi-document summaries. Information Processing &
Management, 43(6), 1619-1642.
[9] Moens, M. (2007). Summarizing court decisions. Information
Processing & Management, 43(6) 1748-1764.
[10] Reeve, L., Han, H., & Brooks, A. (2007). The use of domain-specific
concepts in biomedical text summarization. Information Processing &
Management, 43(6), 1765-1776.
[11] Ling, X., Jiang, J., He, X., Mei, Q., Zhai, C., & Schatz, B. (2007).
Generating gene summaries from biomedical literature: A study of semistructured
summarization. Information Processing & Management,
43(6), 1777-1791.
[12] Russell, S. J., & Norvig, P. (1995). Artificial intelligence: a modern
approach. Englewood Cliffs, NJ: Prentice-Hall International Inc.
[13] Yeh, J., Ke, H., Yang, W., & Meng. I. (2005). Text summarization using
a trainable summarizer and latent semantic analysis. Information
Processing & Management, 41(1), 75-95.
[14] Jann, B. (2005). Making regression tables from stored estimates. Stata
Journal 5, 288-308.
[1] Hobson, S., Dorr, B., Monz, C., & Schwartz, R. (2007). Task-based
evaluation of text summarization using Relevance Prediction
Information Processing & Management, 43(6), 1482-1499.
[2] Sjöbergh, J. (2007). Older versions of the ROUGEeval summarization
evaluation system were easier to fool. Information Processing &
Management, 43(6), 1500-1505.
[3] Over, P., Dang, H., & Harman, D. (2007). DUC in context. Information
Processing & Management, 43(6), 1506-1520.
[4] Hirao, T., Okumura, M., Yasuda, N., & Isozaki, H. (2007). Supervised
automatic evaluation for summarization with voted regression model.
Information Processing & Management, 43(6), 1521-1535.
[5] Zajic, D., Dorr, B., Lin, J., & Schwartz, R. (2007). Multi-candidate
reduction: Sentence compression as a tool for document summarization
tasks. Information Processing & Management, 43(6), 1549-1570.
[6] Nomoto, T. (2007). Discriminative sentence compression with
conditional random fields. Information Processing & Management,
43(6), 1571-1587.
[7] Vanderwende, L., Suzuki, H., Brockett, C., & Nenkova, A. (2007).
Beyond SumBasic: Task-focused summarization with sentence
simplification and lexical expansion. Information Processing &
Management, 43(6), 1606-1618.
[8] Harabagiu, S., Hickl, A., & Lacatusu, F. (2007). Satisfying information
needs with multi-document summaries. Information Processing &
Management, 43(6), 1619-1642.
[9] Moens, M. (2007). Summarizing court decisions. Information
Processing & Management, 43(6) 1748-1764.
[10] Reeve, L., Han, H., & Brooks, A. (2007). The use of domain-specific
concepts in biomedical text summarization. Information Processing &
Management, 43(6), 1765-1776.
[11] Ling, X., Jiang, J., He, X., Mei, Q., Zhai, C., & Schatz, B. (2007).
Generating gene summaries from biomedical literature: A study of semistructured
summarization. Information Processing & Management,
43(6), 1777-1791.
[12] Russell, S. J., & Norvig, P. (1995). Artificial intelligence: a modern
approach. Englewood Cliffs, NJ: Prentice-Hall International Inc.
[13] Yeh, J., Ke, H., Yang, W., & Meng. I. (2005). Text summarization using
a trainable summarizer and latent semantic analysis. Information
Processing & Management, 41(1), 75-95.
[14] Jann, B. (2005). Making regression tables from stored estimates. Stata
Journal 5, 288-308.
@article{"International Journal of Information, Control and Computer Sciences:64024", author = "Mohamed Abdel Fattah and Fuji Ren", title = "Automatic Text Summarization", abstract = "This work proposes an approach to address automatic
text summarization. This approach is a trainable summarizer, which
takes into account several features, including sentence position,
positive keyword, negative keyword, sentence centrality, sentence
resemblance to the title, sentence inclusion of name entity, sentence
inclusion of numerical data, sentence relative length, Bushy path of
the sentence and aggregated similarity for each sentence to generate
summaries. First we investigate the effect of each sentence feature on
the summarization task. Then we use all features score function to
train genetic algorithm (GA) and mathematical regression (MR)
models to obtain a suitable combination of feature weights. The
proposed approach performance is measured at several compression
rates on a data corpus composed of 100 English religious articles.
The results of the proposed approach are promising.", keywords = "Automatic Summarization, Genetic Algorithm,Mathematical Regression, Text Features.", volume = "2", number = "1", pages = "216-4", }