Ontology Population via NLP Techniques in Risk Management

In this paper we propose an NLP-based method for Ontology Population from texts and apply it to semi automatic instantiate a Generic Knowledge Base (Generic Domain Ontology) in the risk management domain. The approach is semi-automatic and uses a domain expert intervention for validation. The proposed approach relies on a set of Instances Recognition Rules based on syntactic structures, and on the predicative power of verbs in the instantiation process. It is not domain dependent since it heavily relies on linguistic knowledge. A description of an experiment performed on a part of the ontology of the PRIMA1 project (supported by the European community) is given. A first validation of the method is done by populating this ontology with Chemical Fact Sheets from Environmental Protection Agency2. The results of this experiment complete the paper and support the hypothesis that relying on the predicative power of verbs in the instantiation process improves the performance.




References:
[1] Alquier A.M. & Tignol M.H., 2007. "Management de risques et
intelligence économique", Economica. ISBN : 2717852522.
[2] Ansoff H.I., 1990. Implanting Strategic Management, Practice Hall.
[3] T. Gruber. Towards principles for the design of ontologies used for
knowledge sharing. Int.J. of Human and Computer Studies, 43:907-928,
1994.
[4] R. Navigli and P. Velardi. Enriching a Formal Ontology with a
Thesaurus: an Application in the Cultural Heritage Domain. In
Proceedings of the 2nd Workshop on Ontology Learning and
Population: Bridging the Gap between Text and Knowledge - OLP
2006, pp. 1 - 9, Sydney, Australia, July 2006
[5] Hillson, D. 2005 "Describing probability: The limitations of natural
language."Proceedings of EMEA, Edinburgh, UK.
[6] Huang, C.F.Risk 2007 "Analysis with Information Described in Natural
Language ". In Computational Science, Proceedings of ICCS2007,
Lecture Notes In Computer Science, Springer Verlag.
[7] Liang T, Shih PK 2005 Empirical Textual Mining to Protein Entities
Recognition from PubMed Corpus, Proceedings of the10th International
Conference on Applications of Natural Language to Information
Systems, NLDB 2005, Alicante, Spain. Lecture Notes in Computer
Science. Springer Verlag. Pp 56-66.
[8] Navarro B., Martínez-Barco P. and M.Palomar, 2005. "Semantic
Annotation of a Natural Language Corpus for Knowledge Extraction"
10th International Conference on Applications of Natural Language to
Information Systems, NLDB 2005, Alicante, Spain. Lecture Notes in
Computer Science. Springer Verlag.
[9] TreeTagger: a language independent part-of-speech tagger
http://www.ims.uni-stuttgart.de/projekte/corplex/TreeTagger
[10] Miller, G. and McDonnell, J. S. 2003. "WordNet 2.0." A Lexical
Database for English, Princeton University's Cognitive Science
Laboratory. http://WordNet.princeton.edu
[11] S. Kim, H. Alani, W. Hall, P. Lewis, D. Millard, N. Shadbolt and M.
Weal. Artequakt: Generating Tailored Biographies from Automatically
Annotated Fragments from the Web. In Proceedings of Workshop on
Semantic Authoring, Annotation & Knowledge Markup (SAAKM-02),
the 15th European Conference on Artificial Intelligence, (ECAI-02), pp.
1-6, Lyon, France 2002.
[12] Alani H., Sanghee K., Millard E.D., Weal J.M., Lewis P.H., Hall W., and
Shadbolt N., Automatic Extraction of Knowledge from Web Documents,
In: Proceeding of (HLT03), 2003.
[13] Alani H., Sanghee K., Millard E.D., Weal J.M., Lewis P.H., Hall W., and
Shadbolt N., Web based Knowledge Extraction and Consolidation for
Automatic Ontology Instantiation, In: Proceedings of the Workshop on
Knowledge Markup and Semantic Annotation at the Second
International Conference on Knowledge Capture (K-CAP 2003),
Florida, USA, 2003.
[14] H. Alani, S. Kim, D.E. Millard, M.J. Weal, W. Hall, P.H. Lewis and
N.R. Shadbolt (2003), "Automatic Ontology-Based Knowledge
Extraction from Web Documents", IEEE Intelligent Systems, 18(1), pp.
14-21.
[15] F.M. Suchanek, G. Ifrim and G, Weikum. LEILA: Learning to Extract
Information by Linguistic Analysis. In Proceedings of the 2nd Workshop
on Ontology Learning and Population: Bridging the Gap between Text
and Knowledge - OLP 2006, pp. 18 - 25, Sydney, Australia, July 2006.
[16] P. Buitelaar, P. Cimiano, S. Racioppa and M. Siegel (2006), "Ontologybased
Information Extraction with SOBA", In Proceedings of the
International Conference on Language Resources and Evaluation, pp.
2321-2324. ELRA, May 2006.
[17] M. Shamsfard, and A. Abdollahzadeh Barforoush. The state of the art in
ontology learning: a framework for comparison. The Knowledge
Engineering Review (2003), 18: 293-316
doi:10.1017/S0269888903000687.