A Web-Based Self-Learning Grammar for Spoken Language Understanding

One of the major goals of Spoken Dialog Systems
(SDS) is to understand what the user utters.
In the SDS domain, the Spoken Language Understanding (SLU)
Module classifies user utterances by means of a pre-definite
conceptual knowledge. The SLU module is able to recognize only the
meaning previously included in its knowledge base. Due the vastity
of that knowledge, the information storing is a very expensive
process.
Updating and managing the knowledge base are time-consuming
and error-prone processes because of the rapidly growing number of
entities like proper nouns and domain-specific nouns. This paper
proposes a solution to the problem of Name Entity Recognition
(NER) applied to a SDS domain. The proposed solution attempts to
automatically recognize the meaning associated with an utterance by
using the PANKOW (Pattern based Annotation through Knowledge
On the Web) method at runtime.
The method being proposed extracts information from the Web to
increase the SLU knowledge module and reduces the development
effort. In particular, the Google Search Engine is used to extract
information from the Facebook social network.





References:
[1] V. Catania, R. Di Natale, A. R. Intilisano, Y. Cilano, D. Panno.
"SmartGrammar: A dynamic spoken language understanding grammar
for inflective languages".(In press)
[2] S. M. Biondi, V. Catania, Y. Cilano, R. Di Natale and A.R. Intilisano.
2014. An Easy and Efficient Grammar Generator for Spoken Language
Understanding, The Sixth International Conference on Creative Content
Technologies – Vol. 7 nr 1 and 2, Venice, Italy.
[3] Philipp Cimiano, Siegfried Handschuh, and Steffen Staab. 2004.
Towards the self-annotating web. In Proceedings of the 13th
international conference on World Wide Web (WWW '04). ACM, New
York, NY, USA, 462-471.
[4] Berenike loos. 2006. On2L - A Framework for Incremental Ontology
Learning in Spoken Dialog Systems. In Proceedings of the
COLING/ACL 2006 Student Research Workshop, Sydney, Australia.
[5] Rune Sætre, Amund Tveit, Tonje S. Steigedal, and Astrid Lægreid.
2005. Semantic annotation of biomedical literature using google.
In Proceedings of the 2005 international conference on Computational
Science and Its Applications - Volume Part III (ICCSA'05), Osvaldo
Gervasi, Marina L. Gavrilova, Vipin Kumar, Antonio Laganà, and Heow
Pueh Lee (Eds.), Vol. Part III. Springer-Verlag, Berlin, Heidelberg, 327-
337.
[6] Marti A. Hearst. 1992. Automatic acquisition of hyponyms from large
text corpora. In Proceedings of the 14th conference on Computational
linguistics - Volume 2 (COLING '92), Vol. 2. Association for
Computational Linguistics, Stroudsburg, PA, USA, 539-545.
[7] Dan Bohus, Alexander I. Rudnicky, "The RavenClaw dialog
management framework: Architecture and systems”, _Computer Speech
and Language, _vol. 23, no. 3, 2009
[8] Bohus, A. Raux, T. K. Harris, M. Eskenazi, and A. I. Rudnicky,
"Olympus: an open-source framework for conversational spoken
language interface research,” in proceedings of HLT-NAACL 2007
workshop on Bridging the Gap: Academic and Industrial Research in
Dialog Technology, 2007.
[9] W. Ward, "Understanding spontaneous speech: the Phoenix system,"
Acoustics, Speech, and Signal Processing, 1991. ICASSP- 91, 1991
International Conference on, 14-17 Apr 1991, pp.365-367 vol.1