Organization Model of Semantic Document Repository and Search Techniques for Studying Information Technology
Nowadays, organizing a repository of documents and
resources for learning on a special field as Information Technology
(IT), together with search techniques based on domain knowledge or
document-s content is an urgent need in practice of teaching, learning
and researching. There have been several works related to methods of
organization and search by content. However, the results are still
limited and insufficient to meet user-s demand for semantic
document retrieval. This paper presents a solution for the
organization of a repository that supports semantic representation and
processing in search. The proposed solution is a model which
integrates components such as an ontology describing domain
knowledge, a database of document repository, semantic
representation for documents and a file system; with problems,
semantic processing techniques and advanced search techniques
based on measuring semantic similarity. The solution is applied to
build a IT learning materials management system of a university with
semantic search function serving students, teachers, and manager as
well. The application has been implemented, tested at the University
of Information Technology, Ho Chi Minh City, Vietnam and has
achieved good results.
[1] Aly, A.A, "Using a query expansion technique to improve document
retrieval", International Journal "Information Technologies and
Knowledge" (2008).
[2] Dario Bonino, Fulvio Corno, Laura Farinetti, Alessio Bosca , "Ontology
Driven Semantic Search", WSEAS Transaction on Information Science
and Application, Issue 6, Volume 1, pp. 1597-1605 (2004).
[3] D. Genest, M. Chein, "An experiment in Document Retrieval using
Conceptual Graph" , Proceeding of 5th ICCS Conference, Washington,
USA, p 489-504 (1997).
[4] Harter, S.P., "A probabilistic approach to automatic keyword indexing",
PhD thesis, Graduate Library, The University of Chicago, Thesis No.
T25146.
[5] Henrik Bulskov Styltsvig, "Ontology-based Information Retrieval", A
dissertation Presented to the Faculties of Roskilde University in Partial
Fulfillment of the Requirement for the Degree of Doctor of Philosophy
(2006).
[6] Henrik Eriksso, "The semantic-document approach to combining
documents and ontologies", International Journal of Human-Computer
Studies Volume 65, Issue 7, Pages 624-639 (2007)
[7] Kraaij, W., "Variations on Language Modeling for Information
Retrieval", ACM SIGIR Forum (2005).
[8] Sanderson M., "Word Sense Disambiguation and Information
Retrieval", Annual ACM Conference on Research and Development in
Information Retrieval, Ireland Springer-Verlag New York, Inc (1994)
[9] Salton G., A. Wong, and C.S. Yang, "A Vector Space Model for
Automatic Indexing", Communications of the ACM, 1975. 18(11): p.
613-620.
[10] Stokoe, C., M.P. Oakes, and J. Tait, "Word sense disambiguation in
information retrieval revisited", Annual ACM Conference on Research
and Development in Information Retrieval Toronto, Canada (2003).
[11] Thanh Tran, Philipp Cimiano, Sebastian Rudolph and Rudi Studer,
"Ontology-Based Interpretation of Keywords for Semantic Search", The
Semantic Web Lecture Notes in Computer Science, Volume 4825/2007,
523-536 (2007)
[12] Tzoukermann, E., J.L. Klavans, and C. Jacquemin, "Effective use of
natural language processing techniques for automatic conflation of
multi-word terms: the role of derivational morphology, part of speech
tagging, and shallow parsing", SIGIR -97: Proceedings of the 20th
annual international ACM SIGIR conference on Research and
development in information retrieval, p. 148-155 (1997)
[13] Vallez, M. and R. Pedraza-Jimenez, "Natural Language Processing in
Textual Information Retrieval and Related Topics", I.S.S.o.t.P.F.
University (2007).
[1] Aly, A.A, "Using a query expansion technique to improve document
retrieval", International Journal "Information Technologies and
Knowledge" (2008).
[2] Dario Bonino, Fulvio Corno, Laura Farinetti, Alessio Bosca , "Ontology
Driven Semantic Search", WSEAS Transaction on Information Science
and Application, Issue 6, Volume 1, pp. 1597-1605 (2004).
[3] D. Genest, M. Chein, "An experiment in Document Retrieval using
Conceptual Graph" , Proceeding of 5th ICCS Conference, Washington,
USA, p 489-504 (1997).
[4] Harter, S.P., "A probabilistic approach to automatic keyword indexing",
PhD thesis, Graduate Library, The University of Chicago, Thesis No.
T25146.
[5] Henrik Bulskov Styltsvig, "Ontology-based Information Retrieval", A
dissertation Presented to the Faculties of Roskilde University in Partial
Fulfillment of the Requirement for the Degree of Doctor of Philosophy
(2006).
[6] Henrik Eriksso, "The semantic-document approach to combining
documents and ontologies", International Journal of Human-Computer
Studies Volume 65, Issue 7, Pages 624-639 (2007)
[7] Kraaij, W., "Variations on Language Modeling for Information
Retrieval", ACM SIGIR Forum (2005).
[8] Sanderson M., "Word Sense Disambiguation and Information
Retrieval", Annual ACM Conference on Research and Development in
Information Retrieval, Ireland Springer-Verlag New York, Inc (1994)
[9] Salton G., A. Wong, and C.S. Yang, "A Vector Space Model for
Automatic Indexing", Communications of the ACM, 1975. 18(11): p.
613-620.
[10] Stokoe, C., M.P. Oakes, and J. Tait, "Word sense disambiguation in
information retrieval revisited", Annual ACM Conference on Research
and Development in Information Retrieval Toronto, Canada (2003).
[11] Thanh Tran, Philipp Cimiano, Sebastian Rudolph and Rudi Studer,
"Ontology-Based Interpretation of Keywords for Semantic Search", The
Semantic Web Lecture Notes in Computer Science, Volume 4825/2007,
523-536 (2007)
[12] Tzoukermann, E., J.L. Klavans, and C. Jacquemin, "Effective use of
natural language processing techniques for automatic conflation of
multi-word terms: the role of derivational morphology, part of speech
tagging, and shallow parsing", SIGIR -97: Proceedings of the 20th
annual international ACM SIGIR conference on Research and
development in information retrieval, p. 148-155 (1997)
[13] Vallez, M. and R. Pedraza-Jimenez, "Natural Language Processing in
Textual Information Retrieval and Related Topics", I.S.S.o.t.P.F.
University (2007).
@article{"International Journal of Information, Control and Computer Sciences:49409", author = "Nhon Do and Thuong Huynh and An Pham", title = "Organization Model of Semantic Document Repository and Search Techniques for Studying Information Technology", abstract = "Nowadays, organizing a repository of documents and
resources for learning on a special field as Information Technology
(IT), together with search techniques based on domain knowledge or
document-s content is an urgent need in practice of teaching, learning
and researching. There have been several works related to methods of
organization and search by content. However, the results are still
limited and insufficient to meet user-s demand for semantic
document retrieval. This paper presents a solution for the
organization of a repository that supports semantic representation and
processing in search. The proposed solution is a model which
integrates components such as an ontology describing domain
knowledge, a database of document repository, semantic
representation for documents and a file system; with problems,
semantic processing techniques and advanced search techniques
based on measuring semantic similarity. The solution is applied to
build a IT learning materials management system of a university with
semantic search function serving students, teachers, and manager as
well. The application has been implemented, tested at the University
of Information Technology, Ho Chi Minh City, Vietnam and has
achieved good results.", keywords = "document retrieval system, knowledgerepresentation, document representation, semantic search, ontology.", volume = "5", number = "11", pages = "1172-6", }