Abstract: When using Information Retrieval Systems (IRS), users often present search queries made of ad-hoc keywords. It is then up to the IRS to obtain a precise representation of the user’s information need and the context of the information. This paper investigates optimization of IRS to individual information needs in order of relevance. The study addressed development of algorithms that optimize the ranking of documents retrieved from IRS. This study discusses and describes a Document Ranking Optimization (DROPT) algorithm for information retrieval (IR) in an Internet-based or designated databases environment. Conversely, as the volume of information available online and in designated databases is growing continuously, ranking algorithms can play a major role in the context of search results. In this paper, a DROPT technique for documents retrieved from a corpus is developed with respect to document index keywords and the query vectors. This is based on calculating the weight (
Abstract: Seeking and sharing knowledge on online forums
have made them popular in recent years. Although online forums are
valuable sources of information, due to variety of sources of
messages, retrieving reliable threads with high quality content is an
issue. Majority of the existing information retrieval systems ignore
the quality of retrieved documents, particularly, in the field of thread
retrieval. In this research, we present an approach that employs
various quality features in order to investigate the quality of retrieved
threads. Different aspects of content quality, including completeness,
comprehensiveness, and politeness, are assessed using these features,
which lead to finding not only textual, but also conceptual relevant
threads for a user query within a forum. To analyse the influence of
the features, we used an adopted version of voting model thread
search as a retrieval system. We equipped it with each feature solely
and also various combinations of features in turn during multiple
runs. The results show that incorporating the quality features
enhances the effectiveness of the utilised retrieval system
significantly.
Abstract: Information retrieval has become an important field of study and research under computer science due to explosive growth of information available in the form of full text, hypertext, administrative text, directory, numeric or bibliographic text. The research work is going on various aspects of information retrieval systems so as to improve its efficiency and reliability. This paper presents a comprehensive study, which discusses not only emergence and evolution of information retrieval but also includes different information retrieval models and some important aspects such as document representation, similarity measure and query expansion.
Abstract: XML is a markup language which is becoming the
standard format for information representation and data exchange. A
major purpose of XML is the explicit representation of the logical
structure of a document. Much research has been performed to
exploit logical structure of documents in information retrieval in
order to precisely extract user information need from large
collections of XML documents. In this paper, we describe an XML
information retrieval weighting scheme that tries to find the most
relevant elements in XML documents in response to a user query.
We present this weighting model for information retrieval systems
that utilize plausible inferences to infer the relevance of elements in
XML documents. We also add to this model the Dempster-Shafer
theory of evidence to express the uncertainty in plausible inferences
and Dempster-Shafer rule of combination to combine evidences
derived from different inferences.
Abstract: Due to new distributed database applications such as
huge deductive database systems, the search complexity is constantly
increasing and we need better algorithms to speedup traditional
relational database queries. An optimal dynamic programming
method for such high dimensional queries has the big disadvantage of
its exponential order and thus we are interested in semi-optimal but
faster approaches. In this work we present a multi-agent based
mechanism to meet this demand and also compare the result with
some commonly used query optimization algorithms.
Abstract: In this paper, a model for an information retrieval
system is proposed which takes into account that knowledge about
documents and information need of users are dynamic. Two
methods are combined, one qualitative or symbolic and the other
quantitative or numeric, which are deemed suitable for many
clustering contexts, data analysis, concept exploring and
knowledge discovery. These two methods may be classified as
inductive learning techniques. In this model, they are introduced to
build “long term" knowledge about past queries and concepts in a
collection of documents. The “long term" knowledge can guide
and assist the user to formulate an initial query and can be
exploited in the process of retrieving relevant information. The
different kinds of knowledge are organized in different points of
view. This may be considered an enrichment of the exploration
level which is coherent with the concept of document/query
structure.
Abstract: This study investigates the use of genetic algorithms
in information retrieval. The method is shown to be applicable to
three well-known documents collections, where more relevant
documents are presented to users in the genetic modification. In this
paper we present a new fitness function for approximate information
retrieval which is very fast and very flexible, than cosine similarity
fitness function.
Abstract: In this paper we propose a multi-agent architecture for web information retrieval using fuzzy logic based result fusion mechanism. The model is designed in JADE framework and takes advantage of JXTA agent communication method to allow agent communication through firewalls and network address translators. This approach enables developers to build and deploy P2P applications through a unified medium to manage agent-based document retrieval from multiple sources.
Abstract: In this study a clustering technique has been implemented which is K-Means like with hierarchical initial set (HKM). The goal of this study is to prove that clustering document sets do enhancement precision on information retrieval systems, since it was proved by Bellot & El-Beze on French language. A comparison is made between the traditional information retrieval system and the clustered one. Also the effect of increasing number of clusters on precision is studied. The indexing technique is Term Frequency * Inverse Document Frequency (TF * IDF). It has been found that the effect of Hierarchical K-Means Like clustering (HKM) with 3 clusters over 242 Arabic abstract documents from the Saudi Arabian National Computer Conference has significant results compared with traditional information retrieval system without clustering. Additionally it has been found that it is not necessary to increase the number of clusters to improve precision more.