Abstract: Content Based Image Retrieval (CBIR) coupled with
Case Based Reasoning (CBR) is a paradigm that is becoming
increasingly popular in the diagnosis and therapy planning of medical
ailments utilizing the digital content of medical images. This paper
presents a survey of some of the promising approaches used in the
detection of abnormalities in retina images as well in
mammographic screening and detection of regions of interest
in MRI scans of the brain. We also describe our proposed
algorithm to detect hard exudates in fundus images of the
retina of Diabetic Retinopathy patients.
Abstract: A large amount of data is typically stored in relational
databases (DB). The latter can efficiently handle user queries which
intend to elicit the appropriate information from data sources.
However, direct access and use of this data requires the end users to
have an adequate technical background, while they should also cope
with the internal data structure and values presented. Consequently
the information retrieval is a quite difficult process even for IT or DB
experts, taking into account the limited contributions of relational
databases from the conceptual point of view. Ontologies enable users
to formally describe a domain of knowledge in terms of concepts and
relations among them and hence they can be used for unambiguously
specifying the information captured by the relational database.
However, accessing information residing in a database using
ontologies is feasible, provided that the users are keen on using
semantic web technologies. For enabling users form different
disciplines to retrieve the appropriate data, the design of a Graphical
User Interface is necessary. In this work, we will present an
interactive, ontology-based, semantically enable web tool that can be
used for information retrieval purposes. The tool is totally based on
the ontological representation of underlying database schema while it
provides a user friendly environment through which the users can
graphically form and execute their queries.
Abstract: The legends about “user-friendly” and “easy-to-use”
birotical tools (computer-related office tools) have been spreading
and misleading end-users. This approach has led us to the extremely
high number of incorrect documents, causing serious financial losses
in the creating, modifying, and retrieving processes. Our research
proved that there are at least two sources of this underachievement:
(1) The lack of the definition of the correctly edited, formatted
documents. Consequently, end-users do not know whether their
methods and results are correct or not. They are not aware of their
ignorance. They are so ignorant that their ignorance does not allow
them to realize their lack of knowledge. (2) The end-users’ problem
solving methods. We have found that in non-traditional programming
environments end-users apply, almost exclusively, surface approach
metacognitive methods to carry out their computer related activities,
which are proved less effective than deep approach methods.
Based on these findings we have developed deep approach
methods which are based on and adapted from traditional
programming languages. In this study, we focus on the most popular
type of birotical documents, the text based documents. We have
provided the definition of the correctly edited text, and based on this
definition, adapted the debugging method known in programming.
According to the method, before the realization of text editing, a
thorough debugging of already existing texts and the categorization
of errors are carried out. With this method in advance to real text
editing users learn the requirements of text based documents and also
of the correctly formatted text.
The method has been proved much more effective than the
previously applied surface approach methods. The advantages of the
method are that the real text handling requires much less human and
computer sources than clicking aimlessly in the GUI (Graphical User
Interface), and the data retrieval is much more effective than from
error-prone documents.
Abstract: The present paper summarizes the analysis of the
request for consultation of information and data on industrial
emissions made publicly available on the web site of the Ministry of
Environment, Land and Sea on integrated pollution prevention and
control from large industrial installations, the so called “AIA Portal”.
As a matter of fact, a huge amount of information on national
industrial plants is already available on internet, although it is usually
proposed as textual documentation or images.
Thus, it is not possible to access all the relevant information
through interoperability systems and also to retrieval relevant
information for decision making purposes as well as rising of
awareness on environmental issue.
Moreover, since in Italy the number of institutional and private
subjects involved in the management of the public information on
industrial emissions is substantial, the access to the information is
provided on internet web sites according to different criteria; thus, at
present it is not structurally homogeneous and comparable.
To overcome the mentioned difficulties in the case of the
Coordinating Committee for the implementation of the Agreement
for the industrial area in Taranto and Statte, operating before the
IPPC permit granting procedures of the relevant installation located
in the area, a big effort was devoted to elaborate and to validate data
and information on characterization of soil, ground water aquifer and
coastal sea at disposal of different subjects to derive a global
perspective for decision making purposes. Thus, the present paper
also focuses on main outcomes matured during such experience.
Abstract: Image search engines rely on the surrounding textual
keywords for the retrieval of images. It is a tedious work for the
search engines like Google and Bing to interpret the user’s search
intention and to provide the desired results. The recent researches
also state that the Google image search engines do not work well on
all the images. Consequently, this leads to the emergence of efficient
image retrieval technique, which interprets the user’s search intention
and shows the desired results. In order to accomplish this task, an
efficient image re-ranking framework is required. Sequentially, to
provide best image retrieval, the new image re-ranking framework is
experimented in this paper. The implemented new image re-ranking
framework provides best image retrieval from the image dataset by
making use of re-ranking of retrieved images that is based on the
user’s desired images. This is experimented in two sections. One is
offline section and other is online section. In offline section, the reranking
framework studies differently (reference classes or Semantic
Spaces) for diverse user query keywords. The semantic signatures get
generated by combining the textual and visual features of the images.
In the online section, images are re-ranked by comparing the
semantic signatures that are obtained from the reference classes with
the user specified image query keywords. This re-ranking
methodology will increases the retrieval image efficiency and the
result will be effective to the user.
Abstract: Content-based image retrieval (CBIR) uses the
contents of images to characterize and contact the images. This paper
focus on retrieving the image by separating images into its three color
mechanism R, G and B and for that Discrete Wavelet Transformation
is applied. Then Wavelet based Generalized Gaussian Density (GGD)
is practical which is used for modeling the coefficients from the
wavelet transforms. After that it is agreed to Histogram of Oriented
Gradient (HOG) for extracting its characteristic vectors with Relevant
Feedback technique is used. The performance of this approach is
calculated by exactness and it confirms that this method is wellorganized
for image retrieval.
Abstract: The growth in the volume of text data such as books
and articles in libraries for centuries has imposed to establish
effective mechanisms to locate them. Early techniques such as
abstraction, indexing and the use of classification categories have
marked the birth of a new field of research called "Information
Retrieval". Information Retrieval (IR) can be defined as the task of
defining models and systems whose purpose is to facilitate access to
a set of documents in electronic form (corpus) to allow a user to find
the relevant ones for him, that is to say, the contents which matches
with the information needs of the user.
Most of the models of information retrieval use a specific data
structure to index a corpus which is called "inverted file" or "reverse
index".
This inverted file collects information on all terms over the corpus
documents specifying the identifiers of documents that contain the
term in question, the frequency of each term in the documents of the
corpus, the positions of the occurrences of the word...
In this paper we use an oriented object database (db4o) instead of
the inverted file, that is to say, instead to search a term in the inverted
file, we will search it in the db4o database.
The purpose of this work is to make a comparative study to see if
the oriented object databases may be competing for the inverse index
in terms of access speed and resource consumption using a large
volume of data.
Abstract: Search engine plays an important role in internet, to
retrieve the relevant documents among the huge number of web
pages. However, it retrieves more number of documents, which are
all relevant to your search topics. To retrieve the most meaningful
documents related to search topics, ranking algorithm is used in
information retrieval technique. One of the issues in data miming is
ranking the retrieved document. In information retrieval the ranking
is one of the practical problems. This paper includes various Page
Ranking algorithms, page segmentation algorithms and compares
those algorithms used for Information Retrieval. Diverse Page Rank
based algorithms like Page Rank (PR), Weighted Page Rank (WPR),
Weight Page Content Rank (WPCR), Hyperlink Induced Topic
Selection (HITS), Distance Rank, Eigen Rumor, Distance Rank Time
Rank, Tag Rank, Relational Based Page Rank and Query Dependent
Ranking algorithms are discussed and compared.
Abstract: This paper aims to analyze the role of natural
language processing (NLP). The paper will discuss the role in the
context of automated data retrieval, automated question answer, and
text structuring. NLP techniques are gaining wider acceptance in real
life applications and industrial concerns. There are various
complexities involved in processing the text of natural language that
could satisfy the need of decision makers. This paper begins with the
description of the qualities of NLP practices. The paper then focuses
on the challenges in natural language processing. The paper also
discusses major techniques of NLP. The last section describes
opportunities and challenges for future research.
Abstract: The fuzzy composition of objects depicted in images
acquired through MR imaging or the use of bio-scanners has often
been a point of controversy for field experts attempting to effectively
delineate between the visualized objects. Modern approaches in
medical image segmentation tend to consider fuzziness as a
characteristic and inherent feature of the depicted object, instead of
an undesirable trait. In this paper, a novel technique for efficient
image retrieval in the context of images in which segmented objects
are either crisp or fuzzily bounded is presented. Moreover, the
proposed method is applied in the case of multiple, even conflicting,
segmentations from field experts. Experimental results demonstrate
the efficiency of the suggested method in retrieving similar objects
from the aforementioned categories while taking into account the
fuzzy nature of the depicted data.
Abstract: Key frame extraction methods select the most
representative frames of a video, which can be used in different areas
of video processing such as video retrieval, video summary, and video
indexing. In this paper we present a novel approach for extracting key
frames from video sequences. The frame is characterized uniquely by
his contours which are represented by the dominant blocks. These
dominant blocks are located on the contours and its near textures.
When the video frames have a noticeable changement, its dominant
blocks changed, then we can extracte a key frame. The dominant
blocks of every frame is computed, and then feature vectors are
extracted from the dominant blocks image of each frame and arranged
in a feature matrix. Singular Value Decomposition is used to calculate
sliding windows ranks of those matrices. Finally the computed ranks
are traced and then we are able to extract key frames of a video.
Experimental results show that the proposed approach is robust
against a large range of digital effects used during shot transition.
Abstract: Nowadays, the Web has become one of the most
pervasive platforms for information change and retrieval. It collects
the suitable and perfectly fitting information from websites that one
requires. Data mining is the form of extracting data’s available in the
internet. Web mining is one of the elements of data mining
Technique, which relates to various research communities such as
information recovery, folder managing system and simulated
intellects. In this Paper we have discussed the concepts of Web
mining. We contain generally focused on one of the categories of
Web mining, specifically the Web Content Mining and its various
farm duties. The mining tools are imperative to scanning the many
images, text, and HTML documents and then, the result is used by
the various search engines. We conclude by presenting a comparative
table of these tools based on some pertinent criteria.
Abstract: The system is designed to show images which are
related to the query image. Extracting color, texture, and shape
features from an image plays a vital role in content-based image
retrieval (CBIR). Initially RGB image is converted into HSV color
space due to its perceptual uniformity. From the HSV image, Color
features are extracted using block color histogram, texture features
using Haar transform and shape feature using Fuzzy C-means
Algorithm. Then, the characteristics of the global and local color
histogram, texture features through co-occurrence matrix and Haar
wavelet transform and shape are compared and analyzed for CBIR.
Finally, the best method of each feature is fused during similarity
measure to improve image retrieval effectiveness and accuracy.
Abstract: Search is the most obvious application of information
retrieval. The variety of widely obtainable biomedical data is
enormous and is expanding fast. This expansion makes the existing
techniques are not enough to extract the most interesting patterns
from the collection as per the user requirement. Recent researches are
concentrating more on semantic based searching than the traditional
term based searches. Algorithms for semantic searches are
implemented based on the relations exist between the words of the
documents. Ontologies are used as domain knowledge for identifying
the semantic relations as well as to structure the data for effective
information retrieval. Annotation of data with concepts of ontology is
one of the wide-ranging practices for clustering the documents. In
this paper, indexing based on concept and annotation are proposed
for clustering the biomedical documents. Fuzzy c-means (FCM)
clustering algorithm is used to cluster the documents. The
performances of the proposed methods are analyzed with traditional
term based clustering for PubMed articles in five different diseases
communities. The experimental results show that the proposed
methods outperform the term based fuzzy clustering.
Abstract: Digital information is expanding in exponential order in our life. Information that is residing online and offline are stored in huge repositories relating to every aspect of our lives. Getting the required information is a task of retrieval systems. Content based image retrieval (CBIR) is a retrieval system that retrieves the required information from repositories on the basis of the contents of the image. Time is a critical factor in retrieval system and using indexed views with CBIR system improves the time efficiency of retrieved results.
Abstract: Chord formation in western music notations is an intelligent art form which is learnt over the years by a musician to acquire it. Still it is a question of creativity that brings the perfect chord sequence that matches music score. This work focuses on the process of forming chords using a custom-designed knowledgebase (KB) of Music Expert System. An optimal Chord-Set for a given music score is arrived by using the chord-pool in the KB and the finding the chord match using Jusic Distance (JD). Conceptual Graph based knowledge representation model is followed for knowledge storage and retrieval in the knowledgebase.
Abstract: The analysis of scientific collaboration networks has contributed significantly to improving the understanding of how does the process of collaboration between researchers and also to understand how the evolution of scientific production of researchers or research groups occurs. However, the identification of collaborations in large scientific databases is not a trivial task given the high computational cost of the methods commonly used. This paper proposes a method for identifying collaboration in large data base of curriculum researchers. The proposed method has low computational cost with satisfactory results, proving to be an interesting alternative for the modeling and characterization of large scientific collaboration networks.
Abstract: Genetic Algorithm (GA) is a powerful technique for solving optimization problems. It follows the idea of survival of the fittest - Better and better solutions evolve from previous generations until a near optimal solution is obtained. GA uses the main three operations, the selection, crossover and mutation to produce new generations from the old ones. GA has been widely used to solve optimization problems in many applications such as traveling salesman problem, airport traffic control, information retrieval (IR), reactive power optimization, job shop scheduling, and hydraulics systems such as water pipeline systems. In water pipeline systems we need to achieve some goals optimally such as minimum cost of construction, minimum length of pipes and diameters, and the place of protection devices. GA shows high performance over the other optimization techniques, moreover, it is easy to implement and use. Also, it searches a limited number of solutions.
Abstract: Information retrieval has become an important field of study and research under computer science due to explosive growth of information available in the form of full text, hypertext, administrative text, directory, numeric or bibliographic text. The research work is going on various aspects of information retrieval systems so as to improve its efficiency and reliability. This paper presents a comprehensive study, which discusses not only emergence and evolution of information retrieval but also includes different information retrieval models and some important aspects such as document representation, similarity measure and query expansion.
Abstract: Reformulating the user query is a technique that aims to improve the performance of an Information Retrieval System (IRS) in terms of precision and recall. This paper tries to evaluate the technique of query reformulation guided by an external resource for Arabic texts. To do this, various precision and recall measures were conducted and two corpora with different external resources like Arabic WordNet (AWN) and the Arabic Dictionary (thesaurus) of Meaning (ADM) were used. Examination of the obtained results will allow us to measure the real contribution of this reformulation technique in improving the IRS performance.