Abstract: Centroid terms are single words that semantically and
topically characterise text documents and so may serve as their
very compact representation in automatic text processing. In the
present paper, centroids are used to measure the relevance of text
documents with respect to a given search query. Thus, a new graphbased
paradigm for searching texts in large corpora is proposed
and evaluated against keyword-based methods. The first, promising
experimental results demonstrate the usefulness of the centroid-based
search procedure. It is shown that especially the routing of search
queries in interactive and decentralised search systems can be greatly
improved by applying this approach. A detailed discussion on further
fields of its application completes this contribution.
Abstract: The problem of Entity relation discovery in structured
data, a well covered topic in literature, consists in searching within
unstructured sources (typically, text) in order to find connections
among entities. These can be a whole dictionary, or a specific
collection of named items. In many cases machine learning and/or
text mining techniques are used for this goal. These approaches
might be unfeasible in computationally challenging problems, such
as processing massive data streams. A faster approach consists in collecting the cooccurrences of any
two words (entities) in order to create a graph of relations - a
cooccurrence graph. Indeed each cooccurrence highlights some grade
of semantic correlation between the words because it is more common
to have related words close each other than having them in the
opposite sides of the text. Some authors have used sliding windows for such problem: they
count all the occurrences within a sliding windows running over the
whole text. In this paper we generalise such technique, coming up
to a Weighted-Distance Sliding Window, where each occurrence of
two named items within the window is accounted with a weight
depending on the distance between items: a closer distance implies
a stronger evidence of a relationship. We develop an experiment in
order to support this intuition, by applying this technique to a data
set consisting in the text of the Bible, split into verses.
Abstract: In the framework of adaptive parametric modelling of images, we propose in this paper a new technique based on the Chandrasekhar fast adaptive filter for texture characterization. An Auto-Regressive (AR) linear model of texture is obtained by scanning the image row by row and modelling this data with an adaptive Chandrasekhar linear filter. The characterization efficiency of the obtained model is compared with the model adapted with the Least Mean Square (LMS) 2-D adaptive algorithm and with the cooccurrence method features. The comparison criteria is based on the computation of a characterization degree using the ratio of "betweenclass" variances with respect to "within-class" variances of the estimated coefficients. Extensive experiments show that the coefficients estimated by the use of Chandrasekhar adaptive filter give better results in texture discrimination than those estimated by other algorithms, even in a noisy context.
Abstract: In this paper, we propose an approach for the classification of fingerprint databases. It is based on the fact that a fingerprint image is composed of regular texture regions that can be successfully represented by co-occurrence matrices. So, we first extract the features based on certain characteristics of the cooccurrence matrix and then we use these features to train a neural network for classifying fingerprints into four common classes. The obtained results compared with the existing approaches demonstrate the superior performance of our proposed approach.
Abstract: The main aim of this study was to examine whether
people understand indicative conditionals on the basis of syntactic
factors or on the basis of subjective conditional probability. The
second aim was to investigate whether the conditional probability of
q given p depends on the antecedent and consequent sizes or derives
from inductive processes leading to establish a link of plausible cooccurrence
between events semantically or experientially associated.
These competing hypotheses have been tested through a 3 x 2 x 2 x 2
mixed design involving the manipulation of four variables: type of
instructions (“Consider the following statement to be true", “Read the
following statement" and condition with no conditional statement);
antecedent size (high/low); consequent size (high/low); statement
probability (high/low). The first variable was between-subjects, the
others were within-subjects. The inferences investigated were Modus
Ponens and Modus Tollens. Ninety undergraduates of the Second
University of Naples, without any prior knowledge of logic or
conditional reasoning, participated in this study.
Results suggest that people understand conditionals in a syntactic
way rather than in a probabilistic way, even though the perception of
the conditional probability of q given p is at least partially involved in
the conditionals- comprehension. They also showed that, in presence
of a conditional syllogism, inferences are not affected by the
antecedent or consequent sizes. From a theoretical point of view these
findings suggest that it would be inappropriate to abandon the idea
that conditionals are naturally understood in a syntactic way for the
idea that they are understood in a probabilistic way.
Abstract: In this paper a novel approach for generalized image
retrieval based on semantic contents is presented. A combination of
three feature extraction methods namely color, texture, and edge
histogram descriptor. There is a provision to add new features in
future for better retrieval efficiency. Any combination of these
methods, which is more appropriate for the application, can be used
for retrieval. This is provided through User Interface (UI) in the
form of relevance feedback. The image properties analyzed in this
work are by using computer vision and image processing algorithms.
For color the histogram of images are computed, for texture cooccurrence
matrix based entropy, energy, etc, are calculated and for
edge density it is Edge Histogram Descriptor (EHD) that is found.
For retrieval of images, a novel idea is developed based on greedy
strategy to reduce the computational complexity. The entire system
was developed using AForge.Imaging (an open source product),
MATLAB .NET Builder, C#, and Oracle 10g. The system was tested
with Coral Image database containing 1000 natural images and
achieved better results.
Abstract: In this paper, we propose a new image segmentation approach for colour textured images. The proposed method for image segmentation consists of two stages. In the first stage, textural features using gray level co-occurrence matrix(GLCM) are computed for regions of interest (ROI) considered for each class. ROI acts as ground truth for the classes. Ohta model (I1, I2, I3) is the colour model used for segmentation. Statistical mean feature at certain inter pixel distance (IPD) of I2 component was considered to be the optimized textural feature for further segmentation. In the second stage, the feature matrix obtained is assumed to be the degraded version of the image labels and modeled as Markov Random Field (MRF) model to model the unknown image labels. The labels are estimated through maximum a posteriori (MAP) estimation criterion using ICM algorithm. The performance of the proposed approach is compared with that of the existing schemes, JSEG and another scheme which uses GLCM and MRF in RGB colour space. The proposed method is found to be outperforming the existing ones in terms of segmentation accuracy with acceptable rate of convergence. The results are validated with synthetic and real textured images.