Scholarly

Interactive, Topic-Oriented Search Support by a Centroid-Based Text Categorisation

Year: 2019 Volume: 13 Issue: 4 178 - 184 Pages

Abstract: Centroid terms are single words that semantically and topically characterise text documents and so may serve as their very compact representation in automatic text processing. In the present paper, centroids are used to measure the relevance of text documents with respect to a given search query. Thus, a new graphbased paradigm for searching texts in large corpora is proposed and evaluated against keyword-based methods. The first, promising experimental results demonstrate the usefulness of the centroid-based search procedure. It is shown that especially the routing of search queries in interactive and decentralised search systems can be greatly improved by applying this approach. A detailed discussion on further fields of its application completes this contribution.

Weighted-Distance Sliding Windows and Cooccurrence Graphs for Supporting Entity-Relationship Discovery in Unstructured Text

Year: 2018 Volume: 12 Issue: 8 663 - 669 Pages

Abstract: The problem of Entity relation discovery in structured data, a well covered topic in literature, consists in searching within unstructured sources (typically, text) in order to find connections among entities. These can be a whole dictionary, or a specific collection of named items. In many cases machine learning and/or text mining techniques are used for this goal. These approaches might be unfeasible in computationally challenging problems, such as processing massive data streams. A faster approach consists in collecting the cooccurrences of any two words (entities) in order to create a graph of relations - a cooccurrence graph. Indeed each cooccurrence highlights some grade of semantic correlation between the words because it is more common to have related words close each other than having them in the opposite sides of the text. Some authors have used sliding windows for such problem: they count all the occurrences within a sliding windows running over the whole text. In this paper we generalise such technique, coming up to a Weighted-Distance Sliding Window, where each occurrence of two named items within the window is accounted with a weight depending on the distance between items: a closer distance implies a stronger evidence of a relationship. We develop an experiment in order to support this intuition, by applying this technique to a data set consisting in the text of the Bible, split into verses.

Texture Characterization Based on a Chandrasekhar Fast Adaptive Filter

Year: 2010 Volume: 4 Issue: 3 616 - 620 Pages

Abstract: In the framework of adaptive parametric modelling of images, we propose in this paper a new technique based on the Chandrasekhar fast adaptive filter for texture characterization. An Auto-Regressive (AR) linear model of texture is obtained by scanning the image row by row and modelling this data with an adaptive Chandrasekhar linear filter. The characterization efficiency of the obtained model is compared with the model adapted with the Least Mean Square (LMS) 2-D adaptive algorithm and with the cooccurrence method features. The comparison criteria is based on the computation of a characterization degree using the ratio of "betweenclass" variances with respect to "within-class" variances of the estimated coefficients. Extensive experiments show that the coefficients estimated by the use of Chandrasekhar adaptive filter give better results in texture discrimination than those estimated by other algorithms, even in a noisy context.

A New Approach for the Fingerprint Classification Based On Gray-Level Co- Occurrence Matrix

Year: 2008 Volume: 2 Issue: 11 3923 - 3926 Pages

Abstract: In this paper, we propose an approach for the classification of fingerprint databases. It is based on the fact that a fingerprint image is composed of regular texture regions that can be successfully represented by co-occurrence matrices. So, we first extract the features based on certain characteristics of the cooccurrence matrix and then we use these features to train a neural network for classifying fingerprints into four common classes. The obtained results compared with the existing approaches demonstrate the superior performance of our proposed approach.

Probability and Instruction Effects in Syllogistic Conditional Reasoning

Year: 2008 Volume: 2 Issue: 7 780 - 788 Pages

Abstract: The main aim of this study was to examine whether people understand indicative conditionals on the basis of syntactic factors or on the basis of subjective conditional probability. The second aim was to investigate whether the conditional probability of q given p depends on the antecedent and consequent sizes or derives from inductive processes leading to establish a link of plausible cooccurrence between events semantically or experientially associated. These competing hypotheses have been tested through a 3 x 2 x 2 x 2 mixed design involving the manipulation of four variables: type of instructions (“Consider the following statement to be true", “Read the following statement" and condition with no conditional statement); antecedent size (high/low); consequent size (high/low); statement probability (high/low). The first variable was between-subjects, the others were within-subjects. The inferences investigated were Modus Ponens and Modus Tollens. Ninety undergraduates of the Second University of Naples, without any prior knowledge of logic or conditional reasoning, participated in this study. Results suggest that people understand conditionals in a syntactic way rather than in a probabilistic way, even though the perception of the conditional probability of q given p is at least partially involved in the conditionals- comprehension. They also showed that, in presence of a conditional syllogism, inferences are not affected by the antecedent or consequent sizes. From a theoretical point of view these findings suggest that it would be inappropriate to abandon the idea that conditionals are naturally understood in a syntactic way for the idea that they are understood in a probabilistic way.

A Universal Model for Content-Based Image Retrieval

Year: 2008 Volume: 2 Issue: 10 3431 - 3434 Pages

Abstract: In this paper a novel approach for generalized image retrieval based on semantic contents is presented. A combination of three feature extraction methods namely color, texture, and edge histogram descriptor. There is a provision to add new features in future for better retrieval efficiency. Any combination of these methods, which is more appropriate for the application, can be used for retrieval. This is provided through User Interface (UI) in the form of relevance feedback. The image properties analyzed in this work are by using computer vision and image processing algorithms. For color the histogram of images are computed, for texture cooccurrence matrix based entropy, energy, etc, are calculated and for edge density it is Edge Histogram Descriptor (EHD) that is found. For retrieval of images, a novel idea is developed based on greedy strategy to reduce the computational complexity. The entire system was developed using AForge.Imaging (an open source product), MATLAB .NET Builder, C#, and Oracle 10g. The system was tested with Coral Image database containing 1000 natural images and achieved better results.

Featured based Segmentation of Color Textured Images using GLCM and Markov Random Field Model

Year: 2011 Volume: 5 Issue: 5 455 - 460 Pages

Abstract: In this paper, we propose a new image segmentation approach for colour textured images. The proposed method for image segmentation consists of two stages. In the first stage, textural features using gray level co-occurrence matrix(GLCM) are computed for regions of interest (ROI) considered for each class. ROI acts as ground truth for the classes. Ohta model (I1, I2, I3) is the colour model used for segmentation. Statistical mean feature at certain inter pixel distance (IPD) of I2 component was considered to be the optimized textural feature for further segmentation. In the second stage, the feature matrix obtained is assumed to be the degraded version of the image labels and modeled as Markov Random Field (MRF) model to model the unknown image labels. The labels are estimated through maximum a posteriori (MAP) estimation criterion using ICM algorithm. The performance of the proposed approach is compared with that of the existing schemes, JSEG and another scheme which uses GLCM and MRF in RGB colour space. The proposed method is found to be outperforming the existing ones in terms of segmentation accuracy with acceptable rate of convergence. The results are validated with synthetic and real textured images.

Top Journal