Abstract: Text categorization techniques are widely used to many Information Retrieval (IR) applications. In this paper, we proposed a simple but efficient method that can automatically find the relationship between any pair of terms and documents, also an indexing matrix is established for text categorization. We call this method Indexing Matrix Categorization Machine (IMCM). Several experiments are conducted to show the efficiency and robust of our algorithm.
Abstract: One of object oriented software developing problem
is the difficulty of searching the appropriate and suitable objects for
starting the system. In this work, ontologies appear in the part of
supporting the object discovering in the initial of object oriented
software developing. There are many researches try to demonstrate
that there is a great potential between object model and ontologies.
Constructing ontology from object model is called ontology
engineering can be done; On the other hand, this research is aiming to
support the idea of building object model from ontology is also
promising and practical. Ontology classes are available online in any
specific areas, which can be searched by semantic search engine.
There are also many helping tools to do so; one of them which are
used in this research is Protégé ontology editor and Visual Paradigm.
To put them together give a great outcome. This research will be
shown how it works efficiently with the real case study by using
ontology classes in travel/tourism domain area. It needs to combine
classes, properties, and relationships from more than two ontologies
in order to generate the object model. In this paper presents a simple
methodology framework which explains the process of discovering
objects. The results show that this framework has great value while
there is possible for expansion. Reusing of existing ontologies offers
a much cheaper alternative than building new ones from scratch.
More ontologies are becoming available on the web, and online
ontologies libraries for storing and indexing ontologies are increasing
in number and demand. Semantic and Ontologies search engines have
also started to appear, to facilitate search and retrieval of online
ontologies.
Abstract: The majority of today's IR systems base the IR task on two main processes: indexing and searching. There exists a special group of dynamic IR systems where both processes (indexing and searching) happen simultaneously; such a system discards obsolete information, simultaneously dealing with the insertion of new in¬formation, while still answering user queries. In these dynamic, time critical text document databases, it is often important to modify index structures quickly, as documents arrive. This paper presents a method for dynamization which may be used for this task. Experimental results show that the dynamization process is possible and that it guarantees the response time for the query operation and index actualization.
Abstract: This study proposes novel hybrid social network analysis and collaborative filtering approach to enhance the performance of recommender systems. The proposed model selects subgroups of users in Internet community through social network analysis (SNA), and then performs clustering analysis using the information about subgroups. Finally, it makes recommendations using cluster-indexing CF based on the clustering results. This study tries to use the cores in subgroups as an initial seed for a conventional clustering algorithm. This model chooses five cores which have the highest value of degree centrality from SNA, and then performs clustering analysis by using the cores as initial centroids (cluster centers). Then, the model amplifies the impact of friends in social network in the process of cluster-indexing CF.
Abstract: The data exchanged on the Web are of different nature
from those treated by the classical database management systems;
these data are called semi-structured data since they do not have a
regular and static structure like data found in a relational database;
their schema is dynamic and may contain missing data or types.
Therefore, the needs for developing further techniques and
algorithms to exploit and integrate such data, and extract relevant
information for the user have been raised. In this paper we present
the system OSIX (Osiris based System for Integration of XML
Sources). This system has a Data Warehouse model designed for the
integration of semi-structured data and more precisely for the
integration of XML documents. The architecture of OSIX relies on
the Osiris system, a DL-based model designed for the representation
and management of databases and knowledge bases. Osiris is a viewbased
data model whose indexing system supports semantic query
optimization. We show that the problem of query processing on a
XML source is optimized by the indexing approach proposed by
Osiris.
Abstract: With the rapid development in the field of life
sciences and the flooding of genomic information, the need for faster
and scalable searching methods has become urgent. One of the
approaches that were investigated is indexing. The indexing methods
have been categorized into three categories which are the lengthbased
index algorithms, transformation-based algorithms and mixed
techniques-based algorithms. In this research, we focused on the
transformation based methods. We embedded the N-gram method
into the transformation-based method to build an inverted index
table. We then applied the parallel methods to speed up the index
building time and to reduce the overall retrieval time when querying
the genomic database. Our experiments show that the use of N-Gram
transformation algorithm is an economical solution; it saves time and
space too. The result shows that the size of the index is smaller than
the size of the dataset when the size of N-Gram is 5 and 6. The
parallel N-Gram transformation algorithm-s results indicate that the
uses of parallel programming with large dataset are promising which
can be improved further.
Abstract: Knowing the geometrical object pose of products in manufacturing line before robot manipulation is required and less time consuming for overall shape measurement. In order to perform it, the information of shape representation and matching of objects is become required. Objects are compared with its descriptor that conceptually subtracted from each other to form scalar metric. When the metric value is smaller, the object is considered closed to each other. Rotating the object from static pose in some direction introduce the change of value in scalar metric value of boundary information after feature extraction of related object. In this paper, a proposal method for indexing technique for retrieval of 3D geometrical models based on similarity between boundaries shapes in order to measure 3D CAD object pose using object shape feature matching for Computer Aided Testing (CAT) system in production line is proposed. In experimental results shows the effectiveness of proposed method.
Abstract: In this study a clustering technique has been implemented which is K-Means like with hierarchical initial set (HKM). The goal of this study is to prove that clustering document sets do enhancement precision on information retrieval systems, since it was proved by Bellot & El-Beze on French language. A comparison is made between the traditional information retrieval system and the clustered one. Also the effect of increasing number of clusters on precision is studied. The indexing technique is Term Frequency * Inverse Document Frequency (TF * IDF). It has been found that the effect of Hierarchical K-Means Like clustering (HKM) with 3 clusters over 242 Arabic abstract documents from the Saudi Arabian National Computer Conference has significant results compared with traditional information retrieval system without clustering. Additionally it has been found that it is not necessary to increase the number of clusters to improve precision more.