SC-LSH: An Efficient Indexing Method for Approximate Similarity Search in High Dimensional Space

Locality Sensitive Hashing (LSH) is one of the most promising techniques for solving nearest neighbour search problem in high dimensional space. Euclidean LSH is the most popular variation of LSH that has been successfully applied in many multimedia applications. However, the Euclidean LSH presents limitations that affect structure and query performances. The main limitation of the Euclidean LSH is the large memory consumption. In order to achieve a good accuracy, a large number of hash tables is required. In this paper, we propose a new hashing algorithm to overcome the storage space problem and improve query time, while keeping a good accuracy as similar to that achieved by the original Euclidean LSH. The Experimental results on a real large-scale dataset show that the proposed approach achieves good performances and consumes less memory than the Euclidean LSH.

A New Approach for the Fingerprint Classification Based On Gray-Level Co- Occurrence Matrix

In this paper, we propose an approach for the classification of fingerprint databases. It is based on the fact that a fingerprint image is composed of regular texture regions that can be successfully represented by co-occurrence matrices. So, we first extract the features based on certain characteristics of the cooccurrence matrix and then we use these features to train a neural network for classifying fingerprints into four common classes. The obtained results compared with the existing approaches demonstrate the superior performance of our proposed approach.

Word Base Line Detection in Handwritten Text Recognition Systems

An approach is offered for more precise definition of base lines- borders in handwritten cursive text and general problems of handwritten text segmentation have also been analyzed. An offered method tries to solve problems arose in handwritten recognition with specific slant or in other words, where the letters of the words are not on the same vertical line. As an informative features, some recognition systems use ascending and descending parts of the letters, found after the word-s baseline detection. In such recognition systems, problems in baseline detection, impacts the quality of the recognition and decreases the rate of the recognition. Despite other methods, here borders are found by small pieces containing segmentation elements and defined as a set of linear functions. In this method, separate borders for top and bottom border lines are found. At the end of the paper, as a result, azerbaijani cursive handwritten texts written in Latin alphabet by different authors has been analyzed.

Design as Contract and Blueprint – Tackling Maturity Level 1 Software Vendors in an e-School Project

Process improvements have drawn much attention in practical software engineering. The capability maturity levels from CMMI have become an important index to assess a software company-s software engineering capability. However, in countries like Taiwan, customers often have no choices but to deal with vendors that are not CMMI prepared or qualified. We call these vendors maturitylevel- 1 (ML1) vendors. In this paper, we describe our experience from consulting an e-school project. We propose an approach to help our client tackle the ML1 vendors. Through our system analysis, we produce a design. This design is suggested to be used as part of contract and a blueprint to guide the implementation.

Applications of Rough Set Decompositions in Information Retrieval

This paper proposes rough set models with three different level knowledge granules in incomplete information system under tolerance relation by similarity between objects according to their attribute values. Through introducing dominance relation on the discourse to decompose similarity classes into three subclasses: little better subclass, little worse subclass and vague subclass, it dismantles lower and upper approximations into three components. By using these components, retrieving information to find naturally hierarchical expansions to queries and constructing answers to elaborative queries can be effective. It illustrates the approach in applying rough set models in the design of information retrieval system to access different granular expanded documents. The proposed method enhances rough set model application in the flexibility of expansions and elaborative queries in information retrieval.

New Adaptive Linear Discriminante Analysis for Face Recognition with SVM

We have applied new accelerated algorithm for linear discriminate analysis (LDA) in face recognition with support vector machine. The new algorithm has the advantage of optimal selection of the step size. The gradient descent method and new algorithm has been implemented in software and evaluated on the Yale face database B. The eigenfaces of these approaches have been used to training a KNN. Recognition rate with new algorithm is compared with gradient.

Toward a Use of Ontology to Reinforcing Semantic Classification of Message Based On LSA

For best collaboration, Asynchronous tools and particularly the discussion forums are the most used thanks to their flexibility in terms of time. To convey only the messages that belong to a theme of interest of the tutor in order to help him during his tutoring work, use of a tool for classification of these messages is indispensable. For this we have proposed a semantics classification tool of messages of a discussion forum that is based on LSA (Latent Semantic Analysis), which includes a thesaurus to organize the vocabulary. Benefits offered by formal ontology can overcome the insufficiencies that a thesaurus generates during its use and encourage us then to use it in our semantic classifier. In this work we propose the use of some functionalities that a OWL ontology proposes. We then explain how functionalities like “ObjectProperty", "SubClassOf" and “Datatype" property make our classification more intelligent by way of integrating new terms. New terms found are generated based on the first terms introduced by tutor and semantic relations described by OWL formalism.

Query Algebra for Semistuctured Data

With the tremendous growth of World Wide Web (WWW) data, there is an emerging need for effective information retrieval at the document level. Several query languages such as XML-QL, XPath, XQL, Quilt and XQuery are proposed in recent years to provide faster way of querying XML data, but they still lack of generality and efficiency. Our approach towards evolving a framework for querying semistructured documents is based on formal query algebra. Two elements are introduced in the proposed framework: first, a generic and flexible data model for logical representation of semistructured data and second, a set of operators for the manipulation of objects defined in the data model. In additional to accommodating several peculiarities of semistructured data, our model offers novel features such as bidirectional paths for navigational querying and partitions for data transformation that are not available in other proposals.

A Study on the Secure ebXML Transaction Models

ebXML (Electronic Business using eXtensible Markup Language) is an e-business standard, sponsored by UN/CEFACT and OASIS, which enables enterprises to exchange business messages, conduct trading relationships, communicate data in common terms and define and register business processes. While there is tremendous e-business value in the ebXML, security remains an unsolved problem and one of the largest barriers to adoption. XML security technologies emerging recently have extensibility and flexibility suitable for security implementation such as encryption, digital signature, access control and authentication. In this paper, we propose ebXML business transaction models that allow trading partners to securely exchange XML based business transactions by employing XML security technologies. We show how each XML security technology meets the ebXML standard by constructing the test software and validating messages between the trading partners.

CSOLAP (Continuous Spatial On-Line Analytical Processing)

Decision support systems are usually based on multidimensional structures which use the concept of hypercube. Dimensions are the axes on which facts are analyzed and form a space where a fact is located by a set of coordinates at the intersections of members of dimensions. Conventional multidimensional structures deal with discrete facts linked to discrete dimensions. However, when dealing with natural continuous phenomena the discrete representation is not adequate. There is a need to integrate spatiotemporal continuity within multidimensional structures to enable analysis and exploration of continuous field data. Research issues that lead to the integration of spatiotemporal continuity in multidimensional structures are numerous. In this paper, we discuss research issues related to the integration of continuity in multidimensional structures, present briefly a multidimensional model for continuous field data. We also define new aggregation operations. The model and the associated operations and measures are validated by a prototype.

Complex Wavelet Transform Based Image Denoising and Zooming Under the LMMSE Framework

This paper proposes a dual tree complex wavelet transform (DT-CWT) based directional interpolation scheme for noisy images. The problems of denoising and interpolation are modelled as to estimate the noiseless and missing samples under the same framework of optimal estimation. Initially, DT-CWT is used to decompose an input low-resolution noisy image into low and high frequency subbands. The high-frequency subband images are interpolated by linear minimum mean square estimation (LMMSE) based interpolation, which preserves the edges of the interpolated images. For each noisy LR image sample, we compute multiple estimates of it along different directions and then fuse those directional estimates for a more accurate denoised LR image. The estimation parameters calculated in the denoising processing can be readily used to interpolate the missing samples. The inverse DT-CWT is applied on the denoised input and interpolated high frequency subband images to obtain the high resolution image. Compared with the conventional schemes that perform denoising and interpolation in tandem, the proposed DT-CWT based noisy image interpolation method can reduce many noise-caused interpolation artifacts and preserve well the image edge structures. The visual and quantitative results show that the proposed technique outperforms many of the existing denoising and interpolation methods.

Sub-Image Detection Using Fast Neural Processors and Image Decomposition

In this paper, an approach to reduce the computation steps required by fast neural networksfor the searching process is presented. The principle ofdivide and conquer strategy is applied through imagedecomposition. Each image is divided into small in sizesub-images and then each one is tested separately usinga fast neural network. The operation of fast neuralnetworks based on applying cross correlation in thefrequency domain between the input image and theweights of the hidden neurons. Compared toconventional and fast neural networks, experimentalresults show that a speed up ratio is achieved whenapplying this technique to locate human facesautomatically in cluttered scenes. Furthermore, fasterface detection is obtained by using parallel processingtechniques to test the resulting sub-images at the sametime using the same number of fast neural networks. Incontrast to using only fast neural networks, the speed upratio is increased with the size of the input image whenusing fast neural networks and image decomposition.

Estimation of the Bit Side Force by Using Artificial Neural Network

Horizontal wells are proven to be better producers because they can be extended for a long distance in the pay zone. Engineers have the technical means to forecast the well productivity for a given horizontal length. However, experiences have shown that the actual production rate is often significantly less than that of forecasted. It is a difficult task, if not impossible to identify the real reason why a horizontal well is not producing what was forecasted. Often the source of problem lies in the drilling of horizontal section such as permeability reduction in the pay zone due to mud invasion or snaky well patterns created during drilling. Although drillers aim to drill a constant inclination hole in the pay zone, the more frequent outcome is a sinusoidal wellbore trajectory. The two factors, which play an important role in wellbore tortuosity, are the inclination and side force at bit. A constant inclination horizontal well can only be drilled if the bit face is maintained perpendicular to longitudinal axis of bottom hole assembly (BHA) while keeping the side force nil at the bit. This approach assumes that there exists no formation force at bit. Hence, an appropriate BHA can be designed if bit side force and bit tilt are determined accurately. The Artificial Neural Network (ANN) is superior to existing analytical techniques. In this study, the neural networks have been employed as a general approximation tool for estimation of the bit side forces. A number of samples are analyzed with ANN for parameters of bit side force and the results are compared with exact analysis. Back Propagation Neural network (BPN) is used to approximation of bit side forces. Resultant low relative error value of the test indicates the usability of the BPN in this area.

Clustering Unstructured Text Documents Using Fading Function

Clustering unstructured text documents is an important issue in data mining community and has a number of applications such as document archive filtering, document organization and topic detection and subject tracing. In the real world, some of the already clustered documents may not be of importance while new documents of more significance may evolve. Most of the work done so far in clustering unstructured text documents overlooks this aspect of clustering. This paper, addresses this issue by using the Fading Function. The unstructured text documents are clustered. And for each cluster a statistics structure called Cluster Profile (CP) is implemented. The cluster profile incorporates the Fading Function. This Fading Function keeps an account of the time-dependent importance of the cluster. The work proposes a novel algorithm Clustering n-ary Merge Algorithm (CnMA) for unstructured text documents, that uses Cluster Profile and Fading Function. Experimental results illustrating the effectiveness of the proposed technique are also included.

A Type-2 Fuzzy Adaptive Controller of a Class of Nonlinear System

In this paper we propose a robust adaptive fuzzy controller for a class of nonlinear system with unknown dynamic. The method is based on type-2 fuzzy logic system to approximate unknown non-linear function. The design of the on-line adaptive scheme of the proposed controller is based on Lyapunov technique. Simulation results are given to illustrate the effectiveness of the proposed approach.

Versioning OWL Ontologies using Temporal Tags

Ontologies play an important role in semantic web applications and are often developed by different groups and continues to evolve over time. The knowledge in ontologies changes very rapidly that make the applications outdated if they continue to use old versions or unstable if they jump to new versions. Temporal frames using frame versioning and slot versioning are used to take care of dynamic nature of the ontologies. The paper proposes new tags and restructured OWL format enabling the applications to work with the old or new version of ontologies. Gene Ontology, a very dynamic ontology, has been used as a case study to explain the OWL Ontology with Temporal Tags.

Level of Service Based Methodology for Municipal Infrastructure Management

Development of levels of service in municipal context is a flexible vehicle to assist in performing quality-cost trade-off analysis for municipal services. This trade-off depends on the willingness of a community to pay as well as on the condition of the assets. Community perspective of the performance of an asset from service point of view may be quite different from the municipality perspective of the performance of the same asset from condition point of view. This paper presents a three phased level of service based methodology for water mains that consists of :1)development of an Analytical Hierarchy model of level of service 2) development of Fuzzy Weighted Sum model of water main condition index and 3) deriving a Fuzzy logic based function that maps level of service to asset condition index. This mapping will assist asset managers in quantifying condition improvement requirement to meet service goals and to make more informed decisions on interventions and relayed priorities.

Development of Content Management System with Animated Graph

Animated graph gives some good impressions in presenting information. However, not many people are able to produce it because the process of generating an animated graph requires some technical skills. This work presents Content Management System with Animated Graph (CMS-AG). It is a webbased system enabling users to produce an effective and interactive graphical report in a short time period. It allows for three levels of user authentication, provides update profile, account management, template management, graph management, and track changes. The system development applies incremental development approach, object-oriented concepts and Web programming technologies. The design architecture promotes new technology of reporting. It also helps user cut off unnecessary expenses, save time and learn new things on different levels of users. In this paper, the developed system is described.

The Knapsack Sharing Problem: A Tree Search Exact Algorithm

In this paper, we study the knapsack sharing problem, a variant of the well-known NP-Hard single knapsack problem. We investigate the use of a tree search for optimally solving the problem. The used method combines two complementary phases: a reduction interval search phase and a branch and bound procedure one. First, the reduction phase applies a polynomial reduction strategy; that is used for decomposing the problem into a series of knapsack problems. Second, the tree search procedure is applied in order to attain a set of optimal capacities characterizing the knapsack problems. Finally, the performance of the proposed optimal algorithm is evaluated on a set of instances of the literature and its runtime is compared to the best exact algorithm of the literature.

Data and Control Flow Analysis of VDMµ Specifications

Formal Specification languages are being widely used for system specification and testing. Highly critical systems such as real time systems, avionics, and medical systems are represented using Formal specification languages. Formal specifications based testing is mostly performed using black box testing approaches thus testing only the set of inputs and outputs of the system. The formal specification language such as VDMµ can be used for white box testing as they provide enough constructs as any other high level programming language. In this work, we perform data and control flow analysis of VDMµ class specifications. The proposed work is discussed with an example of SavingAccount.