A Hybrid Classification Method using Artificial Neural Network Based Decision Tree for Automatic Sleep Scoring

In this paper we propose a new classification method for automatic sleep scoring using an artificial neural network based decision tree. It attempts to treat sleep scoring progress as a series of two-class problems and solves them with a decision tree made up of a group of neural network classifiers, each of which uses a special feature set and is aimed at only one specific sleep stage in order to maximize the classification effect. A single electroencephalogram (EEG) signal is used for our analysis rather than depending on multiple biological signals, which makes greatly simplifies the data acquisition process. Experimental results demonstrate that the average epoch by epoch agreement between the visual and the proposed method in separating 30s wakefulness+S1, REM, S2 and SWS epochs was 88.83%. This study shows that the proposed method performed well in all the four stages, and can effectively limit error propagation at the same time. It could, therefore, be an efficient method for automatic sleep scoring. Additionally, since it requires only a small volume of data it could be suited to pervasive applications.

NOHIS-Tree: High-Dimensional Index Structure for Similarity Search

In Content-Based Image Retrieval systems it is important to use an efficient indexing technique in order to perform and accelerate the search in huge databases. The used indexing technique should also support the high dimensions of image features. In this paper we present the hierarchical index NOHIS-tree (Non Overlapping Hierarchical Index Structure) when we scale up to very large databases. We also present a study of the influence of clustering on search time. The performance test results show that NOHIS-tree performs better than SR-tree. Tests also show that NOHIS-tree keeps its performances in high dimensional spaces. We include the performance test that try to determine the number of clusters in NOHIS-tree to have the best search time.

A Thai to English Machine Translation System Using Thai LFG Tree Structure as Interlingua

Machine Translation (MT) between the Thai and English languages has been a challenging research topic in natural language processing. Most research has been done on English to Thai machine translation, but not the other way around. This paper presents a Thai to English Machine Translation System that translates a Thai sentence into interlingua of a Thai LFG tree using LFG grammar and a bottom up parser. The Thai LFG tree is then transformed into the corresponding English LFG tree by pattern matching and node transformation. Finally, an equivalent English sentence is created using structural information prescribed by the English LFG tree. Based on results of experiments designed to evaluate the performance of the proposed system, it can be stated that the system has been proven to be effective in providing a useful translation from Thai to English.

Game-Tree Simplification by Pattern Matching and Its Acceleration Approach using an FPGA

In this paper, we propose a Connect6 solver which adopts a hybrid approach based on a tree-search algorithm and image processing techniques. The solver must deal with the complicated computation and provide high performance in order to make real-time decisions. The proposed approach enables the solver to be implemented on a single Spartan-6 XC6SLX45 FPGA produced by XILINX without using any external devices. The compact implementation is achieved through image processing techniques to optimize a tree-search algorithm of the Connect6 game. The tree search is widely used in computer games and the optimal search brings the best move in every turn of a computer game. Thus, many tree-search algorithms such as Minimax algorithm and artificial intelligence approaches have been widely proposed in this field. However, there is one fundamental problem in this area; the computation time increases rapidly in response to the growth of the game tree. It means the larger the game tree is, the bigger the circuit size is because of their highly parallel computation characteristics. Here, this paper aims to reduce the size of a Connect6 game tree using image processing techniques and its position symmetric property. The proposed solver is composed of four computational modules: a two-dimensional checkmate strategy checker, a template matching module, a skilful-line predictor, and a next-move selector. These modules work well together in selecting next moves from some candidates and the total amount of their circuits is small. The details of the hardware design for an FPGA implementation are described and the performance of this design is also shown in this paper.

An Attribute-Centre Based Decision Tree Classification Algorithm

Decision tree algorithms have very important place at classification model of data mining. In literature, algorithms use entropy concept or gini index to form the tree. The shape of the classes and their closeness to each other some of the factors that affect the performance of the algorithm. In this paper we introduce a new decision tree algorithm which employs data (attribute) folding method and variation of the class variables over the branches to be created. A comparative performance analysis has been held between the proposed algorithm and C4.5.

Cumulative Learning based on Dynamic Clustering of Hierarchical Production Rules(HPRs)

An important structuring mechanism for knowledge bases is building clusters based on the content of their knowledge objects. The objects are clustered based on the principle of maximizing the intraclass similarity and minimizing the interclass similarity. Clustering can also facilitate taxonomy formation, that is, the organization of observations into a hierarchy of classes that group similar events together. Hierarchical representation allows us to easily manage the complexity of knowledge, to view the knowledge at different levels of details, and to focus our attention on the interesting aspects only. One of such efficient and easy to understand systems is Hierarchical Production rule (HPRs) system. A HPR, a standard production rule augmented with generality and specificity information, is of the following form Decision If < condition> Generality Specificity . HPRs systems are capable of handling taxonomical structures inherent in the knowledge about the real world. In this paper, a set of related HPRs is called a cluster and is represented by a HPR-tree. This paper discusses an algorithm based on cumulative learning scenario for dynamic structuring of clusters. The proposed scheme incrementally incorporates new knowledge into the set of clusters from the previous episodes and also maintains summary of clusters as Synopsis to be used in the future episodes. Examples are given to demonstrate the behaviour of the proposed scheme. The suggested incremental structuring of clusters would be useful in mining data streams.

The Mutated Distance between Two Mixture Trees

The evolutionary tree is an important topic in bioinformation. In 2006, Chen and Lindsay proposed a new method to build the mixture tree from DNA sequences. Mixture tree is a new type evolutionary tree, and it has two additional information besides the information of ordinary evolutionary tree. One of the information is time parameter, and the other is the set of mutated sites. In 2008, Lin and Juan proposed an algorithm to compute the distance between two mixture trees. Their algorithm computes the distance with only considering the time parameter between two mixture trees. In this paper, we proposes a method to measure the similarity of two mixture trees with considering the set of mutated sites and develops two algorithm to compute the distance between two mixture trees. The time complexity of these two proposed algorithms are O(n2 × max{h(T1), h(T2)}) and O(n2), respectively

An Efficient Algorithm for Delay Delay-variation Bounded Least Cost Multicast Routing

Many multimedia communication applications require a source to transmit messages to multiple destinations subject to quality of service (QoS) delay constraint. To support delay constrained multicast communications, computer networks need to guarantee an upper bound end-to-end delay from the source node to each of the destination nodes. This is known as multicast delay problem. On the other hand, if the same message fails to arrive at each destination node at the same time, there may arise inconsistency and unfairness problem among users. This is related to multicast delayvariation problem. The problem to find a minimum cost multicast tree with delay and delay-variation constraints has been proven to be NP-Complete. In this paper, we propose an efficient heuristic algorithm, namely, Economic Delay and Delay-Variation Bounded Multicast (EDVBM) algorithm, based on a novel heuristic function, to construct an economic delay and delay-variation bounded multicast tree. A noteworthy feature of this algorithm is that it has very high probability of finding the optimal solution in polynomial time with low computational complexity.

Balancing of Quad Tree using Point Pattern Analysis

Point quad tree is considered as one of the most common data organizations to deal with spatial data & can be used to increase the efficiency for searching the point features. As the efficiency of the searching technique depends on the height of the tree, arbitrary insertion of the point features may make the tree unbalanced and lead to higher time of searching. This paper attempts to design an algorithm to make a nearly balanced quad tree. Point pattern analysis technique has been applied for this purpose which shows a significant enhancement of the performance and the results are also included in the paper for the sake of completeness.

Comparison among Various Question Generations for Decision Tree Based State Tying in Persian Language

Performance of any continuous speech recognition system is highly dependent on performance of the acoustic models. Generally, development of the robust spoken language technology relies on the availability of large amounts of data. Common way to cope with little data for training each state of Markov models is treebased state tying. This tying method applies contextual questions to tie states. Manual procedure for question generation suffers from human errors and is time consuming. Various automatically generated questions are used to construct decision tree. There are three approaches to generate questions to construct HMMs based on decision tree. One approach is based on misrecognized phonemes, another approach basically uses feature table and the other is based on state distributions corresponding to context-independent subword units. In this paper, all these methods of automatic question generation are applied to the decision tree on FARSDAT corpus in Persian language and their results are compared with those of manually generated questions. The results show that automatically generated questions yield much better results and can replace manually generated questions in Persian language.

Constraint Based Frequent Pattern Mining Technique for Solving GCS Problem

Generalized Center String (GCS) problem are generalized from Common Approximate Substring problem and Common substring problems. GCS are known to be NP-hard allowing the problems lies in the explosion of potential candidates. Finding longest center string without concerning the sequence that may not contain any motifs is not known in advance in any particular biological gene process. GCS solved by frequent pattern-mining techniques and known to be fixed parameter tractable based on the fixed input sequence length and symbol set size. Efficient method known as Bpriori algorithms can solve GCS with reasonable time/space complexities. Bpriori 2 and Bpriori 3-2 algorithm are been proposed of any length and any positions of all their instances in input sequences. In this paper, we reduced the time/space complexity of Bpriori algorithm by Constrained Based Frequent Pattern mining (CBFP) technique which integrates the idea of Constraint Based Mining and FP-tree mining. CBFP mining technique solves the GCS problem works for all center string of any length, but also for the positions of all their mutated copies of input sequence. CBFP mining technique construct TRIE like with FP tree to represent the mutated copies of center string of any length, along with constraints to restraint growth of the consensus tree. The complexity analysis for Constrained Based FP mining technique and Bpriori algorithm is done based on the worst case and average case approach. Algorithm's correctness compared with the Bpriori algorithm using artificial data is shown.

A High-Speed Multiplication Algorithm Using Modified Partial Product Reduction Tree

Multiplication algorithms have considerable effect on processors performance. A new high-speed, low-power multiplication algorithm has been presented using modified Dadda tree structure. Three important modifications have been implemented in inner product generation step, inner product reduction step and final addition step. Optimized algorithms have to be used into basic computation components, such as multiplication algorithms. In this paper, we proposed a new algorithm to reduce power, delay, and transistor count of a multiplication algorithm implemented using low power modified counter. This work presents a novel design for Dadda multiplication algorithms. The proposed multiplication algorithm includes structured parts, which have important effect on inner product reduction tree. In this paper, a 1.3V, 64-bit carry hybrid adder is presented for fast, low voltage applications. The new 64-bit adder uses a new circuit to implement the proposed carry hybrid adder. The new adder using 80 nm CMOS technology has been implemented on 700 MHz clock frequency. The proposed multiplication algorithm has achieved 14 percent improvement in transistor count, 13 percent reduction in delay and 12 percent modification in power consumption in compared with conventional designs.

Predicting Protein Function using Decision Tree

The drug discovery process starts with protein identification because proteins are responsible for many functions required for maintenance of life. Protein identification further needs determination of protein function. Proposed method develops a classifier for human protein function prediction. The model uses decision tree for classification process. The protein function is predicted on the basis of matched sequence derived features per each protein function. The research work includes the development of a tool which determines sequence derived features by analyzing different parameters. The other sequence derived features are determined using various web based tools.

Learning User Keystroke Patterns for Authentication

Keystroke authentication is a new access control system to identify legitimate users via their typing behavior. In this paper, machine learning techniques are adapted for keystroke authentication. Seven learning methods are used to build models to differentiate user keystroke patterns. The selected classification methods are Decision Tree, Naive Bayesian, Instance Based Learning, Decision Table, One Rule, Random Tree and K-star. Among these methods, three of them are studied in more details. The results show that machine learning is a feasible alternative for keystroke authentication. Compared to the conventional Nearest Neighbour method in the recent research, learning methods especially Decision Tree can be more accurate. In addition, the experiment results reveal that 3-Grams is more accurate than 2-Grams and 4-Grams for feature extraction. Also, combination of attributes tend to result higher accuracy.

An Optimal Algorithm for HTML Page Building Process

Demand over web services is in growing with increases number of Web users. Web service is applied by Web application. Web application size is affected by its user-s requirements and interests. Differential in requirements and interests lead to growing of Web application size. The efficient way to save store spaces for more data and information is achieved by implementing algorithms to compress the contents of Web application documents. This paper introduces an algorithm to reduce Web application size based on reduction of the contents of HTML files. It removes unimportant contents regardless of the HTML file size. The removing is not ignored any character that is predicted in the HTML building process.

Construction Of Decentralized Lifetime Maximizing Tree for Data Aggregation in Wireless Sensor Networks

To meet the demands of wireless sensor networks (WSNs) where data are usually aggregated at a single source prior to transmitting to any distant user, there is a need to establish a tree structure inside any given event region. In this paper , a novel technique to create one such tree is proposed .This tree preserves the energy and maximizes the lifetime of event sources while they are constantly transmitting for data aggregation. The term Decentralized Lifetime Maximizing Tree (DLMT) is used to denote this tree. DLMT features in nodes with higher energy tend to be chosen as data aggregating parents so that the time to detect the first broken tree link can be extended and less energy is involved in tree maintenance. By constructing the tree in such a way, the protocol is able to reduce the frequency of tree reconstruction, minimize the amount of data loss ,minimize the delay during data collection and preserves the energy.

Classification and Analysis of Risks in Software Engineering

Despite various methods that exist in software risk management, software projects have a high rate of failure. When complexity and size of the projects are increased, managing software development becomes more difficult. In these projects the need for more analysis and risk assessment is vital. In this paper, a classification for software risks is specified. Then relations between these risks using risk tree structure are presented. Analysis and assessment of these risks are done using probabilistic calculations. This analysis helps qualitative and quantitative assessment of risk of failure. Moreover it can help software risk management process. This classification and risk tree structure can apply to some software tools.

Specialization-based parallel Processing without Memo-trees

The purpose of this paper is to propose a framework for constructing correct parallel processing programs based on Equivalent Transformation Framework (ETF). ETF regards computation as In the framework, a problem-s domain knowledge and a query are described in definite clauses, and computation is regarded as transformation of the definite clauses. Its meaning is defined by a model of the set of definite clauses, and the transformation rules generated must preserve meaning. We have proposed a parallel processing method based on “specialization", a part of operation in the transformations, which resembles substitution in logic programming. The method requires “Memo-tree", a history of specialization to maintain correctness. In this paper we proposes the new method for the specialization-base parallel processing without Memo-tree.

Hybrid Coding for Animated Polygonal Meshes

A new hybrid coding method for compressing animated polygonal meshes is presented. This paper assumes the simplistic representation of the geometric data: a temporal sequence of polygonal meshes for each discrete frame of the animated sequence. The method utilizes a delta coding and an octree-based method. In this hybrid method, both the octree approach and the delta coding approach are applied to each single frame in the animation sequence in parallel. The approach that generates the smaller encoded file size is chosen to encode the current frame. Given the same quality requirement, the hybrid coding method can achieve much higher compression ratio than the octree-only method or the delta-only method. The hybrid approach can represent 3D animated sequences with higher compression factors while maintaining reasonable quality. It is easy to implement and have a low cost encoding process and a fast decoding process, which make it a better choice for real time application.

Use XML Format like a Model of Data Backup

Nowadays data backup format doesn-t cease to appear raising so the anxiety on their accessibility and their perpetuity. XML is one of the most promising formats to guarantee the integrity of data. This article suggests while showing one thing man can do with XML. Indeed XML will help to create a data backup model. The main task will consist in defining an application in JAVA able to convert information of a database in XML format and restore them later.