Abstract: In this paper we propose a new classification method for automatic sleep scoring using an artificial neural network based decision tree. It attempts to treat sleep scoring progress as a series of two-class problems and solves them with a decision tree made up of a group of neural network classifiers, each of which uses a special feature set and is aimed at only one specific sleep stage in order to maximize the classification effect. A single electroencephalogram (EEG) signal is used for our analysis rather than depending on multiple biological signals, which makes greatly simplifies the data acquisition process. Experimental results demonstrate that the average epoch by epoch agreement between the visual and the proposed method in separating 30s wakefulness+S1, REM, S2 and SWS epochs was 88.83%. This study shows that the proposed method performed well in all the four stages, and can effectively limit error propagation at the same time. It could, therefore, be an efficient method for automatic sleep scoring. Additionally, since it requires only a small volume of data it could be suited to pervasive applications.
Abstract: In Content-Based Image Retrieval systems it is
important to use an efficient indexing technique in order to perform
and accelerate the search in huge databases. The used indexing
technique should also support the high dimensions of image features.
In this paper we present the hierarchical index NOHIS-tree (Non
Overlapping Hierarchical Index Structure) when we scale up to very
large databases. We also present a study of the influence of clustering
on search time. The performance test results show that NOHIS-tree
performs better than SR-tree. Tests also show that NOHIS-tree keeps
its performances in high dimensional spaces. We include the
performance test that try to determine the number of clusters in
NOHIS-tree to have the best search time.
Abstract: Machine Translation (MT) between the Thai and English languages has been a challenging research topic in natural language processing. Most research has been done on English to Thai machine translation, but not the other way around. This paper presents a Thai to English Machine Translation System that translates a Thai sentence into interlingua of a Thai LFG tree using LFG grammar and a bottom up parser. The Thai LFG tree is then transformed into the corresponding English LFG tree by pattern matching and node transformation. Finally, an equivalent English sentence is created using structural information prescribed by the English LFG tree. Based on results of experiments designed to evaluate the performance of the proposed system, it can be stated that the system has been proven to be effective in providing a useful translation from Thai to English.
Abstract: In this paper, we propose a Connect6 solver which
adopts a hybrid approach based on a tree-search algorithm and image
processing techniques. The solver must deal with the complicated
computation and provide high performance in order to make real-time
decisions. The proposed approach enables the solver to be
implemented on a single Spartan-6 XC6SLX45 FPGA produced by
XILINX without using any external devices. The compact
implementation is achieved through image processing techniques to
optimize a tree-search algorithm of the Connect6 game. The tree
search is widely used in computer games and the optimal search brings
the best move in every turn of a computer game. Thus, many
tree-search algorithms such as Minimax algorithm and artificial
intelligence approaches have been widely proposed in this field.
However, there is one fundamental problem in this area; the
computation time increases rapidly in response to the growth of the
game tree. It means the larger the game tree is, the bigger the circuit
size is because of their highly parallel computation characteristics.
Here, this paper aims to reduce the size of a Connect6 game tree using
image processing techniques and its position symmetric property. The
proposed solver is composed of four computational modules: a
two-dimensional checkmate strategy checker, a template matching
module, a skilful-line predictor, and a next-move selector. These
modules work well together in selecting next moves from some
candidates and the total amount of their circuits is small. The details of
the hardware design for an FPGA implementation are described and
the performance of this design is also shown in this paper.
Abstract: Decision tree algorithms have very important place at
classification model of data mining. In literature, algorithms use
entropy concept or gini index to form the tree. The shape of the
classes and their closeness to each other some of the factors that
affect the performance of the algorithm. In this paper we introduce a
new decision tree algorithm which employs data (attribute) folding
method and variation of the class variables over the branches to be
created. A comparative performance analysis has been held between
the proposed algorithm and C4.5.
Abstract: An important structuring mechanism for knowledge bases is building clusters based on the content of their knowledge objects. The objects are clustered based on the principle of maximizing the intraclass similarity and minimizing the interclass similarity. Clustering can also facilitate taxonomy formation, that is, the organization of observations into a hierarchy of classes that group similar events together. Hierarchical representation allows us to easily manage the complexity of knowledge, to view the knowledge at different levels of details, and to focus our attention on the interesting aspects only. One of such efficient and easy to understand systems is Hierarchical Production rule (HPRs) system. A HPR, a standard production rule augmented with generality and specificity information, is of the following form Decision If < condition> Generality Specificity . HPRs systems are capable of handling taxonomical structures inherent in the knowledge about the real world. In this paper, a set of related HPRs is called a cluster and is represented by a HPR-tree. This paper discusses an algorithm based on cumulative learning scenario for dynamic structuring of clusters. The proposed scheme incrementally incorporates new knowledge into the set of clusters from the previous episodes and also maintains summary of clusters as Synopsis to be used in the future episodes. Examples are given to demonstrate the behaviour of the proposed scheme. The suggested incremental structuring of clusters would be useful in mining data streams.
Abstract: The evolutionary tree is an important topic in bioinformation. In 2006, Chen and Lindsay proposed a new method to build the mixture tree from DNA sequences. Mixture tree is a new type evolutionary tree, and it has two additional information besides the information of ordinary evolutionary tree. One of the information is time parameter, and the other is the set of mutated sites. In 2008, Lin and Juan proposed an algorithm to compute the distance between two mixture trees. Their algorithm computes the distance with only considering the time parameter between two mixture trees. In this paper, we proposes a method to measure the similarity of two mixture trees with considering the set of mutated sites and develops two algorithm to compute the distance between two mixture trees. The time complexity of these two proposed algorithms are O(n2 × max{h(T1), h(T2)}) and O(n2), respectively
Abstract: Many multimedia communication applications require a
source to transmit messages to multiple destinations subject to quality
of service (QoS) delay constraint. To support delay constrained
multicast communications, computer networks need to guarantee an
upper bound end-to-end delay from the source node to each of
the destination nodes. This is known as multicast delay problem.
On the other hand, if the same message fails to arrive at each
destination node at the same time, there may arise inconsistency and
unfairness problem among users. This is related to multicast delayvariation
problem. The problem to find a minimum cost multicast
tree with delay and delay-variation constraints has been proven to
be NP-Complete. In this paper, we propose an efficient heuristic
algorithm, namely, Economic Delay and Delay-Variation Bounded
Multicast (EDVBM) algorithm, based on a novel heuristic function,
to construct an economic delay and delay-variation bounded multicast
tree. A noteworthy feature of this algorithm is that it has very high
probability of finding the optimal solution in polynomial time with
low computational complexity.
Abstract: Point quad tree is considered as one of the most
common data organizations to deal with spatial data & can be used to
increase the efficiency for searching the point features. As the
efficiency of the searching technique depends on the height of the
tree, arbitrary insertion of the point features may make the tree
unbalanced and lead to higher time of searching. This paper attempts
to design an algorithm to make a nearly balanced quad tree. Point
pattern analysis technique has been applied for this purpose which
shows a significant enhancement of the performance and the results
are also included in the paper for the sake of completeness.
Abstract: Performance of any continuous speech recognition system is highly dependent on performance of the acoustic models. Generally, development of the robust spoken language technology relies on the availability of large amounts of data. Common way to cope with little data for training each state of Markov models is treebased state tying. This tying method applies contextual questions to tie states. Manual procedure for question generation suffers from human errors and is time consuming. Various automatically generated questions are used to construct decision tree. There are three approaches to generate questions to construct HMMs based on decision tree. One approach is based on misrecognized phonemes, another approach basically uses feature table and the other is based on state distributions corresponding to context-independent subword units. In this paper, all these methods of automatic question generation are applied to the decision tree on FARSDAT corpus in Persian language and their results are compared with those of manually generated questions. The results show that automatically generated questions yield much better results and can replace manually generated questions in Persian language.
Abstract: Generalized Center String (GCS) problem are
generalized from Common Approximate Substring problem
and Common substring problems. GCS are known to be
NP-hard allowing the problems lies in the explosion of
potential candidates. Finding longest center string without
concerning the sequence that may not contain any motifs is
not known in advance in any particular biological gene
process. GCS solved by frequent pattern-mining techniques
and known to be fixed parameter tractable based on the
fixed input sequence length and symbol set size. Efficient
method known as Bpriori algorithms can solve GCS with
reasonable time/space complexities. Bpriori 2 and Bpriori
3-2 algorithm are been proposed of any length and any
positions of all their instances in input sequences. In this
paper, we reduced the time/space complexity of Bpriori
algorithm by Constrained Based Frequent Pattern mining
(CBFP) technique which integrates the idea of Constraint
Based Mining and FP-tree mining. CBFP mining technique
solves the GCS problem works for all center string of any
length, but also for the positions of all their mutated copies
of input sequence. CBFP mining technique construct TRIE
like with FP tree to represent the mutated copies of center
string of any length, along with constraints to restraint
growth of the consensus tree. The complexity analysis for
Constrained Based FP mining technique and Bpriori
algorithm is done based on the worst case and average case
approach. Algorithm's correctness compared with the
Bpriori algorithm using artificial data is shown.
Abstract: Multiplication algorithms have considerable effect on
processors performance. A new high-speed, low-power
multiplication algorithm has been presented using modified Dadda
tree structure. Three important modifications have been implemented
in inner product generation step, inner product reduction step and
final addition step. Optimized algorithms have to be used into basic
computation components, such as multiplication algorithms. In this
paper, we proposed a new algorithm to reduce power, delay, and
transistor count of a multiplication algorithm implemented using low
power modified counter. This work presents a novel design for
Dadda multiplication algorithms. The proposed multiplication
algorithm includes structured parts, which have important effect on
inner product reduction tree. In this paper, a 1.3V, 64-bit carry hybrid
adder is presented for fast, low voltage applications. The new 64-bit
adder uses a new circuit to implement the proposed carry hybrid
adder. The new adder using 80 nm CMOS technology has been
implemented on 700 MHz clock frequency. The proposed
multiplication algorithm has achieved 14 percent improvement in
transistor count, 13 percent reduction in delay and 12 percent
modification in power consumption in compared with conventional
designs.
Abstract: The drug discovery process starts with protein
identification because proteins are responsible for many functions
required for maintenance of life. Protein identification further needs
determination of protein function. Proposed method develops a
classifier for human protein function prediction. The model uses
decision tree for classification process. The protein function is
predicted on the basis of matched sequence derived features per each
protein function. The research work includes the development of a
tool which determines sequence derived features by analyzing
different parameters. The other sequence derived features are
determined using various web based tools.
Abstract: Keystroke authentication is a new access control system
to identify legitimate users via their typing behavior. In this paper,
machine learning techniques are adapted for keystroke authentication.
Seven learning methods are used to build models to differentiate user
keystroke patterns. The selected classification methods are Decision
Tree, Naive Bayesian, Instance Based Learning, Decision Table, One
Rule, Random Tree and K-star. Among these methods, three of them
are studied in more details. The results show that machine learning
is a feasible alternative for keystroke authentication. Compared to
the conventional Nearest Neighbour method in the recent research,
learning methods especially Decision Tree can be more accurate. In
addition, the experiment results reveal that 3-Grams is more accurate
than 2-Grams and 4-Grams for feature extraction. Also, combination
of attributes tend to result higher accuracy.
Abstract: Demand over web services is in growing with increases number of Web users. Web service is applied by Web application. Web application size is affected by its user-s requirements and interests. Differential in requirements and interests lead to growing of Web application size. The efficient way to save store spaces for more data and information is achieved by implementing algorithms to compress the contents of Web application documents. This paper introduces an algorithm to reduce Web application size based on reduction of the contents of HTML files. It removes unimportant contents regardless of the HTML file size. The removing is not ignored any character that is predicted in the HTML building process.
Abstract: To meet the demands of wireless sensor networks
(WSNs) where data are usually aggregated at a single source prior to
transmitting to any distant user, there is a need to establish a tree
structure inside any given event region. In this paper , a novel
technique to create one such tree is proposed .This tree preserves the
energy and maximizes the lifetime of event sources while they are
constantly transmitting for data aggregation. The term Decentralized
Lifetime Maximizing Tree (DLMT) is used to denote this tree.
DLMT features in nodes with higher energy tend to be chosen as data
aggregating parents so that the time to detect the first broken tree link
can be extended and less energy is involved in tree maintenance. By
constructing the tree in such a way, the protocol is able to reduce the
frequency of tree reconstruction, minimize the amount of data loss
,minimize the delay during data collection and preserves the energy.
Abstract: Despite various methods that exist in software risk management, software projects have a high rate of failure. When complexity and size of the projects are increased, managing software development becomes more difficult. In these projects the need for more analysis and risk assessment is vital. In this paper, a classification for software risks is specified. Then relations between these risks using risk tree structure are presented. Analysis and assessment of these risks are done using probabilistic calculations. This analysis helps qualitative and quantitative assessment of risk of failure. Moreover it can help software risk management process. This classification and risk tree structure can apply to some software tools.
Abstract: The purpose of this paper is to propose a framework for constructing correct parallel processing programs based on Equivalent Transformation Framework (ETF). ETF regards computation as In the framework, a problem-s domain knowledge and a query are described in definite clauses, and computation is regarded as transformation of the definite clauses. Its meaning is defined by a model of the set of definite clauses, and the transformation rules generated must preserve meaning. We have proposed a parallel processing method based on “specialization", a part of operation in the transformations, which resembles substitution in logic programming. The method requires “Memo-tree", a history of specialization to maintain correctness. In this paper we proposes the new method for the specialization-base parallel processing without Memo-tree.
Abstract: A new hybrid coding method for compressing
animated polygonal meshes is presented. This paper assumes
the simplistic representation of the geometric data: a temporal
sequence of polygonal meshes for each discrete frame of the
animated sequence. The method utilizes a delta coding and an
octree-based method. In this hybrid method, both the octree
approach and the delta coding approach are applied to each
single frame in the animation sequence in parallel. The
approach that generates the smaller encoded file size is chosen
to encode the current frame. Given the same quality
requirement, the hybrid coding method can achieve much
higher compression ratio than the octree-only method or the
delta-only method. The hybrid approach can represent 3D
animated sequences with higher compression factors while
maintaining reasonable quality. It is easy to implement and have
a low cost encoding process and a fast decoding process, which
make it a better choice for real time application.
Abstract: Nowadays data backup format doesn-t cease to appear raising so the anxiety on their accessibility and their perpetuity. XML is one of the most promising formats to guarantee the integrity of data. This article suggests while showing one thing man can do with XML. Indeed XML will help to create a data backup model. The main task will consist in defining an application in JAVA able to convert information of a database in XML format and restore them later.