Abstract: In India, there is a lack of standard, systematic sizing approach for producing readymade garments. Garments manufacturing companies use their own created size tables by modifying international sizing charts of ready-made garments. The purpose of this study is to tabulate the anthropometric data which cover the variety of figure proportions in both height and girth. 3,000 data have been collected by an anthropometric survey undertaken over females between the ages of 16 to 80 years from the some states of India to produce the sizing system suitable for clothing manufacture and retailing. The data are used for the statistical analysis of body measurements, the formulation of sizing systems and body measurements tables. Factor analysis technique is used to filter the control body dimensions from the large number of variables. Decision tree-based data mining is used to cluster the data. The standard and structured sizing system can facilitate pattern grading and garment production. Moreover, it can exceed buying ratios and upgrade size allocations to retail segments.
Abstract: Many supervised machine learning tasks require
decision making across numerous different classes. Multi-class
classification has several applications, such as face recognition, text
recognition and medical diagnostics. The objective of this article is
to analyze an adapted method of Stacking in multi-class problems,
which combines ensembles within the ensemble itself. For this
purpose, a training similar to Stacking was used, but with three
levels, where the final decision-maker (level 2) performs its training
by combining outputs from the tree-based pair of meta-classifiers
(level 1) from Bayesian families. These are in turn trained by pairs
of base classifiers (level 0) of the same family. This strategy seeks to
promote diversity among the ensembles forming the meta-classifier
level 2. Three performance measures were used: (1) accuracy, (2)
area under the ROC curve, and (3) time for three factors: (a)
datasets, (b) experiments and (c) levels. To compare the factors,
ANOVA three-way test was executed for each performance measure,
considering 5 datasets by 25 experiments by 3 levels. A triple
interaction between factors was observed only in time. The accuracy
and area under the ROC curve presented similar results, showing
a double interaction between level and experiment, as well as for
the dataset factor. It was concluded that level 2 had an average
performance above the other levels and that the proposed method
is especially efficient for multi-class problems when compared to
binary problems.
Abstract: The traditional k-means algorithm has been widely used as a simple and efficient clustering method. However, the algorithm often converges to local minima for the reason that it is sensitive to the initial cluster centers. In this paper, an algorithm for selecting initial cluster centers on the basis of minimum spanning tree (MST) is presented. The set of vertices in MST with same degree are regarded as a whole which is used to find the skeleton data points. Furthermore, a distance measure between the skeleton data points with consideration of degree and Euclidean distance is presented. Finally, MST-based initialization method for the k-means algorithm is presented, and the corresponding time complexity is analyzed as well. The presented algorithm is tested on five data sets from the UCI Machine Learning Repository. The experimental results illustrate the effectiveness of the presented algorithm compared to three existing initialization methods.
Abstract: In wireless sensor networks, locality and positioning information can be captured using Global Positioning System (GPS). This message can be congregated initially from spot to identify the system. Users can retrieve information of interest from a wireless sensor network (WSN) by injecting queries and gathering results from the mobile sink nodes. Routing is the progression of choosing optimal path in a mobile network. Intermediate node employs permutation of device nodes into teams and generating cluster heads that gather the data from entity cluster’s node and encourage the collective data to base station. WSNs are widely used for gathering data. Since sensors are power-constrained devices, it is quite vital for them to reduce the power utilization. A tree-based data fusion clustering routing algorithm (TBDFC) is used to reduce energy consumption in wireless device networks. Here, the nodes in a tree use the cluster formation, whereas the elevation of the tree is decided based on the distance of the member nodes to the cluster-head. Network simulation shows that this scheme improves the power utilization by the nodes, and thus considerably improves the lifetime.
Abstract: This work is on decision tree-based classification for
the disbursement of scholarship. Tree-based data mining
classification technique is used in other to determine the generic rule
to be used to disburse the scholarship. The system based on the
defined rules from the tree is able to determine the class (status) to
which an applicant shall belong whether Granted or Not Granted. The
applicants that fall to the class of granted denote a successful
acquirement of scholarship while those in not granted class are
unsuccessful in the scheme. An algorithm that can be used to classify
the applicants based on the rules from tree-based classification was
also developed. The tree-based classification is adopted because of its
efficiency, effectiveness, and easy to comprehend features. The
system was tested with the data of National Information Technology
Development Agency (NITDA) Abuja, a Parastatal of Federal
Ministry of Communication Technology that is mandated to develop
and regulate information technology in Nigeria. The system was
found working according to the specification. It is therefore
recommended for all scholarship disbursement organizations.
Abstract: One of the most important applications of
wireless sensor networks is data collection. This paper
proposes as efficient approach for data collection in wireless
sensor networks by introducing Member Forward List. This list
includes the nodes with highest priority for forwarding the data.
When a node fails or dies, this list is used to select the next node
with higher priority. The benefit of this node is that it prevents
the algorithm from repeating when a node fails or dies. The
results show that Member Forward List decreases power
consumption and latency in wireless sensor networks.
Abstract: The similarity comparison of RNA secondary
structures is important in studying the functions of RNAs. In recent
years, most existing tools represent the secondary structures by
tree-based presentation and calculate the similarity by tree alignment
distance. Different to previous approaches, we propose a new method
based on maximum clique detection algorithm to extract the maximum
common structural elements in compared RNA secondary structures.
A new graph-based similarity measurement and maximum common
subgraph detection procedures for comparing purely RNA secondary
structures is introduced. Given two RNA secondary structures, the
proposed algorithm consists of a process to determine the score of the
structural similarity, followed by comparing vertices labelling, the
labelled edges and the exact degree of each vertex. The proposed
algorithm also consists of a process to extract the common structural
elements between compared secondary structures based on a proposed
maximum clique detection of the problem. This graph-based model
also can work with NC-IUB code to perform the pattern-based
searching. Therefore, it can be used to identify functional RNA motifs
from database or to extract common substructures between complex
RNA secondary structures. We have proved the performance of this
proposed algorithm by experimental results. It provides a new idea of
comparing RNA secondary structures. This tool is helpful to those
who are interested in structural bioinformatics.
Abstract: A new hybrid coding method for compressing
animated polygonal meshes is presented. This paper assumes
the simplistic representation of the geometric data: a temporal
sequence of polygonal meshes for each discrete frame of the
animated sequence. The method utilizes a delta coding and an
octree-based method. In this hybrid method, both the octree
approach and the delta coding approach are applied to each
single frame in the animation sequence in parallel. The
approach that generates the smaller encoded file size is chosen
to encode the current frame. Given the same quality
requirement, the hybrid coding method can achieve much
higher compression ratio than the octree-only method or the
delta-only method. The hybrid approach can represent 3D
animated sequences with higher compression factors while
maintaining reasonable quality. It is easy to implement and have
a low cost encoding process and a fast decoding process, which
make it a better choice for real time application.
Abstract: This work presents a new phonetic transcription system based on a tree of hierarchical pronunciation rules expressed as context-specific grapheme-phoneme correspondences. The tree is automatically inferred from a phonetic dictionary by incrementally analyzing deeper context levels, eventually representing a minimum set of exhaustive rules that pronounce without errors all the words in the training dictionary and that can be applied to out-of-vocabulary words. The proposed approach improves upon existing rule-tree-based techniques in that it makes use of graphemes, rather than letters, as elementary orthographic units. A new linear algorithm for the segmentation of a word in graphemes is introduced to enable outof- vocabulary grapheme-based phonetic transcription. Exhaustive rule trees provide a canonical representation of the pronunciation rules of a language that can be used not only to pronounce out-of-vocabulary words, but also to analyze and compare the pronunciation rules inferred from different dictionaries. The proposed approach has been implemented in C and tested on Oxford British English and Basic English. Experimental results show that grapheme-based rule trees represent phonetically sound rules and provide better performance than letter-based rule trees.
Abstract: Feature selection study is gaining importance due to its contribution to save classification cost in terms of time and computation load. In search of essential features, one of the methods to search the features is via the decision tree. Decision tree act as an intermediate feature space inducer in order to choose essential features. In decision tree-based feature selection, some studies used decision tree as a feature ranker with a direct threshold measure, while others remain the decision tree but utilized pruning condition that act as a threshold mechanism to choose features. This paper proposed threshold measure using Manhattan Hierarchical Cluster distance to be utilized in feature ranking in order to choose relevant features as part of the feature selection process. The result is promising, and this method can be improved in the future by including test cases of a higher number of attributes.